1 Introduction

We have genetic samples of Culicoides from 24 sites spread around the Mediterranean Sea. The figure below display the sample locations.

We measured a genetic distance (CSE, Cavalli-Sforza and Edwards’ chord distance) between samples and recorded the sampling dates.

We assume that populations spread through areas of suitable environmental conditions and that between every pair of sampling locations there is a main/preferred dispersal path (unknown).

These paths can be modelled as least-cost paths of some latent rugosity landscape, determined by environmental factors (in an unknown relationship).

Assuming that the genetic distance between samples is associated with:

  • the length of the main dispersal path,

  • the environmental conditions along the path and

  • the temporal distance between the samples,

the objective of the present study is to estimate the rugosity landscape, the dispersal paths and the effect of the environmental factors on the genetic distance.

The method consists of an iterative procedure (EM-like algorithm, Dempster, Laird, and Rubin (1977), see appendix) that starts by assumming that the geodesic (direct) paths between locations are an approximation of the dispersal paths. Next, we:

  • compute a series of summaries of the environmental variables along the working paths that serve as covariates together with the length of the route and the difference in sampling times

  • fit a regression model for the genetic distances using the previous computed covariates

  • predict the rugosity (i.e. expected increase in genetic distance) at each pixel, given the environmental variables alone (i.e. geographical and temporal distance fixed at 0)

  • compute the next working paths as the least-cost paths between sampling locations, given the previously computed rugosity map

and iterate until convergence, when the maximum absolute relative difference between successive rugosity maps drops below 5%.

This method is based on Bouyer et al. (2015), with the difference that here we work over a domain with a large water body (the Mediterranean Sea). As a consequence, some of the variables in the statistical model are specific to land areas (e.g. tree-cover, elevation). Moreover, the common variables (e.g. air temperature) can have different effects over land or over sea.

2 Data description

Joint and marginal distributions of genetic and temporal distances between the 276 samples.

Figure 2.1: Joint and marginal distributions of genetic and temporal distances between the 276 samples.

2.1 Environmental variables

Spatial representation of centered and scaled environmental covariates.

Figure 2.2: Spatial representation of centered and scaled environmental covariates.

3 Results

3.1 Convergence process

The process converged after 5 iterations dropping down the established threshold of 5 % relative difference of maximum-absolute value of rugosity with respect to the previous step (Fig. 3.1).

Relative difference of maximum absolute rugosity with respect to previous step.

Figure 3.1: Relative difference of maximum absolute rugosity with respect to previous step.

Below, the convergence of the rugosity map and least-cost dispersal paths are displayed.