Modelling coexposition

1 Introduction

2 Model #0: base model

The binary outcome \(y_i\) of serology for the observation \(i\) is positive with probability \(p_i = 1/(1 + e^{-\mu_{k(i)}})\), where \(k(i)\) is the disease corresponding to observation \(i\).

\[\begin{equation} \tag{2.1} \begin{aligned} y_i & \sim \text{Ber}(p_i) \\ \eta_i = \log\frac{p_i}{1-p_i} & = \mu_i \\ \mu_i & = \beta_{k(i)} \end{aligned} \end{equation}\]

This model assumes that the serology of each animal for a disease \(k\) is a random draw of a binary variable with probability \(p_k\). The observed variation is that of the random sampling, and the probabilities of co-exposition are simply the result of independent sampling.

##  Family: MV(bernoulli, bernoulli, bernoulli, bernoulli, bernoulli) 
##   Links: mu = logit
##          mu = logit
##          mu = logit
##          mu = logit
##          mu = logit 
## Formula: sero_circo ~ 1 
##          sero_tub ~ 1 
##          sero_vhe ~ 1 
##          sero_myco ~ 1 
##          sero_auj ~ 1 
##    Data: sero_dat (Number of observations: 1217) 
##   Draws: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 2000
## 
## Regression Coefficients:
##                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## serocirco_Intercept     0.42      0.06     0.31     0.53 1.00     3181     1545
## serotub_Intercept      -2.62      0.11    -2.84    -2.40 1.00     2309     1469
## serovhe_Intercept       1.18      0.07     1.05     1.31 1.00     2702     1859
## seromyco_Intercept     -0.17      0.06    -0.28    -0.06 1.00     2224     1597
## seroauj_Intercept       0.33      0.06     0.21     0.44 1.00     2233     1578
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders.

Figure 2.1: Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders.

Same thing, but more concise.

Figure 2.2: Same thing, but more concise.

Figures 2.1 and 2.2 show that this model fits the observed seroprevalences of the 5 pathogens, but fails to adequately fit many of the co-exposures, suggesting that these are not really independent.

Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species.

Figure 2.3: Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species.

Furthermore, if we compare the model predictions with the corresponding prevalences by species, we realise that the model is predicting an average prevalence that does not really fit the situation for either species (Figure 2.3).

3 Model #1: species + latent correlated intercepts

The binary outcome \(y_i\) of serology for the observation \(i\) is positive with probability \(p_i\), which varies depending on the pathogen \(k\), the species \(s\) and the individual \(j\) as follows.

\[\begin{equation} \tag{2.1} \begin{aligned} y_i & \sim \text{Ber}(p_i) \\ \eta_i = \log\frac{p_i}{1-p_i} & = \mu_i + u_i\\ \mu_i & = \alpha^\text{dp}_{k(i)} + \delta^\text{wb}_{k(i)} \mathbb{I}_\text{wb}(i) \\ \mathbf{u} & \sim \mathcal{N}(0, \Sigma_k \otimes \mathbf{I}_n) \end{aligned} \end{equation}\]

This model assumes that there is a expected serology for pathogen \(k\) and species \(s\), with additional random individual variation that enables potential correlation across pathogens (the same for both species).

##  Family: MV(bernoulli, bernoulli, bernoulli, bernoulli, bernoulli) 
##   Links: mu = logit
##          mu = logit
##          mu = logit
##          mu = logit
##          mu = logit 
## Formula: sero_circo ~ 1 + species + (1 | i | id) 
##          sero_tub ~ 1 + species + (1 | i | id) 
##          sero_vhe ~ 1 + species + (1 | i | id) 
##          sero_myco ~ 1 + species + (1 | i | id) 
##          sero_auj ~ 1 + species + (1 | i | id) 
##    Data: sero_dat (Number of observations: 1217) 
##   Draws: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 2000
## 
## Multilevel Hyperparameters:
## ~id (Number of levels: 866) 
##                                             Estimate Est.Error l-95% CI
## sd(serocirco_Intercept)                         0.35      0.20     0.02
## sd(serotub_Intercept)                           0.62      0.42     0.03
## sd(serovhe_Intercept)                           0.88      0.21     0.45
## sd(seromyco_Intercept)                          0.37      0.23     0.02
## sd(seroauj_Intercept)                           0.66      0.16     0.35
## cor(serocirco_Intercept,serotub_Intercept)      0.10      0.38    -0.68
## cor(serocirco_Intercept,serovhe_Intercept)      0.42      0.34    -0.42
## cor(serotub_Intercept,serovhe_Intercept)        0.22      0.32    -0.51
## cor(serocirco_Intercept,seromyco_Intercept)     0.06      0.37    -0.69
## cor(serotub_Intercept,seromyco_Intercept)      -0.08      0.37    -0.74
## cor(serovhe_Intercept,seromyco_Intercept)       0.22      0.32    -0.50
## cor(serocirco_Intercept,seroauj_Intercept)      0.41      0.33    -0.40
## cor(serotub_Intercept,seroauj_Intercept)        0.16      0.34    -0.58
## cor(serovhe_Intercept,seroauj_Intercept)        0.78      0.12     0.48
## cor(seromyco_Intercept,seroauj_Intercept)       0.19      0.32    -0.50
##                                             u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(serocirco_Intercept)                         0.76 1.02      147      320
## sd(serotub_Intercept)                           1.59 1.03      150      262
## sd(serovhe_Intercept)                           1.30 1.00      399      577
## sd(seromyco_Intercept)                          0.87 1.01      209      456
## sd(seroauj_Intercept)                           0.98 1.00      447      742
## cor(serocirco_Intercept,serotub_Intercept)      0.80 1.01      380      405
## cor(serocirco_Intercept,serovhe_Intercept)      0.89 1.05       86      131
## cor(serotub_Intercept,serovhe_Intercept)        0.74 1.01      191      431
## cor(serocirco_Intercept,seromyco_Intercept)     0.74 1.00      351      572
## cor(serotub_Intercept,seromyco_Intercept)       0.66 1.00      446      607
## cor(serovhe_Intercept,seromyco_Intercept)       0.78 1.00      942     1058
## cor(serocirco_Intercept,seroauj_Intercept)      0.87 1.05       86      124
## cor(serotub_Intercept,seroauj_Intercept)        0.77 1.00      208      419
## cor(serovhe_Intercept,seroauj_Intercept)        0.96 1.00      670     1205
## cor(seromyco_Intercept,seroauj_Intercept)       0.76 1.01      779     1000
## 
## Regression Coefficients:
##                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## serocirco_Intercept           1.52      0.10     1.32     1.73 1.00     1132
## serotub_Intercept            -3.66      0.36    -4.61    -3.13 1.02      263
## serovhe_Intercept             2.50      0.18     2.16     2.88 1.00      781
## seromyco_Intercept            0.19      0.08     0.03     0.34 1.00     2728
## seroauj_Intercept             0.50      0.09     0.33     0.67 1.00     1660
## serocirco_specieswildboar    -2.56      0.16    -2.90    -2.26 1.00      846
## serotub_specieswildboar       1.54      0.27     1.04     2.09 1.00      905
## serovhe_specieswildboar      -2.25      0.20    -2.65    -1.87 1.00     1042
## seromyco_specieswildboar     -0.97      0.14    -1.25    -0.71 1.00     1361
## seroauj_specieswildboar      -0.33      0.13    -0.59    -0.06 1.00     2810
##                           Tail_ESS
## serocirco_Intercept           1072
## serotub_Intercept              268
## serovhe_Intercept             1077
## seromyco_Intercept            1353
## seroauj_Intercept              944
## serocirco_specieswildboar     1121
## serotub_specieswildboar       1009
## serovhe_specieswildboar       1166
## seromyco_specieswildboar       878
## seroauj_specieswildboar       1514
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders.

Figure 3.1: Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders.

Figure 3.1 is much better.

Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species.

Figure 3.2: Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species.

Consider the prevalence grouped by risk of interaction.

Distribution of risk levels across observations, by species.

Figure 3.3: Distribution of risk levels across observations, by species.

Posterior predictive checks of the proportion of observations for each combination of pathogen and risk level, with observed prevalence by species.

Figure 3.4: Posterior predictive checks of the proportion of observations for each combination of pathogen and risk level, with observed prevalence by species.

Figure 3.4 shows that the model does not capture correctly the observed seroprevalences of some pathogens at some risk levels. That is the case, for example, of VHE at high risk of interaction, or of Aujeszky at an intermediate risk of interaction.

This shows that the seroprevalence depends on the risk of interaction to some extent, and justifies the need of a more complex model.

4 Model #2: Add the risk of interaction

The binary outcome \(y_i\) of serology for the observation \(i\) is positive with probability \(p_i\), which varies depending on the pathogen \(k\), the species \(s\), the risk of interaction \(X_{m(i)}\) in municipality \(m\) and the individual \(j\) as follows.

\[\begin{equation} \tag{2.1} \begin{aligned} y_i & \sim \text{Ber}(p_i) \\ \eta_i = \log\frac{p_i}{1-p_i} & = \mu_i + u_i\\ \mu_i & = \alpha^\text{dp}_{k(i)} + \delta^\text{wb}_{k(i)} \mathbb{I}_\text{wb}(i) + (\beta^\text{dp}_{k(i)} + \gamma^\text{wb}_{k(i)} \mathbb{I}_\text{wb}(i)) X_{m(i)} \\ \mathbf{u} & \sim \mathcal{N}(0, \Sigma_k \otimes \mathbf{I}_n) \end{aligned} \end{equation}\]

This model assumes that there is a expected serology for pathogen \(k\) and species \(s\), with additional random individual variation that enables potential correlation across pathogens (the same for both species).

##  Family: MV(bernoulli, bernoulli, bernoulli, bernoulli, bernoulli) 
##   Links: mu = logit
##          mu = logit
##          mu = logit
##          mu = logit
##          mu = logit 
## Formula: sero_circo ~ 1 + species * latent_risk + (1 | i | id) 
##          sero_tub ~ 1 + species * latent_risk + (1 | i | id) 
##          sero_vhe ~ 1 + species * latent_risk + (1 | i | id) 
##          sero_myco ~ 1 + species * latent_risk + (1 | i | id) 
##          sero_auj ~ 1 + species * latent_risk + (1 | i | id) 
##    Data: filter(sero_dat, !is.na(latent_risk)) (Number of observations: 609) 
##   Draws: 2 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 2000
## 
## Multilevel Hyperparameters:
## ~id (Number of levels: 501) 
##                                             Estimate Est.Error l-95% CI
## sd(serocirco_Intercept)                         0.48      0.36     0.02
## sd(serotub_Intercept)                           1.47      1.01     0.07
## sd(serovhe_Intercept)                           1.24      0.54     0.21
## sd(seromyco_Intercept)                          0.35      0.23     0.02
## sd(seroauj_Intercept)                           0.61      0.27     0.06
## cor(serocirco_Intercept,serotub_Intercept)      0.05      0.38    -0.68
## cor(serocirco_Intercept,serovhe_Intercept)      0.21      0.34    -0.51
## cor(serotub_Intercept,serovhe_Intercept)        0.02      0.28    -0.56
## cor(serocirco_Intercept,seromyco_Intercept)     0.05      0.40    -0.72
## cor(serotub_Intercept,seromyco_Intercept)      -0.20      0.40    -0.83
## cor(serovhe_Intercept,seromyco_Intercept)       0.14      0.35    -0.58
## cor(serocirco_Intercept,seroauj_Intercept)      0.17      0.36    -0.57
## cor(serotub_Intercept,seroauj_Intercept)       -0.09      0.32    -0.67
## cor(serovhe_Intercept,seroauj_Intercept)        0.61      0.26    -0.10
## cor(seromyco_Intercept,seroauj_Intercept)       0.11      0.37    -0.64
##                                             u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(serocirco_Intercept)                         1.37 1.01      223      554
## sd(serotub_Intercept)                           3.85 1.03      110      247
## sd(serovhe_Intercept)                           2.45 1.07       41       59
## sd(seromyco_Intercept)                          0.84 1.02      210      732
## sd(seroauj_Intercept)                           1.20 1.02      149      136
## cor(serocirco_Intercept,serotub_Intercept)      0.75 1.01      199      344
## cor(serocirco_Intercept,serovhe_Intercept)      0.81 1.01      149      416
## cor(serotub_Intercept,serovhe_Intercept)        0.56 1.01      296      451
## cor(serocirco_Intercept,seromyco_Intercept)     0.74 1.00      659     1027
## cor(serotub_Intercept,seromyco_Intercept)       0.65 1.00      656      897
## cor(serovhe_Intercept,seromyco_Intercept)       0.77 1.00     1206     1274
## cor(serocirco_Intercept,seroauj_Intercept)      0.81 1.01      188      443
## cor(serotub_Intercept,seroauj_Intercept)        0.59 1.01      398      515
## cor(serovhe_Intercept,seroauj_Intercept)        0.94 1.02      196      163
## cor(seromyco_Intercept,seroauj_Intercept)       0.77 1.00     1141     1530
## 
## Regression Coefficients:
##                                       Estimate Est.Error l-95% CI u-95% CI Rhat
## serocirco_Intercept                       2.04      0.29     1.55     2.66 1.00
## serotub_Intercept                        -3.84      1.05    -6.59    -2.47 1.02
## serovhe_Intercept                         3.32      0.64     2.35     4.91 1.04
## seromyco_Intercept                        0.27      0.17    -0.07     0.62 1.00
## seroauj_Intercept                         0.93      0.20     0.53     1.33 1.00
## serocirco_specieswildboar                -2.70      0.40    -3.54    -2.03 1.00
## serocirco_latent_risk                    -0.20      0.11    -0.43     0.02 1.00
## serocirco_specieswildboar:latent_risk     0.33      0.14     0.06     0.59 1.00
## serotub_specieswildboar                   1.23      0.64     0.07     2.58 1.00
## serotub_latent_risk                       0.04      0.19    -0.31     0.42 1.00
## serotub_specieswildboar:latent_risk       0.05      0.22    -0.42     0.48 1.00
## serovhe_specieswildboar                  -2.29      0.59    -3.65    -1.31 1.01
## serovhe_latent_risk                       0.12      0.16    -0.18     0.44 1.00
## serovhe_specieswildboar:latent_risk       0.10      0.18    -0.24     0.44 1.00
## seromyco_specieswildboar                 -0.98      0.31    -1.59    -0.36 1.00
## seromyco_latent_risk                     -0.07      0.07    -0.21     0.08 1.00
## seromyco_specieswildboar:latent_risk      0.10      0.10    -0.09     0.29 1.00
## seroauj_specieswildboar                  -0.51      0.30    -1.10     0.09 1.00
## seroauj_latent_risk                       0.04      0.08    -0.11     0.20 1.00
## seroauj_specieswildboar:latent_risk       0.06      0.10    -0.14     0.26 1.00
##                                       Bulk_ESS Tail_ESS
## serocirco_Intercept                       1029      919
## serotub_Intercept                          164      302
## serovhe_Intercept                           73      396
## seromyco_Intercept                        2125     1705
## seroauj_Intercept                         2015     1556
## serocirco_specieswildboar                  921     1027
## serocirco_latent_risk                     1755     1306
## serocirco_specieswildboar:latent_risk     1607     1403
## serotub_specieswildboar                   1051      880
## serotub_latent_risk                        921      712
## serotub_specieswildboar:latent_risk       1173      707
## serovhe_specieswildboar                    353      804
## serovhe_latent_risk                       1505     1419
## serovhe_specieswildboar:latent_risk       2332     1409
## seromyco_specieswildboar                  1953     1278
## seromyco_latent_risk                      1825     1248
## seromyco_specieswildboar:latent_risk      1627     1000
## seroauj_specieswildboar                   2113     1606
## seroauj_latent_risk                       2048     1559
## seroauj_specieswildboar:latent_risk       1658     1516
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species (across risk levels).

Figure 4.1: Posterior predictive checks of the proportion of observations for each possible combination of pathogens of all orders, with observed prevalence by species (across risk levels).

Posterior predictive checks of the proportion of observations for each combination of pathogen and risk level, with observed prevalence by species.

Figure 4.2: Posterior predictive checks of the proportion of observations for each combination of pathogen and risk level, with observed prevalence by species.

5 Results

Posterior distribution (median and 50 % quantile credible interval) of the expected seroprevalence by species, pathogen and risk of interaction.

Figure 5.1: Posterior distribution (median and 50 % quantile credible interval) of the expected seroprevalence by species, pathogen and risk of interaction.

Posterior correlations (median a 89% quantile intervals) of co-expositions.

Figure 5.2: Posterior correlations (median a 89% quantile intervals) of co-expositions.

Figure 5.2 shows that only VHE and Aujeszki are clearly correlated positively. To a lesser extent, Circovirus seem to be positively correlated with both as well. For the others, the uncertainties are too large to be conclusive.

6 Conclusions

  • Which other results might be of interest? E.g.

    • can we compare the expected seroprevalences for 2 distinct villages in terms of risk level?

    • species-specific correlations?

  • Modelling alternatives:

    • municipality-level intercepts?

    • risk-specific correlations?