This package dlmwwbe (Dynamic Linear Model for Wastewater-based Epidemilogy with Missing Data) contains two main function pdlm() (Predictive Dynamic Linear Model) and dllm() (Dynamic Local Level Model). The first one is to fit a dynamic linear model for forecasting the clinical positive cases (or other similar data) using lagged clinical and wastewater data. The second one is to fit a local level model for smoothing the noisy wastewater data. For more details, see papers here.
First, we implement dllm() on the wastewater data collected between 2022 - 2024 in Twin Cities metro area in Minnesota, United States. For the detail of the data, see papers. There are two possible structures: 1. all wastewater data share a single latent state (S = ‘univariate’). 2. Each wastewater data has its own latent sate (S = ‘kvariate’). For a better model fitting, we encourage the use of the log transformation of the original wastewater data by setting the argument log10 = TRUE. This is because the data better approximates the normality assumption in practice. Other transformation might be necessary depending on the nature of the data. The summary() provides some information of the fitted model.
Consider both wastewater data have their individual latent state. The average of the smoother is provided.
data_TC <- wastewater[wastewater$Code == "TC",]
data_TC$SampleDate <- as.Date(data_TC$SampleDate)
fit <- dllm(
  equal.state.var=FALSE,
  equal.obs.var=FALSE,
  log10=TRUE,
  data = data_TC,
  date = "SampleDate",
  obs_cols = c("ORFlab", "Nlab"),
  S = c('kvariate')
)
summary(fit)#> Dynamic Linear Model Results
#> ============================
#> Model Structure:
#> - Observation columns: ORFlab, Nlab 
#> - Latent state dimension: 2 
#> - Observation dimension: 2 
#> Fit Statistics:
#> Log-likelihood: 239.71
#> AIC: -471.41
#> BIC: -456.16
#> Convergence: Achieved
#> 
#> Estimated Parameters:
#> sigmaW_state1 sigmaW_state2 sigmaV_ORFlab   sigmaV_Nlab 
#>   0.001783840   0.002291088   0.012992110   0.013768954Next, we implement pdlm() on the clinical and wastewater data. Different number of lags are demonstrated. For a better model fitting, we encourage the use of the log transformation of the original wastewater data by setting the argument log10 = TRUE (and add \(1\) for the positive count cases for a valid transformation). The summary() provides some information of the fitted model.
Here, We consider \(0\) and \(2\) lags and plot them along with the observed data on its original scale.
data_TC <- wastewaterhealthworker[wastewaterhealthworker$Code == "TC",]
data_TC$SampleDate <- as.Date(data_TC$SampleDate)
fit <- pdlm(
  data=data_TC,
  formula=HealthWorkerCaseCount ~ WW.tuesday + WW.thursday,
  lags=0,
  log10=TRUE,
  date = NULL,
  equal.state.var = TRUE,
  equal.obs.var = FALSE,
  auto_init = TRUE,
  control = list(maxit = 100))
summary(fit)#> Dynamic Linear Model Results
#> ============================
#> Model Structure:
#> - Formula: HealthWorkerCaseCount ~ WW.tuesday + WW.thursday 
#> - Lags: 0 
#> - Latent states dimension: 4 
#> - Observation dimension: 3 
#> Fit Statistics:
#> Log-likelihood: 156.52
#> AIC: -297.05
#> BIC: -281.91
#> Convergence: Failed
#> 
#> Estimated Parameters:
#> $Ft
#> [1]  0.004656649 -0.672803533  0.369672144
#> 
#> $Wt
#> [1] 0.005005797 0.015570229
#> 
#> $Vt
#> [1] 0.04743422 0.00998462 1.00000000data_TC <- wastewaterhealthworker[wastewaterhealthworker$Code == "TC",]
data_TC$SampleDate <- as.Date(data_TC$SampleDate)
fit <- pdlm(
  data=data_TC,
  formula=HealthWorkerCaseCount ~ WW.tuesday + WW.thursday,
  lags=2,
  log10=TRUE,
  date = NULL,
  equal.state.var = FALSE,
  equal.obs.var = TRUE,
  auto_init = TRUE,
  control = list(maxit = 100))
summary(fit)#> Dynamic Linear Model Results
#> ============================
#> Model Structure:
#> - Formula: HealthWorkerCaseCount ~ WW.tuesday + WW.thursday 
#> - Lags: 2 
#> - Latent states dimension: 10 
#> - Observation dimension: 3 
#> Fit Statistics:
#> Log-likelihood: 154.26
#> AIC: -284.52
#> BIC: -261.82
#> Convergence: Failed
#> 
#> Estimated Parameters:
#> $Ft
#> [1] -0.01790474  0.23394470 -0.42737618  0.11332519  0.27906550  0.14315139
#> [7] -0.17890679
#> 
#> $Wt
#> [1] 0.43087811 0.03331713 0.03668333
#> 
#> $Vt
#> [1] 0.0027912527 0.0005323801