Skip to contents

The ‘WES’ R package provides reproducible functions for collating and analyzing data from wastewater and environmental sampling studies. Wastewater and Environmental Sampling (WES) of infectious diseases involves collecting samples from various sources (such as sewage, water, air, soil, or surfaces) to monitor the presence of pathogens in the environment. The tools here are intended to do the heavy lifting when analyzing WES data and include:

  • establishing standardized data formats,
  • relative and absolute quantification of qPCR data,
  • collating spatial data for sampling sites,
  • and analysis of sample sizes and temporal trends.

We developed the ‘WES’ R package primarily for epidemiological surveillance studies for multiple pathogens in Low- and Middle-Income Countries (LMIC) where samples are collected from informal sewer networks. However, the functions should generalize to other applications provided they use the same data formats. Feel free to reach out with comments or questions; the package is currently in development and maintained by John Giles (@gilesjohnr).


Data standards

To use the data and methods provided in the ‘WES’ R package, input data must match the formatting shown in the template_WES_data and template_WES_standard_curves data objects. Both template data sets are described in more detail in the following vignette:

Templates for input data sets


Derivative qPCR quantities

Studies utilizing environmental sampling for disease surveillance often employ Quantitative Real-Time Polymerase Chain Reaction (qPCR) to detect pathogens in samples. In qPCR, the measurement is given as the cycle threshold (Ct) value which is often used with a cutoff value to render a presence/absence response. The Ct values can also be transformed into absolute or relative quantities of pathogen presence using the methods in the calc_n_copies(), calc_delta_delta_ct(), and est_amplification_efficiency() functions which are described in more detail in the following vignettes:

Absolute quantification of qPCR data
Relative quantification of qPCR data
qPCR Amplification efficiency


Covariates

Wastewater and Environmental Sampling studies conducted in informal sewer systems are vulnerable to confounding because the climate and local geography can impact the substrate available for collection. Therefore, we have included functions that download spatial data (such as climate, hydrology, and local populations) for the times and locations in the WES data. The following vignettes describe the data sources and methods used and relate these spatial data to WES data observations:

Sources of spatial data
Getting climate data
Calculating hydrological variables
Calculating local population size
Getting administrative data


Analysis

Additional analyses are under development and will likely include the calculation of cross-correlations among targets, time series models, and models of pathogen presence based on multiple gene targets. Currently, methods for quick tabulation of sampling statistics and overall detection rates are available in the calc_sample_sizes() function.


Visualization

We are currently developing an RShiny application that will visualize the data and methods in the ‘WES’ package.


Installation

Use the devtools package to install the development version of ‘WES’ from the GitHub repository. R version >= 4.1.1 recommended.

# The WhiteboxTools R frontend is a prerequisite for GIS functionality
install.packages("whitebox", dependencies = TRUE)
whitebox::install_whitebox()
whitebox::wbt_version()

# Install the current CRAN version
install.packages("WES", dependencies = TRUE)

# Or install the development version from GitHub
install.packages('devtools')
devtools::install_github("gilesjohnr/WES", dependencies=TRUE)


Troubleshooting

For general questions, contact John Giles () and/or Jillian Gauld (). Note that this software is made available under a Creative Commons 4.0 license and was developed for specific environmental sampling applications and therefore may not generalize perfectly to all settings.


Funding

This work was developed at the Institute for Disease Modeling in support of funded research grants made by the Bill & Melinda Gates Foundation.