Genetic Sequencing at the DSHS Laboratory
Scope of Genetic Sequencing at the Lab
The Genetic Sequencing Branch’s Advanced Molecular Detection (AMD) Team sequences bacterial, viral, and fungal genomes. Since 2017, the team has performed WGS of Salmonella enterica, Listeria monocytogenes, Shigella spp., Campylobacter spp., Vibrio spp., and Shiga toxin-producing E. coli (STEC) obtained from clinical, food, and environmental specimens. The team then uploads the pathogens’ sequenced data—including species identification and subtype—to the National Center for Biotechnology Information (NCBI) database. The database, which is run by the National Library of Medicine, is open access and allows researchers in the U.S. and across the world to compare submitted sequences from food and environmental samples to analyze the data for indications of disease clusters (outbreaks of foodborne illnesses caused by the same bacteria).
Next generation sequencing at the DSHS Laboratory is carried out using two methods: Whole Genome Sequencing (WGS) and Amplicon-Based Sequencing. Both methods identify serotypes, drug resistance-related genes, toxin genes, cluster analysis, and other characteristics of interest.
Amplicon-Based Sequencing is a targeted next-generation sequencing (NGS) method that allows the analysis of specific genomic regions with high precision. It is useful for studying specific mutations, genetic variations, or microbial communities and is cost-effective and efficient. Amplicon-based sequencing can also be used to amplify entire viral genomes.
Whole Genome Sequencing determines the entirety, or nearly the entirety, of the DNA sequence of an organism’s genome at a single time. This includes sequencing all genetic material, including the mitochondrial DNA of eukaryotic organisms. It provides a comprehensive view of the genome being studied, including all the genes, regulatory elements, and non-coding regions.
Sample WGS Workflow
1. Specimen Received The genomic sequencing branch receives isolates or DNA and RNA extracts obtained from clinical, food, and environmental specimens that were submitted to the DSHS Laboratory. These specimens may be linked to infectious disease cases and outbreaks.
2. Specimen Processing Amplification, sequencing, and analysis of extracted bacterial genomic material to identify the species, serotype, detect and identify antimicrobial resistance genes and other characteristics of interest. DNA is fragmented, tagged with unique identifiers, and amplified into DNA libraries. These libraries are cleaned and normalized before pooling together to load on the sequencer.
3.Sequencing The Genetic Sequencing Branch has several sequencing platforms that offer low to high throughput and 24–48 hour run times, including:
- Illumina MiSeq
- Illumina NextSeq2000
- Oxford Nanopore Technologies GridION
- Clear Dx automated sequencer
- Short Read Sequencing: Sequencing by Synthesis (SBS by Illumina) DNA Library Preparation
- Long Read Sequencing: Oxford Nanopore Technologies How nanopore sequencing works | Oxford Nanopore Technologies
4. Bioinformatic Data Analysis Fragments of DNA are aligned and assembled to compare to reference genomes. Workflow pipelines, which clean data and perform analyses, including typing and variant detection and include:
- PulseNet 2.0 for PulseNet/GenomeTrakr samples
- Armadillo for ARLN samples
- Shigella Pipeline v1.0 for Shigella serotyping
- MycoSNP for Candida spp. and other fungi
- Varpipe_wgs for mycobacterial species
- Cecret for SARS CoV 2 and human monkeypox virus
- CDC’s IRMA for influenza virus
- TaxTriage for Metagenomics
- WARP for Newborn Screening
5. Results Reports After analyzing the genomic data, the branch can report on the following:
- Microorganism identification
- Serotype/Lineage/Clade
- Toxin genes (stx)
- Antimicrobial resistance (AMR) genes
- Drug resistance
- Cluster reports to epidemiologists
Most sequence files are submitted to NCBI to make sequencing data and simple metadata publicly available. COVID samples are submitted to the Global Initiative on Sharing All Influenza Data (GISAID).
Current Sequencing at the Laboratory (as of 2024)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Current Sequencing Projects
PulseNet
The DSHS Lab is the PulseNet Laboratory for the South-Central Area, which includes New Mexico, Louisiana, Arkansas, Oklahoma, Texas, the City of Houston, and the Texas State Chemist. PulseNet is an online, nationwide database that is maintained by the CDC. This network allows member laboratories to upload data to aid in detecting foodborne and infectious disease case clusters. PulseNet detects subtypes of E. coli O157 and other Shiga toxin-producing E. coli, Campylobacter, Cronobacter, Listeria monocytogenes, Salmonella, Vibrio cholerae, Vibrio parahaemolyticus, and Shigella.
These data are obtained from pulse field gel electrophoresis (PFGE), multiple locus variable number tandem repeat analysis (MLVA), and whole genome sequencing. The information from PulseNet allows for early identification of common source outbreaks and helps epidemiologists investigate outbreaks.
Image Source: CDC
GenomeTrakr
GenomeTrakr is an FDA network of public health and university laboratories that uses whole genome sequencing to identify foodborne pathogens from food and environmental sources. The sequencing data generated by WGS are uploaded to public databases at the National Center for Biotechnology Information (NCBI). Researchers and public health officials can access this genomic data for real-time comparison and analysis, which helps speed up foodborne illness outbreak investigations and reduces foodborne illnesses and deaths.
The GenomeTrakr network primarily sequences foodborne pathogens, but it may also sequence non-foodborne illness pathogens that could disrupt the food supply in other ways. For example, in 2021, the FDA initiated a pilot study for states to test SARS-CoV-2 wastewater surveillance during the COVID-19 pandemic
Network of GenomeTrkr Labs across the U.S. Image source: FDA
Influenza Sequencing Center
The Genetic Sequencing Branch is one of the six regional Influenza Sequencing Center (ISC) in the U.S. supported by the CDC. An ISC goal of the DSHS genomics team is to sequence approximately 500 influenza viruses each year. These sequence data are shared with the CDC to rapidly identify viruses that have genetic differences that may evade vaccine protection. This information will assist in updating seasonal flu vaccines, ensuring their ability to protect people and save lives.
Please refer to DSHS’ LTSM test menu for more details on the types of specimens accepted for whole genome sequencing at the Laboratory. General specimen submission guidance is available on the DSHS Laboratory’s specimen submission and shipping requirements pages.
Future Sequencing Projects at the DSHS Laboratory
- Metagenomics (clinical and environmental
- Expanded wastewater surveillance
- Multiple gene panels for newborn screening disorders:
- 1. very long chain acyl-CoA dehydrogenase (VLCAD) deficiency
- 2. hemoglobinopathies
- 3. X-linked adrenoleukodystrophy (X-ALD)
- 4. galactosemia
- 5. Pompe
- 6. Mucopolysarcharidosis type I (MPS1)
- 7. Mucopolysarcharidosis type II (MPS2)