Genetic Sequencing at the DSHS Laboratory

The interior of a whole genome sequencing laboratory. MiSeq and NextSeq sequencer are placed on lab benches

Scope of Genetic Sequencing at the Lab

The Genetic Sequencing Branch’s Advanced Molecular Detection (AMD) Team sequences bacterial, viral, and fungal genomes. Since 2017, the team has performed WGS of Salmonella enterica, Listeria monocytogenes, Shigella spp., Campylobacter spp., Vibrio spp., and Shiga toxin-producing E. coli (STEC) obtained from clinical, food, and environmental specimens. The team then uploads the pathogens’ sequenced data—including species identification and subtype—to the National Center for Biotechnology Information (NCBI) database. The database, which is run by the National Library of Medicine, is open access and allows researchers in the U.S. and across the world to compare submitted sequences from food and environmental samples to analyze the data for indications of disease clusters (outbreaks of foodborne illnesses caused by the same bacteria).

Next generation sequencing at the DSHS Laboratory is carried out using two methods: Whole Genome Sequencing (WGS) and Amplicon-Based Sequencing. Both methods identify serotypes, drug resistance-related genes, toxin genes, cluster analysis, and other characteristics of interest.

Amplicon-Based Sequencing is a targeted next-generation sequencing (NGS) method that allows the analysis of specific genomic regions with high precision. It is useful for studying specific mutations, genetic variations, or microbial communities and is cost-effective and efficient. Amplicon-based sequencing can also be used to amplify entire viral genomes.

Whole Genome Sequencing determines the entirety, or nearly the entirety, of the DNA sequence of an organism’s genome at a single time. This includes sequencing all genetic material, including the mitochondrial DNA of eukaryotic organisms. It provides a comprehensive view of the genome being studied, including all the genes, regulatory elements, and non-coding regions.

A graphic that visually explains the Whole Genome Sequencing process

Sample WGS Workflow

1. Specimen Received The genomic sequencing branch receives isolates or DNA and RNA extracts obtained from clinical, food, and environmental specimens that were submitted to the DSHS Laboratory. These specimens may be linked to infectious disease cases and outbreaks.

2. Specimen Processing Amplification, sequencing, and analysis of extracted bacterial genomic material to identify the species, serotype, detect and identify antimicrobial resistance genes and other characteristics of interest. DNA is fragmented, tagged with unique identifiers, and amplified into DNA libraries. These libraries are cleaned and normalized before pooling together to load on the sequencer.

3.Sequencing The Genetic Sequencing Branch has several sequencing platforms that offer low to high throughput and 24–48 hour run times, including:

Illumina MiSeq
Illumina NextSeq2000
Oxford Nanopore Technologies GridION
Clear Dx automated sequencer

Short Read Sequencing: Sequencing by Synthesis (SBS by Illumina) DNA Library Preparation
Long Read Sequencing: Oxford Nanopore Technologies How nanopore sequencing works | Oxford Nanopore Technologies

A graphic that identified the different parts of the MiSeq system features

4. Bioinformatic Data Analysis Fragments of DNA are aligned and assembled to compare to reference genomes. Workflow pipelines, which clean data and perform analyses, including typing and variant detection and include:

PulseNet 2.0 for PulseNet/GenomeTrakr samples
Armadillo for ARLN samples
Shigella Pipeline v1.0 for Shigella serotyping
MycoSNP for Candida spp. and other fungi
Varpipe_wgs for mycobacterial species
Cecret for SARS CoV 2 and human monkeypox virus
CDC’s IRMA for influenza virus
TaxTriage for Metagenomics
WARP for Newborn Screening

5. Results Reports After analyzing the genomic data, the branch can report on the following:

Microorganism identification
Serotype/Lineage/Clade
Toxin genes (stx)
Antimicrobial resistance (AMR) genes
Drug resistance
Cluster reports to epidemiologists

Most sequence files are submitted to NCBI to make sequencing data and simple metadata publicly available. COVID samples are submitted to the Global Initiative on Sharing All Influenza Data (GISAID).

Bar graph showing the numbers of PulseNet organisms that have been sequenced at the DSHS WGS lab

Current Sequencing at the Laboratory (as of 2024)

Salmonella enterica	Vibrio spp.
Escherichia coli	Carbapenem Resistant Organisms
Shigella spp.	Neisseria gonorrhoeae
Listeria monocytogenes	Mycobacterium tuberculosis
Campylobacter spp.	Mycobacterium tuberculosis
Cyclospora	Candida auris
SARS-CoV-2	Neisseria meningitidis
Human monkey pox virus	Haemophilus influenzae
Influenza virus

Current Sequencing Projects PulseNet

The DSHS Lab is the PulseNet Laboratory for the South-Central Area, which includes New Mexico, Louisiana, Arkansas, Oklahoma, Texas, the City of Houston, and the Texas State Chemist. PulseNet is an online, nationwide database that is maintained by the CDC. This network allows member laboratories to upload data to aid in detecting foodborne and infectious disease case clusters. PulseNet detects subtypes of E. coli O157 and other Shiga toxin-producing E. coli, Campylobacter, Cronobacter, Listeria monocytogenes, Salmonella, Vibrio cholerae, Vibrio parahaemolyticus, and Shigella.

These data are obtained from pulse field gel electrophoresis (PFGE), multiple locus variable number tandem repeat analysis (MLVA), and whole genome sequencing. The information from PulseNet allows for early identification of common source outbreaks and helps epidemiologists investigate outbreaks.

A map of the United States that shows the PulseNet network of foodborne disease surveillance

Image Source: CDC

GenomeTrakr

GenomeTrakr is an FDA network of public health and university laboratories that uses whole genome sequencing to identify foodborne pathogens from food and environmental sources. The sequencing data generated by WGS are uploaded to public databases at the National Center for Biotechnology Information (NCBI). Researchers and public health officials can access this genomic data for real-time comparison and analysis, which helps speed up foodborne illness outbreak investigations and reduces foodborne illnesses and deaths.

The GenomeTrakr network primarily sequences foodborne pathogens, but it may also sequence non-foodborne illness pathogens that could disrupt the food supply in other ways. For example, in 2021, the FDA initiated a pilot study for states to test SARS-CoV-2 wastewater surveillance during the COVID-19 pandemic

Map of the United States showing the states, university, or hospital or labs that are part of the GenomeTrakr Network

Network of GenomeTrkr Labs across the U.S. Image source: FDA

Influenza Sequencing Center

The Genetic Sequencing Branch is one of the six regional Influenza Sequencing Center (ISC) in the U.S. supported by the CDC. An ISC goal of the DSHS genomics team is to sequence approximately 500 influenza viruses each year. These sequence data are shared with the CDC to rapidly identify viruses that have genetic differences that may evade vaccine protection. This information will assist in updating seasonal flu vaccines, ensuring their ability to protect people and save lives.

A computer generated close up of a spherical flu virus particle. The viral genetic material is folded inside the capsule

Please refer to DSHS’ LTSM test menu for more details on the types of specimens accepted for whole genome sequencing at the Laboratory. General specimen submission guidance is available on the DSHS Laboratory’s specimen submission and shipping requirements pages.

Future Sequencing Projects at the DSHS Laboratory

Metagenomics (clinical and environmental
Expanded wastewater surveillance
Multiple gene panels for newborn screening disorders:
- 1. very long chain acyl-CoA dehydrogenase (VLCAD) deficiency
- 2. hemoglobinopathies
- 3. X-linked adrenoleukodystrophy (X-ALD)
- 4. galactosemia
- 5. Pompe
- 6. Mucopolysaccharidosis type I (MPS1)
- 7. Mucopolysaccharidosis type II (MPS2)

Graphic showing the different sequencing projects being conducted at the DSHS whole genome sequencing laboratory.

Menu Title

Menu Title

Menu Title

Menu Title

Menu Title

Genetic Sequencing at the DSHS Laboratory