Improving Forensic SNP-Based Outcomes: Insights from Case Studies, Comparative Sequencing & Bioinformatic Approaches
Alaina Addison, MS1, Jessica Bouchet, BS1, Meghan Didier, MS1, Sheila Diepold, MS1, Kyla Hackman, MFS1, Kevin Lord2, Richard E. Green, PhD2, Cydne Holt, PhD1
1Forensic Services Division, Tetracore®, Inc., Rockville, MD 208502Astrea Forensics™ LLC, Scotts Valley, CA, 95066
Summary
Forensic investigative genetic genealogy (FIGG) is driving technical innovation of advanced DNA techniques for use in problematic samples from recent cases, cold cases, and unidentified human remains investigations. Rigorous evaluation of assay chemistries, bioinformatics, computational advancements and mixture resolution capabilities are of essential interest to forensic practitioners and law enforcement professionals. Methodological transparency in SNP sequencing and data handling are crucial in shaping the future of forensic genomics and investigative lead generation in a highly ethical way that supports evidentiary reliability and leads to solved cases.
A systematic assessment was conducted of two DNA library preparation kits (Kintelligence (targeted), whole genome sequencing (WGS)) with particular attention to bioinformatic approaches, mixtures, and investigative utility for kinship detection after GEDmatch PRO (GMP) query. Performance metrics included but were not limited to single nucleotide polymorphism (SNP) yield, quality, mixture deconvolution success, kinship estimation robustness and genealogical reach.
Genomic data were generated (100 pg DNA input) from an individual with nine known family members (1st-4th degree) in the GEDmatch database, opted-in for law enforcement purposes. This extended family provided a testbed for assessing sensitivity and specificity using the Shared cM metric in GMP across variables such as PCR cycle numbers, NextSeq 1000/2000 read lengths and two FIGG-compatible bioinformatic pipelines for WGS-based SNP calling, QC and imputation: a custom tool and AstreaImpute2 (Astrea Forensics). Both pipelines detected the known relatives with high sensitivity and specificity, in rank order and with the Shared cM kinship metric reported by GMP within the expected ranges (DNA Painter Shared cM Project 4.0 Tool v4) excepting some instances where fewer Shared cM than expected were observed (e.g., 84 cM < the 448 cM minimum anticipated for a 1st cousin (1C)). Both pipelines supported correct kinship degree estimation of the two, known, 1st cousins once removed (1C1R, 4th degree).