Sequencing Exome Genome Chromosomal abnormalities Copy-number variants >50kb SNVs & indels, some large exonic variants SNVs, indels, some large variants ~5% explanation rate ~10% ~30% ~40% Phelan Proc. of Greenwood Genetics Center 1996 De Vries AJHG 2008 De Ligt NEJM 2012 Gilissen Nature 2014
Sequencing Exome Genome Chromosomal abnormalities Copy-number variants >50kb SNVs & indels, some large exonic variants SNVs, indels, some large variants ~5% explanation rate ~10% ~30% ~40% Phelan Proc. of Greenwood Genetics Center 1996 De Vries AJHG 2008 De Ligt NEJM 2012 Gilissen Nature 2014 What is missing? structural variants difficult-to-map regions repeat expansions phasing
WORKFLOW IMPLEMENTATION -designed with HPC/cloud job scheduling and scaling in mind -Snakemake implementation from PacBio https://github.com/PacificBiosciences/pb-human-wgs-workflow-snakemake -WDL implementation adapted by Microsoft Genomics https://github.com/PacificBiosciences/pb-human-wgs-workflow-wdl
Sequencing Long-read Sequencing Exome Genome HiFi Genome Chromosomal abnormalities Copy-number variants >50kb SNVs & indels, some large exonic variants SNVs, indels, some large variants SNVs, indels, SVs, CNVs, phasing, translocations, inversions, repeat expansions ~5% explanation rate ~10% ~30% ~40% up to 67% Phelan Proc. of Greenwood Genetics Center 1996 De Vries AJHG 2008 De Ligt NEJM 2012 Gilissen Nature 2014 Collaborations, presentations, and publications to date
“all” variants called with short-read WGS plus tens of thousands additional SNVs, indels, and SVs per genome. Candidate variants found in 30 of 80 samples from: • SNVs and indels in GC-rich regions and difficult-to-map regions • Structural variants • Phasing Future work: long-read population control databases, improved variant interpretation tools.
Isabelle Thiffault ACKNOWLEDGEMENTS PacBio Aaron Wenger Shreyasee Chakraborty Christine Lambert Primo Baybayan Microsoft Genomics Roberto Lleras Matthew McLoughlin Benjamin Moskowitz