BioCentury
ARTICLE | Preclinical News

Study uncovers large batch effects in TCGA exome sequencing data

October 19, 2018 4:52 PM UTC

A preprint article published in bioRxiv identified large cross-institutional batch effects for a subset of The Cancer Genome Atlas data. The report provides a warning to all multisite sequencing consortia, and calls for re-examining results that depend on detection of germline single nucleotide variants.

Batch effects -- artifactual similarities in data from samples analyzed under similar conditions -- have long been recognized as a challenge for large-scale genomics projects. Over TCGA's 13-year history, researchers have found and filtered out artifacts such as oxidative DNA damage generated during sample preparation, and harmonized protocol components including sequencing instruments and bioinformatics tools. ...