Biobanking with genetics shapes precision medicine and global health
McInnes, G., Yee, S. W., Pershad, Y. & Altman, R. B. Genomewide association studies in pharmacogenomics. Clin. Pharmacol. Ther. 110, 637–648 (2021).
Google Scholar
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022). This study reports genome-wide association analyses on common variation and human height in more than five million individuals, which could account for nearly 100% of the estimated common SNP-based heritability.
Google Scholar
Tan, V. Y. & Timpson, N. J. The UK Biobank: a shining example of genome-wide association study science with the power to detect the murky complications of real-world epidemiology. Annu. Rev. Genomics Hum. Genet. 23, 569–589 (2022).
Google Scholar
Psaty, B. M. et al. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium: design of prospective meta-analyses of genome-wide association studies from 5 cohorts. Circ. Cardiovasc. Genet. 2, 73–80 (2009).
Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Google Scholar
Lazareva, T. E. et al. Biobanking as a tool for genomic research: from allele frequencies to cross-ancestry association studies. J. Pers. Med. 12, 2040 (2022).
Google Scholar
Galinsky, K. J. et al. Population structure of UK Biobank and ancient Eurasians reveals adaptation at genes influencing blood pressure. Am. J. Hum. Genet. 99, 1130–1139 (2016).
Google Scholar
Prive, F. Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics. Bioinformatics 38, 3477–3480 (2022).
Google Scholar
Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell 177, 1080 (2019).
Google Scholar
Manrai, A. K. et al. Genetic misdiagnoses and the potential for health disparities. N. Engl. J. Med. 375, 655–665 (2016).
Google Scholar
Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom. 2, 100192 (2022). This paper introduces trans-ancestry genome-wide association analyses that combine data from more than 25 cohorts and biobanks from around the world to perform meta-analyses across approximately 2.2 million individuals for a total of 14 harmonizable disease-relevant end-points.
Google Scholar
Manolio, T. A., Goodhand, P. & Ginsburg, G. The International Hundred Thousand Plus Cohort Consortium: integrating large-scale cohorts to address global scientific challenges. Lancet Digit. Health 2, e567–e568 (2020).
Google Scholar
All of Us Research Program, I. et al. The “All of Us” research program. N. Engl. J. Med. 381, 668–676 (2019).
Google Scholar
Cronin, R. M. et al. Development of the initial surveys for the All of Us research program. Epidemiology 30, 597–608 (2019).
Google Scholar
Mapes, B. M. et al. Diversity and inclusion for the All of Us research program: a scoping review. PLoS ONE 15, e0234962 (2020).
Google Scholar
Ramirez, A. H., Gebo, K. A. & Harris, P. A. Progress with the All of Us research program: opening access for researchers. JAMA 325, 2441–2442 (2021).
Google Scholar
Ramirez, A. H. et al. The All of Us research program: data quality, utility, and diversity. Patterns 3, 100570 (2022).
Google Scholar
Hedden, S. L. et al. The impact of COVID-19 on the All of Us research program. Am. J. Epidemiol. 192, 11–24 (2023). This study reports observations of positive detections of COVID-19 in the general population before what was originally reported to be the first clinically detected case.
Google Scholar
Hirata, M. et al. Overview of BioBank Japan follow-up data in 32 diseases. J. Epidemiol. 27, S22–S28 (2017).
Google Scholar
Nagai, A. et al. Overview of the BioBank Japan project: study design and profile. J. Epidemiol. 27, S2–S8 (2017).
Google Scholar
Hirata, M. et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, S9–S21 (2017).
Google Scholar
Roden, D. M. et al. Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther. 84, 362–369 (2008).
Google Scholar
Pulley, J. et al. Principles of human subjects protections applied in an opt-out, de-identified biobank. Clin. Transl. Sci. 3, 42–48 (2010).
Google Scholar
McGregor, T. L. et al. Inclusion of pediatric samples in an opt-out biorepository linking DNA to de-identified medical records: pediatric BioVU. Clin. Pharmacol. Ther. 93, 204–211 (2013).
Google Scholar
Chen, Z. et al. China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int. J. Epidemiol. 40, 1652–1666 (2011).
Google Scholar
Walters, R. G. et al. Genotyping and population characteristics of the China Kadoorie Biobank. Cell Genom. 3, 100361 (2023).
Google Scholar
Chen, Z. et al. Cohort profile: the Kadoorie Study of Chronic Disease in China (KSCDC). Int. J. Epidemiol. 34, 1243–1249 (2005).
Google Scholar
Leitsalu, L. et al. Linking a population biobank with national health registries—the Estonian experience. J. Pers. Med. 5, 96–106 (2015).
Google Scholar
Leitsalu, L. et al. Cohort profile: Estonian Biobank of the Estonian Genome Center, University of Tartu. Int. J. Epidemiol. 44, 1137–1147 (2015).
Google Scholar
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Google Scholar
Minton, K. The FinnGen study: disease insights from a ‘bottlenecked’ population. Nat. Rev. Genet. 24, 207 (2023).
Google Scholar
Finer, S. et al. Cohort profile: East London Genes & Health (ELGH), a community-based population genomics and health study in British Bangladeshi and British Pakistani people. Int. J. Epidemiol. 49, 20–21i (2020).
Google Scholar
Kvale, M. N. et al. Genotyping informatics and quality control for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200, 1051–1060 (2015).
Google Scholar
Boutin, N. T. et al. The evolution of a large biobank at Mass General Brigham. J. Pers. Med. 12, 1323 (2022).
Google Scholar
Karlson, E. W., Boutin, N. T., Hoffnagle, A. G. & Allen, N. L. Building the Partners Healthcare Biobank at Partners personalized medicine: informed consent, return of research results, recruitment lessons and operational considerations. J. Pers. Med. 6, 2 (2016).
Google Scholar
Boutin, N. T. et al. Implementation of electronic consent at a biobank: an opportunity for precision medicine research. J. Pers. Med. 6, 17 (2016).
Google Scholar
Castro, V. M. et al. The Mass General Brigham Biobank portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics. J. Am. Med. Inf. Assoc. 29, 643–651 (2022).
Google Scholar
Zawistowski, M. et al. The Michigan Genomics Initiative: a biobank linking genotypes and electronic clinical records in Michigan Medicine patients. Cell Genom. 3, 100257 (2023).
Google Scholar
Gaziano, J. M. et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 70, 214–223 (2016).
Google Scholar
Hunter-Zinck, H. et al. Genotyping array design and data quality control in the Million Veteran Program. Am. J. Hum. Genet. 106, 535–548 (2020).
Google Scholar
Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, 6319 (2016).
Google Scholar
Al Thani, A. et al. Qatar Biobank cohort study: study design and first results. Am. J. Epidemiol. 188, 1420–1433 (2019).
Google Scholar
Al Kuwari, H. et al. The Qatar Biobank: background and methods. BMC Public. Health 15, 1208 (2015).
Google Scholar
Fthenou, E., Al Thani, A., Al Marri, A. & Afifi, N. Qatar Biobank: a paradigm of translating biobank science into evidence-based health care interventions. Biopreserv Biobank 17, 491–493 (2019).
Google Scholar
Fthenou, E. et al. Conception, implementation, and integration of heterogenous information technology infrastructures in the Qatar Biobank. Biopreserv Biobank 17, 494–505 (2019).
Google Scholar
Salman, A. et al. Qatar Biobank milestones in building a successful biobank. Biopreserv Biobank 17, 485–486 (2019).
Google Scholar
Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Google Scholar
Ollier, W., Sprosen, T. & Peakman, T. UK Biobank: from concept to reality. Pharmacogenomics 6, 639–646 (2005).
Google Scholar
Peakman, T. C. & Elliott, P. The UK Biobank sample handling and storage validation studies. Int. J. Epidemiol. 37, i2–i6 (2008).
Google Scholar
Collins, R. What makes UK Biobank special? Lancet 379, 1173–1174 (2012).
Google Scholar
Suzuki, K. et al. Genetic drivers of heterogeneity in type 2 diabetes pathophysiology. Nature 627, 347–357 (2024).
Google Scholar
Praveen, K. et al. Population-scale analysis of common and rare genetic variation associated with hearing loss in adults. Commun. Biol. 5, 540 (2022).
Google Scholar
Li, B. et al. Frequencies of pharmacogenomic alleles across biogeographic groups in a large-scale biobank. Am. J. Hum. Genet. 110, 1628–1647 (2023).
Google Scholar
Jiang, X. et al. Age-dependent topic modeling of comorbidities in UK Biobank identifies disease subtypes with differential genetic risk. Nat. Genet. 55, 1854–1865 (2023).
Google Scholar
Stein, M. B. et al. Genome-wide association analyses of post-traumatic stress disorder and its symptom subdomains in the Million Veteran Program. Nat. Genet. 53, 174–184 (2021).
Google Scholar
Suh, J. & Ressler, K. J. Common biological mechanisms of alcohol use disorder and post-traumatic stress disorder. Alcohol. Res. 39, 131–145 (2018).
Google Scholar
Smith, N. D. L. & Cottler, L. B. The epidemiology of post-traumatic stress disorder and alcohol use disorder. Alcohol. Res. 39, 113–120 (2018).
Google Scholar
Abbott, L. et al. Neale lab UKB round 2 GWAS summary statistics. UK Biobank (2018). This paper reports large-scale, automated association analyses performed across a total of 4,236 phenotypes with resulting summary statistics made readily available.
Rasooly, D. et al. Genome-wide association analysis and Mendelian randomization proteomics identify drug targets for heart failure. Nat. Commun. 14, 3826 (2023).
Google Scholar
Pietzner, M. et al. Mapping the proteo-genomic convergence of human diseases. Science 374, eabj1541 (2021).
Google Scholar
Ginsburg, G. S. & Voora, D. The long and winding road to warfarin pharmacogenetic testing. J. Am. Coll. Cardiol. 55, 2813–2815 (2010).
Google Scholar
Turongkaravee, S. et al. A systematic review and meta-analysis of genotype-based and individualized data analysis of SLCO1B1 gene and statin-induced myopathy. Pharmacogenomics J. 21, 296–307 (2021).
Google Scholar
Jithesh, P. V. et al. A population study of clinically actionable genetic variation affecting drug response from the Middle East. NPJ Genom. Med. 7, 10 (2022).
Google Scholar
Markianos, K. et al. Pharmacogenetic allele variant frequencies: an analysis of the VA’s Million Veteran Program (MVP) as a representation of the diversity in US population. PLoS ONE 18, e0274339 (2023).
Google Scholar
Amstutz, U. et al. HLA-A 31:01 and HLA-B 15:02 as genetic markers for carbamazepine hypersensitivity in children. Clin. Pharmacol. Ther. 94, 142–149 (2013).
Google Scholar
Mallal, S. et al. Association between presence of HLA-B*5701, HLA-DR7, and HLA-DQ3 and hypersensitivity to HIV-1 reverse-transcriptase inhibitor abacavir. Lancet 359, 727–732 (2002).
Google Scholar
Hung, S. I. et al. HLA-B*5801 allele as a genetic marker for severe cutaneous adverse reactions caused by allopurinol. Proc. Natl Acad. Sci. USA 102, 4134–4139 (2005).
Google Scholar
Venner, E. et al. The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities. Commun. Biol. 7, 174 (2024).
Google Scholar
Choi, S. W., Mak, T. S. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Google Scholar
Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).
Google Scholar
Shams, H. et al. Polygenic risk score association with multiple sclerosis susceptibility and phenotype in Europeans. Brain 146, 645–656 (2023).
Google Scholar
Gottesman, O. et al. The Electronic Medical Records and Genomics (eMERGE) network: past, present, and future. Genet. Med. 15, 761–771 (2013).
Google Scholar
McCarty, C. A. et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med. Genomics 4, 13 (2011).
Google Scholar
Lennon, N. J. et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat. Med. 30, 480–487 (2024). This study develops and validates PRS models for ten clinical end-points in eMERGE and All of Us, respectively.
Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Google Scholar
Sun, B. B. et al. Genetic associations of protein-coding variants in human disease. Nature 603, 95–102 (2022). This study first maps the role of rare genetic variation in human disease using whole-genome sequencing data from the UKBB and then compiles the results into a publicly browsable portal known as GeneBass.
Google Scholar
Jurgens, S. J. et al. Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank. Nat. Genet. 54, 240–250 (2022).
Google Scholar
Swanson, J. M. The UK Biobank and selection bias. Lancet 380, 110 (2012).
Google Scholar
Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).
Google Scholar
van Alten, S. et al. Reweighting UK Biobank corrects for pervasive selection bias due to volunteering. Int. J. Epidemiol. 53, dyae054 (2024). This study shows that item-level non-response behaviours, such as participants responding PNA or IDK, have measurable and significant degrees of SNP-based heritability that may skew GWAS.
Mignogna, G. et al. Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci. Nat. Hum. Behav. 7, 1371–1387 (2023).
Google Scholar
Huang, J. Y. Representativeness is not representative: addressing major inferential threats in the UK Biobank and other big data repositories. Epidemiology 32, 189–193 (2021).
Google Scholar
Mars, N. et al. Genome-wide risk prediction of common diseases across ancestries in one million people. Cell Genom. 2, None (2022).
Google Scholar
Marquez-Luna, C. et al. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet. Epidemiol. 41, 811–823 (2017).
Google Scholar
Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).
Google Scholar
Gomez, F., Hirbo, J. & Tishkoff, S. A. Genetic variation and adaptation in Africa: implications for human evolution and disease. Cold Spring Harb. Perspect. Biol. 6, a008524 (2014).
Google Scholar
Lu, Z. et al. Multi-ancestry fine-mapping improves precision to identify causal genes in transcriptome-wide association studies. Am. J. Hum. Genet. 109, 1388–1404 (2022).
Google Scholar
Sohail, M. et al. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature 622, 775–783 (2023).
Google Scholar
James, P. D. et al. The mutational spectrum of type 1 von Willebrand disease: results from a Canadian cohort study. Blood 109, 145–154 (2007).
Google Scholar
O’Brien, L. A. et al. Founder von Willebrand factor haplotype associated with type 1 von Willebrand disease. Blood 102, 549–557 (2003).
Google Scholar
Goodeve, A. et al. Phenotype and genotype of a cohort of families historically diagnosed with type 1 von Willebrand disease in the European study, Molecular and Clinical Markers for the Diagnosis and Management of Type 1 von Willebrand Disease (MCMDM-1VWD). Blood 109, 112–121 (2007).
Google Scholar
Deflaux, N. et al. Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis. Nat. Commun. 14, 5419 (2023).
Google Scholar
Isgut, M. et al. Effect of case and control definitions on genome-wide association study (GWAS) findings. Genet. Epidemiol. 47, 394–406 (2023).
Google Scholar
Chen, C. Y. et al. Analysis across Taiwan Biobank, Biobank Japan, and UK Biobank identifies hundreds of novel loci for 36 quantitative traits. Cell Genom. 3, 100436 (2023).
Google Scholar
Benjamin, I. et al. American Heart Association Cardiovascular Genome–Phenome Study: foundational basis and program. Circulation 131, 100–112 (2015).
Google Scholar
Tsao, C. W. & Vasan, R. S. Cohort profile: the Framingham Heart Study (FHS): overview of milestones in cardiovascular epidemiology. Int. J. Epidemiol. 44, 1800–1813 (2015).
Google Scholar
Wang, Y. & Wang, J. G. Genome-wide association studies of hypertension and several other cardiovascular diseases. Pulse 6, 169–186 (2019).
Google Scholar
Levy, D. et al. Framingham Heart Study 100K Project: genome-wide associations for blood pressure and arterial stiffness. BMC Med. Genet. 8, S3 (2007).
Google Scholar
Althoff, K. N. et al. Antibodies to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in All of Us research program participants, 2 January to 18 March 2020. Clin. Infect. Dis. 74, 584–590 (2022). (4).
Google Scholar
Helms, J. et al. Neurologic features in severe SARS-CoV-2 infection. N. Engl. J. Med. 382, 2268–2270 (2020).
Google Scholar
Douaud, G. et al. SARS-CoV-2 is associated with changes in brain structure in UK Biobank. Nature 604, 697–707 (2022).
Google Scholar
link
