7 research outputs found
Genotyping, sequencing and analysis of 140,000 adults from Mexico City
The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent
A Cloud-Based Infrastructure for Cancer Genomics
The advent of new genomic approaches, particularly next generation sequencing (NGS) has resulted in explosive growth of biological data. As the size of biological data keeps growing at exponential rates, new methods for data management and data processing are becoming essential in bioinformatics and computational biology. Indeed, data analysis has now become the central challenge in genomics. NGS has provided rich tools for defining genomic alterations that cause cancer. The processing time and computing requirements have now become a serious bottleneck to the characterization and analysis of these genomic alterations. Moreover, as the adoption of NGS continues to increase, the computing power required often exceeds what any single institution can provide, leading to major restraints in the type and number of analyses that can be performed. Cloud computing represents a potential solution to this problem. On a cloud platform, computing resources can be available on-demand, thus allowing users to implement scalable and highly parallel methods. However, few centralized frameworks exist to allow the average researcher the ability to apply bioinformatics workflows using cloud resources. Moreover, bioinformatics approaches are associated with multiple processing challenges, such as the variability in the methods or data used and the reproducibility requirements of the research analysis. Here, we present CloudConductor, a software system that is specifically designed to harness the power of cloud computing to perform complex analysis pipelines on large biological datasets. CloudConductor was designed with five central features in mind: scalability, modularity, parallelism, reproducibility and platform agnosticism. We demonstrate the processing power afforded by CloudConductor on a real-world genomics problem. Using CloudConductor, we processed and analyzed 101 whole genome tumor-normal paired samples from Burkitt lymphoma subtypes to identify novel genomic alterations. We identified a total of 72 driver genes associated with the disease. Somatic events were identified in both coding and non-coding regions of nearly all driver genes, notably in genes IGLL5, BACH2, SIN3A, and DNMT1. We have developed the analysis framework by implementing a graphical user interface, a back-end database system, a data loader and a workflow management system.In this thesis, we develop the concepts and describe an implementation of automated cloud-based infrastructure to analyze genomics data, creating a fast and efficient analysis resource for genomics researchers.</p
Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank
Cardiometabolic diseases are the leading cause of death worldwide. Despite a known genetic component, our understanding of these diseases remains incomplete. Here, we analyzed the contribution of rare variants to 57 diseases and 26 cardiometabolic traits, using data from 200,337 UK Biobank participants with whole-exome sequencing. We identified 57 gene-based associations, with broad replication of novel signals in Geisinger MyCode. There was a striking risk associated with mutations in known Mendelian disease genes, including MYBPC3, LDLR, GCK, PKD1 and TTN. Many genes showed independent convergence of rare and common variant evidence, including an association between GIGYF1 and type 2 diabetes. We identified several large effect associations for height and 18 unique genes associated with blood lipid or glucose levels. Finally, we found that between 1.0% and 2.4% of participants carried rare potentially pathogenic variants for cardiometabolic disorders. These findings may facilitate studies aimed at therapeutics and screening of these common disorders
Recommended from our members
Whole Exome and Transcriptome Sequencing in 1042 Cases Reveals Distinct Clinically Relevant Genetic Subgroups of Follicular Lymphoma
Follicular Lymphoma (FL) is the most common indolent lymphoma derived from light zone germinal center B cells and characterized by a t(14;18) translocation resulting in upregulation of BCL2 in over 80% of cases. This translocation alone is not sufficient for tumorogenesis, and must be combined with additional genetic mutations to transform B cells. FL is incurable and the disease course can be highly varied, with survival ranging from a few months to decades following diagnosis and treatment with standard chemoimmunotherapy. The heterogeneity of FL poses major challenges to identifying the association of genetic alterations and clinical outcome. Current WHO guidelines recommend establishing grade for each FL case with grade 3 thought to be more aggressive than 1 and 2. The genetic basis and clinical implications of grade in FL are unclear. Recent sequencing studies have identified many genes found to be recurrently mutated in FL including KMT2D and CREBBP. However, the degree to which genetic alterations cooperate with each other or contribute to clinical outcome is unclear. Based on the observed mutational rates in follicular lymphoma, we estimated 900 cases were needed to comprehensively delineate the genetic alterations that underlie histologic grade and clinical outcome. Accordingly, we enrolled a cohort of 1042 patients with newly diagnosed FL. All treated patients received rituximab-containing standard regimens. To go beyond the identification of gene-coding events, we developed a very large panel of 110 Mbp covering exonic (~40Mbp) and non-exonic regions (~70Mbp) of interest to enable a wide range of genomic analysis including mutation calling in both coding and non-coding regions, rearrangement detection, viral identification, and copy number analysis. In addition to the whole exome, we extended coverage to include introns, promoters, and untranslated regions of all known driver genes in cancer. We included the entirety of the immunoglobulin loci, T-cell receptor loci and CD3 loci to detect clonotypes and rearrangements. We also included lymphoma-relevant long non-coding RNAs, microRNAs, enhancers, and breakpoint-prone regions. For viral detection, we targeted the genomes of eight cancer-related viruses: Epstein-Barr virus, human papillomavirus, human immunodeficiency virus, hepatitis B, hepatitis C, Kaposi's sarcoma-associated herpesvirus, human T-lymphotropic virus, and Merkel cell polyomavirus. In addition, to enable high resolution identification of copy number variation (CNV) calls, the entire genome was tiled with probes spaced 10kb apart. DNA and RNA were extracted from all tumors and their paired normal samples, prepared into DNA and RNA sequencing libraries and subjected to sequencing on the Illumina platform to a targeted coverage of 150X. Somatic events were identified and further filtered to identify driver events in both coding and non-coding regions. FLs demonstrated a significant degree of genetic heterogeneity with over 100 genes mutated with a frequency of at least 2%. Nearly 100% of FL cases had a mutation in at least one chromatin-modifying gene. The most frequently mutated genes in follicular lymphoma were KMT2D, BCL2, IGLL5 and CREBBP. In addition, we identified frequent mutations in SPEN, BIRC6 and SETD2. To our knowledge, this is the first description of alterations in these genes in FL. Transcriptome analysis indicated a strong correlation between BIRC6 mutations and the previously described immune response 2 signature that is associated with a poor prognosis. We further performed unbiased clustering of genetic alterations in these FL cases. We identified a cluster that was specifically enriched in BCL6 and TP53 alterations and was strongly associated with grade 3 FLs which are predicted to have poorer outcomes with low intensity therapies. We further examined the genetic profiles of 1001 DLBCLs in comparison to this cohort of FLs. Our data indicate a continuum of highly overlapping genetic alterations with DLBCL displaying more complex patterns that included alterations in MYC, TP53 and CDKN2A (mainly copy number losses), indicating shared pathogenetic mechanisms underlying FL and DLBCL, particularly those germinal center B cell origin. Disclosures Koff: Burroughs Wellcome Fund: Research Funding; V Foundation: Research Funding; Lymphoma Research Foundation: Research Funding; American Association for Cancer Research: Research Funding. Leppä:Roche: Honoraria, Research Funding; Takeda: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Bayer: Research Funding; Celgene: Membership on an entity's Board of Directors or advisory committees, Research Funding; Janssen-Cilag: Research Funding; Novartis: Membership on an entity's Board of Directors or advisory committees. Gang:ROCHE: Membership on an entity's Board of Directors or advisory committees; Novartis: Membership on an entity's Board of Directors or advisory committees. Hsi:Abbvie: Research Funding; Eli Lilly: Research Funding; Cleveland Clinic&Abbvie Biotherapeutics Inc: Patents & Royalties: US8,603,477 B2; Jazz: Consultancy. Flowers:AbbVie: Consultancy, Research Funding; Denovo Biopharma: Consultancy; BeiGene: Consultancy, Research Funding; Burroughs Wellcome Fund: Research Funding; Eastern Cooperative Oncology Group: Research Funding; National Cancer Institute: Research Funding; V Foundation: Research Funding; Optimum Rx: Consultancy; Millenium/Takeda: Research Funding; TG Therapeutics: Research Funding; Gilead: Consultancy, Research Funding; Celgene: Consultancy, Research Funding; Karyopharm: Consultancy; AstraZeneca: Consultancy; Pharmacyclics/Janssen: Consultancy, Research Funding; Spectrum: Consultancy; Bayer: Consultancy; Acerta: Research Funding; Genentech, Inc./F. Hoffmann-La Roche Ltd: Consultancy, Research Funding. Neff:Enzyvant: Consultancy; EUSA Pharma: Honoraria, Membership on an entity's Board of Directors or advisory committees. Fedoriw:Alexion Pharmaceuticals: Other: Consultant and Speaker. Reddy:Genentech: Research Funding; BMS: Consultancy, Research Funding; Celgene: Consultancy; KITE Pharma: Consultancy; Abbvie: Consultancy. Mason:Sysmex: Honoraria. Behdad:Loxo-Bayer: Membership on an entity's Board of Directors or advisory committees; Thermo Fisher: Membership on an entity's Board of Directors or advisory committees; Pfizer: Other: Speaker. Burton:Bristol-Myers Squibb: Honoraria, Membership on an entity's Board of Directors or advisory committees; Roche: Honoraria, Membership on an entity's Board of Directors or advisory committees, Other: Travel; Celgene: Membership on an entity's Board of Directors or advisory committees; Takeda: Honoraria, Membership on an entity's Board of Directors or advisory committees. Dave:Data Driven Bioscience: Equity Ownership
Recommended from our members
NOTCH3 p.Arg1231Cys is markedly enriched in South Asians and associated with stroke
Acknowledgements: Supported by Regeneron Pharmaceuticals, Inc. This research has been conducted using the UK Biobank Resource (project 26041). The authors thank everyone who made this work possible, particularly the UK Biobank team, their funders, the professionals from the member institutions who contributed to and supported this work, and most especially the UK Biobank participants, without whom this research would not be possible. The exome sequencing was funded by the UK Biobank Exome Sequencing Consortium (Bristol Myers Squibb, Regeneron, Biogen, Takeda, Abbvie, Alnylam, AstraZeneca and Pfizer). Ethical approval for the UK Biobank was previously obtained from the North West Center for Research Ethics Committee (11/NW/0382). Disclosure forms provided by the authors are available with the full text of this article.The genetic factors of stroke in South Asians are largely unexplored. Exome-wide sequencing and association analysis (ExWAS) in 75 K Pakistanis identified NM_000435.3(NOTCH3):c.3691 C > T, encoding the missense amino acid substitution p.Arg1231Cys, enriched in South Asians (alternate allele frequency = 0.58% compared to 0.019% in Western Europeans), and associated with subcortical hemorrhagic stroke [odds ratio (OR) = 3.39, 95% confidence interval (CI) = [2.26, 5.10], p = 3.87 × 10−9), and all strokes (OR [CI] = 2.30 [1.77, 3.01], p = 7.79 × 10−10). NOTCH3 p.Arg231Cys was strongly associated with white matter hyperintensity on MRI in United Kingdom Biobank (UKB) participants (effect [95% CI] in SD units = 1.1 [0.61, 1.5], p = 3.0 × 10−6). The variant is attributable for approximately 2.0% of hemorrhagic strokes and 1.1% of all strokes in South Asians. These findings highlight the value of diversity in genetic studies and have major implications for genomic medicine and therapeutic development in South Asian populations
Safety and Outcome of Revascularization Treatment in Patients With Acute Ischemic Stroke and COVID-19: The Global COVID-19 Stroke Registry
BACKGROUND AND OBJECTIVES: COVID-19 related inflammation, endothelial dysfunction and coagulopathy may increase the bleeding risk and lower efficacy of revascularization treatments in patients with acute ischemic stroke. We aimed to evaluate the safety and outcomes of revascularization treatments in patients with acute ischemic stroke and COVID-19. METHODS: Retrospective multicenter cohort study of consecutive patients with acute ischemic stroke receiving intravenous thrombolysis (IVT) and/or endovascular treatment (EVT) between March 2020 and June 2021, tested for SARS-CoV-2 infection. With a doubly-robust model combining propensity score weighting and multivariate regression, we studied the association of COVID-19 with intracranial bleeding complications and clinical outcomes. Subgroup analyses were performed according to treatment groups (IVT-only and EVT). RESULTS: Of a total of 15128 included patients from 105 centers, 853 (5.6%) were diagnosed with COVID-19. 5848 (38.7%) patients received IVT-only, and 9280 (61.3%) EVT (with or without IVT). Patients with COVID-19 had a higher rate of symptomatic intracerebral hemorrhage (SICH) (adjusted odds ratio [OR] 1.53; 95% CI 1.16-2.01), symptomatic subarachnoid hemorrhage (SSAH) (OR 1.80; 95% CI 1.20-2.69), SICH and/or SSAH combined (OR 1.56; 95% CI 1.23-1.99), 24-hour (OR 2.47; 95% CI 1.58-3.86) and 3-month mortality (OR 1.88; 95% CI 1.52-2.33).COVID-19 patients also had an unfavorable shift in the distribution of the modified Rankin score at 3 months (OR 1.42; 95% CI 1.26-1.60). DISCUSSION: Patients with acute ischemic stroke and COVID-19 showed higher rates of intracranial bleeding complications and worse clinical outcomes after revascularization treatments than contemporaneous non-COVID-19 treated patients. Current available data does not allow direct conclusions to be drawn on the effectiveness of revascularization treatments in COVID-19 patients, or to establish different treatment recommendations in this subgroup of patients with ischemic stroke. Our findings can be taken into consideration for treatment decisions, patient monitoring and establishing prognosis
Safety and Outcome of Revascularization Treatment in Patients With Acute Ischemic Stroke and COVID-19: The Global COVID-19 Stroke Registry.
BACKGROUND AND OBJECTIVES
COVID-19 related inflammation, endothelial dysfunction and coagulopathy may increase the bleeding risk and lower efficacy of revascularization treatments in patients with acute ischemic stroke. We aimed to evaluate the safety and outcomes of revascularization treatments in patients with acute ischemic stroke and COVID-19.
METHODS
Retrospective multicenter cohort study of consecutive patients with acute ischemic stroke receiving intravenous thrombolysis (IVT) and/or endovascular treatment (EVT) between March 2020 and June 2021, tested for SARS-CoV-2 infection. With a doubly-robust model combining propensity score weighting and multivariate regression, we studied the association of COVID-19 with intracranial bleeding complications and clinical outcomes. Subgroup analyses were performed according to treatment groups (IVT-only and EVT).
RESULTS
Of a total of 15128 included patients from 105 centers, 853 (5.6%) were diagnosed with COVID-19. 5848 (38.7%) patients received IVT-only, and 9280 (61.3%) EVT (with or without IVT). Patients with COVID-19 had a higher rate of symptomatic intracerebral hemorrhage (SICH) (adjusted odds ratio [OR] 1.53; 95% CI 1.16-2.01), symptomatic subarachnoid hemorrhage (SSAH) (OR 1.80; 95% CI 1.20-2.69), SICH and/or SSAH combined (OR 1.56; 95% CI 1.23-1.99), 24-hour (OR 2.47; 95% CI 1.58-3.86) and 3-month mortality (OR 1.88; 95% CI 1.52-2.33).COVID-19 patients also had an unfavorable shift in the distribution of the modified Rankin score at 3 months (OR 1.42; 95% CI 1.26-1.60).
DISCUSSION
Patients with acute ischemic stroke and COVID-19 showed higher rates of intracranial bleeding complications and worse clinical outcomes after revascularization treatments than contemporaneous non-COVID-19 treated patients. Current available data does not allow direct conclusions to be drawn on the effectiveness of revascularization treatments in COVID-19 patients, or to establish different treatment recommendations in this subgroup of patients with ischemic stroke. Our findings can be taken into consideration for treatment decisions, patient monitoring and establishing prognosis