258 research outputs found
The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration
The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies, which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform, and new ontologies being created, on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved in the future
Annotation of two large contiguous regions from the Haemonchus contortus genome using RNA-seq and comparative analysis with Caenorhabditis elegans
The genomes of numerous parasitic nematodes are currently being sequenced, but their complexity and size, together with high levels of intra-specific sequence variation and a lack of reference genomes, makes their assembly and annotation a challenging task. Haemonchus contortus is an economically significant parasite of livestock that is widely used for basic research as well as for vaccine development and drug discovery. It is one of many medically and economically important parasites within the strongylid nematode group. This group of parasites has the closest phylogenetic relationship with the model organism Caenorhabditis elegans, making comparative analysis a potentially powerful tool for genome annotation and functional studies. To investigate this hypothesis, we sequenced two contiguous fragments from the H. contortus genome and undertook detailed annotation and comparative analysis with C. elegans. The adult H. contortus transcriptome was sequenced using an Illumina platform and RNA-seq was used to annotate a 409 kb overlapping BAC tiling path relating to the X chromosome and a 181 kb BAC insert relating to chromosome I. In total, 40 genes and 12 putative transposable elements were identified. 97.5% of the annotated genes had detectable homologues in C. elegans of which 60% had putative orthologues, significantly higher than previous analyses based on EST analysis. Gene density appears to be less in H. contortus than in C. elegans, with annotated H. contortus genes being an average of two-to-three times larger than their putative C. elegans orthologues due to a greater intron number and size. Synteny appears high but gene order is generally poorly conserved, although areas of conserved microsynteny are apparent. C. elegans operons appear to be partially conserved in H. contortus. Our findings suggest that a combination of RNA-seq and comparative analysis with C. elegans is a powerful approach for the annotation and analysis of strongylid nematode genomes
Genomic analyses identify recurrent MEF2D fusions in acute lymphoblastic leukemia
Chromosomal rearrangements are initiating events in acute lymphoblastic leukaemia (ALL). Here using RNA sequencing of 560 ALL cases, we identify rearrangements between MEF2D (myocyte enhancer factor 2D) and five genes (BCL9, CSF1R, DAZAP1, HNRNPUL1 and SS18) in 22 B progenitor ALL (B-ALL) cases with a distinct gene expression profile, the most common of which is MEF2DBCL9. Examination of an extended cohort of 1,164 B-ALL cases identified 30 cases with MEF2D rearrangements, which include an additional fusion partner, FOXJ2; thus, MEF2D-rearranged cases comprise 5.3% of cases lacking recurring alterations. MEF2D-rearranged ALL is characterized by a distinct immunophenotype, DNA copy number alterations at the rearrangement sites, older diagnosis age and poor outcome. The rearrangements result in enhanced MEF2D transcriptional activity, lymphoid transformation, activation of HDAC9 expression and sensitive to histone deacetylase inhibitor treatment. Thus, MEF2D-rearranged ALL represents a distinct form of high-risk leukaemia, for which new therapeutic approaches should be considered.This work was supported in part by
the American Lebanese Syrian Associated Charities of St. Jude Children’s Research
Hospital; by a Stand Up to Cancer Innovative Research Grant and St. Baldrick’s
Foundation Scholar Award (to C.G.M.); by a St. Baldrick’s Consortium Award (S.P.H.),
by a Leukemia and Lymphoma Society Specialized Center of Research grant (S.P.H. and
C.G.M.), by a Lady Tata Memorial Trust Award (I.I.), by a Leukemia and Lymphoma
Society Special Fellow Award and Alex’s Lemonade Stand Foundation Young Investigator
Awards (K.R.), by an Alex’s Lemonade Stand Foundation Award (M.L.) and by
National Cancer Institute Grants CA21765 (St Jude Cancer Center Support Grant), U01
CA157937 (C.L.W. and S.P.H.), U24 CA114737 (to Dr Gastier-Foster), NCI Contract
HHSN261200800001E (to Dr Gastier-Foster), U10 CA180820 (ECOG-ACRIN
Operations) and CA180827 (E.P.); U10 CA180861 (C.D.B. and G.M.); U24 CA196171
(The Alliance NCTN Biorepository and Biospecimen Resource); CA145707 (C.L.W. and
C.G.M.); and grants to the COG: U10 CA98543 (Chair’s grant and supplement to
support the COG ALL TARGET project), U10 CA98413 (Statistical Center) and U24
CA114766 (Specimen Banking). This project has been funded in whole or in part with
Federal funds from the National Cancer Institute, National Institutes of Health, under
Contract Number HHSN261200800001E
Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas
This integrated, multiplatform PanCancer Atlas study co-mapped and identified distinguishing
molecular features of squamous cell carcinomas (SCCs) from five sites associated with smokin
Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context
Long noncoding RNAs (lncRNAs) are commonly dys-regulated in tumors, but only a handful are known toplay pathophysiological roles in cancer. We inferredlncRNAs that dysregulate cancer pathways, onco-genes, and tumor suppressors (cancer genes) bymodeling their effects on the activity of transcriptionfactors, RNA-binding proteins, and microRNAs in5,185 TCGA tumors and 1,019 ENCODE assays.Our predictions included hundreds of candidateonco- and tumor-suppressor lncRNAs (cancerlncRNAs) whose somatic alterations account for thedysregulation of dozens of cancer genes and path-ways in each of 14 tumor contexts. To demonstrateproof of concept, we showed that perturbations tar-geting OIP5-AS1 (an inferred tumor suppressor) andTUG1 and WT1-AS (inferred onco-lncRNAs) dysre-gulated cancer genes and altered proliferation ofbreast and gynecologic cancer cells. Our analysis in-dicates that, although most lncRNAs are dysregu-lated in a tumor-specific manner, some, includingOIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergis-tically dysregulate cancer pathways in multiple tumorcontexts
Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas
Although theMYConcogene has been implicated incancer, a systematic assessment of alterations ofMYC, related transcription factors, and co-regulatoryproteins, forming the proximal MYC network (PMN),across human cancers is lacking. Using computa-tional approaches, we define genomic and proteo-mic features associated with MYC and the PMNacross the 33 cancers of The Cancer Genome Atlas.Pan-cancer, 28% of all samples had at least one ofthe MYC paralogs amplified. In contrast, the MYCantagonists MGA and MNT were the most frequentlymutated or deleted members, proposing a roleas tumor suppressors.MYCalterations were mutu-ally exclusive withPIK3CA,PTEN,APC,orBRAFalterations, suggesting that MYC is a distinct onco-genic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such asimmune response and growth factor signaling; chro-matin, translation, and DNA replication/repair wereconserved pan-cancer. This analysis reveals insightsinto MYC biology and is a reference for biomarkersand therapeutics for cancers with alterations ofMYC or the PMN
Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images
Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images
of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL
maps are derived through computational staining using a convolutional neural network trained to
classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and
correlation with overall survival. TIL map structural patterns were grouped using standard
histopathological parameters. These patterns are enriched in particular T cell subpopulations
derived from molecular measures. TIL densities and spatial structure were differentially enriched
among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial
infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic
patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for
the TCGA image archives with insights into the tumor-immune microenvironment
Recommended from our members
The genetic landscape of high-risk neuroblastoma
Neuroblastoma is a malignancy of the developing sympathetic nervous system that often presents with widespread metastatic disease, resulting in survival rates of less than 50%1. To determine the spectrum of somatic mutation in high-risk neuroblastoma, we studied 240 cases using a combination of whole exome, genome and transcriptome sequencing as part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative. Here we report a low median exonic mutation frequency of 0.60 per megabase (0.48 non-silent), and remarkably few recurrently mutated genes in these tumors. Genes with significant somatic mutation frequencies included ALK (9.2% of cases), PTPN11 (2.9%), ATRX (2.5%, an additional 7.1% had focal deletions), MYCN (1.7%, a recurrent p.Pro44Leu alteration), and NRAS (0.83%). Rare, potentially pathogenic germline variants were significantly enriched in ALK, CHEK2, PINK1, and BARD1. The relative paucity of recurrent somatic mutations in neuroblastoma challenges current therapeutic strategies reliant upon frequently altered oncogenic drivers
Molecular attributes underlying central nervous system and systemic relapse in diffuse large B-cell lymphoma
- …