556 research outputs found

    Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation

    Get PDF
    BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. RESULTS: The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes. CONCLUSION: Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose

    Discovery and Characterization of Recurrent Gene Fusions in Prostate Cancer.

    Full text link
    Recurrent chromosomal rearrangements have been well characterized in hematologic and mesenchymal malignancies, but not in common carcinomas. A novel bioinformatics algorithm termed Cancer Outlier Profile Analysis (COPA) was developed to analyze DNA microarray data for genes markedly over-expressed (“outliers”) in a subset of cases. COPA identified the ETS family members ERG and ETV1 as high-ranking outliers in multiple prostate cancer profiling studies. In cases with outlier expression of ERG or ETV1, recurrent gene fusions of the 5’ untranslated region of the prostate-specific, androgen-induced gene TMPRSS2 to the respective ETS family member were identified. In vitro studies in cancer cell lines demonstrated that androgen-responsive promoter elements of TMPRSS2 mediate the aberrant ETS family member over-expression. Subsequent interrogation of all ETS family members in prostate cancer profiling studies identified outlier expression of ETV4 in two of 98 cases. In one such case, ETV4 over-expression was confirmed and a fusion of the TMPRSS2 and ETV4 loci was identified. A large scale profiling and integrated molecular concepts analysis demonstrated that ETS rearrangement-positive and -negative tumors have distinct transcriptional programs, with loss at 6q21 as a possible defining genetic event in ETS negative prostate cancers. While TMPRSS2:ERG fusions are predominant, fewer TMPRSS2:ETV1 cases were identified than would be expected based on the frequency of ETV1 outlier expression. Through characterizing additional ETV1 outlier cases, novel 5’ fusion partners defining distinct functional classes of ETS gene rearrangements were identified. These include fusions involving androgen-stimulated, androgen-repressed and androgen-insensitive 5’ partners. As the commonality of ETS rearrangements is aberrant over-expression, in vitro and in vivo recapitulation demonstrated that ETV1 or ERG over-expression in benign prostate cells and the mouse prostate confers neoplastic phenotypes. Together, this work suggests a pathogenetically important role for recurrent chromosomal rearrangements in a common epithelial tumor and has important implications in the molecular diagnosis and treatment of prostate cancer. Deregulation of ETS family member expression through gene fusions appears to be a generalized mechanism for prostate cancer development in the majority of cases. Additionally, other common epithelial tumors may be driven by uncharacterized gene rearrangements.Ph.D.Molecular & Cellular PathologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/57601/2/tomlinss_1.pd

    Genomic and epigenomic studies of acute myeloid leukemia with CEPBA abnormalities

    Get PDF

    Notch-1 and IGF-1 as Survivin Regulatory Pathways in Cancer: A Dissertation

    Get PDF
    The 21st century brought about a dramatic increase in knowledge about genetic and molecular profiles of cancer. This information has validated the complexity of tumor cells and increased awareness of “nodal proteins”, but has yet to advance the development of rational targeted cancer therapeutics. Nodal proteins are critical cellular proteins that collect biological inputs and distribute the information across diverse biological processes. Survivin acts as a nodal protein by interfacing the multiple signals involved in mitosis and apoptosis and functionally integrate proliferation, cell death, and cellular homeostasis. By characterizing survivin as a target of both Type 1 Insulin-like Growth Factor (IGF-1) and Notch developmental signaling, we contribute to the paradigm of survivin as a nodal protein. The two signaling systems, Notch and IGF-1, regulate survivin by two independent mechanisms. Notch activation induces survivin transcription preferentially in basal breast cancer, a breast cancer subtype with poor prognosis and lack of molecular therapies. Activated Notch binds the transcription factor RBP-Jк and drives transcription from the survivin promoter. Notch mediated survivin expression increases cell cycle kinetics promoting tumor proliferation. Inhibition of Notch in a breast xenograft model reduced tumor growth and systemic metastasis. On the other hand, IGF-1 signaling drives survivin protein translation in prostate cancer cells. Binding of IGF-1 to its receptor activates downstream kinases, mammalian target of rapamycin (mTOR) and p70 S6 protein kinase (p70S6K), which modulates survivin mRNA translation to increase the apoptotic threshold. The multiple roles of survivin in tumorigenesis implicate survivin as a rational target for the “next generation” of cancer therapeutics

    Genomic and epigenomic studies of acute myeloid leukemia with CEPBA abnormalities

    Get PDF

    Discovering information from an integrated graph database

    Get PDF
    The information explosion in science has become a different problem, not the sheer amount per se, but the multiplicity and heterogeneity of massive sets of data sources. Relations mined from these heterogeneous sources, namely texts, database records, and ontologies have been mapped to Resource Description Framework (RDF) triples in an integrated database. The subject and object resources are expressed as references to concepts in a biomedical ontology consisting of the Unified Medical Language System (UMLS), UniProt and EntrezGene and for the predicate resource to a predicate thesaurus. All RDF triples have been stored in a graph database, including provenance. For evaluation we used an actual formal PRISMA literature study identifying 61 cerebral spinal fluid biomarkers and 200 blood biomarkers for migraine. These biomarkers sets could be retrieved with weighted mean average precision values of 0.32 and 0.59, respectively, and can be used as a first reference for further refinements

    Functional Cohesion of Gene Sets Determined by Latent Semantic Indexing of PubMed Abstracts

    Get PDF
    High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature
    • …
    corecore