556 research outputs found
Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation
BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. RESULTS: The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes. CONCLUSION: Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose
Discovery and Characterization of Recurrent Gene Fusions in Prostate Cancer.
Recurrent chromosomal rearrangements have been well characterized in hematologic and mesenchymal malignancies, but not in common carcinomas. A novel bioinformatics algorithm termed Cancer Outlier Profile Analysis (COPA) was developed to analyze DNA microarray data for genes markedly over-expressed (“outliers”) in a subset of cases. COPA identified the ETS family members ERG and ETV1 as high-ranking outliers in multiple prostate cancer profiling studies. In cases with outlier expression of ERG or ETV1, recurrent gene fusions of the 5’ untranslated region of the prostate-specific, androgen-induced gene TMPRSS2 to the respective ETS family member were identified. In vitro studies in cancer cell lines demonstrated that androgen-responsive promoter elements of TMPRSS2 mediate the aberrant ETS family member over-expression. Subsequent interrogation of all ETS family members in prostate cancer profiling studies identified outlier expression of ETV4 in two of 98 cases. In one such case, ETV4 over-expression was confirmed and a fusion of the TMPRSS2 and ETV4 loci was identified. A large scale profiling and integrated molecular concepts analysis demonstrated that ETS rearrangement-positive and -negative tumors have distinct transcriptional programs, with loss at 6q21 as a possible defining genetic event in ETS negative prostate cancers.
While TMPRSS2:ERG fusions are predominant, fewer TMPRSS2:ETV1 cases were identified than would be expected based on the frequency of ETV1 outlier expression. Through characterizing additional ETV1 outlier cases, novel 5’ fusion partners defining distinct functional classes of ETS gene rearrangements were identified. These include fusions involving androgen-stimulated, androgen-repressed and androgen-insensitive 5’ partners. As the commonality of ETS rearrangements is aberrant over-expression, in vitro and in vivo recapitulation demonstrated that ETV1 or ERG over-expression in benign prostate cells and the mouse prostate confers neoplastic phenotypes.
Together, this work suggests a pathogenetically important role for recurrent chromosomal rearrangements in a common epithelial tumor and has important implications in the molecular diagnosis and treatment of prostate cancer. Deregulation of ETS family member expression through gene fusions appears to be a generalized mechanism for prostate cancer development in the majority of cases. Additionally, other common epithelial tumors may be driven by uncharacterized gene rearrangements.Ph.D.Molecular & Cellular PathologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/57601/2/tomlinss_1.pd
Notch-1 and IGF-1 as Survivin Regulatory Pathways in Cancer: A Dissertation
The 21st century brought about a dramatic increase in knowledge about genetic and molecular profiles of cancer. This information has validated the complexity of tumor cells and increased awareness of “nodal proteins”, but has yet to advance the development of rational targeted cancer therapeutics. Nodal proteins are critical cellular proteins that collect biological inputs and distribute the information across diverse biological processes. Survivin acts as a nodal protein by interfacing the multiple signals involved in mitosis and apoptosis and functionally integrate proliferation, cell death, and cellular homeostasis. By characterizing survivin as a target of both Type 1 Insulin-like Growth Factor (IGF-1) and Notch developmental signaling, we contribute to the paradigm of survivin as a nodal protein. The two signaling systems, Notch and IGF-1, regulate survivin by two independent mechanisms. Notch activation induces survivin transcription preferentially in basal breast cancer, a breast cancer subtype with poor prognosis and lack of molecular therapies. Activated Notch binds the transcription factor RBP-Jк and drives transcription from the survivin promoter. Notch mediated survivin expression increases cell cycle kinetics promoting tumor proliferation. Inhibition of Notch in a breast xenograft model reduced tumor growth and systemic metastasis. On the other hand, IGF-1 signaling drives survivin protein translation in prostate cancer cells. Binding of IGF-1 to its receptor activates downstream kinases, mammalian target of rapamycin (mTOR) and p70 S6 protein kinase (p70S6K), which modulates survivin mRNA translation to increase the apoptotic threshold. The multiple roles of survivin in tumorigenesis implicate survivin as a rational target for the “next generation” of cancer therapeutics
Recommended from our members
A Systems Biology Approach to Epigenetic Gene Regulation
The ability to control when, and how much of the genetic code is being expressed is the underlying principle behind gene regulation. Control of gene production is able to influence a cell's phenotype by determining which structural components of the cell's observable traits (shape, growth, and behavior) are made. In multicellular organism’s different cell types are able to arise from the same genetic code due to a difference in the patterns of genes being expressed. Essentially anywhere in the process of gene expression from transcription, RNA processing, translation, and post-translational modifications of the protein is subject to regulation. As transcription is the first step in the process of gene expression, it is the first level of regulation for influencing the cell phenotype. The actions of transcription factors, histone modifiers, and other proteins work together to influence RNA polymerase's ability to complete the process of transcription. The actions of transcription factors are able to influence transcription by controlling the ability of RNA polymerase to be recruited to the start of a protein coding region and histone modifiers can rearrange the histones of the chromatin causing entire regions of a chromosome to become exposed or sequestered. These transcriptional regulators are able to work in a combinatorial fashion with one another to either activate and/or repress wide repertoires of transcriptional targets. Mapping out a network of interactions between these transcriptional regulators in gene expression programs allows researchers to understand how each protein is able to influence the phenotype of the cell, and how mutations to any of these transcriptional regulators are able to drive the cell into a diseased state. In the case of cancer, changes in the mechanisms of gene regulation brought on by mutations to these transcriptional regulators may drive the cell's hyper proliferative state. With the creation of next generation sequencing researchers are now better able to define where regulation is taking place in the genome, and how much it is able to influence gene expression. This gives researchers the ability to build these gene regulatory networks and evaluate their impact on gene expression. The subsequent chapters of this dissertation are a reflection of my published work investigating the contribution of oncogenic processes to gene regulatory networks in cancer through the study of hyperactivating somatic mutation of a histone modifier, changes in transcription factor response element specificity, epigenetic regulation of transcription factor signaling, and a transcription factor coactivation network
Discovering information from an integrated graph database
The information explosion in science has become a different problem, not the sheer amount per se, but the multiplicity and heterogeneity of massive sets of data sources. Relations mined from these heterogeneous sources, namely texts, database records, and ontologies have been mapped to Resource Description Framework (RDF) triples in an integrated database. The subject and object resources are expressed as references to concepts in a biomedical ontology consisting of the Unified Medical Language System (UMLS), UniProt and EntrezGene and for the predicate resource to a predicate thesaurus. All RDF triples have been stored in a graph database, including provenance. For evaluation we used an actual formal PRISMA literature study identifying 61 cerebral spinal fluid biomarkers and 200 blood biomarkers for migraine. These biomarkers sets could be retrieved with weighted mean average precision values of 0.32 and 0.59, respectively, and can be used as a first reference for further refinements
Functional Cohesion of Gene Sets Determined by Latent Semantic Indexing of PubMed Abstracts
High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature
- …