49 research outputs found

    Tutkimus ihmisen geenien ilmentymisestä

    Get PDF
    Gene expression is one of the most critical factors influencing the phenotype of a cell. As a result of several technological advances, measuring gene expression levels has become one of the most common molecular biological measurements to study the behaviour of cells. The scientific community has produced enormous and constantly increasing collection of gene expression data from various human cells both from healthy and pathological conditions. However, while each of these studies is informative and enlighting in its own context and research setup, diverging methods and terminologies make it very challenging to integrate existing gene expression data to a more comprehensive view of human transcriptome function. On the other hand, bioinformatic science advances only through data integration and synthesis. The aim of this study was to develop biological and mathematical methods to overcome these challenges and to construct an integrated database of human transcriptome as well as to demonstrate its usage. Methods developed in this study can be divided in two distinct parts. First, the biological and medical annotation of the existing gene expression measurements needed to be encoded by systematic vocabularies. There was no single existing biomedical ontology or vocabulary suitable for this purpose. Thus, new annotation terminology was developed as a part of this work. Second part was to develop mathematical methods correcting the noise and systematic differences/errors in the data caused by various array generations. Additionally, there was a need to develop suitable computational methods for sample collection and archiving, unique sample identification, database structures, data retrieval and visualization. Bioinformatic methods were developed to analyze gene expression levels and putative functional associations of human genes by using the integrated gene expression data. Also a method to interpret individual gene expression profiles across all the healthy and pathological tissues of the reference database was developed. As a result of this work 9783 human gene expression samples measured by Affymetrix microarrays were integrated to form a unique human transcriptome resource GeneSapiens. This makes it possible to analyse expression levels of 17330 genes across 175 types of healthy and pathological human tissues. Application of this resource to interpret individual gene expression measurements allowed identification of tissue of origin with 92.0% accuracy among 44 healthy tissue types. Systematic analysis of transcriptional activity levels of 459 kinase genes was performed across 44 healthy and 55 pathological tissue types and a genome wide analysis of kinase gene co-expression networks was done. This analysis revealed biologically and medically interesting data on putative kinase gene functions in health and disease. Finally, we developed a method for alignment of gene expression profiles (AGEP) to perform analysis for individual patient samples to pinpoint gene- and pathway-specific changes in the test sample in relation to the reference transcriptome database. We also showed how large-scale gene expression data resources can be used to quantitatively characterize changes in the transcriptomic program of differentiating stem cells. Taken together, these studies indicate the power of systematic bioinformatic analyses to infer biological and medical insights from existing published datasets as well as to facilitate the interpretation of new molecular profiling data from individual patients.Jokaisessa ihmisen solussa on sama määrä geenejä, joita yhdessä kutsutaan genomiksi. Kullakin ajan hetkellä tietyt geenit ovat aktiivisia kussakin solussa tietyllä voimakkuudella. Geenien aktiivisuus on yksi tärkeimmistä solujen ulkoisia ominaisuuksia määrittävistä tekijöistä. Nykytekniikalla geenien aktiivisuustasojen mittaaminen solu- tai kudosnäytteestä on tehokasta ja suhteellisen tarkkaa, joten ei ole siis ihme että koko genomin kattavat geenien aktiivisuustasojen mittaukset ovat nykyään molekyyligenetiikan arkipäivää. Kansainvälinen tieteellinen yhteistö on vuosien saatossa tuottanut valtavat määrät tietoa geenien aktiivisuustasoista niin terveistä kuin patologista näytteistä. Vaikka jokainen näistä tutkimuksista on informatiivinen ja valaiseva sen omassa kontekstissaan ja tutkimusasetelmassaan, vaihtelevat menetelmät ja terminologiat merkittävästi hankaloittavat näiden olemassa olevien tutkimusten vertailua ja yhdistelyä laajempien teorioiden ja mallien muodostamista varten. Tämän tutkimuksen tavoitteena oli luoda menetelmiä näiden haasteiden voittamiseksi ja mahdollistaa geenien aktiivisuustasojen tutkiminen kaikissa ihmisen soluissa hyödyntäen jo olemassa olevaa valtavaa aineistoa. Tutkimuksessa kehitetyt menetelmät voidaan pääpiirteittäin jakaa biologisiin ja matemaattisiin menetelmiin. Ensimmäiseksi kerätyn aineiston näytteiden täsmällistä kuvaamista varten luotiin biologisesti ja lääketieteellisesti mielekäs yhtenäinen terminologia, Käytännössä kustakin tutkimuksesta selvitettiin tarkasti millaista näytettä siinä oli tutkittu ja näyte kuvattiin mahdollisimman täsmällisesti. Toiseksi tutkimuksessa kehitettiin matemaattisia menetelmiä joilla eri menetelmillä mitatut geenien aktiivisuustasot saatiin vertailukelpoisiksi. Lisäksi kehitettiin tietoteknisiä ratkaisuja näytteiden keräämiseen ja arkistointiin sekä tietokantaratkaisuja näytetietojen käyttöön ja tallennukseen. Lopuksi kehitettiin bioinformaattisia menetelmiä tämän yhdenmukaistetun tietokannan soveltamiseen ja tulosten visualisointiin. Työn tuloksena syntyi maailman suurin tietokanta GeneSapiens - 17 330 ihmisen geenin aktiivisuustasoista 9783 näytteessä jotka on otettu 175 erilaisesta kudoksesta. Osoitimme kuinka tietokantaa voi käyttää esimerkiksi tuntemattoman näytteen kudosalkuperän tunnistamiseen yli 90% tarkkuudella. Tietokantaa apuna käyttäen teimme toistaiseksi kokonaisvaltaisimman ihmisen kinaasi-geenien aktivisuustasojen tutkimuksen. Kinaasit ovat solujen signaalivälityksen keskeinen geeniperhe, jotka ovat myös aktiivisen lääkekehityksen kohteena. Lopuksi, kehitimme menetelmän, jolla yksittäisen potilaan molekyyligeneettistä profiilia voidaan verrata tuhansista muista potilaista kerättyyn vertailuaineistoon henkilökohtaisen molekyylitason diagnoosin tuottamiseksi. Menetelmällä on myös muita sovelluskohteita esimerkiksi erilaistuvien kantasolujen geenien ilmentymisohjelman muutoksen tutkimuksessa. Tämä tutkimus osoittaa bioinformatiikan menetelmien ja systemaattisen analyysin tehokkuuden tuottaa uutta tietoa laajemman kokonaiskuvan ja teorioiden luomiseksi ihmisen genomin toiminnasta. Tutkimuksessa kehitetyt menetelmät avaavat lisäksi ovia entistä henkilökohtaisempaan hoitoon, kun käytössä on huipputason molekyyligenetiikan menetelmät sekä laaja vertailuaineisto, joka on koottu osana tätä tutkimustyötä

    Classification of unknown primary tumors with a data-driven method based on a large microarray reference database

    Get PDF
    We present a new method to analyze cancer of unknown primary origin (CUP) samples. Our method achieves good results with classification accuracy (88% leave-one-out cross validation for primary tumors from 56 categories, 78% for CUP samples), and can also be used to study CUP samples on a gene-by-gene basis. It is not tied to any a priori defined gene set as many previous methods, and is adaptable to emerging new information

    Activation of Oncogenic and Immune-Response Pathways Is Linked to Disease-Specific Survival in Merkel Cell Carcinoma

    Get PDF
    Background: Merkel cell carcinoma (MCC) is a rare but highly aggressive neuroendocrine carcinoma of the skin with a poor prognosis. Improving the prognosis of MCC by means of targeted therapies requires further understanding of the mechanisms that drive tumor progression. In this study, we aimed to identify the genes, processes, and pathways that play the most crucial roles in determining MCC outcomes. Methods: We investigated transcriptomes generated by RNA sequencing of formalin-fixed paraffin-embedded tissue samples of 102 MCC patients and identified the genes that were upregulated among survivors and in patients who died from MCC. We subsequently cross-referenced these genes with online databases to investigate the functions and pathways they represent. We further investigated differential gene expression based on viral status in patients who died from MCC. Results: We found several novel genes associated with MCC-specific survival. Genes upregulated in patients who died from MCC were most notably associated with angiogenesis and the PI3K-Akt and MAPK pathways; their expression predominantly had no association with viral status in patients who died from MCC. Genes upregulated among survivors were largely associated with antigen presentation and immune response. Conclusion: This outcome-based discrepancy in gene expression suggests that these pathways and processes likely play crucial roles in determining MCC outcomes.Peer reviewe

    Activation of Oncogenic and Immune-Response Pathways Is Linked to Disease-Specific Survival in Merkel Cell Carcinoma

    Get PDF
    Background: Merkel cell carcinoma (MCC) is a rare but highly aggressive neuroendocrine carcinoma of the skin with a poor prognosis. Improving the prognosis of MCC by means of targeted therapies requires further understanding of the mechanisms that drive tumor progression. In this study, we aimed to identify the genes, processes, and pathways that play the most crucial roles in determining MCC outcomes. Methods: We investigated transcriptomes generated by RNA sequencing of formalin-fixed paraffin-embedded tissue samples of 102 MCC patients and identified the genes that were upregulated among survivors and in patients who died from MCC. We subsequently cross-referenced these genes with online databases to investigate the functions and pathways they represent. We further investigated differential gene expression based on viral status in patients who died from MCC. Results: We found several novel genes associated with MCC-specific survival. Genes upregulated in patients who died from MCC were most notably associated with angiogenesis and the PI3K-Akt and MAPK pathways; their expression predominantly had no association with viral status in patients who died from MCC. Genes upregulated among survivors were largely associated with antigen presentation and immune response. Conclusion: This outcome-based discrepancy in gene expression suggests that these pathways and processes likely play crucial roles in determining MCC outcomes.Peer reviewe

    Array-based gene expression, CGH and tissue data defines a 12q24 gain in neuroblastic tumors with prognostic implication

    Get PDF
    Neuroblastoma has successfully served as a model system for the identification of neuroectoderm-derived oncogenes. However, in spite of various efforts, only a few clinically useful prognostic markers have been found. Here, we present a framework, which integrates DNA, RNA and tissue data to identify and prioritize genetic events that represent clinically relevant new therapeutic targets and prognostic biomarkers for neuroblastoma.Peer reviewe

    Integrative functional genomics analysis of sustained polyploidy phenotypes in breast cancer cells identifies an oncogenic profile for GINS2

    Get PDF
    Aneuploidy is among the most obvious differences between normal and cancer cells. However, mechanisms contributing to development and maintenance of aneuploid cell growth are diverse and incompletely understood. Functional genomics analyses have shown that aneuploidy in cancer cells is correlated with diffuse gene expression signatures and that aneuploidy can arise by a variety of mechanisms, including cytokinesis failures, DNA endoreplication and possibly through polyploid intermediate states. Here, we used a novel cell spot microarray technique to identify genes with a loss-of-function effect inducing polyploidy and/or allowing maintenance of polyploid cell growth of breast cancer cells. Integrative genomics profiling of candidate genes highlighted GINS2 as a potential oncogene frequently overexpressed in clinical breast cancers as well as in several other cancer types. Multivariate analysis indicated GINS2 to be an independent prognostic factor for breast cancer outcome (p = 0.001). Suppression of GINS2 expression effectively inhibited breast cancer cell growth and induced polyploidy. In addition, protein level detection of nuclear GINS2 accurately distinguished actively proliferating cancer cells suggesting potential use as an operational biomarker.Peer reviewe

    Facilitates Chromatin Transcription Complex Is an “Accelerator” of Tumor Transformation and Potential Marker and Target of Aggressive Cancers

    Get PDF
    SummaryThe facilitates chromatin transcription (FACT) complex is involved in chromatin remodeling during transcription, replication, and DNA repair. FACT was previously considered to be ubiquitously expressed and not associated with any disease. However, we discovered that FACT is the target of a class of anticancer compounds and is not expressed in normal cells of adult mammalian tissues, except for undifferentiated and stem-like cells. Here, we show that FACT expression is strongly associated with poorly differentiated aggressive cancers with low overall survival. In addition, FACT was found to be upregulated during in vitro transformation and to be necessary, but not sufficient, for driving transformation. FACT also promoted survival and growth of established tumor cells. Genome-wide mapping of chromatin-bound FACT indicated that FACT’s role in cancer most likely involves selective chromatin remodeling of genes that stimulate proliferation, inhibit cell death and differentiation, and regulate cellular stress responses
    corecore