40 research outputs found

    Combining Support Vector Machines to Predict Novel Angiogenesis Genes

    Get PDF
    Vähk on tänapäeval üks levinumaid ja ohtlikumaid haigusi põhjustades igal aastal 13% kõigist surmajuhtumitest üle maailma. Hoolimata aastatepikkustest jõupingutustest ei ole seni ikka veel efektiivset ravi selle haiguse vastu leitud. Küll on aga teada, et vähi arengus on olulisel kohal angiogenees, mille käigus vähk paneb enda ümber asuvad veresooned hargnema ja kasvama. Parem arusaamine sellest protsessist võimaldaks potentsiaalselt luua uusi ja efektiivsemaid ravimeetodeid. Aastate jooksul tehtud eksperimentide käigus on mõõdetud enamiku inimese geenide ekpressiooni rohkem kui 5000 tingimuses. Lisaks on meie koostööpartnerid koostanud nimekirja 341-st veresoonte loomega seotud geenist. Käesoleva töö eesmärgiks ongi uurida, kuidas geeniekspressiooni andmete ja väikese hulga tuntud angiogeneesi geenide põhjal on võimalik ennustada uusi angiogeneesiga seotud geene. Selleks võrreldakse kõigepealt mitmeid olemasolevaid masinõppe meetodeid ja avalikult kättesaadavaid bioinformaatika tööriistu, mida saaks kasutada kandidaatgeenide ennustamiseks. Kõigi nende meetodite puhul kasutatakse sisendiks võimalikult sarnaseid andmeid ning mõõdetakse siis 10-kordse ristvalideerimise abil, kui edukad need on juba tuntud angiogeneesi geenide ülesleidmisel. Töö teises osas pakutakse välja uudne Comb-SVM meetod kandidaatgeenide ennustamiseks. Selle põhiidee baseerub kolmel sammul. Kõigepealt kasutatakse juba tuntud angiogeneesi geene ning juhuslikult valitud negatiivseid geene, et treenida paralleelselt mitu tugivektormasinal (ingl k Support Vector Machine) põhinevat klassifitseerijat. Järgnevalt kasutakse neid klassifitseerijaid uute angiogeneesi geenide ennustamiseks. Viimaks agregeeritakse kõigi klassifitseerijate tulemused kokku üheks ennustuseks. Töö lõpus näidatakse, et 10-kordse ristvalideerimise põhjal on Comb-SVM täpsem kui enamik olemasolevaid meetodeid. Lisaks näidatakse, et Comb-SVM ennustused on oluliselt stabiilsemad väikeste muudatuste suhtes treeningandmetes kui paremuselt teise algoritmi tulemused. Kõige lõpuks kasu- tatakse teaduskirjandust ning Gene Ontology andmebaasi veendumaks, et uued ennustatud geenid on tõpoolest seotud angiogeneesiga.Angiogenesis is the process of growing new blood vessels. It is part of normal bodily functions like wound healing, but it also plays an important role in cancer development. Without angiogenesis, tumors would not be able to grow larger than 1-2 millimeters in diameter due to the lack of oxygen and nutrients. However, only a part of the genes involved in angiogenesis are known. In this work, we proposed a new Comb-SVM machine learning method to predict new members to the positive class, that does not require a clearly defined negative examples. The idea is to train multiple Support Vector Machines (SVMs) using known genes as positive samples and various randomly selected sets of genes as negative examples. The multiple SVMs are then used to separately classify all remaining human genes and the results are finally aggregated using a rank aggregation algorithm. The outcome is a list of genes ranked according to their similarity to known input genes. We applied this method to 341 known angiogenesis genes. Experiments were conducted on a large Affymetrix microarray gene expression matrix consisting of 5732 experiments and 22283 probe sets obtained from ArrayExpress. We compared Comb-SVM to many other state-of-the-art approaches. According to cross-validation experiments, our method outperformed most of the existing methods when looking at areas under Receiver Operator Characteristic and Precision-Recall curves. We also determined that our method gave significantly more stable results than the second best approach. Finally, we verified the biological relevance of the predicted genes by searching the literature and Gene Ontology

    eQTL Catalogue 2023: New datasets, X chromosome QTLs, and improved detection and visualisation of transcript-level QTLs

    Get PDF
    The eQTL Catalogue is an open database of uniformly processed human molecular quantitative trait loci (QTLs). We are continuously updating the resource to further increase its utility for interpreting genetic associations with complex traits. Over the past two years, we have increased the number of uniformly processed studies from 21 to 31 and added X chromosome QTLs for 19 compatible studies. We have also implemented Leafcutter to directly identify splice-junction usage QTLs in all RNA sequencing datasets. Finally, to improve the interpretability of transcript-level QTLs, we have developed static QTL coverage plots that visualise the association between the genotype and average RNA sequencing read coverage in the region for all 1.7 million fine mapped associations. To illustrate the utility of these updates to the eQTL Catalogue, we performed colocalisation analysis between vitamin D levels in the UK Biobank and all molecular QTLs in the eQTL Catalogue. Although most GWAS loci colocalised both with eQTLs and transcript-level QTLs, we found that visual inspection could sometimes be used to distinguish primary splicing QTLs from those that appear to be secondary consequences of large-effect gene expression QTLs. While these visually confirmed primary splicing QTLs explain just 6/53 of the colocalising signals, they are significantly less pleiotropic than eQTLs and identify a prioritised causal gene in 4/6 cases

    Common genetic variation drives molecular heterogeneity in human iPSCs.

    Get PDF
    Technology utilizing human induced pluripotent stem cells (iPS cells) has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterization of many existing iPS cell lines limits their potential use for research and therapy. Here we describe the systematic generation, genotyping and phenotyping of 711 iPS cell lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative. Our study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5-46% of the variation in different iPS cell phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of genomic copy-number alterations that are repeatedly observed in iPS cells. In addition, we present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells

    Loss of IL-10 signaling in macrophages limits bacterial killing driven by prostaglandin E2

    Get PDF
    Loss of IL-10 signaling in macrophages (Mφs) leads to inflammatory bowel disease (IBD). Induced pluripotent stem cells (iPSCs) were generated from an infantile-onset IBD patient lacking a functional IL10RB gene. Mφs differentiated from IL10RB−/− iPSCs lacked IL-10RB mRNA expression, were unable to phosphorylate STAT3, and failed to reduce LPS induced inflammatory cytokines in the presence of exogenous IL-10. IL-10RB−/− Mφs exhibited a striking defect in their ability to kill Salmonella enterica serovar Typhimurium, which was rescuable after experimentally introducing functional copies of the IL10RB gene. Genes involved in synthesis and receptor pathways for eicosanoid prostaglandin E2 (PGE2) were more highly induced in IL-10RB−/− Mφs, and these Mφs produced higher amounts of PGE2 after LPS stimulation compared with controls. Furthermore, pharmacological inhibition of PGE2 synthesis and PGE2 receptor blockade enhanced bacterial killing in Mφs. These results identify a regulatory interaction between IL-10 and PGE2, dysregulation of which may drive aberrant Mφ activation and impaired host defense contributing to IBD pathogenesis

    Elucidating the transcriptional regulatory network controlling the TPO1 response to benzoic acid in yeast

    No full text
    Multidrug resistance (MDR) is the simultaneous acquisition of resistance to wide range of structurally and functionally unrelated cytotoxic chemical compounds that has severe consequences in cancer therapy, agriculture and food industry. Saccharomyces cerevisiae is a well-established model organism used to study the mechanisms of MDR. In yeast and other related organisms, MDR is often caused by drug-efflux pumps that are able to export a wide range of unrelated chemicals. Tpo1, a drug:H+ antiporter of the major facilitator superfamily, is one such drug-efflux pump. In the current work, our aim was to characterize the transcriptional regulatory network controlling TPO1 response to benzoic acid. We have employed two complementary approaches to achieve this aim. First, we have used RT-PCR to measure the transcript levels of Tpo1 and five of its known and putative regulators (GCN4, STP1, STP2, PDR1, PDR3) over a time course in wild type and respective deletion mutants. We have subsequently used this information to construct a logical model of TPO1 regulation. In the second part, we have developed a computational approach that combines data from multiple public sources to predict novel regulators for TPO1 and we have verified some of the prediction experimentally using, ß-galactosidase assays. Our results indicate that in benzoic acid stress, Pdrl/Pdr3 seem to play no role in regulating TPO1 and instead, a complex interplay between Gcn4, and Stp1 is responsible for the up regulation of TP01. Screening for new regulators revealed Hal9 and Ash1 that seem to be repressing TPOl expression in control conditions or in benzoic acid stress, respectively. Furthermore, multiple transcription factors previously implicated in pseudohyphal growth also have a small effect on TPOl expression

    Chromatin accessibility QTL lead variants in macrophages stimulated with IFNg and Salmonella

    No full text
    Lead caQTL variants from RASQUAL and FastQTL analyses

    Summary statistics of transcript usage QTLs in naive and stimulated macrophages (part 1)

    No full text
    <p>Summary statistics of transcript usage QTLs in naive and stimulated macrophages</p

    Revised transcript annotations for GRCh38 reference genome and Ensembl v87.

    No full text
    Custom transcript annotations generated using the reviseAnnotations package. Reference genome: GRCh38 Ensembl version: 87 See the GitHub page of reviseAnnotations for more details: https://github.com/kauralasoo/reviseAnnotation
    corecore