70 research outputs found

    Identifying transcription factors and microRNAs as key regulators of pathways using Bayesian inference on known pathway structures

    Get PDF
    Background: Transcription factors and microRNAs act in concert to regulate gene expression in eukaryotes. Numerous computational methods based on sequence information are available for the prediction of target genes of transcription factors and microRNAs. Although these methods provide a static snapshot of how genes may be regulated, they are not effective for the identification of condition-specific regulators. Results: We propose a new method that combines: a) transcription factors and microRNAs that are predicted to target genes in pathways, with b) microarray expression profiles of microRNAs and mRNAs, in conjunction with c) the known structure of molecular pathways. These elements are integrated into a Bayesian network derived from each pathway that, through probability inference, allows for the prediction of the key regulators in the pathway. We demonstrate 1) the steps to discretize the expression data for the computation of conditional probabilities in a Bayesian network, 2) the procedure to construct a Bayesian network using the structure of a known pathway and the transcription factors and microRNAs predicted to target genes in that pathway, and 3) the inference results as potential regulators of three signaling pathways using microarray expression profiles of microRNA and mRNA in estrogen receptor positive and estrogen receptor negative tumors. Conclusions: We displayed the ability of our framework to integrate multiple sets of microRNA and mRNA expression data, from two phenotypes, with curated molecular pathway structures by creating Bayesian networks. Moreover, by performing inference on the network using known evidence, e.g., status of differentially expressed genes, or by entering hypotheses to be tested, we obtain a list of potential regulators of the pathways. This, in turn, will help increase our understanding about the regulatory mechanisms relevant to the two phenotypes

    A Local Genetic Algorithm for the Identification of Condition-Specific MicroRNA-Gene Modules

    Get PDF
    Transcription factor and microRNA are two types of key regulators of gene expression. Their regulatory mechanisms are highly complex. In this study, we propose a computational method to predict condition-specific regulatory modules that consist of microRNAs, transcription factors, and their commonly regulated genes. We used matched global expression profiles of mRNAs and microRNAs together with the predicted targets of transcription factors and microRNAs to construct an underlying regulatory network. Our method searches for highly scored modules from the network based on a two-step heuristic method that combines genetic and local search algorithms. Using two matched expression datasets, we demonstrate that our method can identify highly scored modules with statistical significance and biological relevance. The identified regulatory modules may provide useful insights on the mechanisms of transcription factors and microRNAs

    In silico phenotyping via co-training for improved phenotype prediction from genotype

    Get PDF
    Motivation: Predicting disease phenotypes from genotypes is a key challenge in medical applications in the postgenomic era. Large training datasets of patients that have been both genotyped and phenotyped are the key requisite when aiming for high prediction accuracy. With current genotyping projects producing genetic data for hundreds of thousands of patients, large-scale phenotyping has become the bottleneck in disease phenotype prediction. Results: Here we present an approach for imputing missing disease phenotypes given the genotype of a patient. Our approach is based on co-training, which predicts the phenotype of unlabeled patients based on a second class of information, e.g. clinical health record information. Augmenting training datasets by this type of in silico phenotyping can lead to significant improvements in prediction accuracy. We demonstrate this on a dataset of patients with two diagnostic types of migraine, termed migraine with aura and migraine without aura, from the International Headache Genetics Consortium. Conclusions: Imputing missing disease phenotypes for patients via co-training leads to larger training datasets and improved prediction accuracy in phenotype prediction. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/co-training.html Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    Machine Learning Successfully Detects Patients with COVID-19 Prior to PCR Results and Predicts Their Survival Based on Standard Laboratory Parameters in an Observational Study

    Full text link
    Introduction: In the current COVID-19 pandemic, clinicians require a manageable set of decisive parameters that can be used to (i) rapidly identify SARS-CoV-2 positive patients, (ii) identify patients with a high risk of a fatal outcome on hospital admission, and (iii) recognize longitudinal warning signs of a possible fatal outcome. Methods: This comparative study was performed in 515 patients in the Maria Skłodowska-Curie Specialty Voivodeship Hospital in Zgierz, Poland. The study groups comprised 314 patients with COVID-like symptoms who tested negative and 201 patients who tested positive for SARS-CoV-2 infection; of the latter, 72 patients with COVID-19 died and 129 were released from hospital. Data on which we trained several machine learning (ML) models included clinical findings on admission and during hospitalization, symptoms, epidemiological risk, and reported comorbidities and medications. Results: We identified a set of eight on-admission parameters: white blood cells, antibody-synthesizing lymphocytes, ratios of basophils/lymphocytes, platelets/neutrophils, and monocytes/lymphocytes, procalcitonin, creatinine, and C-reactive protein. The medical decision tree built using these parameters differentiated between SARS-CoV-2 positive and negative patients with up to 90–100% accuracy. Patients with COVID-19 who on hospital admission were older, had higher procalcitonin, C-reactive protein, and troponin I levels together with lower hemoglobin and platelets/neutrophils ratio were found to be at highest risk of death from COVID-19. Furthermore, we identified longitudinal patterns in C-reactive protein, white blood cells, and D dimer that predicted the disease outcome. Conclusions: Our study provides sets of easily obtainable parameters that allow one to assess the status of a patient with SARS-CoV-2 infection, and the risk of a fatal disease outcome on hospital admission and during the course of the disease

    Spatial transcriptomics combined with single-cell RNA-sequencing unravels the complex inflammatory cell network in atopic dermatitis

    Get PDF
    BackgroundAtopic dermatitis (AD) is the most common chronic inflammatory skin disease with complex pathogenesis for which the cellular and molecular crosstalk in AD skin has not been fully understood.MethodsSkin tissues examined for spatial gene expression were derived from the upper arm of 6 healthy control (HC) donors and 7 AD patients (lesion and nonlesion). We performed spatial transcriptomics sequencing to characterize the cellular infiltrate in lesional skin. For single‐cell analysis, we analyzed the single‐cell data from suction blister material from AD lesions and HC skin at the antecubital fossa skin (4 ADs and 5 HCs) and full‐thickness skin biopsies (4 ADs and 2 HCs). The multiple proximity extension assays were performed in the serum samples from 36 AD patients and 28 HCs.ResultsThe single‐cell analysis identified unique clusters of fibroblasts, dendritic cells, and macrophages in the lesional AD skin. Spatial transcriptomics analysis showed the upregulation of COL6A5, COL4A1, TNC, and CCL19 in COL18A1‐expressing fibroblasts in the leukocyte‐infiltrated areas in AD skin. CCR7‐expressing dendritic cells (DCs) showed a similar distribution in the lesions. Additionally, M2 macrophages expressed CCL13 and CCL18 in this area. Ligand–receptor interaction analysis of the spatial transcriptome identified neighboring infiltration and interaction between activated COL18A1‐expressing fibroblasts, CCL13‐ and CCL18‐expressing M2 macrophages, CCR7‐ and LAMP3‐expressing DCs, and T cells. As observed in skin lesions, serum levels of TNC and CCL18 were significantly elevated in AD, and correlated with clinical disease severity.ConclusionIn this study, we show the unknown cellular crosstalk in leukocyte‐infiltrated area in lesional skin. Our findings provide a comprehensive in‐depth knowledge of the nature of AD skin lesions to guide the development of better treatments

    Genome-Wide Progesterone Receptor Binding: Cell Type-Specific and Shared Mechanisms in T47D Breast Cancer Cells and Primary Leiomyoma Cells

    Get PDF
    Progesterone, via its nuclear receptor (PR), exerts an overall tumorigenic effect on both uterine fibroid (leiomyoma) and breast cancer tissues, whereas the antiprogestin RU486 inhibits growth of these tissues through an unknown mechanism. Here, we determined the interaction between common or cell-specific genome-wide binding sites of PR and mRNA expression in RU486-treated uterine leiomyoma and breast cancer cells.ChIP-sequencing revealed 31,457 and 7,034 PR-binding sites in breast cancer and uterine leiomyoma cells, respectively; 1,035 sites overlapped in both cell types. Based on the chromatin-PR interaction in both cell types, we statistically refined the consensus progesterone response element to G•ACA• • •TGT•C. We identified two striking differences between uterine leiomyoma and breast cancer cells. First, the cis-regulatory elements for HSF, TEF-1, and C/EBPα and β were statistically enriched at genomic RU486/PR-targets in uterine leiomyoma, whereas E2F, FOXO1, FOXA1, and FOXF sites were preferentially enriched in breast cancer cells. Second, 51.5% of RU486-regulated genes in breast cancer cells but only 6.6% of RU486-regulated genes in uterine leiomyoma cells contained a PR-binding site within 5 kb from their transcription start sites (TSSs), whereas 75.4% of RU486-regulated genes contained a PR-binding site farther than 50 kb from their TSSs in uterine leiomyoma cells. RU486 regulated only seven mRNAs in both cell types. Among these, adipophilin (PLIN2), a pro-differentiation gene, was induced via RU486 and PR via the same regulatory region in both cell types.Our studies have identified molecular components in a RU486/PR-controlled gene network involved in the regulation of cell growth, cell migration, and extracellular matrix function. Tissue-specific and common patterns of genome-wide PR binding and gene regulation may determine the therapeutic effects of antiprogestins in uterine fibroids and breast cancer

    Computational Methods to Study Gene Regulation Using Genomic, Epigenomic and Chromosome Conformation Data

    No full text
    Transcriptional regulation in eukaryotes is the process in which different cells regulate the expression of genes. It is extremely complex and the adequate regulation of genes at precise times is what makes many cellular processes viable. Additionally, errors or disruptions in the transcriptional machinery can often compromise the livelihood of the cell or cause disease. In the past few years, novel genomic techniques have been developed to probe the regulatory mechanisms of genes. These techniques include next-generation sequencing, for example, to determine the exact location of DNA-bound regulatory proteins and sophisticated methylation arrays among others. Here we describe a set of computational methods that approach the process of gene regulation from three different research perspectives. Firstly, we explore the standard view of transcription factors binding directly to DNA to promote or repress the expression of genes. The understanding of transcription regulation is enhanced when considering how microRNAs regulate genes at a post-transcriptional phase. Secondly, we analyze how other epigenetic factors, such as DNA methylation, can affect gene expression. Thirdly, we delve into a more complex scenario within the nucleus of the cell where we consider gene regulation as the product, not only of epigenetics or acting transcription factors, but also of the three-dimensional conformation of chromosomes. The significance of our work is based on the fact that it provides an encompassing view of the complex nature of gene regulation. Because of constant advances in experimental genomics there is a need to develop new analysis methods to cope with the ever increasing volume of biological data that are generated. The deliverables from each of the research aims mentioned above will include, in addition to sound mathematical formulations of how to model the problems, a set of generic (executable) tools from which other researchers can benefit

    Kernel conditional clustering and kernel conditional semi-supervised learning

    No full text
    The results of clustering are often affected by covariates that are independent of the clusters one would like to discover. Traditionally, alternative clustering algorithms can be used to solve such clustering problems. However, these suffer from at least one of the following problems: (1) Continuous covariates or nonlinearly separable clusters cannot be handled; (2) assumptions are made about the distribution of the data; (3) one or more hyper-parameters need to be set. The presence of covariates also has an effect in a different type of problem such as semi-supervised learning. To the best of our knowledge, there is no existing method addressing the semi-supervised learning setting in the presence of covariates. Here we propose two novel algorithms, named kernel conditional clustering (KCC) and kernel conditional semi-supervised learning (KCSSL), whose objectives are derived from a kernel-based conditional dependence measure. KCC is parameter-light and makes no assumptions about the cluster structure, the covariates, or the distribution of the data, while KCSSL is fully parameter-free. On both simulated and real-world datasets, the proposed KCC and KCSSL algorithms perform better than state-of-the-art methods. The former detects the ground truth cluster structures more accurately, and the latter makes more accurate predictions.ISSN:0219-1377ISSN:0219-311
    corecore