9 research outputs found

    Identification of microRNA-mRNA modules using microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are post-transcriptional regulators of mRNA expression and are involved in numerous cellular processes. Consequently, miRNAs are an important component of gene regulatory networks and an improved understanding of miRNAs will further our knowledge of these networks. There is a many-to-many relationship between miRNAs and mRNAs because a single miRNA targets multiple mRNAs and a single mRNA is targeted by multiple miRNAs. However, most of the current methods for the identification of regulatory miRNAs and their target mRNAs ignore this biological observation and focus on miRNA-mRNA pairs.</p> <p>Results</p> <p>We propose a two-step method for the identification of many-to-many relationships between miRNAs and mRNAs. In the first step, we obtain miRNA and mRNA clusters using a combination of miRNA-target mRNA prediction algorithms and microarray expression data. In the second step, we determine the associations between miRNA clusters and mRNA clusters based on changes in miRNA and mRNA expression profiles. We consider the miRNA-mRNA clusters with statistically significant associations to be potentially regulatory and, therefore, of biological interest.</p> <p>Conclusions</p> <p>Our method reduces the interactions between several hundred miRNAs and several thousand mRNAs to a few miRNA-mRNA groups, thereby facilitating a more meaningful biological analysis and a more targeted experimental validation.</p

    Modular combinatorial binding among human trans-acting factors reveals direct and indirect factor binding

    Get PDF
    Background The combinatorial binding of trans-acting factors (TFs) to the DNA is critical to the spatial and temporal specificity of gene regulation. For certain regulatory regions, more than one regulatory module (set of TFs that bind together) are combined to achieve context-specific gene regulation. However, previous approaches are limited to either pairwise TF co-association analysis or assuming that only one module is used in each regulatory region. Results We present a new computational approach that models the modular organization of TF combinatorial binding. Our method learns compact and coherent regulatory modules from in vivo binding data using a topic model. We found that the binding of 115 TFs in K562 cells can be organized into 49 interpretable modules. Furthermore, we found that tens of thousands of regulatory regions use multiple modules, a structure that cannot be observed with previous hard clustering based methods. The modules discovered recapitulate many published protein-protein physical interactions, have consistent functional annotations of chromatin states, and uncover context specific co-binding such as gene proximal binding of NFY + FOS + SP and distal binding of NFY + FOS + USF. For certain TFs, the co-binding partners of direct binding (motif present) differs from those of indirect binding (motif absent); the distinct set of co-binding partners can predict whether the TF binds directly or indirectly with up to 95% accuracy. Joint analysis across two cell types reveals both cell-type-specific and shared regulatory modules. Conclusions Our results provide comprehensive cell-type-specific combinatorial binding maps and suggest a modular organization of combinatorial binding. Keywords Computational genomics Transcription factor Combinatorial binding Direct and indirect binding Topic modelNational Institutes of Health (U.S.) (grant 1U01HG007037-01

    Analysis of Antisense Expression by Whole Genome Tiling Microarrays and siRNAs Suggests Mis-Annotation of Arabidopsis Orphan Protein-Coding Genes

    Get PDF
    MicroRNAs (miRNAs) and trans-acting small-interfering RNAs (tasi-RNAs) are small (20-22 nt long) RNAs (smRNAs) generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs) are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery.We explored rice (Oryza sativa) sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans) and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis 'orphan' hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM) was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the "ancient" (deeply conserved) class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for "new" rapidly-evolving MIRNA genes.Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other kingdoms, which can provide insight into antisense transcription, miRNA evolution, and post-transcriptional gene regulation

    Efficient state estimation via inference on a probabilistic graphical model

    Get PDF
    This thesis presents a unique and efficient solver to the state estimation (SE) problem for the power grid, based on probabilistic graphical models (PGMs). SE is a method of estimating the varying state values of voltage magnitude and phase at every bus within a power grid based on meter measurements. However, existing SE solvers are notorious for their computational inefficiency to calculate the matrix inverse, and hence slow convergence to produce the final state estimates. The proposed PGM-based solver estimates the state values from a different perspective. Instead of calculating the matrix inverse directly, it models the power grid as a PGM, and then assigns potentials to nodes and edges of the PGM, based on the physical constraints of the power grid. This way, the original SE problem is transformed into an equivalent probabilistic inference problem on the PGM, for which two efficient algorithms are proposed based on Gaussian belief propagation (GBP). The equivalence between the proposed PGM-based solver and existing SE solvers is shown in terms of state estimates, and it is experimentally demonstrated that this new method converges much faster than existing solvers

    Characterizing the Huntington's disease, Parkinson's disease, and pan-neurodegenerative gene expression signature with RNA sequencing

    Get PDF
    Huntington's disease (HD) and Parkinson's disease (PD) are devastating neurodegenerative disorders that are characterized pathologically by degeneration of neurons in the brain and clinically by loss of motor function and cognitive decline in mid to late life. The cause of neuronal degeneration in these diseases is unclear, but both are histologically marked by aggregation of specific proteins in specific brain regions. In HD, fragments of a mutant Huntingtin protein aggregate and cause medium spiny interneurons of the striatum to degenerate. In contrast, PD brains exhibit aggregation of toxic fragments of the alpha synuclein protein throughout the central nervous system and trigger degeneration of dopaminergic neurons in the substantia nigra. Considering the commonalities and differences between these diseases, identifying common biological patterns across HD and PD as well as signatures unique to each may provide significant insight into the molecular mechanisms underlying neurodegeneration as a general process. State-of-the-art high-throughput sequencing technology allows for unbiased, whole genome quantification of RNA molecules within a biological sample that can be used to assess the level of activity, or expression, of thousands of genes simultaneously. In this thesis, I present three studies characterizing the RNA expression profiles of post-mortem HD and PD subjects using high-throughput mRNA sequencing data sets. The first study describes an analysis of differential expression between HD individuals and neurologically normal controls that indicates a widespread increase in immune, neuroinflammatory, and developmental gene expression. The second study expands upon the first study by making methodological improvements and extends the differential expression analysis to include PD subjects, with the goal of comparing and contrasting HD and PD gene expression profiles. This study was designed to identify common mechanisms underlying the neurodegenerative phenotype, transcending those of each unique disease, and has revealed specific biological processes, in particular those related to NFkB inflammation, common to HD and PD. The last study describes a novel methodology for combining mRNA and miRNA expression that seeks to identify associations between mRNA-miRNA modules and continuous clinical variables of interest, including CAG repeat length and clinical age of onset in HD

    Caracterización de miRNAs involucrados en respuesta a infección por bacterias del género Xanthomonas en arroz y yuca

    Get PDF
    Los microRNAs (miRNAs) son moléculas de RNAs pequeñas que juegan un papel importante en la regulación de la expresión génica en eucariotas mediante el silenciamiento de mRNAs complementarios (targets). En este trabajo se estudió su papel en interacciones con bacterias del género Xanthomonas en arroz y yuca. Mediante análisis bioinformático de librerías de RNAs pequeños se identificaron y cuantificaron miRNAs, y se estudió la expresión de los posibles targets. Se encontró que en yuca la respuesta mediada por miRNAs está caracterizada por la inducción de miRNAs que reprimen señalización por auxinas y por la represión de miRNAs involucrados en el control de genes de resistencia. Mientras que en arroz los miRNAs no parecen tener un papel crucial en defensa aunque la expresión de miRNAs varía en respuesta a distintas cepas bacterianas. Y se identificaron también posibles mecanismos de regulación transcripcional de miRNAs mediados por factores de transcripción vegetales y bacterianos.Abstract. microRNAs (miRNAs) are small RNA molecules involved in the control of gene expression in eukaryotes through the silencing of complementary mRNAs (targets). This work addresses their role in interactions with bacteria from the genus Xanthomonas in rice and cassava. miRNAs were identified and quantified through the bioinformatic analyses of sRNA libraries and the expression of the possible targets was also studied. It was found that in cassava the response mediated by miRNAs is characterized mainly by the induction of miRNAs that repress auxin silencing and by the repression of miRNAs involved in the control of resistance genes. While in rice miRNAs seemed not to have a crucial role in defense even though the expression of miRNAs was variable in response to different bacterial strains. This work also addresses the identification of possible mechanisms of transcriptional regulation of miRNAs mediated by vegetal and bacterial transcription factors.Maestrí