1,122 research outputs found

    MicroRNAs in the stressed heart: Sorting the signal from the noise

    Get PDF
    The short noncoding RNAs, known as microRNAs, are of undisputed importance in cellular signaling during differentiation and development, and during adaptive and maladaptive responses of adult tissues, including those that comprise the heart. Cardiac microRNAs are regulated by hemodynamic overload resulting from exercise or hypertension, in the response of surviving myocardium to myocardial infarction, and in response to environmental or systemic disruptions to homeostasis, such as those arising from diabetes. A large body of work has explored microRNA responses in both physiological and pathological contexts but there is still much to learn about their integrated actions on individual mRNAs and signaling pathways. This review will highlight key studies of microRNA regulation in cardiac stress and suggest possible approaches for more precise identification of microRNA targets, with a view to exploiting the resulting data for therapeutic purposes

    Progress and Opportunities of Foundation Models in Bioinformatics

    Full text link
    Bioinformatics has witnessed a paradigm shift with the increasing integration of artificial intelligence (AI), particularly through the adoption of foundation models (FMs). These AI techniques have rapidly advanced, addressing historical challenges in bioinformatics such as the scarcity of annotated data and the presence of data noise. FMs are particularly adept at handling large-scale, unlabeled data, a common scenario in biological contexts due to the time-consuming and costly nature of experimentally determining labeled data. This characteristic has allowed FMs to excel and achieve notable results in various downstream validation tasks, demonstrating their ability to represent diverse biological entities effectively. Undoubtedly, FMs have ushered in a new era in computational biology, especially in the realm of deep learning. The primary goal of this survey is to conduct a systematic investigation and summary of FMs in bioinformatics, tracing their evolution, current research status, and the methodologies employed. Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs. We delve into the specifics of the problem at hand including sequence analysis, structure prediction, function annotation, and multimodal integration, comparing the structures and advancements against traditional methods. Furthermore, the review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases. Finally, we outline potential development paths and strategies for FMs in future biological research, setting the stage for continued innovation and application in this rapidly evolving field. This comprehensive review serves not only as an academic resource but also as a roadmap for future explorations and applications of FMs in biology.Comment: 27 pages, 3 figures, 2 table

    FilTar: Using RNA-Seq data to improve microRNA target prediction accuracy in animals

    Get PDF
    MOTIVATION: MicroRNA (miRNA) target prediction algorithms do not generally consider biological context and therefore generic target prediction based on seed binding can lead to a high level of false-positive predictions. Here, we present FilTar, a method that incorporates RNA-Seq data to make miRNA target prediction specific to a given cell type or tissue of interest. RESULTS: We demonstrate that FilTar can be used to: (i) provide sample specific 3'-UTR reannotation; extending or truncating default annotations based on RNA-Seq read evidence and (ii) filter putative miRNA target predictions by transcript expression level, thus removing putative interactions where the target transcript is not expressed in the tissue or cell line of interest. We test the method on a variety of miRNA transfection datasets and demonstrate increased accuracy versus generic miRNA target prediction methods. AVAILABILITY AND IMPLEMENTATION: FilTar is freely available and can be downloaded from https://github.com/TBradley27/FilTar. The tool is implemented using the Python and R programming languages, and is supported on GNU/Linux operating systems. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    Get PDF
    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation

    Discrimination between thermodynamic models of cis-regulation using transcription factor occupancy data

    Get PDF
    Many studies have identified binding preferences for transcription factors (TFs), but few have yielded predictive models of how combinations of transcription factor binding sites generate specific levels of gene expression. Synthetic promoters have emerged as powerful tools for generating quantitative data to parameterize models of combinatorial cis-regulation. We sought to improve the accuracy of such models by quantifying the occupancy of TFs on synthetic promoters in vivo and incorporating these data into statistical thermodynamic models of cis-regulation. Using chromatin immunoprecipitation-seq, we measured the occupancy of Gcn4 and Cbf1 in synthetic promoter libraries composed of binding sites for Gcn4, Cbf1, Met31/Met32 and Nrg1. We measured the occupancy of these two TFs and the expression levels of all promoters in two growth conditions. Models parameterized using only expression data predicted expression but failed to identify several interactions between TFs. In contrast, models parameterized with occupancy and expression data predicted expression data, and also revealed Gcn4 self-cooperativity and a negative interaction between Gcn4 and Nrg1. Occupancy data also allowed us to distinguish between competing regulatory mechanisms for the factor Gcn4. Our framework for combining occupancy and expression data produces predictive models that better reflect the mechanisms underlying combinatorial cis-regulation of gene expression

    Functional Identification and Characterization of cis-Regulatory Elements

    Get PDF
    Transcription is regulated through interactions between regulatory proteins, such as transcription factors (TFs), and DNA sequence. It is known that TFs act combinatorially in some cases to regulate transcription, but in which situations and to what degree is unclear. I first studied the contribution of TF binding sites to expression in mouse embryonic stem (ES) cells by using synthetic cis-regulatory elements (CREs). The synthetic CREs were comprised of combinations of binding sites for the pluripotency TFs Oct4, Sox2, Klf4, and Esrrb. A statistical thermodynamic model explained 72% of the variation in expression driven by these CREs. The high predictive power of this model depended on five TF interaction parameters, including favorable heterotypic interactions between Oct4 and Sox2, Klf4 and Sox2, and Klf4 and Esrrb. The model also included two unfavorable homotypic interaction parameters. These homotypic parameters help to explain the fact that synthetic CREs with mixtures of binding sites for various TFs drive much higher expression than multiple binding sites for the same TF. I then found that the expression of these synthetic CREs largely changes as ES cells differentiate down the neural lineage. However, CREs with no repeat binding sites drove similar levels of expression, suggesting that heterotypic interactions may be similar in the two conditions. In a separate set of experiments I interrogated the determinants of expression driven by genomic sequences previously segmented into classes based on chromatin features. A set of these sequences was assayed in K562 cells. As expected, we found that Enhancers and Weak Enhancers drove expression over background, while Repressed elements and Enhancers from another cell type did not. Unexpectedly, we found that Weak Enhancers drove higher expression than Enhancers, possibly based on their lower H3K36me3 and H3K27ac, which we found to be weakly associated with lower expression. Using a logistic regression model, we showed that matches to TF binding motifs were best able to predict active sequences, but chromatin features contributed significantly as well. These results demonstrate that interactions between certain combinations of pluripotency TFs, but not all combinations, are important to transcriptional regulation. Furthermore, chromatin modifications can still contribute to predictions of expression even after accounting for binding site motifs. Better understanding of the process of cis-regulation will allow us to predict which sequences can drive expression and how perturbations affect this expression

    Optimal Use of Conservation and Accessibility Filters in MicroRNA Target Prediction

    Get PDF
    It is generally accepted that filtering microRNA (miRNA) target predictions by conservation or by accessibility can reduce the false discovery rate. However, these two strategies are usually not exploited in a combined and flexible manner. Here, we introduce PACCMIT, a flexible method that filters miRNA binding sites by their conservation, accessibility, or both. The improvement in performance obtained with each of these three filters is demonstrated on the prediction of targets for both i) highly and ii) weakly conserved miRNAs, i.e., in two scenarios in which the miRNA-target interactions are subjected to different evolutionary pressures. We show that in the first scenario conservation is a better filter than accessibility (as both sensitivity and precision are higher among the top predictions) and that the combined filter improves performance of PACCMIT even further. In the second scenario, on the other hand, the accessibility filter performs better than both the conservation and combined filters, suggesting that the site conservation is not equally effective in rejecting false positive predictions for all miRNAs. Regarding the quality of the ranking criterion proposed by Robins and Press and used in PACCMIT, it is shown that top ranking interactions correspond to more downregulated proteins than do the lower ranking interactions. Comparison with several other target prediction algorithms shows that the ranking of predictions provided by PACCMIT is at least as good as the ranking generated by other conservation-based methods and considerably better than the energy-based ranking used in other accessibility-based methods

    Activity of microRNAs and transcription factors in Gene Regulatory Networks

    Get PDF
    In biological research, diverse high-throughput techniques enable the investigation of whole systems at the molecular level. The development of new methods and algorithms is necessary to analyze and interpret measurements of gene and protein expression and of interactions between genes and proteins. One of the challenges is the integrated analysis of gene expression and the associated regulation mechanisms. The two most important types of regulators, transcription factors (TFs) and microRNAs (miRNAs), often cooperate in complex networks at the transcriptional and post-transcriptional level and, thus, enable a combinatorial and highly complex regulation of cellular processes. For instance, TFs activate and inhibit the expression of other genes including other TFs whereas miRNAs can post-transcriptionally induce the degradation of transcribed RNA and impair the translation of mRNA into proteins. The identification of gene regulatory networks (GRNs) is mandatory in order to understand the underlying control mechanisms. The expression of regulators is itself regulated, i.e. activating or inhibiting regulators in varying conditions and perturbations. Thus, measurements of gene expression following targeted perturbations (knockouts or overexpressions) of these regulators are of particular importance. The prediction of the activity states of the regulators and the prediction of the target genes are first important steps towards the construction of GRNs. This thesis deals with these first bioinformatics steps to construct GRNs. Targets of TFs and miRNAs are determined as comprehensively and accurately as possible. The activity state of regulators is predicted for specific high-throughput data and specific contexts using appropriate statistical approaches. Moreover, (parts of) GRNs are inferred, which lead to explanations of given measurements. The thesis describes new approaches for these tasks together with accompanying evaluations and validations. This immediately defines the three main goals of the current thesis: 1. The development of a comprehensive database of regulator-target relation. Regulators and targets are retrieved from public repositories, extracted from the literature via text mining and collected into the miRSel database. In addition, relations can be predicted using various published methods. In order to determine the activity states of regulators (see 2.) and to infer GRNs (3.) comprehensive and accurate regulator-target relations are required. It could be shown that text mining enables the reliable extraction of miRNA, gene, and protein names as well as their relations from scientific free texts. Overall, the miRSel contains about three times more relations for the model organisms human, mouse, and rat as compared to state-of-the-art databases (e.g. TarBase, one of the currently most used resources for miRNA-target relations). 2. The prediction of activity states of regulators based on improved target sets. In order to investigate mechanisms of gene regulation, the experimental contexts have to be determined in which the respective regulators become active. A regulator is predicted as active based on appropriate statistical tests applied to the expression values of its set of target genes. For this task various gene set enrichment (GSE) methods have been proposed. Unfortunately, before an actual experiment it is unknown which genes are affected. The missing standard-of-truth so far has prevented the systematic assessment and evaluation of GSE tests. In contrast, the trigger of gene expression changes is of course known for experiments where a particular regulator has been directly perturbed (i.e. by knockout, transfection, or overexpression). Based on such datasets, we have systematically evaluated 12 current GSE tests. In our analysis ANOVA and the Wilcoxon test performed best. 3. The prediction of regulation cascades. Using gene expression measurements and given regulator-target relations (e.g. from the miRSel database) GRNs are derived. GSE tests are applied to determine TFs and miRNAs that change their activity as cellular response to an overexpressed miRNA. Gene regulatory networks can constructed iteratively. Our models show how miRNAs trigger gene expression changes: either directly or indirectly via cascades of miRNA-TF, miRNA-kinase-TF as well as TF-TF relations. In this thesis we focus on measurements which have been obtained after overexpression of miRNAs. Surprisingly, a number of cancer relevant miRNAs influence a common core of TFs which are involved in processes such as proliferation and apoptosis
    corecore