51 research outputs found

    Fungal Virulence and Development Is Regulated by Alternative Pre-mRNA 3′End Processing in Magnaporthe oryzae

    Get PDF
    RNA-binding proteins play a central role in post-transcriptional mechanisms that control gene expression. Identification of novel RNA-binding proteins in fungi is essential to unravel post-transcriptional networks and cellular processes that confer identity to the fungal kingdom. Here, we carried out the functional characterisation of the filamentous fungus-specific RNA-binding protein RBP35 required for full virulence and development in the rice blast fungus. RBP35 contains an N-terminal RNA recognition motif (RRM) and six Arg-Gly-Gly tripeptide repeats. Immunoblots identified two RBP35 protein isoforms that show a steady-state nuclear localisation and bind RNA in vitro. RBP35 coimmunoprecipitates in vivo with Cleavage Factor I (CFI) 25 kDa, a highly conserved protein involved in polyA site recognition and cleavage of pre-mRNAs. Several targets of RBP35 have been identified using transcriptomics including 14-3-3 pre-mRNA, an important integrator of environmental signals. In Magnaporthe oryzae, RBP35 is not essential for viability but regulates the length of 3′UTRs of transcripts with developmental and virulence-associated functions. The Δrbp35 mutant is affected in the TOR (target of rapamycin) signaling pathway showing significant changes in nitrogen metabolism and protein secretion. The lack of clear RBP35 orthologues in yeast, plants and animals indicates that RBP35 is a novel auxiliary protein of the polyadenylation machinery of filamentous fungi. Our data demonstrate that RBP35 is the fungal equivalent of metazoan CFI 68 kDa and suggest the existence of 3′end processing mechanisms exclusive to the fungal kingdom

    Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes

    Get PDF
    BACKGROUND: A cluster analysis is the most commonly performed procedure (often regarded as a first step) on a set of gene expression profiles. In most cases, a post hoc analysis is done to see if the genes in the same clusters can be functionally correlated. While past successes of such analyses have often been reported in a number of microarray studies (most of which used the standard hierarchical clustering, UPGMA, with one minus the Pearson's correlation coefficient as a measure of dissimilarity), often times such groupings could be misleading. More importantly, a systematic evaluation of the entire set of clusters produced by such unsupervised procedures is necessary since they also contain genes that are seemingly unrelated or may have more than one common function. Here we quantify the performance of a given unsupervised clustering algorithm applied to a given microarray study in terms of its ability to produce biologically meaningful clusters using a reference set of functional classes. Such a reference set may come from prior biological knowledge specific to a microarray study or may be formed using the growing databases of gene ontologies (GO) for the annotated genes of the relevant species. RESULTS: In this paper, we introduce two performance measures for evaluating the results of a clustering algorithm in its ability to produce biologically meaningful clusters. The first measure is a biological homogeneity index (BHI). As the name suggests, it is a measure of how biologically homogeneous the clusters are. This can be used to quantify the performance of a given clustering algorithm such as UPGMA in grouping genes for a particular data set and also for comparing the performance of a number of competing clustering algorithms applied to the same data set. The second performance measure is called a biological stability index (BSI). For a given clustering algorithm and an expression data set, it measures the consistency of the clustering algorithm's ability to produce biologically meaningful clusters when applied repeatedly to similar data sets. A good clustering algorithm should have high BHI and moderate to high BSI. We evaluated the performance of ten well known clustering algorithms on two gene expression data sets and identified the optimal algorithm in each case. The first data set deals with SAGE profiles of differentially expressed tags between normal and ductal carcinoma in situ samples of breast cancer patients. The second data set contains the expression profiles over time of positively expressed genes (ORF's) during sporulation of budding yeast. Two separate choices of the functional classes were used for this data set and the results were compared for consistency. CONCLUSION: Functional information of annotated genes available from various GO databases mined using ontology tools can be used to systematically judge the results of an unsupervised clustering algorithm as applied to a gene expression data set in clustering genes. This information could be used to select the right algorithm from a class of clustering algorithms for the given data set

    Genome-Wide Integration on Transcription Factors, Histone Acetylation and Gene Expression Reveals Genes Co-Regulated by Histone Modification Patterns

    Get PDF
    N-terminal tails of H2A, H2B, H3 and H4 histone families are subjected to posttranslational modifications that take part in transcriptional regulation mechanisms, such as transcription factor binding and gene expression. Regulation mechanisms under control of histone modification are important but remain largely unclear, despite of emerging datasets for comprehensive analysis of histone modification. In this paper, we focus on what we call genetic harmonious units (GHUs), which are co-occurring patterns among transcription factor binding, gene expression and histone modification. We present the first genome-wide approach that captures GHUs by combining ChIP-chip with microarray datasets from Saccharomyces cerevisiae. Our approach employs noise-robust soft clustering to select patterns which share the same preferences in transcription factor-binding, histone modification and gene expression, which are all currently implied to be closely correlated. The detected patterns are a well-studied acetylation of lysine 16 of H4 in glucose depletion as well as co-acetylation of five lysine residues of H3 with H4 Lys12 and H2A Lys7 responsible for ribosome biogenesis. Furthermore, our method further suggested the recognition of acetylated H4 Lys16 being crucial to histone acetyltransferase ESA1, whose essential role is still under controversy, from a microarray dataset on ESA1 and its bypass suppressor mutants. These results demonstrate that our approach allows us to provide clearer principles behind gene regulation mechanisms under histone modifications and detect GHUs further by applying to other microarray and ChIP-chip datasets. The source code of our method, which was implemented in MATLAB (http://www.mathworks.com/), is available from the supporting page for this paper: http://www.bic.kyoto-u.ac.jp/pathway/natsume/hm_detector.htm

    Light regulation of metabolic pathways in fungi

    Get PDF
    Light represents a major carrier of information in nature. The molecular machineries translating its electromagnetic energy (photons) into the chemical language of cells transmit vital signals for adjustment of virtually every living organism to its habitat. Fungi react to illumination in various ways, and we found that they initiate considerable adaptations in their metabolic pathways upon growth in light or after perception of a light pulse. Alterations in response to light have predominantly been observed in carotenoid metabolism, polysaccharide and carbohydrate metabolism, fatty acid metabolism, nucleotide and nucleoside metabolism, and in regulation of production of secondary metabolites. Transcription of genes is initiated within minutes, abundance and activity of metabolic enzymes are adjusted, and subsequently, levels of metabolites are altered to cope with the harmful effects of light or to prepare for reproduction, which is dependent on light in many cases. This review aims to give an overview on metabolic pathways impacted by light and to illustrate the physiological significance of light for fungi. We provide a basis for assessment whether a given metabolic pathway might be subject to regulation by light and how these properties can be exploited for improvement of biotechnological processes

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research

    An integrative approach for a network based meta-analysis of viral RNAi screens.

    Get PDF
    BACKGROUND: Big data is becoming ubiquitous in biology, and poses significant challenges in data analysis and interpretation. RNAi screening has become a workhorse of functional genomics, and has been applied, for example, to identify host factors involved in infection for a panel of different viruses. However, the analysis of data resulting from such screens is difficult, with often low overlap between hit lists, even when comparing screens targeting the same virus. This makes it a major challenge to select interesting candidates for further detailed, mechanistic experimental characterization. RESULTS: To address this problem we propose an integrative bioinformatics pipeline that allows for a network based meta-analysis of viral high-throughput RNAi screens. Initially, we collate a human protein interaction network from various public repositories, which is then subjected to unsupervised clustering to determine functional modules. Modules that are significantly enriched with host dependency factors (HDFs) and/or host restriction factors (HRFs) are then filtered based on network topology and semantic similarity measures. Modules passing all these criteria are finally interpreted for their biological significance using enrichment analysis, and interesting candidate genes can be selected from the modules. CONCLUSIONS: We apply our approach to seven screens targeting three different viruses, and compare results with other published meta-analyses of viral RNAi screens. We recover key hit genes, and identify additional candidates from the screens. While we demonstrate the application of the approach using viral RNAi data, the method is generally applicable to identify underlying mechanisms from hit lists derived from high-throughput experimental data, and to select a small number of most promising genes for further mechanistic studies. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13015-015-0035-7) contains supplementary material, which is available to authorized users
    corecore