6,822 research outputs found

    Towards knowledge-based gene expression data mining

    Get PDF
    The field of gene expression data analysis has grown in the past few years from being purely data-centric to integrative, aiming at complementing microarray analysis with data and knowledge from diverse available sources. In this review, we report on the plethora of gene expression data mining techniques and focus on their evolution toward knowledge-based data analysis approaches. In particular, we discuss recent developments in gene expression-based analysis methods used in association and classification studies, phenotyping and reverse engineering of gene networks

    Transcription analysis of apple fruit development using cDNA microarrays

    Get PDF
    The knowledge of the molecular mechanisms underlying fruit quality traits is fundamental to devise efficient marker-assisted selection strategies and to improve apple breeding. In this study, cDNA microarray technology was used to identify genes whose expression changes during fruit development and maturation thus potentially involved in fruit quality traits. The expression profile of 1,536 transcripts was analysed by microarray hybridisation. A total of 177 genes resulted to be differentially expressed in at least one of the developmental stages considered. Gene ontology annotation was employed to univocally describe gene function, while cluster analysis allowed grouping genes according to their expression profile. An overview of the transcriptional changes and of the metabolic pathways involved in fruit development was obtained. As expected, August and September are the two months where the largest number of differentially expressed genes was observed. In particular, 85 genes resulted to be up-regulated in September. Even though most of the differentially expressed genes are involved in primary metabolism, several other interesting functions were detected and will be presented

    Analysis of Gene Expression Data Using BRB-Array Tools

    Get PDF
    BRB-ArrayTools is an integrated software system for the comprehensive analysis of DNA microarray experiments. It was developed by professional biostatisticians experienced in the design and analysis of DNA microarray studies and incorporates methods developed by leading statistical laboratories. The software is designed for use by biomedical scientists who wish to have access to state-of-the-art statistical methods for the analysis of gene expression data and to receive training in the statistical analysis of high dimensional data. The software provides the most extensive set of tools available for predictive classifier development and complete cross-validation. It offers extensive links to genomic websites for gene annotation and analysis tools for pathway analysis. An archive of over 100 datasets of published microarray data with associated clinical data is provided and BRB-ArrayTools automatically imports data from the Gene Expression Omnibus public archive at the National Center for Biotechnology Information

    Bioinformatics protocols for analysis of functional genomics data applied to neuropathy microarray datasets

    Get PDF
    Microarray technology allows the simultaneous measurement of the abundance of thousands of transcripts in living cells. The high-throughput nature of microarray technology means that automatic analytical procedures are required to handle the sheer amount of data, typically generated in a single microarray experiment. Along these lines, this work presents a contribution to the automatic analysis of microarray data by attempting to construct protocols for the validation of publicly available methods for microarray. At the experimental level, an evaluation of amplification of RNA targets prior to hybridisation with the physical array was undertaken. This had the important consequence of revealing the extent to which the significance of intensity ratios between varying biological conditions may be compromised following amplification as well as identifying the underlying cause of this effect. On the basis of these findings, recommendations regarding the usability of RNA amplification protocols with microarray screening were drawn in the context of varying microarray experimental conditions. On the data analysis side, this work has had the important outcome of developing an automatic framework for the validation of functional analysis methods for microarray. This is based on using a GO semantic similarity scoring metric to assess the similarity between functional terms found enriched by functional analysis of a model dataset and those anticipated from prior knowledge of the biological phenomenon under study. Using such validation system, this work has shown, for the first time, that ā€˜Catmapā€™, an early functional analysis method performs better than the more recent and most popular methods of its kind. Crucially, the effectiveness of this validation system implies that such system may be reliably adopted for validation of newly developed functional analysis methods for microarray

    Discovering study-specific gene regulatory networks

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets

    The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart

    Get PDF
    Cardiac development is a complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. Approximately 200 genes of interest were input into the algorithm to generate putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of putative networks are merged and filtered to generate scale-free, hierarchical networks that are statistically significant and biologically relevant. The networks are validated with known gene interactions and used to predict regulatory pathways important for the developing mammalian heart. Area under the precision-recall curve and receiver operator characteristic curve are 9% and 58%, respectively. Of the top 10 ranked predicted interactions, 4 have already been validated. The algorithm is further tested using a network enriched with known interactions and another depleted of them. The inferred networks contained more interactions for the enriched network versus the depleted network. In all test cases, maximum performance of the algorithm was achieved when the purely data-driven method of network inference was combined with a data-independent, functional-based association method. Lastly, the network generated from the list of approximately 200 genes of interest was expanded using gene-profile uniqueness metrics to include approximately 900 additional known mouse genes and to form the most likely cardiogenic gene regulatory network. The resultant network supports known regulatory interactions and contains several novel cardiogenic regulatory interactions. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation
    • ā€¦
    corecore