46 research outputs found

    Evaluation of reference-based two-color methods for measurement of gene expression ratios using spotted cDNA microarrays

    Get PDF
    BACKGROUND: Spotted cDNA microarrays generally employ co-hybridization of fluorescently-labeled RNA targets to produce gene expression ratios for subsequent analysis. Direct comparison of two RNA samples in the same microarray provides the highest level of accuracy; however, due to the number of combinatorial pair-wise comparisons, the direct method is impractical for studies including large number of individual samples (e.g., tumor classification studies). For such studies, indirect comparisons using a common reference standard have been the preferred method. Here we evaluated the precision and accuracy of reconstructed ratios from three indirect methods relative to ratios obtained from direct hybridizations, herein considered as the gold-standard. RESULTS: We performed hybridizations using a fixed amount of Cy3-labeled reference oligonucleotide (RefOligo) against distinct Cy5-labeled targets from prostate, breast and kidney tumor samples. Reconstructed ratios between all tissue pairs were derived from ratios between each tissue sample and RefOligo. Reconstructed ratios were compared to (i) ratios obtained in parallel from direct pair-wise hybridizations of tissue samples, and to (ii) reconstructed ratios derived from hybridization of each tissue against a reference RNA pool (RefPool). To evaluate the effect of the external references, reconstructed ratios were also calculated directly from intensity values of single-channel (One-Color) measurements derived from tissue sample data collected in the RefOligo experiments. We show that the average coefficient of variation of ratios between intra- and inter-slide replicates derived from RefOligo, RefPool and One-Color were similar and 2 to 4-fold higher than ratios obtained in direct hybridizations. Correlation coefficients calculated for all three tissue comparisons were also similar. In addition, the performance of all indirect methods in terms of their robustness to identify genes deemed as differentially expressed based on direct hybridizations, as well as false-positive and false-negative rates, were found to be comparable. CONCLUSION: RefOligo produces ratios as precise and accurate as ratios reconstructed from a RNA pool, thus representing a reliable alternative in reference-based hybridization experiments. In addition, One-Color measurements alone can reconstruct expression ratios without loss in precision or accuracy. We conclude that both methods are adequate options in large-scale projects where the amount of a common reference RNA pool is usually restrictive

    ProbCD: enrichment analysis accounting for categorization uncertainty

    Get PDF
    As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation

    ProbFAST: Probabilistic Functional Analysis System Tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis.</p> <p>Results</p> <p>We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes.</p> <p>Conclusions</p> <p>ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at <url>http://gdm.fmrp.usp.br/probfast</url>.</p

    ProbMetab: an R package for Bayesian probabilistic annotation of LC-MS based metabolomics

    Full text link
    We present ProbMetab, an R package which promotes substantial improvement in automatic probabilistic LC-MS based metabolome annotation. The inference engine core is based on a Bayesian model implemented to: (i) allow diverse source of experimental data and metadata to be systematically incorporated into the model with alternative ways to calculate the likelihood function and; (ii) allow sensitive selection of biologically meaningful biochemical reactions databases as Dirichlet-categorical prior distribution. Additionally, to ensure result interpretation by system biologists, we display the annotation in a network where observed mass peaks are connected if their candidate metabolites are substrate/product of known biochemical reactions. This graph can be overlaid with other graph-based analysis, such as partial correlation networks, in a visualization scheme exported to Cytoscape, with web and stand alone versions. ProbMetab was implemented in a modular fashion to fit together with established upstream (xcms, CAMERA, AStream, mzMatch.R, etc) and downstream R package tools (GeneNet, RCytoscape, DiffCorr, etc). ProbMetab, along with extensive documentation and case studies, is freely available under GNU license at: http://labpib.fmrp.usp.br/methods/probmetab/.Comment: Manuscript to be submitted very soon. 7 pages, 3 color figures. There is a companion material, the two case studies, which are going to be posted here together with the main text in next updated versio

    Simcluster: clustering enumeration gene expression data on the simplex space

    Get PDF
    Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST &#x22;digital northern&#x22;, are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.&#xd;&#xa;&#xd;&#xa;Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.&#xd;&#xa;&#xd;&#xa;Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data

    Markov Chain Ontology Analysis (MCOA)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biomedical ontologies have become an increasingly critical lens through which researchers analyze the genomic, clinical and bibliographic data that fuels scientific research. Of particular relevance are methods, such as enrichment analysis, that quantify the importance of ontology classes relative to a collection of domain data. Current analytical techniques, however, remain limited in their ability to handle many important types of structural complexity encountered in real biological systems including class overlaps, continuously valued data, inter-instance relationships, non-hierarchical relationships between classes, semantic distance and sparse data.</p> <p>Results</p> <p>In this paper, we describe a methodology called Markov Chain Ontology Analysis (MCOA) and illustrate its use through a MCOA-based enrichment analysis application based on a generative model of gene activation. MCOA models the classes in an ontology, the instances from an associated dataset and all directional inter-class, class-to-instance and inter-instance relationships as a single finite ergodic Markov chain. The adjusted transition probability matrix for this Markov chain enables the calculation of eigenvector values that quantify the importance of each ontology class relative to other classes and the associated data set members. On both controlled Gene Ontology (GO) data sets created with Escherichia coli, Drosophila melanogaster and Homo sapiens annotations and real gene expression data extracted from the Gene Expression Omnibus (GEO), the MCOA enrichment analysis approach provides the best performance of comparable state-of-the-art methods.</p> <p>Conclusion</p> <p>A methodology based on Markov chain models and network analytic metrics can help detect the relevant signal within large, highly interdependent and noisy data sets and, for applications such as enrichment analysis, has been shown to generate superior performance on both real and simulated data relative to existing state-of-the-art approaches.</p

    Empirical bayes analysis of sequencing-based transcriptional profiling without replicates

    Get PDF
    Background: Recent technological advancements have made high throughput sequencing an increasingly popular approach for transcriptome analysis. Advantages of sequencing-based transcriptional profiling over microarrays have been reported, including lower technical variability. However, advances in technology do not remove biological variation between replicates and this variation is often neglected in many analyses. Results: We propose an empirical Bayes method, titled Analysis of Sequence Counts (ASC), to detect differential expression based on sequencing technology. ASC borrows information across sequences to establish prior distribution of sample variation, so that biological variation can be accounted for even when replicates are not available. Compared to current approaches that simply tests for equality of proportions in two samples, ASC is less biased towards highly expressed sequences and can identify more genes with a greater log fold change at lower overall abundance. Conclusions: ASC unifies the biological and statistical significance of differential expression by estimating the posterior mean of log fold change and estimating false discovery rates based on the posterior mean. The implementation in R is available at http://www.stat.brown.edu/Zwu/research.aspx

    ACUTE KIDNEY INJURY CAUSED BY Crotalus AND Bothrops SNAKE VENOM: A REVIEW OF EPIDEMIOLOGY, CLINICAL MANIFESTATIONS AND TREATMENT

    Get PDF
    SUMMARY Ophidic accidents are an important public health problem due to their incidence, morbidity and mortality. An increasing number of cases have been registered in Brazil in the last few years. Several studies point to the importance of knowing the clinical complications and adequate approach in these accidents. However, knowledge about the risk factors is not enough and there are an increasing number of deaths due to these accidents in Brazil. In this context, acute kidney injury (AKI) appears as one of the main causes of death and consequences for these victims, which are mainly young males working in rural areas. Snakes of the Bothrops and Crotalus genera are the main responsible for renal involvement in ophidic accidents in South America. The present study is a literature review of AKI caused by Bothrops and Crotalus snake venom regarding diverse characteristics, emphasizing the most appropriate therapeutic approach for these cases. Recent studies have been carried out searching for complementary therapies for the treatment of ophidic accidents, including the use of lipoic acid, simvastatin and allopurinol. Some plants, such as Apocynaceae, Lamiaceae and Rubiaceae seem to have a beneficial role in the treatment of this type of envenomation. Future studies will certainly find new therapeutic measures for ophidic accidents

    Co-expression network of neural-differentiation genes shows specific pattern in schizophrenia

    Get PDF
    Background: Schizophrenia is a neurodevelopmental disorder with genetic and environmental factors contributing to its pathogenesis, although the mechanism is unknown due to the difficulties in accessing diseased tissue during human neurodevelopment. The aim of this study was to find neuronal differentiation genes disrupted in schizophrenia and to evaluate those genes in post-mortem brain tissues from schizophrenia cases and controls. Methods: We analyzed differentially expressed genes (DEG), copy number variation (CNV) and differential methylation in human induced pluripotent stem cells (hiPSC) derived from fibroblasts from one control and one schizophrenia patient and further differentiated into neuron (NPC). Expression of the DEG were analyzed with microarrays of post-mortem brain tissue (frontal cortex) cohort of 29 schizophrenia cases and 30 controls. A Weighted Gene Co-expression Network Analysis (WGCNA) using the DEG was used to detect clusters of co-expressed genes that werenon-conserved between adult cases and controls brain samples. Results: We identified methylation alterations potentially involved with neuronal differentiation in schizophrenia, which displayed an over-representation of genes related to chromatin remodeling complex (adjP = 0.04). We found 228 DEG associated with neuronal differentiation. These genes were involved with metabolic processes, signal transduction, nervous system development, regulation of neurogenesis and neuronal differentiation. Between adult brain samples from cases and controls there were 233 DEG, with only four genes overlapping with the 228 DEG, probably because we compared single cell to tissue bulks and more importantly, the cells were at different stages of development. The comparison of the co-expressed network of the 228 genes in adult brain samples between cases and controls revealed a less conserved module enriched for genes associated with oxidative stress and negative regulation of cell differentiation. Conclusion: This study supports the relevance of using cellular approaches to dissect molecular aspects of neurogenesis with impact in the schizophrenic brain. We showed that, although generated by different approaches, both sets of DEG associated to schizophrenia were involved with neocortical development. The results add to the hypothesis that critical metabolic changes may be occurring during early neurodevelopment influencing faulty development of the brain and potentially contributing to further vulnerability to the illness.We thank the patients, doctors and nurses involved with sample collection and the Stanley Medical Research Institute. This research was supported by either Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq #17/2008) and Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro (FAPERJ). MM (CNPq 304429/2014-7), ACT (FAPESP 2014/00041-1), LL (CAPES 10682/13-9) HV (CAPES) and BP (PPSUS 137270) were supported by their fellowshipsinfo:eu-repo/semantics/publishedVersio
    corecore