45 research outputs found

    Taking U out, with two nucleases?

    Get PDF
    BACKGROUND: REX1 and REX2 are protein components of the RNA editing complex (the editosome) and function as exouridylylases. The exact roles of REX1 and REX2 in the editosome are unclear and the consequences of the presence of two related proteins are not fully understood. Here, a variety of computational studies were performed to enhance understanding of the structure and function of REX proteins in Trypanosoma and Leishmania species. RESULTS: Sequence analysis and homology modeling of the Endonuclease/Exonuclease/Phosphatase (EEP) domain at the C-terminus of REX1 and REX2 highlights a common active site shared by all EEP domains. Phylogenetic analysis indicates that REX proteins contain a distinct subfamily of EEP domains. Inspection of three-dimensional models of the EEP domain in Trypanosoma brucei REX1 and REX2, and Leishmania major REX1 suggests variations of previously characterized key residues likely to be important in catalysis and determining substrate specificity. CONCLUSION: We have identified features of the REX EEP domain that distinguish it from other family members and hence subfamily specific determinants of catalysis and substrate binding. The results provide specific guidance for experimental investigations about the role(s) of REX proteins in RNA editing

    A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data.</p> <p>Results</p> <p>The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (<it>l</it><sub>1</sub>-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>1, I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>2 and I<smcaps>N</smcaps>S<smcaps>ILICO</smcaps>3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published <it>Saccharomyces cerevisae </it>cell cycle transcript profiling data sets capture known regulatory associations. In each <it>S. cerevisiae </it>LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification.</p> <p>Conclusion</p> <p>A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.</p

    Evaluating evolution as a learning algorithm

    Full text link
    We interpret the Moran model of natural selection and drift as an algorithm for learning features of a simplified fitness landscape, specifically genotype superiority. This algorithm's efficiency in extracting these characteristics is evaluated by comparing it to a novel Bayesian learning algorithm developed using information-theoretic tools. This algorithm makes use of a communication channel analogy between an environment and an evolving population. We use the associated channel-rate to determine an informative population-sampling procedure. We find that the algorithm can identify genotype superiority faster than the Moran model but at the cost of larger fluctuations in uncertainty

    Evaluation of normalization methods for cDNA microarray data by k-NN classification

    Get PDF
    BACKGROUND: Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. RESULTS: Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. CONCLUSION: Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics

    Homeostatic Imbalance between Apoptosis and Cell Renewal in the Liver of Premature Aging XpdTTD Mice

    Get PDF
    Unrepaired or misrepaired DNA damage has been implicated as a causal factor in cancer and aging. XpdTTD mice, harboring defects in nucleotide excision repair and transcription due to a mutation in the Xpd gene (R722W), display severe symptoms of premature aging but have a reduced incidence of cancer. To gain further insight into the molecular basis of the mutant-specific manifestation of age-related phenotypes, we used comparative microarray analysis of young and old female livers to discover gene expression signatures distinguishing XpdTTD mice from their age-matched wild type controls. We found a transcription signature of increased apoptosis in the XpdTTD mice, which was confirmed by in situ immunohistochemical analysis and found to be accompanied by increased proliferation. However, apoptosis rate exceeded the rate of proliferation, resulting in homeostatic imbalance. Interestingly, a metabolic response signature was observed involving decreased energy metabolism and reduced IGF-1 signaling, a major modulator of life span. We conclude that while the increased apoptotic response to endogenous DNA damage contributes to the accelerated aging phenotypes and the reduced cancer incidence observed in the XpdTTD mice, the signature of reduced energy metabolism is likely to reflect a compensatory adjustment to limit the increased genotoxic stress in these mutants. These results support a general model for premature aging in DNA repair deficient mice based on cellular responses to DNA damage that impair normal tissue homeostasis

    MPHASYS: a mouse phenotype analysis system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Systematic, high-throughput studies of mouse phenotypes have been hampered by the inability to analyze individual animal data from a multitude of sources in an integrated manner. Studies generally make comparisons at the level of genotype or treatment thereby excluding associations that may be subtle or involve compound phenotypes. Additionally, the lack of integrated, standardized ontologies and methodologies for data exchange has inhibited scientific collaboration and discovery.</p> <p>Results</p> <p>Here we introduce a Mouse Phenotype Analysis System (MPHASYS), a platform for integrating data generated by studies of mouse models of human biology and disease such as aging and cancer. This computational platform is designed to provide a standardized methodology for working with animal data; a framework for data entry, analysis and sharing; and ontologies and methodologies for ensuring accurate data capture. We describe the tools that currently comprise MPHASYS, primarily ones related to mouse pathology, and outline its use in a study of individual animal-specific patterns of multiple pathology in mice harboring a specific germline mutation in the DNA repair and transcription-specific gene Xpd.</p> <p>Conclusion</p> <p>MPHASYS is a system for analyzing multiple data types from individual animals. It provides a framework for developing data analysis applications, and tools for collecting and distributing high-quality data. The software is platform independent and freely available under an open-source license <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p

    A human breast cell model of pre-invasive to invasive transition

    Get PDF
    A crucial step in human breast cancer progression is the acquisition of invasiveness. There is a distinct lack of human cell culture models to study the transition from pre-invasive to invasive phenotype as it may occur 'spontaneously' in vivo. To delineate molecular alterations important for this transition, we isolated human breast epithelial cell lines that showed partial loss of tissue polarity in three-dimensional reconstituted-basement membrane cultures. These cells remained non-invasive; however, unlike their non-malignant counterparts, they exhibited a high propensity to acquire invasiveness through basement membrane in culture. The genomic aberrations and gene expression profiles of the cells in this model showed a high degree of similarity to primary breast tumor profiles. The xenograft tumors formed by the cell lines in three different microenvironments in nude mice displayed metaplastic phenotypes, including squamous and basal characteristics, with invasive cells exhibiting features of higher grade tumors. To find functionally significant changes in transition from pre-invasive to invasive phenotype, we performed attribute profile clustering analysis on the list of genes differentially expressed between pre-invasive and invasive cells. We found integral membrane proteins, transcription factors, kinases, transport molecules, and chemokines to be highly represented. In addition, expression of matrix metalloproteinases MMP-9,-13,-15,-17 was up regulated in the invasive cells. Using siRNA based approaches, we found these MMPs to be required for the invasive phenotype. This model provides a new tool for dissection of mechanisms by which pre-invasive breast cells could acquire invasiveness in a metaplastic context
    corecore