81 research outputs found

    Inference algorithms for gene networks: a statistical mechanics analysis

    Full text link
    The inference of gene regulatory networks from high throughput gene expression data is one of the major challenges in systems biology. This paper aims at analysing and comparing two different algorithmic approaches. The first approach uses pairwise correlations between regulated and regulating genes; the second one uses message-passing techniques for inferring activating and inhibiting regulatory interactions. The performance of these two algorithms can be analysed theoretically on well-defined test sets, using tools from the statistical physics of disordered systems like the replica method. We find that the second algorithm outperforms the first one since it takes into account collective effects of multiple regulators

    The liver pharmacological and xenobiotic gene response repertoire

    Get PDF
    We have used a supervised classification approach to systematically mine a large microarray database derived from livers of compound-treated rats. Thirty-four distinct signatures (classifiers) for pharmacological and toxicological end points can be identified. Just 200 genes are sufficient to classify these end points. Signatures were enriched in xenobiotic and immune response genes and contain un-annotated genes, indicating that not all key genes in the liver xenobiotic responses have been characterized. Many signatures with equal classification capabilities but with no gene in common can be derived for the same phenotypic end point. The analysis of the union of all genes present in these signatures can reveal the underlying biology of that end point as illustrated here using liver fibrosis signatures. Our approach using the whole genome and a diverse set of compounds allows a comprehensive view of most pharmacological and toxicological questions and is applicable to other situations such as disease and development

    Tracking and coordinating an international curation effort for the CCDS Project

    Get PDF
    The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a ‘gold standard’ definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines

    Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes

    Get PDF
    Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine

    A Flexible Approach for Highly Multiplexed Candidate Gene Targeted Resequencing

    Get PDF
    We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies

    Application of Biomarkers in Cancer Risk Management: Evaluation from Stochastic Clonal Evolutionary and Dynamic System Optimization Points of View

    Get PDF
    Aside from primary prevention, early detection remains the most effective way to decrease mortality associated with the majority of solid cancers. Previous cancer screening models are largely based on classification of at-risk populations into three conceptually defined groups (normal, cancer without symptoms, and cancer with symptoms). Unfortunately, this approach has achieved limited successes in reducing cancer mortality. With advances in molecular biology and genomic technologies, many candidate somatic genetic and epigenetic “biomarkers” have been identified as potential predictors of cancer risk. However, none have yet been validated as robust predictors of progression to cancer or shown to reduce cancer mortality. In this Perspective, we first define the necessary and sufficient conditions for precise prediction of future cancer development and early cancer detection within a simple physical model framework. We then evaluate cancer risk prediction and early detection from a dynamic clonal evolution point of view, examining the implications of dynamic clonal evolution of biomarkers and the application of clonal evolution for cancer risk management in clinical practice. Finally, we propose a framework to guide future collaborative research between mathematical modelers and biomarker researchers to design studies to investigate and model dynamic clonal evolution. This approach will allow optimization of available resources for cancer control and intervention timing based on molecular biomarkers in predicting cancer among various risk subsets that dynamically evolve over time

    Somatic mutational landscape of hereditary hematopoietic malignancies caused by germline variants in <i>RUNX1</i>, <i>GATA2</i>, and <i>DDX41</i>

    Get PDF
    Individuals with germ line variants associated with hereditary hematopoietic malignancies (HHMs) have a highly variable risk for leukemogenesis. Gaps in our understanding of premalignant states in HHMs have hampered efforts to design effective clinical surveillance programs, provide personalized preemptive treatments, and inform appropriate counseling for patients. We used the largest known comparative international cohort of germline RUNX1, GATA2, or DDX41 variant carriers without and with hematopoietic malignancies (HMs) to identify patterns of genetic drivers that are unique to each HHM syndrome before and after leukemogenesis. These patterns included striking heterogeneity in rates of early-onset clonal hematopoiesis (CH), with a high prevalence of CH in RUNX1 and GATA2 variant carriers who did not have malignancies (carriers-without HM). We observed a paucity of CH in DDX41 carriers-without HM. In RUNX1 carriers-without HM with CH, we detected variants in TET2, PHF6, and, most frequently, BCOR. These genes were recurrently mutated in RUNX1-driven malignancies, suggesting CH is a direct precursor to malignancy in RUNX1-driven HHMs. Leukemogenesis in RUNX1 and DDX41 carriers was often driven by second hits in RUNX1 and DDX41, respectively. This study may inform the development of HHM-specific clinical trials and gene-specific approaches to clinical monitoring. For example, trials investigating the potential benefits of monitoring DDX41 carriers-without HM for low-frequency second hits in DDX41 may now be beneficial. Similarly, trials monitoring carriers-without HM with RUNX1 germ line variants for the acquisition of somatic variants in BCOR, PHF6, and TET2 and second hits in RUNX1 are warranted

    The population biology and evolutionary significance of Ty elements in Saccharomyces cerevisiae

    Full text link
    The basic structure and properties of Ty elements are considered with special reference to their role as agents of evolutionary change. Ty elements may generate genetic variation for fitness by their action as mutagens, as well as by providing regions of portable homology for recombination. The mutational spectra generated by Ty 1 transposition events may, due to their target specificity and gene regulatory capabilities, possess a higher frequency of adaptively favorable mutations than spectra resulting from other types of mutational processes. Laboratory strains contain between 25–35 elements, and in both these and industrial strains the insertions appear quite stable. In contrast, a wide variation in Ty number is seen in wild isolates, with a lower average number/genome. Factors which may determine Ty copy number in populations include transposition rates (dependent on Ty copy number and mating type), and stabilization of Ty elements in the genome as well as selection for and against Ty insertions in the genome. Although the average effect of Ty transpositions are deleterious, populations initiated with a single clone containing a single Ty element steadily accumulated Ty elements over 1,000 generations. Direct evidence that Ty transposition events can be selectively favored is provided by experiments in which populations containing large amounts of variability for Ty1 copy number were maintained for ∼100 generations in a homogeneous environment. At their termination, the frequency of clones containing 0 Ty elements had decreased to ∼0.0, and the populations had became dominated by a small number of clones containing >0 Ty elements. No such reduction in variability was observed in populations maintained in a structured environment, though changes in Ty number were observed. The implications of genetic (mating type and ploidy) changes and environmental fluctuations for the long-term persistence of Ty elements within the S. cerevisiae species group are discussed.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/42799/1/10709_2004_Article_BF00133718.pd
    corecore