277 research outputs found

    Serendipitous discoveries in microarray analysis

    Get PDF
    Background Scientists are capable of performing very large scale gene expression experiments with current microarray technologies. In order to find significance in the expression data, it is common to use clustering algorithms to group genes with similar expression patterns. Clusters will often contain related genes, such as co-regulated genes or genes in the same biological pathway. It is too expensive and time consuming to test all of the relationships found in large scale microarray experiments. There are many bioinformatics tools that can be used to infer the significance of microarray experiments and cluster analysis. Materials and methods In this project we review several existing tools and used a combination of them to narrow down the number of significant clusters from a microarray experiment. Microarray data was obtained through the Cerebellar Gene Regulation in Time and Space (Cb GRiTS) database [2]. The data was clustered using paraclique, a graph-based clustering algorithm. Each cluster was evaluated using Gene-Set Cohesion Analysis Tool (GCAT) [3], ONTO-Pathway Analysis [4], and Allen Brain Atlas data [1]. The clusters with the lowest p-values in each of the three analysis methods were researched to determine good candidate clusters for further experimental confirmation of gene relationships. Results and conclusion While looking for genes important to cerebellar development, we serendipitously came across interesting clusters related to neural diseases. For example, we found two clusters that contain genes known to be associated with Parkinson’s disease, Huntington’s disease, and Alzheimer’s disease pathways. Both clusters scored low in all three analyses and have very similar expression patterns but at different expression levels. Such unexpected discoveries help unlock the real power of high throughput data analysis

    OpenML: networked science in machine learning

    Full text link
    Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems. We discuss how OpenML relates to other examples of networked science and what benefits it brings for machine learning research, individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure

    PKD1 Phosphorylation-Dependent Degradation of SNAIL by SCF-FBXO11 Regulates Epithelial-Mesenchymal Transition and Metastasis

    Get PDF
    SummaryMetastatic dissemination is often initiated by the reactivation of an embryonic development program referred to as epithelial-mesenchymal transition (EMT). The transcription factor SNAIL promotes EMT and elicits associated pathological characteristics such as invasion, metastasis, and stemness. To better understand the posttranslational regulation of SNAIL, we performed a luciferase-based, genome-wide E3 ligase siRNA library screen and identified SCF-FBXO11 as an important E3 that targets SNAIL for ubiquitylation and degradation. Furthermore, we discovered that SNAIL degradation by FBXO11 is dependent on Ser-11 phosphorylation of SNAIL by protein kinase D1 (PKD1). FBXO11 blocks SNAIL-induced EMT, tumor initiation, and metastasis in multiple breast cancer models. These findings establish the PKD1-FBXO11-SNAIL axis as a mechanism of posttranslational regulation of EMT and cancer metastasis

    Expression profiling of drug response-from genes to pathways

    Get PDF
    Understanding individual response to a drug—what determines its efficacy and tolerability—is the major bottleneck in current drug development and clinical trials. Intracellular response and metabolism, for example through cytochrome P- 450 enzymes, may either enhance or decrease the effect of different drugs, dependent on the genetic variant. Microarrays offer the potential to screen the genetic composition of the individual patient. However, experiments are “noisy” and must be accompanied by solid and robust data analysis. Furthermore, recent research aims at the combination of highthroughput data with methods of mathematical modeling, enabling problem-oriented assistance in the drug discovery process. This article will discuss state-of-the-art DNA array technology platforms and the basic elements of data analysis and bioinformatics research in drug discovery. Enhancing single-gene analysis, we will present a new method for interpreting gene expression changes in the context of entire pathways. Furthermore, we will introduce the concept of systems biology as a new paradigm for drug development and highlight our recent research—the development of a modeling and simulation platform for biomedical applications. We discuss the potentials of systems biology for modeling the drug response of the individual patient

    Ab initio identification of putative human transcription factor binding sites by comparative genomics

    Get PDF
    We discuss a simple and powerful approach for the ab initio identification of cis-regulatory motifs involved in transcriptional regulation. The method we present integrates several elements: human-mouse comparison, statistical analysis of genomic sequences and the concept of coregulation. We apply it to a complete scan of the human genome. By using the catalogue of conserved upstream sequences collected in the CORG database we construct sets of genes sharing the same overrepresented motif (short DNA sequence) in their upstream regions both in human and in mouse. We perform this construction for all possible motifs from 5 to 8 nucleotides in length and then filter the resulting sets looking for two types of evidence of coregulation: first, we analyze the Gene Ontology annotation of the genes in the set, searching for statistically significant common annotations; second, we analyze the expression profiles of the genes in the set as measured by microarray experiments, searching for evidence of coexpression. The sets which pass one or both filters are conjectured to contain a significant fraction of coregulated genes, and the upstream motifs characterizing the sets are thus good candidates to be the binding sites of the TF's involved in such regulation. In this way we find various known motifs and also some new candidate binding sites.Comment: 22 pages, 2 figures. Supplementary material available from the author

    Genomics of lithium action and response

    Get PDF
    Lithium is the most successful mood stabiliser treatment for bipolar disorder. However, unlike conventional drugs that are designed to interact with a specific molecular target, the actions of lithium are distributed across many biological processes and pathways. Treatment response is subject to genetic variation between individuals and similar genetic variation may dictate susceptibility to side-effects. Transcriptomic, genomic and cell model research strategies have all been deployed in the search for the genetic factors and biological systems that mediate the interaction between genetics and the therapeutic actions of lithium. In this review, recent findings from genome-wide studies and patient cell lines will be summarised and discussed from a standpoint that genuine progress is being made to define clinically useful mechanisms of this treatment, to place it in the context of bipolar disorder pathology, and to move towards a time when the prescription of lithium is targeted to those individuals who will derive the greatest benefit

    No wisdom in the crowd: genome annotation at the time of big data - current status and future prospects

    Get PDF
    Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from ‘the sequence tells the structure tells the function’ fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader

    Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    Get PDF
    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems
    corecore