277 research outputs found
Serendipitous discoveries in microarray analysis
Background
Scientists are capable of performing very large scale gene expression experiments with current microarray technologies. In order to find significance in the expression data, it is common to use clustering algorithms to group genes with similar expression patterns. Clusters will often contain related genes, such as co-regulated genes or genes in the same biological pathway. It is too expensive and time consuming to test all of the relationships found in large scale microarray experiments. There are many bioinformatics tools that can be used to infer the significance of microarray experiments and cluster analysis.
Materials and methods
In this project we review several existing tools and used a combination of them to narrow down the number of significant clusters from a microarray experiment. Microarray data was obtained through the Cerebellar Gene Regulation in Time and Space (Cb GRiTS) database [2]. The data was clustered using paraclique, a graph-based clustering algorithm. Each cluster was evaluated using Gene-Set Cohesion Analysis Tool (GCAT) [3], ONTO-Pathway Analysis [4], and Allen Brain Atlas data [1]. The clusters with the lowest p-values in each of the three analysis methods were researched to determine good candidate clusters for further experimental confirmation of gene relationships.
Results and conclusion
While looking for genes important to cerebellar development, we serendipitously came across interesting clusters related to neural diseases. For example, we found two clusters that contain genes known to be associated with Parkinson’s disease, Huntington’s disease, and Alzheimer’s disease pathways. Both clusters scored low in all three analyses and have very similar expression patterns but at different expression levels. Such unexpected discoveries help unlock the real power of high throughput data analysis
OpenML: networked science in machine learning
Many sciences have made significant breakthroughs by adopting online tools
that help organize, structure and mine information that is too detailed to be
printed in journals. In this paper, we introduce OpenML, a place for machine
learning researchers to share and organize data in fine detail, so that they
can work more effectively, be more visible, and collaborate with others to
tackle harder problems. We discuss how OpenML relates to other examples of
networked science and what benefits it brings for machine learning research,
individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure
PKD1 Phosphorylation-Dependent Degradation of SNAIL by SCF-FBXO11 Regulates Epithelial-Mesenchymal Transition and Metastasis
SummaryMetastatic dissemination is often initiated by the reactivation of an embryonic development program referred to as epithelial-mesenchymal transition (EMT). The transcription factor SNAIL promotes EMT and elicits associated pathological characteristics such as invasion, metastasis, and stemness. To better understand the posttranslational regulation of SNAIL, we performed a luciferase-based, genome-wide E3 ligase siRNA library screen and identified SCF-FBXO11 as an important E3 that targets SNAIL for ubiquitylation and degradation. Furthermore, we discovered that SNAIL degradation by FBXO11 is dependent on Ser-11 phosphorylation of SNAIL by protein kinase D1 (PKD1). FBXO11 blocks SNAIL-induced EMT, tumor initiation, and metastasis in multiple breast cancer models. These findings establish the PKD1-FBXO11-SNAIL axis as a mechanism of posttranslational regulation of EMT and cancer metastasis
Expression profiling of drug response-from genes to pathways
Understanding individual response to a drug—what determines its efficacy and tolerability—is the major bottleneck in current drug development and clinical trials. Intracellular response and metabolism, for example through cytochrome P- 450 enzymes, may either enhance or decrease the effect of different drugs, dependent on the genetic variant. Microarrays offer the potential to screen the genetic composition of the individual patient. However, experiments are “noisy” and must be accompanied by solid and robust data analysis. Furthermore, recent research aims at the combination of highthroughput data with methods of mathematical modeling, enabling problem-oriented assistance in the drug discovery process. This article will discuss state-of-the-art DNA array technology platforms and the basic elements of data analysis and bioinformatics research in drug discovery. Enhancing single-gene analysis, we will present a new method for interpreting gene expression changes in the context of entire pathways. Furthermore, we will introduce the concept of systems biology as a new paradigm for drug development and highlight our recent research—the development of a modeling and simulation platform for biomedical applications. We discuss the potentials of systems biology for modeling the drug response of the individual patient
Recommended from our members
Facilitating insight into a simulation model using visualization and dynamic model previews
This paper shows how model simplification, by replacing iterative steps with unitary predictive equations, can enable dynamic interaction with a complex simulation process. Model previews extend the techniques of dynamic querying and query previews into the context of ad hoc simulation model exploration. A case study is presented within the domain of counter-current chromatography. The relatively novel method of insight evaluation was applied, given the exploratory nature of the task. The evaluation data show that the trade-off in accuracy is far outweighed by benefits of dynamic interaction. The number of insights gained using the enhanced interactive version of the computer model was more than six times higher than the number of insights gained using the basic version of the model. There was also a trend for dynamic interaction to facilitate insights of greater domain importance
Ab initio identification of putative human transcription factor binding sites by comparative genomics
We discuss a simple and powerful approach for the ab initio identification of
cis-regulatory motifs involved in transcriptional regulation. The method we
present integrates several elements: human-mouse comparison, statistical
analysis of genomic sequences and the concept of coregulation. We apply it to a
complete scan of the human genome. By using the catalogue of conserved upstream
sequences collected in the CORG database we construct sets of genes sharing the
same overrepresented motif (short DNA sequence) in their upstream regions both
in human and in mouse. We perform this construction for all possible motifs
from 5 to 8 nucleotides in length and then filter the resulting sets looking
for two types of evidence of coregulation: first, we analyze the Gene Ontology
annotation of the genes in the set, searching for statistically significant
common annotations; second, we analyze the expression profiles of the genes in
the set as measured by microarray experiments, searching for evidence of
coexpression. The sets which pass one or both filters are conjectured to
contain a significant fraction of coregulated genes, and the upstream motifs
characterizing the sets are thus good candidates to be the binding sites of the
TF's involved in such regulation. In this way we find various known motifs and
also some new candidate binding sites.Comment: 22 pages, 2 figures. Supplementary material available from the
author
Genomics of lithium action and response
Lithium is the most successful mood stabiliser treatment for bipolar disorder. However, unlike conventional drugs that are designed to interact with a specific molecular target, the actions of lithium are distributed across many biological processes and pathways. Treatment response is subject to genetic variation between individuals and similar genetic variation may dictate susceptibility to side-effects. Transcriptomic, genomic and cell model research strategies have all been deployed in the search for the genetic factors and biological systems that mediate the interaction between genetics and the therapeutic actions of lithium. In this review, recent findings from genome-wide studies and patient cell lines will be summarised and discussed from a standpoint that genuine progress is being made to define clinically useful mechanisms of this treatment, to place it in the context of bipolar disorder pathology, and to move towards a time when the prescription of lithium is targeted to those individuals who will derive the greatest benefit
No wisdom in the crowd: genome annotation at the time of big data - current status and future prospects
Science and engineering rely on the accumulation
and dissemination of knowledge to make discoveries
and create new designs. Discovery-driven genome
research rests on knowledge passed on via gene
annotations. In response to the deluge of sequencing
big data, standard annotation practice employs automated
procedures that rely on majority rules. We
argue this hinders progress through the generation
and propagation of errors, leading investigators into
blind alleys. More subtly, this inductive process discourages
the discovery of novelty, which remains
essential in biological research and reflects the nature
of biology itself. Annotation systems, rather than
being repositories of facts, should be tools that support
multiple modes of inference. By combining
deduction, induction and abduction, investigators can
generate hypotheses when accurate knowledge is
extracted from model databases. A key stance is to
depart from ‘the sequence tells the structure tells the
function’ fallacy, placing function first. We illustrate
our approach with examples of critical or unexpected
pathways, using MicroScope to demonstrate how
tools can be implemented following the principles we
advocate. We end with a challenge to the reader
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities
The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems
- …