6,271 research outputs found
ModHMM: A Modular Supra-Bayesian Genome Segmentation Method
Genome segmentation methods are powerful tools to obtain cell type or tissue-specific genome-wide annotations and are frequently used to discover regulatory elements. However, traditional segmentation methods show low predictive accuracy and their data-driven annotations have some undesirable properties. As an alternative, we developed ModHMM, a highly modular genome segmentation method. Inspired by the supra-Bayesian approach, it incorporates predictions from a set of classifiers. This allows to compute genome segmentations by utilizing state-of-the-art methodology. We demonstrate the method on ENCODE data and show that it outperforms traditional segmentation methods not only in terms of predictive performance, but also in qualitative aspects. Therefore, ModHMM is a valuable alternative to study the epigenetic and regulatory landscape across and within cell types or tissues
Gcn4p and novel upstream activating sequences regulate targets of the unfolded protein response.
Eukaryotic cells respond to accumulation of unfolded proteins in the endoplasmic reticulum (ER) by activating the unfolded protein response (UPR), a signal transduction pathway that communicates between the ER and the nucleus. In yeast, a large set of UPR target genes has been experimentally determined, but the previously characterized unfolded protein response element (UPRE), an upstream activating sequence (UAS) found in the promoter of the UPR target gene KAR2, cannot account for the transcriptional regulation of most genes in this set. To address this puzzle, we analyzed the promoters of UPR target genes computationally, identifying as candidate UASs short sequences that are statistically overrepresented. We tested the most promising of these candidate UASs for biological activity, and identified two novel UPREs, which are necessary and sufficient for UPR activation of promoters. A genetic screen for activators of the novel motifs revealed that the transcription factor Gcn4p plays an essential and previously unrecognized role in the UPR: Gcn4p and its activator Gcn2p are required for induction of a majority of UPR target genes during ER stress. Both Hac1p and Gcn4p bind target gene promoters to stimulate transcriptional induction. Regulation of Gcn4p levels in response to changing physiological conditions may function as an additional means to modulate the UPR. The discovery of a role for Gcn4p in the yeast UPR reveals an additional level of complexity and demonstrates a surprising conservation of the signaling circuit between yeast and metazoan cells
Millisecond single-molecule localization microscopy combined with convolution analysis and automated image segmentation to determine protein concentrations in complexly structured, functional cells, one cell at a time
We present a single-molecule tool called the CoPro (Concentration of
Proteins) method that uses millisecond imaging with convolution analysis,
automated image segmentation and super-resolution localization microscopy to
generate robust estimates for protein concentration in different compartments
of single living cells, validated using realistic simulations of complex
multiple compartment cell types. We demonstrates its utility experimentally on
model Escherichia coli bacteria and Saccharomyces cerevisiae budding yeast
cells, and use it to address the biological question of how signals are
transduced in cells. Cells in all domains of life dynamically sense their
environment through signal transduction mechanisms, many involving gene
regulation. The glucose sensing mechanism of S. cerevisiae is a model system
for studying gene regulatory signal transduction. It uses the multi-copy
expression inhibitor of the GAL gene family, Mig1, to repress unwanted genes in
the presence of elevated extracellular glucose concentrations. We fluorescently
labelled Mig1 molecules with green fluorescent protein (GFP) via chromosomal
integration at physiological expression levels in living S. cerevisiae cells,
in addition to the RNA polymerase protein Nrd1 with the fluorescent protein
reporter mCherry. Using CoPro we make quantitative estimates of Mig1 and Nrd1
protein concentrations in the cytoplasm and nucleus compartments on a
cell-by-cell basis under physiological conditions. These estimates indicate a
4-fold shift towards higher values in concentration of diffusive Mig1 in the
nucleus if the external glucose concentration is raised, whereas equivalent
levels in the cytoplasm shift to smaller values with a relative change an order
of magnitude smaller. This compares with Nrd1 which is not involved directly in
glucose sensing, which is almost exclusively localized in the nucleus under
high and..
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Challenges for modeling global gene regulatory networks during development: Insights from Drosophila
AbstractDevelopment is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative “coarse-grain” models operating at the gene level to very “fine-grain” quantitative models operating at the biophysical “transcription factor-DNA level”. Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans
Machine Learning and Genome Annotation: A Match Meant to Be?
By its very nature, genomics produces large, high-dimensional datasets that are well suited to analysis by machine learning approaches. Here, we explain some key aspects of machine learning that make it useful for genome annotation, with illustrative examples from ENCODE
Unsupervised Classification for Tiling Arrays: ChIP-chip and Transcriptome
Tiling arrays make possible a large scale exploration of the genome thanks to
probes which cover the whole genome with very high density until 2 000 000
probes. Biological questions usually addressed are either the expression
difference between two conditions or the detection of transcribed regions. In
this work we propose to consider simultaneously both questions as an
unsupervised classification problem by modeling the joint distribution of the
two conditions. In contrast to previous methods, we account for all available
information on the probes as well as biological knowledge like annotation and
spatial dependence between probes. Since probes are not biologically relevant
units we propose a classification rule for non-connected regions covered by
several probes. Applications to transcriptomic and ChIP-chip data of
Arabidopsis thaliana obtained with a NimbleGen tiling array highlight the
importance of a precise modeling and the region classification
A hierarchical Bayesian model for inference of copy number variants and their association to gene expression
A number of statistical models have been successfully developed for the
analysis of high-throughput data from a single source, but few methods are
available for integrating data from different sources. Here we focus on
integrating gene expression levels with comparative genomic hybridization (CGH)
array measurements collected on the same subjects. We specify a measurement
error model that relates the gene expression levels to latent copy number
states which, in turn, are related to the observed surrogate CGH measurements
via a hidden Markov model. We employ selection priors that exploit the
dependencies across adjacent copy number states and investigate MCMC stochastic
search techniques for posterior inference. Our approach results in a unified
modeling framework for simultaneously inferring copy number variants (CNV) and
identifying their significant associations with mRNA transcripts abundance. We
show performance on simulated data and illustrate an application to data from a
genomic study on human cancer cell lines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS705 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …