Search CORE

14,798 research outputs found

The EM Algorithm and the Rise of Computational Biology

Author: Citable Link
Jun S. Liu
Xiaodan Fan
Yuan Yuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Regulatory motif discovery using a population clustering evolutionary algorithm

Author: Lones Michael A.
Tyrrell Andy M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2007
Field of study

This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences

White Rose Research Online

A conserved filamentous assembly underlies the structure of the meiotic chromosome axis.

Author: Caballero Iracema
Corbett Kevin D
Hagemann Götz
Herzog Franz
Lehmer Madison K
MacQueen Amy J
Rosenberg Scott C
Ur Sarah N
Usón Isabel
West Alan Mv
Ye Qiaozhen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The meiotic chromosome axis plays key roles in meiotic chromosome organization and recombination, yet the underlying protein components of this structure are highly diverged. Here, we show that 'axis core proteins' from budding yeast (Red1), mammals (SYCP2/SYCP3), and plants (ASY3/ASY4) are evolutionarily related and play equivalent roles in chromosome axis assembly. We first identify 'closure motifs' in each complex that recruit meiotic HORMADs, the master regulators of meiotic recombination. We next find that axis core proteins form homotetrameric (Red1) or heterotetrameric (SYCP2:SYCP3 and ASY3:ASY4) coiled-coil assemblies that further oligomerize into micron-length filaments. Thus, the meiotic chromosome axis core in fungi, mammals, and plants shares a common molecular architecture, and likely also plays conserved roles in meiotic chromosome axis assembly and recombination control

Open Access LMU

eScholarship - University of California

Digital.CSIC

Recommended from our members

Diversification of the Caenorhabditis heat shock response by Helitron transposable elements.

Author: Daugherty Matthew D
Garrigues Jacob M
Pasquinelli Amy E
Tsu Brian V
Publication venue: eScholarship, University of California
Publication date: 01/12/2019
Field of study

Heat Shock Factor 1 (HSF-1) is a key regulator of the heat shock response (HSR). Upon heat shock, HSF-1 binds well-conserved motifs, called Heat Shock Elements (HSEs), and drives expression of genes important for cellular protection during this stress. Remarkably, we found that substantial numbers of HSEs in multiple Caenorhabditis species reside within Helitrons, a type of DNA transposon. Consistent with Helitron-embedded HSEs being functional, upon heat shock they display increased HSF-1 and RNA polymerase II occupancy and up-regulation of nearby genes in C. elegans. Interestingly, we found that different genes appear to be incorporated into the HSR by species-specific Helitron insertions in C. elegans and C. briggsae and by strain-specific insertions among different wild isolates of C. elegans. Our studies uncover previously unidentified targets of HSF-1 and show that Helitron insertions are responsible for rewiring and diversifying the Caenorhabditis HSR

eScholarship - University of California

The Mathematics of Phylogenomics

Author: Pachter Lior
Sturmfels Bernd
Publication venue
Publication date: 01/01/2004
Field of study

The grand challenges in biology today are being shaped by powerful high-throughput technologies that have revealed the genomes of many organisms, global expression patterns of genes and detailed information about variation within populations. We are therefore able to ask, for the first time, fundamental questions about the evolution of genomes, the structure of genes and their regulation, and the connections between genotypes and phenotypes of individuals. The answers to these questions are all predicated on progress in a variety of computational, statistical, and mathematical fields. The rapid growth in the characterization of genomes has led to the advancement of a new discipline called Phylogenomics. This discipline results from the combination of two major fields in the life sciences: Genomics, i.e., the study of the function and structure of genes and genomes; and Molecular Phylogenetics, i.e., the study of the hierarchical evolutionary relationships among organisms and their genomes. The objective of this article is to offer mathematicians a first introduction to this emerging field, and to discuss specific mathematical problems and developments arising from phylogenomics.Comment: 41 pages, 4 figure

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

Recommended from our members

PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures.

Author: Aviran Sharon
Ledda Mirko
Publication venue: eScholarship, University of California
Publication date: 01/03/2018
Field of study

Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions

eScholarship - University of California

Identification of motifs in biological sequences using genetic programming

Author: Universitat Autònoma de Barcelona. Escola d'Enginyeria
Velasco Àlex
Publication venue
Publication date: 01/01/2020
Field of study

Current tools for motif discovery search patterns that are over-represented in DNA sequences but do not use DNA curvature or cofactors associated with the protein bind. We developed a tool that searches for motifs with a variable gap between patterns. The search is done using a genetic programming algorithm that searches for possible models that could be the motif and tries to fit them in a set of positive sequences with the motif against a control dataset. To evaluate the fitness of the organisms we have created an energy model for each component of the regulated bacterial promoters. The final genetic algorithm is able to find hidden motifs in synthetic sequences and real biological sequences.Les eines actuals per al descobriment de motius busquen patrons que estan sobre-representats a les seqüències d'ADN, però no utilitzen la curvatura de l'ADN o cofactors associats a la unió de la proteïna. Hem desenvolupat una eina que busca motius amb un espaiador variable entre patrons. La cerca es fa mitjançant un algorisme de programació genètica que busca possibles models que podrien ser el motiu i intenta encaixar-los en un conjunt de seqüències positives que inclouen el motiu envers un conjunt de seqüències de control. Per avaluar l'encaix dels organismes hem creat un model d'energia per a cada component dels promotors reguladors bacterians. L'algorisme genètic final és capaç de trobar motius ocults a seqüències sintètiques i seqüències reals.Las herramientas actuales para el descubrimiento de motivos buscan patrones que están sobrerepresentados en las secuencias de ADN, pero no usan la curvatura del ADN o cofactores asociados a la unión de la proteína. Hemos desarrollado una herramienta que busca motivos con un espaciado variable entre patrones. La búsqueda se hace mediante un algoritmo de programación genética que busca posibles modelos que podrían ser el motivo y los intenta encajar en un conjunto de secuencias positivas que incluyen el motivo contra un conjunto de secuencias de control. Para evaluar el encaje de los organismos, hemos creado un modelo de energía para cada componente de los promotores reguladores bacterianos. El algoritmo genético final es capaz de encontrar motivos ocultos en secuencias sintéticas y secuencias reales

Diposit Digital de Documents de la UAB

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

Author: A Ambesi-Impiombato
A Blais
A Eto
A Subramanian
AE Kel
AG Clark
AL Lam
AM McGuire
Anat Reiner
Assif Yitzhaky
B Ren
C Kimura-Yoshida
C Plessy
C Yang
CT Harbison
D Pfeifer
D Wang
DB Allison
E Emberly
E Segal
Eytan Domany
FP Roth
GC Pipes
GC Yuan
GQ Yao
GZ Hertz
H Li
H Lodish
J Zheng
JD Hughes
JL DeRisi
JQ Ling
K Frech
K Quandt
KD MacIsaac
L Amir-Zilberstein
L Elnitski
L Marino-Ramirez
L McCue
M Ashburner
M Kellis
M Milyavsky
MA Nobrega
Mark Koudritsky
MC Frith
ML Howard
ML Whitfield
N Rajewsky
Or Zuk
P Carninci
P Carninci
P Cliften
PM Haverty
PR Buckland
R Elkon
R Liu
R Sharan
Ran Brosh
S Aerts
S Rashi-Elkeles
S Tavazoie
SJ Cooper
SJ Ho Sui
Sui Huang
U Gerland
Varda Rotter
WW Wasserman
X Xie
Y Barash
Y Benjamini
Y Benjamini
Y Tabach
Yossi Buganim
Yuval Tabach
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central