Search CORE

51 research outputs found

Recommended from our members

Taxonomic Classification of Bacterial 16S rRNA Genes Using Short Sequencing Reads: Evaluation of Effective Study Designs

Author: Davenport Emily R.
Gilad Yoav
Mizrahi-Man Orna
Publication venue
Publication date: 18/01/2024
Field of study

Massively parallel high throughput sequencing technologies allow us to interrogate the microbial composition of biological samples at unprecedented resolution. The typical approach is to perform high-throughout sequencing of 16S rRNA genes, which are then taxonomically classified based on similarity to known sequences in existing databases. Current technologies cause a predicament though, because although they enable deep coverage of samples, they are limited in the length of sequence they can produce. As a result, high-throughout studies of microbial communities often do not sequence the entire 16S rRNA gene. The challenge is to obtain reliable representation of bacterial communities through taxonomic classification of short 16S rRNA gene sequences. In this study we explored properties of different study designs and developed specific recommendations for effective use of short-read sequencing technologies for the purpose of interrogating bacterial communities, with a focus on classification using naïve Bayesian classifiers. To assess precision and coverage of each design, we used a collection of ∼8,500 manually curated 16S rRNA gene sequences from cultured bacteria and a set of over one million bacterial 16S rRNA gene sequences retrieved from environmental samples, respectively. We also tested different configurations of taxonomic classification approaches using short read sequencing data, and provide recommendations for optimal choice of the relevant parameters. We conclude that with a judicious selection of the sequenced region and the corresponding choice of a suitable training set for taxonomic classification, it is possible to explore bacterial communities at great depth using current technologies, with only a minimal loss of taxonomic resolution.</p

Knowledge UChicago

Recommended from our members

A Framework for Exploring Functional Variability in Olfactory Receptor Genes

Author: Crasto Chiquito J.
Gilad Yoav
Man Orna
Shepherd Gordon M.
Willhite David C.
Publication venue
Publication date: 17/01/2024
Field of study

Background: Olfactory receptors (ORs) are the largest gene family in mammalian genomes. Since nearly all OR genes are orphan receptors, inference of functional similarity or differences between odorant receptors typically relies on sequence comparisons. Based on the alignment of entire coding region sequence, OR genes are classified into families and subfamilies, a classification that is believed to be a proxy for OR gene functional variability. However, the assumption that overall protein sequence diversity is a good proxy for functional properties is untested.Methodology: Here, we propose an alternative sequence-based approach to infer the similarities and differences in OR binding capacity. Our approach is based on similarities and differences in the predicted binding pockets of OR genes, rather than on the entire OR coding region.Conclusions: Interestingly, our approach yields markedly different results compared to the analysis based on the entire OR coding-regions. While neither approach can be tested at this time, the discrepancy between the two calls into question the assumption that the current classification reliably reflects OR gene functional variability.</p

Knowledge UChicago

FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded

Author: Beckmann Jacques S.
Felder Clifford E.
Man Orna
Prilusky Jaime
Rydberg Edwin H.
Silman Israel
Sussman Joel L.
Zeev-Ben-Mordehai Tzviya
Publication venue
Publication date: 02/08/2017
Field of study

Summary: An easy-to-use, versatile and freely available graphic web server, FoldIndex© is described: it predicts if a given protein sequence is intrinsically unfolded implementing the algorithm of Uversky and co-workers, which is based on the average residue hydrophobicity and net charge of the sequence. FoldIndex© has an error rate comparable to that of more sophisticated fold prediction methods. Sliding windows permit identification of large regions within a protein that possess folding propensities different from those of the whole protein. Availability: FoldIndex© can be accessed at http://bioportal.weizmann.ac.il/fldbin/findex Contact: [email protected] Supplementary information: http://www.weizmann.ac.il/sb/faculty_pages/Sussman/papers/suppl/Prilusky_200

RERO DOC Digital Library

A Framework for Exploring Functional Variability in Olfactory Receptor Genes

Author: B Malnic
Chiquito J. Crasto
D Krautwurst
D Lancet
David C. Willhite
G Glusman
G Glusman
Gordon M. Shepherd
IC Griff
Ivan Baxter
J Ngai
JB Tenenbaum
JM Young
JM Young
K Kajiya
K Palczewski
K Schneider
L Buck
L Li
LA Mirny
M Balasubramanian
M Lapidot
M Laska
MP Miller
NM Antikainen
O Man
Orna Man
P Mombaerts
P Mombaerts
P Quignon
PA Godfrey
R Chenna
R Grantham
RA Gibbs
RJ O'Connell
RR Reed
S Katada
S Kawashima
S Rouquier
S Salvador
S Zozulya
T Bozza
T Olender
T Olender
T Schoneberg
X Zhang
Y Gilad
Y Gilad
Y Gilad
Y Niimura
Yoav Gilad
Publication venue: Public Library of Science
Publication date
Field of study

BACKGROUND: Olfactory receptors (ORs) are the largest gene family in mammalian genomes. Since nearly all OR genes are orphan receptors, inference of functional similarity or differences between odorant receptors typically relies on sequence comparisons. Based on the alignment of entire coding region sequence, OR genes are classified into families and subfamilies, a classification that is believed to be a proxy for OR gene functional variability. However, the assumption that overall protein sequence diversity is a good proxy for functional properties is untested. METHODOLOGY: Here, we propose an alternative sequence-based approach to infer the similarities and differences in OR binding capacity. Our approach is based on similarities and differences in the predicted binding pockets of OR genes, rather than on the entire OR coding region. CONCLUSIONS: Interestingly, our approach yields markedly different results compared to the analysis based on the entire OR coding-regions. While neither approach can be tested at this time, the discrepancy between the two calls into question the assumption that the current classification reliably reflects OR gene functional variability

Crossref

Directory of Open Access Journals

PubMed Central

A reanalysis of mouse ENCODE comparative gene expression data [v1; ref status: indexed, http://f1000r.es/5ez]

Author: Orna Mizrahi-Man
Yoav Gilad
Publication venue: 'F1000 Research Ltd'
Publication date: 01/05/2015
Field of study

Recently, the Mouse ENCODE Consortium reported that comparative gene expression data from human and mouse tend to cluster more by species rather than by tissue. This observation was surprising, as it contradicted much of the comparative gene regulatory data collected previously, as well as the common notion that major developmental pathways are highly conserved across a wide range of species, in particular across mammals. Here we show that the Mouse ENCODE gene expression data were collected using a flawed study design, which confounded sequencing batch (namely, the assignment of samples to sequencing flowcells and lanes) with species. When we account for the batch effect, the corrected comparative gene expression data from human and mouse tend to cluster by tissue, not by species

Directory of Open Access Journals

Functional Characterization of Variations on Regulatory Motifs

Author: Lapidot Michal
Orna Mizrahi-man
Yitzhak Pilpel
Publication venue
Publication date: 01/03/2008
Field of study

Transcription factors (TFs) regulate gene expression through specific interactions with short promoter elements. The same regulatory protein may recognize a variety of related sequences. Moreover, once they are detected it is hard to predict whether highly similar sequence motifs will be recognized by the same TF and regulate similar gene expression patterns, or serve as binding sites for distinct regulatory factors. We developed computational measures to assess the functional implications of variations on regulatory motifs and to compare the functions of related sites. We have developed computational means for estimating the functional outcome of substituting a single position within a binding site and applied them to a collection of putative regulatory motifs. We predict the effects of nucleotide variations within motifs on gene expression patterns. In cases where such predictions could be compared to suitable published experimental evidence, we found very good agreement. We further accumulated statistics from multiple substitutions across various binding sites in an attempt to deduce general properties that characterize nucleotide substitutions that are more likely to alter expression. We found that substitutions involving Adenine are more likely to retain the expression pattern and that substitutions involving Guanine are more likely to alter expression compared to the rest of the substitutions. Our results should facilitate the prediction of the expression outcomes of binding site variations. One typical important implication i

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Correction: Functional Characterization of Variations on Regulatory Motifs

Author: Lapidot Michal
Mizrahi-Man Orna
Pilpel Yitzhak
Publication venue: Public Library of Science
Publication date: 01/06/2008
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

Data files and codes used in the reanalysis of the mouse encode comparative gene expression data

Author: Orna Mizrahi-Man (42539)
Yoav Gilad (2481)
Publication venue
Publication date
Field of study

<p>We provide supplementary files of the python codes used to process and prepare the data for analysis with R, and the data files for the python codes. We also provide the R codes we used to perform the different analyses as supplementary files, as well as the input for the R codes. Please see supplementary text files for more details.</p

FigShare