Search CORE

127 research outputs found

Targeted Assembly of Short Sequence Reads

Author: H Li
H Li
H Li
JD Freeman
JT Simpson
LD Stein
M Rasmussen
Olivier Lespinet
R Goya
R Li
R Li
R Morin
René L. Warren
RK Nam
RL Warren
RL Warren
RM Durbin
Robert A. Holt
S Nacu
SP Shah
WR Jeck
Publication venue
Publication date: 01/01/2011
Field of study

As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants, by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled strin-gently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming ge-nomic mutations, polymorphism, fusion and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Nature Precedings

Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

Author: B Knudsen
C Do
D Mathews
H Kiryu
I Hofacker
I Holmes
IL Hofacker
JS McCaskill
K Sato
Kengo Sato
Kiyoshi Asai
L Carvalho
L Kall
M Andronescu
M Andronescu
M Hamada
M Hamada
M Hamada
M Hamada
M Parisien
M Zuker
M Zuker
MC Frith
Michiaki Hamada
N Michal
P Baldi
PP Gardner
R Durbin
RK Bradley
RK Bradley
S Bernhart
S Engelen
S Griffiths-Jones
S Gross
S Seemann
SJ Schroeder
Y Ding
Y Ding
Y Ding
ZJ Lu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. Results Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the <it>pseudo</it>-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. Conclusions This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Early readmission and length of hospitalization practices in the Dialysis Outcomes and Practice Patterns Study (DOPPS)

Author: Ashton CM
Cline CM
Daugirdas JT
Dhingra RK
Durbin CG
Fetter RB
Foley RN
Ganesh SK
Harnett JD
Hofer TP
Jaar BG
Martens KH
Muramatsu N
Nassar GM
Polanczyk CA
Port FK
Schwartz WB
Taheri PA
United States Renal Data System
United States Renal Data System
United States Renal Data System
Ward MM
Weinberger M
Woods JD
Yamaoka K
Publication venue: 'Wiley'
Publication date: 01/07/2004
Field of study

Background: Rising hospital care costs have created pressure to shorten hospital stays and emphasize outpatient care. This study tests the hypothesis that shorter median length of stay (LOS) as a dialysis facility practice is associated with higher rates of early readmission. Methods: Readmission within 30 days of each hospitalization was evaluated for participants in the Dialysis Outcomes and Practice Patterns Study, an observational study of randomly selected hemodialysis patients in the United States (142 facilities, 5095 patients with hospitalizations), five European countries (101 facilities, 2281 patients with hospitalizations), and Japan (58 facilities, 883 patients with hospitalizations). Associations between median facility LOS (estimated from all hospitalizations at the facility and interpreted as a dialysis facility practice pattern) and odds of readmission were assessed using logistic regression, adjusted for patient characteristics and the LOS of each index hospitalization. Results: Risk of readmission was directly and significantly associated with LOS of the index hospitalization (adjusted odds ratio [AOR] 1.005 per day in median facility LOS, p = 0.007) and inversely associated with median facility LOS (AOR = 0.974 per day, p = 0.016). This latter association was strongest for US hemodialysis centers (AOR = 0.954 per day, p = 0.015). Conclusions: Dialysis facilities with shorter median hospital LOS for their patients have higher odds of readmission, particularly in the United States, where there is greater pressure to shorten LOS. The determinants and consequences of practices related to hospital LOS for hemodialysis patients should be further studied.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/73641/1/j.1492-7535.2004.01107.x.pd

Crossref

Deep Blue Documents

PicXAA-R: Efficient structural alignment of multiple RNA sequences using a greedy approach

Author: A Wilm
A Wilm
AO Harmanci
AS Schwartz
B Paten
Byung-Jun Yoon
C Do
C Notredame
CB Do
CB Do
CB Do
D Dalli
D Sankoff
DH Mathews
DH Mathews
FF Costa
G Storz
H Kiryu
H Kiryu
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
JH Havgaard
JH Havgaard
JS McCaskill
K Katoh
M Anwar
M Bauer
M Hamada
M Hamada
R Durbin
RD Dowell
RK Bradley
RK Bradley
S Griffiths-Jones
S Lindgreen
S Moretti
S Siebert
S Wang
S Washietl
S Will
Sayed Mohammad Ebrahim Sahraeian
SM Sahraeian
SR Eddy
U Roshan
X Xu
Y Tabei
ZJ Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Accurate and efficient structural alignment of non-coding RNAs (ncRNAs) has grasped more and more attentions as recent studies unveiled the significance of ncRNAs in living organisms. While the Sankoff style structural alignment algorithms cannot efficiently serve for multiple sequences, mostly progressive schemes are used to reduce the complexity. However, this idea tends to propagate the early stage errors throughout the entire process, thereby degrading the quality of the final alignment. For multiple protein sequence alignment, we have recently proposed PicXAA which constructs an accurate alignment in a non-progressive fashion. Results Here, we propose PicXAA-R as an extension to PicXAA for greedy structural alignment of ncRNAs. PicXAA-R efficiently grasps both folding information within each sequence and local similarities between sequences. It uses a set of probabilistic consistency transformations to improve the posterior base-pairing and base alignment probabilities using the information of all sequences in the alignment. Using a graph-based scheme, we greedily build up the structural alignment from sequence regions with high base-pairing and base alignment probabilities. Conclusions Several experiments on datasets with different characteristics confirm that PicXAA-R is one of the fastest algorithms for structural alignment of multiple RNAs and it consistently yields accurate alignment results, especially for datasets with locally similar sequences. PicXAA-R source code is freely available at: <url>http://www.ece.tamu.edu/~bjyoon/picxaa/</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

OAKTrust Digital Repository (Texas A&M Univ)

Seasonal and Long-Term Changes in Relative Abundance of Bull Sharks from a Tourist Shark Feeding Site in Fiji

Author: A Cruz-Martínez
AA Myrberg
ALF Castro
BG Yeiser
Brian Gratwicke
C Cater
C Ward-Paige
CG Meyer
E Clua
E Rasalato
FF Snelson
GH Burgess
GJ Edgar
Harald Baensch
IF Porcher
J Altmann
J Catlin
J Dobson
J Dobson
J Durbin
JC Carrier
JH Colonello
JK Carlson
JM Brunnschweiler
JM Brunnschweiler
JM Brunnschweiler
Juerg M. Brunnschweiler
MA Samoilys
MB Orams
ML Dicken
ML Dicken
ML Domeier
MR Heithaus
MT O'Connell
N Buray
NE Kohler
RA Myers
RK Laroche
YP Papastamatiou
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Shark tourism has become increasingly popular, but remains controversial because of major concerns originating from the need of tour operators to use bait or chum to reliably attract sharks. We used direct underwater sampling to document changes in bull shark Carcharhinus leucas relative abundance at the Shark Reef Marine Reserve, a shark feeding site in Fiji, and the reproductive cycle of the species in Fijian waters. Between 2003 and 2009, the total number of C. leucas counted on each day ranged from 0 to 40. Whereas the number of C. leucas counted at the feeding site increased over the years, shark numbers decreased over the course of a calendar year with fewest animals counted in November. Externally visible reproductive status information indicates that the species' seasonal departure from the feeding site may be related to reproductive activity

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

Modeling the Evolution of Regulatory Elements by Simultaneous Detection and Alignment with Phylogenetic Pair HMMs

Author: A Loytynoja
A Siepel
A Siepel
A Viterbi
AL Halpern
AM Moses
AP Boyle
B Langmead
D Stanojevic
DA Pollard
DL Gumucio
DS Hirschberg
G Wray
GP Wagner
I Holmes
J Felsenstein
J Hawkins
JC Bryne
JD Thompson
JL Thorne
K Wong
MS Halfon
MZ Ludwig
MZ Ludwig
N Saitou
PR Ray
R Durbin
R Satija
R Siddharthan
RC Edgar
RK Bradley
RW Lusk
Uwe Ohler
W Huang
WH Majoros
WH Majoros
William H. Majoros
WJ Kent
WJL Quesne
Wyeth W. Wasserman
X He
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

DukeSpace (Duke Univ.)

MDC Repository

Responsibility Ascriptions in Technology Development and Engineering: Three Perspectives

In the last decades increasing attention is paid to the topic of responsibility in technology development and engineering. The discussion of this topic is often guided by questions related to liability and blameworthiness. Recent discussions in engineering ethics call for a reconsideration of the traditional quest for responsibility. Rather than on alleged wrongdoing and blaming, the focus should shift to more socially responsible engineering, some authors argue. The present paper aims at exploring the different approaches to responsibility in order to see which one is most appropriate to apply to engineering and technology development. Using the example of the development of a new sewage water treatment technology, the paper shows how different approaches for ascribing responsibilities have different implications for engineering practice in general, and R&D or technological design in particular. It was found that there was a tension between the demands that follow from these different approaches, most notably between efficacy and fairness. Although the consequentialist approach with its efficacy criterion turned out to be most powerful, it was also shown that the fairness of responsibility ascriptions should somehow be taken into account. It is proposed to look for alternative, more procedural ways to approach the fairness of responsibility ascriptions

Crossref

TU Delft Repository

Springer - Publisher Connector

PubMed Central

Histone Deacetylase Inhibitors Globally Enhance H3/H4 Tail Acetylation Without Affecting H3 Lysine 56 Acetylation

Author: A Battu
A Battu
A Loyola
A Verreault
AM Falick
B Schwer
BD Strahl
BS Mann
C Campas-Moya
C Das
CA Davey
DL Swaney
E Michishita
F van Leeuwen
GA Orsi
H Masumoto
HM Prince
I Celic
J Ye
J Yuan
JE Bolden
JH Waterborg
JV Tjeertes
K Halkidou
K Luger
KA Lo
KM Miller
KR Durbin
L Stimson
LJ Benson
M Bots
M Dokmanovic
M Yoshida
MF Fraga
MJ Lee
N Siuti
OK Song
P Drogaris
PN Dyer
RE Sobel
RK Vempati
RK Vempati
RW Johnstone
S Anoopkumar-Dukie
S Kong
S Minucci
T Beckers
T Kouzarides
V Morales
W Xie
WF Marzluff
YP Ninios
Publication venue: Nature Publishing Group
Publication date
Field of study

Histone deacetylase inhibitors (HDACi) represent a promising avenue for cancer therapy. We applied mass spectrometry (MS) to determine the impact of clinically relevant HDACi on global levels of histone acetylation. Intact histone profiling revealed that the HDACi SAHA and MS-275 globally increased histone H3 and H4 acetylation in both normal diploid fibroblasts and transformed human cells. Histone H3 lysine 56 acetylation (H3K56ac) recently elicited much interest and controversy due to its potential as a diagnostic and prognostic marker for a broad diversity of cancers. Using quantitative MS, we demonstrate that H3K56ac is much less abundant than previously reported in human cells. Unexpectedly, in contrast to H3/H4 N-terminal tail acetylation, H3K56ac did not increase in response to inhibitors of each class of HDACs. In addition, we demonstrate that antibodies raised against H3K56ac peptides cross-react against H3 N-terminal tail acetylation sites that carry sequence similarity to residues flanking H3K56

Crossref

PubMed Central

Trait Variation in Yeast Is Defined by Population History

A fundamental goal in biology is to achieve a mechanistic understanding of how and to what extent ecological variation imposes selection for distinct traits and favors the fixation of specific genetic variants. Key to such an understanding is the detailed mapping of the natural genomic and phenomic space and a bridging of the gap that separates these worlds. Here we chart a high-resolution map of natural trait variation in one of the most important genetic model organisms, the budding yeast Saccharomyces cerevisiae, and its closest wild relatives and trace the genetic basis and timing of major phenotype changing events in its recent history. We show that natural trait variation in S. cerevisiae exceeds that of its relatives, despite limited genetic variation, and follows the population history rather than the source environment. In particular, the West African population is phenotypically unique, with an extreme abundance of low-performance alleles, notably a premature translational termination signal in GAL3 that cause inability to utilize galactose. Our observations suggest that many S. cerevisiae traits may be the consequence of genetic drift rather than selection, in line with the assumption that natural yeast lineages are remnants of recent population bottlenecks. Disconcertingly, the universal type strain S288C was found to be highly atypical, highlighting the danger of extrapolating gene-trait connections obtained in mosaic, lab-domesticated lineages to the species as a whole. Overall, this study represents a step towards an in-depth understanding of the causal relationship between co-variation in ecology, selection pressure, natural traits, molecular mechanism, and alleles in a key model organism

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Leicester Research Archive

Parameters for accurate genome alignment

Author: A Morgulis
A Morgulis
A Schwartz
A Stark
B Paten
CH Yuh
CN Dewey
D Gusfield
D Karolchik
D States
DA Pollard
E Kim
EH Margulies
F Chiaromonte
G Benson
G Lunter
G Lunter
I Holmes
J Ruan
J Wang
JC Wootton
JE Janecka
JO Kriegs
JT Reese
KD Pruitt
KM Wong
LA Newberg
LE Carvalho
M Brudno
M Hamada
Martin C Frith
MC Frith
Michiaki Hamada
MS Waterman
Paul Horton
PP Gardner
R Durbin
RC Friedman
RK Bradley
S Karlin
S Kumar
S Miyazawa
S Schwartz
S Sheetlin
SF Altschul
SF Altschul
TJ Treangen
W Huang
WJ Kent
WJ Kent
YK Yu
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed. Results We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that γ-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases. Conclusions These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours <url>http://last.cbrc.jp/</url>.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central