8,122 research outputs found
Telescoper: de novo assembly of highly repetitive regions.
MotivationWith advances in sequencing technology, it has become faster and cheaper to obtain short-read data from which to assemble genomes. Although there has been considerable progress in the field of genome assembly, producing high-quality de novo assemblies from short-reads remains challenging, primarily because of the complex repeat structures found in the genomes of most higher organisms. The telomeric regions of many genomes are particularly difficult to assemble, though much could be gained from the study of these regions, as their evolution has not been fully characterized and they have been linked to aging.ResultsIn this article, we tackle the problem of assembling highly repetitive regions by developing a novel algorithm that iteratively extends long paths through a series of read-overlap graphs and evaluates them based on a statistical framework. Our algorithm, Telescoper, uses short- and long-insert libraries in an integrated way throughout the assembly process. Results on real and simulated data demonstrate that our approach can effectively resolve much of the complex repeat structures found in the telomeres of yeast genomes, especially when longer long-insert libraries are used.AvailabilityTelescoper is publicly available for download at sourceforge.net/p/[email protected] informationSupplementary data are available at Bioinformatics online
Decoding coalescent hidden Markov models in linear time
In many areas of computational biology, hidden Markov models (HMMs) have been
used to model local genomic features. In particular, coalescent HMMs have been
used to infer ancient population sizes, migration rates, divergence times, and
other parameters such as mutation and recombination rates. As more loci,
sequences, and hidden states are added to the model, however, the runtime of
coalescent HMMs can quickly become prohibitive. Here we present a new algorithm
for reducing the runtime of coalescent HMMs from quadratic in the number of
hidden time states to linear, without making any additional approximations. Our
algorithm can be incorporated into various coalescent HMMs, including the
popular method PSMC for inferring variable effective population sizes. Here we
implement this algorithm to speed up our demographic inference method diCal,
which is equivalent to PSMC when applied to a sample of two haplotypes. We
demonstrate that the linear-time method can reconstruct a population size
change history more accurately than the quadratic-time method, given similar
computation resources. We also apply the method to data from the 1000 Genomes
project, inferring a high-resolution history of size changes in the European
population.Comment: 18 pages, 5 figures. To appear in the Proceedings of the 18th Annual
International Conference on Research in Computational Molecular Biology
(RECOMB 2014). The final publication is available at link.springer.co
Recommended from our members
An application of psychometric models and ATI methodology to the evaluation of instruction.
EducationDoctor of Education (Ed.D.
Deep Learning for Population Genetic Inference
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statis- tics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Inter- estingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme
Compensatory approaches and engagement techniques to gain flood storage in England and Wales
Flood storage involves creating sacrificial land for water to purposefully inundate protect land downstream. Obtaining the right or co-operation to flood on private property remains a challenge. This paper based on empirical qualitative research with 14 key stakeholders involved in the practice of gaining land to flood in England and Wales the different forms of financial and economic approach that might be used to facilitate this right. Expropriation of land, one off-payment, annual single payment and flood event losses compensation were explored. Availability of funding as compensation is the main driver for landowner adoption of flood storage schemes. Three funding approaches were revealed; flowage easement, full land purchase and Agricultural Schemes funding diffuse storage. Rather than attempting to gain partnerships between spatially dislocated stakeholders in upper storage and lower impacted catchments success resides on the storage land and persuading landowner co-operation. A clear enforced legal framework of ownership of land and funding mechanisms is also viewed as essential
Deciphering Courts of Appeals Decisions Using the U.S. Courts of Appeals Data Base
Is one circuit significantly more conservative or liberal than the others? Do circuit courts consistently avoid deciding the substance of certain appeals by concluding that the plaintiffs lack standing? Have state governments been more successful than other parties when they appeal adverse district court rulings? Do appeals courts act in a majoritarian or countermajoritarian manner with regard to elected institutions and the general public? The United States Courts of Appeals Data Base, an extensive data set of courts of appeals decisions, can address these and other questions about the circuit courts. This article describes the background, scope, and content of the database, explains how to use it, and illustrates applications to research questions of interest to the diverse law and social science community interested in courts of appeals
Evaluating circadian dysfunction in mouse models of Alzheimer\u27s disease: Where do we stand?
Circadian dysfunction has been described in patients with symptomatic Alzheimer\u27s disease (AD), as well as in presymptomatic phases of the disease. Modeling this circadian dysfunction in mouse models would provide an optimal platform for understanding mechanisms and developing therapies. While numerous studies have examined behavioral circadian function, and in some cases clock gene oscillation, in mouse models of AD, the results are variable and inconsistent across models, ages, and conditions. Ultimately, circadian changes observed in APP/PS1 models are inconsistent across studies and do not always replicate circadian phenotypes observed in human AD. Other models, including the 3xTG mouse, tau transgenic lines, and the accelerated aging SAMP8 line, show circadian phenotypes more consistent with human AD, although the literature is either inconsistent or minimal. We summarize these data and provide some recommendations to improve and standardize future studies of circadian function in AD mouse models
- …