1,071 research outputs found
Microbiome profiling by Illumina sequencing of combinatorial sequence-tagged PCR products
We developed a low-cost, high-throughput microbiome profiling method that
uses combinatorial sequence tags attached to PCR primers that amplify the rRNA
V6 region. Amplified PCR products are sequenced using an Illumina paired-end
protocol to generate millions of overlapping reads. Combinatorial sequence
tagging can be used to examine hundreds of samples with far fewer primers than
is required when sequence tags are incorporated at only a single end. The
number of reads generated permitted saturating or near-saturating analysis of
samples of the vaginal microbiome. The large number of reads al- lowed an
in-depth analysis of errors, and we found that PCR-induced errors composed the
vast majority of non-organism derived species variants, an ob- servation that
has significant implications for sequence clustering of similar high-throughput
data. We show that the short reads are sufficient to assign organisms to the
genus or species level in most cases. We suggest that this method will be
useful for the deep sequencing of any short nucleotide region that is
taxonomically informative; these include the V3, V5 regions of the bac- terial
16S rRNA genes and the eukaryotic V9 region that is gaining popularity for
sampling protist diversity.Comment: 28 pages, 13 figure
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment
Background
Searching for similarities in protein and DNA databases has become a routine procedure in Molecular Biology. The Smith-Waterman algorithm has been available for more than 25 years. It is based on a dynamic programming approach that explores all the possible alignments between two sequences; as a result it returns the optimal local alignment. Unfortunately, the computational cost is very high, requiring a number of operations proportional to the product of the length of two sequences. Furthermore, the exponential growth of protein and DNA databases makes the Smith-Waterman algorithm unrealistic for searching similarities in large sets of sequences. For these reasons heuristic approaches such as those implemented in FASTA and BLAST tend to be preferred, allowing faster execution times at the cost of reduced sensitivity. The main motivation of our work is to exploit the huge computational power of commonly available graphic cards, to develop high performance solutions for sequence alignment.
Results
In this paper we present what we believe is the fastest solution of the exact Smith-Waterman algorithm running on commodity hardware. It is implemented in the recently released CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU) G80. Speeds of more than 3.5 GCUPS (Giga Cell Updates Per Second) are achieved on a workstation running two GeForce 8800 GTX. Exhaustive tests have been done to compare our implementation to SSEARCH and BLAST, running on a 3 GHz Intel Pentium IV processor. Our solution was also compared to a recently published GPU implementation and to a Single Instruction Multiple Data (SIMD) solution. These tests show that our implementation performs from 2 to 30 times faster than any other previous attempt available on commodity hardware.
Conclusions
The results show that graphic cards are now sufficiently advanced to be used as efficient hardware accelerators for sequence alignment. Their performance is better than any alternative available on commodity hardware platforms. The solution presented in this paper allows large scale alignments to be performed at low cost, using the exact Smith-Waterman algorithm instead of the largely adopted heuristic approaches
Winter Bird Assemblages in Rural and Urban Environments: A National Survey
Urban development has a marked effect on the ecological and behavioural traits of many living
organisms, including birds. In this paper, we analysed differences in the numbers of wintering
birds between rural and urban areas in Poland. We also analysed species richness
and abundance in relation to longitude, latitude, human population size, and landscape
structure. All these parameters were analysed using modern statistical techniques incorporating
species detectability. We counted birds in 156 squares (0.25 km2 each) in December
2012 and again in January 2013 in locations in and around 26 urban areas across Poland
(in each urban area we surveyed 3 squares and 3 squares in nearby rural areas). The influence
of twelve potential environmental variables on species abundance and richness was
assessed with Generalized Linear Mixed Models, Principal Components and Detrended
Correspondence Analyses. Totals of 72 bird species and 89,710 individual birds were recorded
in this study. On average (±SE) 13.3 ± 0.3 species and 288 ± 14 individuals were recorded
in each square in each survey. A formal comparison of rural and urban areas
revealed that 27 species had a significant preference; 17 to rural areas and 10 to urban areas. Moreover, overall abundance in urban areas was more than double that of rural
areas. There was almost a complete separation of rural and urban bird communities. Significantly
more birds and more bird species were recorded in January compared to December.
We conclude that differences between rural and urban areas in terms of winter conditions
and the availability of resources are reflected in different bird communities in the two
environments
Estimating Animal Abundance in Ground Beef Batches Assayed with Molecular Markers
Estimating animal abundance in industrial scale batches of ground meat is important for mapping meat products through the manufacturing process and for effectively tracing the finished product during a food safety recall. The processing of ground beef involves a potentially large number of animals from diverse sources in a single product batch, which produces a high heterogeneity in capture probability. In order to estimate animal abundance through DNA profiling of ground beef constituents, two parameter-based statistical models were developed for incidence data. Simulations were applied to evaluate the maximum likelihood estimate (MLE) of a joint likelihood function from multiple surveys, showing superiority in the presence of high capture heterogeneity with small sample sizes, or comparable estimation in the presence of low capture heterogeneity with a large sample size when compared to other existing models. Our model employs the full information on the pattern of the capture-recapture frequencies from multiple samples. We applied the proposed models to estimate animal abundance in six manufacturing beef batches, genotyped using 30 single nucleotide polymorphism (SNP) markers, from a large scale beef grinding facility. Results show that between 411∼1367 animals were present in six manufacturing beef batches. These estimates are informative as a reference for improving recall processes and tracing finished meat products back to source
Measuring Global Credibility with Application to Local Sequence Alignment
Computational biology is replete with high-dimensional (high-D) discrete prediction and inference problems, including sequence alignment, RNA structure prediction, phylogenetic inference, motif finding, prediction of pathways, and model selection problems in statistical genetics. Even though prediction and inference in these settings are uncertain, little attention has been focused on the development of global measures of uncertainty. Regardless of the procedure employed to produce a prediction, when a procedure delivers a single answer, that answer is a point estimate selected from the solution ensemble, the set of all possible solutions. For high-D discrete space, these ensembles are immense, and thus there is considerable uncertainty. We recommend the use of Bayesian credibility limits to describe this uncertainty, where a (1−α)%, 0≤α≤1, credibility limit is the minimum Hamming distance radius of a hyper-sphere containing (1−α)% of the posterior distribution. Because sequence alignment is arguably the most extensively used procedure in computational biology, we employ it here to make these general concepts more concrete. The maximum similarity estimator (i.e., the alignment that maximizes the likelihood) and the centroid estimator (i.e., the alignment that minimizes the mean Hamming distance from the posterior weighted ensemble of alignments) are used to demonstrate the application of Bayesian credibility limits to alignment estimators. Application of Bayesian credibility limits to the alignment of 20 human/rodent orthologous sequence pairs and 125 orthologous sequence pairs from six Shewanella species shows that credibility limits of the alignments of promoter sequences of these species vary widely, and that centroid alignments dependably have tighter credibility limits than traditional maximum similarity alignments
PLAST: parallel local alignment search tool for database comparison
Background: Sequence similarity searching is an important and challenging task in molecular biology and next-generation sequencing should further strengthen the need for faster algorithms to process such vast amounts of data. At the same time, the internal architecture of current microprocessors is tending towards more parallelism, leading to the use of chips with two, four and more cores integrated on the same die. The main purpose of this work was to design an effective algorithm to fit with the parallel capabilities of modern microprocessors. Results: A parallel algorithm for comparing large genomic banks and targeting middle-range computers has been developed and implemented in PLAST software. The algorithm exploits two key parallel features of existing and future microprocessors: the SIMD programming model (SSE instruction set) and the multithreading concept (multicore). Compared to multithreaded BLAST software, tests performed on an 8-processor server have shown speedup ranging from 3 to 6 with a similar level of accuracy. Conclusions: A parallel algorithmic approach driven by the knowledge of the internal microprocessor architecture allows significant speedup to be obtained while preserving standard sensitivity for similarity search problems.
Numerical study of circulation on the inner Amazon Shelf
Author Posting. © Springer, 2008. This is the author's version of the work. It is posted here by permission of Springer for personal use, not for redistribution. The definitive version was published in Ocean Dynamics 58 (2008): 187-198, doi:10.1007/s10236-008-0139-4.We studied the circulation on the coastal
domain of the Amazon Shelf by applying the hydrodynamic
module of the Estuarine and Coastal Ocean
Model and Sediment Transport - ECOMSED. The first
barotropic experiment aimed to explain the major bathymetric
effects on tides and those generated by anisotropy
in sediment distribution. We analyzed the continental
shelf response of barotropic tides under realistic bottom
stress parametrization (Cd), considering sediment granulometry
obtained from a faciologic map, where river
mud deposits and reworked sediments areas are well distinguished,
among others classes of sediments. Very low
Cd values were set in the fluid mud regions off the Amapa
coast (1.0 10-4 ), in contrast to values around 3:5 10-3
for coarser sediment regions off the Para coast. Three-dimensional
experiments represented the Amazon River
discharge and trade winds, combined to barotropic tide
influences and induced vertical mixing. The quasi-resonant
response of the Amazon Shelf to the M2 tide act on
the local hydrodynamics by increasing tidal admittance,
along with tidal forcing at the shelf break and extensive
fluid mud regions. Harmonic analysis of modeled
currents agreed well with analysis of the AMASSEDS
observational data set. Tidal-induced vertical shear provided
strong homogenization of threshold waters, which
are subject to a kind of hydraulic control due to the topographic
steepness. Ahead of the hydraulic jump, the
low-salinity plume is disconnected from the bottom and
acquires negative vorticity, turning southeastward. Tides
act as a generator mechanism and topography, via hydraulic
control, as a maintainer mechanism for the low-salinity
frontal zone positioning. Tidally induced southeastward
plume fate is overwhelmed by northwestward
trade winds so that, along with background circulation,
probably play the most important role on the plume fate
and variability over the Amazon Shelf
3-Methyl-1-butanol production in Escherichia coli: random mutagenesis and two-phase fermentation
Interest in producing biofuels from renewable sources has escalated due to energy and environmental concerns. Recently, the production of higher chain alcohols from 2-keto acid pathways has shown significant progress. In this paper, we demonstrate a mutagenesis approach in developing a strain of Escherichia coli for the production of 3-methyl-1-butanol by leveraging selective pressure toward l-leucine biosynthesis and screening for increased alcohol production. Random mutagenesis and selection with 4-aza-d,l-leucine, a structural analogue to l-leucine, resulted in the development of a new strain of E. coli able to produce 4.4 g/L of 3-methyl-1-butanol. Investigation of the host’s sensitivity to 3-methyl-1-butanol directed development of a two-phase fermentation process in which titers reached 9.5 g/L of 3-methyl-1-butanol with a yield of 0.11 g/g glucose after 60 h
Recommended from our members
The Views of Mental Health Manager Towards the Use of a Family Work Model for Psychosis in Guangzhou, China
Family Interventions in Psychosis (FIP) have been promoted internationally but have been criticised for being based on western cultural models. This paper reports on a focus group study with 10 Integrated Mental Health Service Managers in Guangzhou, China using thematic analysis. Managers believed FIP might benefit families but identified potential difficulties due to (a) families avoiding services due to the ‘shame’ of mental illness (b) unrealistic expectations of services amongst families (c) deferral to ‘key decision-makers’ within families when discussing family issues with workers. The findings indicate that FIP work should focus on interaction between carers in the first instance with service users being introduced into sessions at a later date and that more attention needs to be given by the research community to how FIP may be adapted to cultural norms within China
- …