4,248 research outputs found

    Performance analysis of a parallel, multi-node pipeline for DNA sequencing

    Get PDF
    Post-sequencing DNA analysis typically consists of read mapping followed by variant calling and is very time-consuming, even on a multi-core machine. Recently, we proposed Halvade, a parallel, multi-node implementation of a DNA sequencing pipeline according to the GATK Best Practices recommendations. The MapReduce programming model is used to distribute the workload among different workers. In this paper, we study the impact of different hardware configurations on the performance of Halvade. Benchmarks indicate that especially the lack of good multithreading capabilities in the existing tools (BWA, SAMtools, Picard, GATK) cause suboptimal scaling behavior. We demonstrate that it is possible to circumvent this bottleneck by using multiprocessing on high-memory machines rather than using multithreading. Using a 15-node cluster with 360 CPU cores in total, this results in a runtime of 1 h 31 min. Compared to a single-threaded runtime of similar to 12 days, this corresponds to an overall parallel efficiency of 53%

    Systematic review of acute physically active learning and classroom movement breaks on children's physical activity, cognition, academic performance and classroom behaviour: understanding critical design features.

    Get PDF
    Objective: To examine the impact of acute classroom movement break (CMB) and physically active learning (PAL) interventions on physical activity (PA), cognition, academic performance and classroom behaviour. Design: Systematic review. Data sources: PubMed, EBSCO, Academic Search Complete, Education Resources Information Center, PsycINFO, SPORTDiscus, SCOPUS and Web of Science. Eligibility criteria for selecting studies: Studies investigating school-based acute bouts of CMB or PAL on (PA), cognition, academic performance and classroom behaviour. The Downs and Black checklist assessed risk of bias. Results: Ten PAL and eight CMB studies were identified from 2929 potentially relevant articles. Risk of bias scores ranged from 33% to 64.3%. Variation in study designs drove specific, but differing, outcomes. Three studies assessed PA using objective measures. Interventions replaced sedentary time with either light PA or moderate-to-vigorous PA dependent on design characteristics (mode, duration and intensity). Only one study factored individual PA outcomes into analyses. Classroom behaviour improved after longer moderate-to-vigorous (>10 min), or shorter more intense (5 min), CMB/PAL bouts (9 out of 11 interventions). There was no support for enhanced cognition or academic performance due to limited repeated studies. Conclusion: Low-to-medium quality designs predominate in investigations of the acute impacts of CMB and PAL on PA, cognition, academic performance and classroom behaviour. Variable quality in experimental designs, outcome measures and intervention characteristics impact outcomes making conclusions problematic. CMB and PAL increased PA and enhanced time on task. To improve confidence in study outcomes, future investigations should combine examples of good practice observed in current studies. PROSPERO registration number: CRD42017070981

    Illuminating Choices for Library Prep: A Comparison of Library Preparation Methods for Whole Genome Sequencing of Cryptococcus neoformans Using Illumina HiSeq.

    Get PDF
    The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use

    The PKR-binding domain of adenovirus VA RNAI exists as a mixture of two functionally non-equivalent structures

    Get PDF
    VA RNAI is a non-coding adenoviral transcript that counteracts the host cell anti-viral defenses such as immune responses mediated via PKR. We investigated potential alternate secondary structure conformations within the PKR-binding domain of VA RNAI using site-directed mutagenesis, RNA UV-melting analysis and enzymatic RNA secondary structure probing. The latter data clearly indicated that the wild-type VA RNAI apical stem can adopt two different conformations and that it exists as a mixed population of these two structures. In contrast, in two sequence variants we designed to eliminate one of the possible structures, while leaving the other intact, each formed a unique secondary structure. This clarification of the apical stem pairing also suggests a small alteration to the apical stem–loop secondary structure. The relative ability of the two apical stem conformations to bind PKR and inhibit kinase activity was measured by isothermal titration calorimetry and PKR autophosphorylation inhibition assay. We found that the two sequence variants displayed markedly different activities, with one being a significantly poorer binder and inhibitor of PKR. Whether the presence of the VA RNAI conformation with reduced PKR inhibitory activity is directly beneficial to the virus in the cell for some other function requires further investigation

    Examining longitudinal associations between the recreational physical activity environment, change in body mass index, and obesity by age in 8864 Yorkshire Health Study participants.

    Get PDF
    The environment may lead to lower body mass index (BMI) and obesity risk by providing opportunities to be physically active. However, while intuitively appealing, associations are often inconsistent in direction and small scale. This longitudinal study examined if change in BMI and obesity was associated with the availability of physical activity (PA) facilities and parks and explored if these associations differed by age. Longitudinal data (n = 8,864, aged 18-86 years) were provided at baseline (wave I: 2010-2012) and follow up (wave II: 2013-2015) of the Yorkshire Health Study. BMI was calculated using self-reported height (cm) and weight (kg) (obesity = BMI≥30.00). To define availability, home addresses were geocoded based on postcode zone centroids and neighbourhood was defined as a 2 km radial buffer. PA facilities were sourced from Ordnance Survey Points of Interest (PoI) and parks were sourced from OpenStreetMap. Environmental data temporally matched individual-level baseline data collection. PA facilities (b = -0.006 [-0.015, 0.003]) and parks (b = -0.001 [-0.015, 0.013]) at baseline were not associated with change in BMI. Change in obesity was unrelated to parks (OR = 0.994 [0.975, 1.015]) and while PA facilities were related (OR = 0.979 [0.965, 0.993]), effects were small. A combined measure of the recreational PA environment including parks and PA facilities was unrelated to change in BMI and obesity. Despite this, statistically significant interactions were found for both PA facilities, parks, and change in obesity by age. Based on the premise that an individual's mobility varies with age, and although effects were small, this offers tentative evidence which suggests it may be useful for policymakers in Public Health and Planning to consider the impact of environmental interventions across the life course

    Neighbourhood typologies and associations with body mass index and obesity: a cross-sectional study

    Get PDF
    Little research has investigated associations between a combined measure of the food and physical activity (PA) environment, BMI (body-mass-index) and obesity. Cross-sectional data (n=22,889, age 18-86 years) from the Yorkshire Health Study were used [2010-2013]. BMI was calculated using self-reported height and weight; obesity=BMI≥30. Neighbourhood was defined as a 2km radial buffer. Food outlets and PA facilities were sourced from Ordnance Survey Points of Interest (PoI) and categorised into ‘fast-food’, ‘large supermarkets’, ‘convenience and other food retail outlets’ and ‘physical activity facilities’. Parks were sourced from Open Street Map. Latent class analysis was conducted on these five environmental variables and availability was defined by quartiles of exposure. Linear and logistic regression were then conducted for BMI and obesity respectively for different neighbourhood types. Models adjusted for age, gender, ethnicity, area-level deprivation, and rural/urban classification. A five-class solution demonstrated best fit and was interpretable. Neighbourhood typologies were defined as; ‘low availability’, ‘moderate availability’, ‘moderate PA, limited food’, ‘saturated’ and ‘moderate PA, ample food’. Compared to low availability, one typology demonstrated lower BMI (saturated, b= -0.50, [95% CI= -0.76,-0.23]), while three showed higher BMI (moderate availability, b= 0.49 [0.27,0.72]; moderate PA, limited food, b=0.30 [0.01,0.59]; moderate PA, ample food, b=0.32 [0.08,0.57]). Furthermore, compared to the low availability, saturated neighbourhoods showed lower odds of obesity (OR=0.86 [0.75,0.99]) while moderate availability showed greater odds of obesity (OR=1.18 [1.05,1.32]). This study supports population-level approaches to tackling obesity however neighbourhoods contained features that were health-promoting and -constraining

    Quantifying single nucleotide variant detection sensitivity in exome sequencing

    Get PDF
    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits

    Moderate-to-Vigorous Physical Activity in Primary School Children: Inactive Lessons Are Dominated by Maths and English.

    Get PDF
    BACKGROUND: A large majority of primary school pupils fail to achieve 30-min of daily, in-school moderate-to-vigorous physical activity (MVPA). The aim of this study was to investigate MVPA accumulation and subject frequency during academic lesson segments and the broader segmented school day. METHODS: 122 children (42.6% boys; 9.9 ± 0.3 years) from six primary schools in North East England, wore uniaxial accelerometers for eight consecutive days. Subject frequency was assessed by teacher diaries. Multilevel models (children nested within schools) examined significant predictors of MVPA across each school-day segment (lesson one, break, lesson two, lunch, lesson three). RESULTS: Pupils averaged 18.33 ± 8.34 min of in-school MVPA, and 90.2% failed to achieve the in-school 30-min MVPA threshold. Across all school-day segments, MVPA accumulation was typically influenced at the individual level. Lessons one and two-dominated by maths and English-were less active than lesson three. Break and lunch were the most active segments. CONCLUSION: This study breaks new ground, revealing that MVPA accumulation and subject frequency varies greatly during different academic lessons. Morning lessons were dominated by the inactive delivery of maths and English, whereas afternoon lessons involved a greater array of subject delivery that resulted in marginally higher levels of MVPA

    Control of laser light by a plasma immersed in a tunable strong magnetic field

    Get PDF
    The interaction between laser light and an underdense plasma immersed in a spatio-temporally tunable magnetic field is studied analytically and numerically. The transversely nonuniform magnetic field can serve as a magnetic channel, which can act on laser propagation in a similar way to the density channel. The envelope equation for laser intensity evolution is derived, which contains the effects of magnetic channel and relativistic self-focusing. Due to the magnetic field applied, the critical laser power for relativistic self-focusing can be significantly reduced. Theory and particle-in-cell simulations show that a weakly relativistic laser pulse can propagate with a nearly constant peak intensity along the magnetic channel for a distance much longer than its Rayleigh length. By setting the magnetic field tunable in both space and time, the simulation further shows that the magnetized plasma can then act as a lens of varying focal length to control the movement of laser focal spot, decoupling the laser group velocity from the light speed c in vacuum
    corecore