20,662 research outputs found

    Extreme Scale De Novo Metagenome Assembly

    Full text link
    Metagenome assembly is the process of transforming a set of short, overlapping, and potentially erroneous DNA segments from environmental samples into the accurate representation of the underlying microbiomes's genomes. State-of-the-art tools require big shared memory machines and cannot handle contemporary metagenome datasets that exceed Terabytes in size. In this paper, we introduce the MetaHipMer pipeline, a high-quality and high-performance metagenome assembler that employs an iterative de Bruijn graph approach. MetaHipMer leverages a specialized scaffolding algorithm that produces long scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is end-to-end parallelized using the Unified Parallel C language and therefore can run seamlessly on shared and distributed-memory systems. Experimental results show that MetaHipMer matches or outperforms the state-of-the-art tools in terms of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and is able to assemble previously intractable grand challenge metagenomes. We demonstrate the unprecedented capability of MetaHipMer by computing the first full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion reads - size 2.6 TBytes.Comment: Accepted to SC1

    Be bold and take a challenge: could motivational strategies improve help-seeking?

    Get PDF
    Part of the motivation behind the evolution of learning environments is the idea of providing students with individualized instructional strategies that allow them to learn as much as possible. It has been suggested that the goals an individual holds create a framework or orientation from which they react and respond to events. There is a large evidence-based literature which supports the notion of mastery and performance approaches to learning and which identifies distinct behavioural patterns associated with each. However, it remains unclear how these orientations manifest themselves within the individual: an important question to address when applying goal theory to the development of a goal-sensitive learner model. This paper exposes some of these issues by describing two empirical studies. They approach the subject from different perspectives, one from the implementation of an affective computing system and the other a classroom-based study, have both encountered the same empirical and theoretical problems: the dispositional/situational aspect and the dimensionality of goal orientation

    Assembly, quantification, and downstream analysis for high trhoughput sequencing data

    Get PDF
    Next Generation Sequencing is a set of relatively recent but already well-established technologies with a wide range of applications in life sciences. Despite the fact that they are constantly being improved, multiple challenging problems still exist in the analysis of high throughput sequencing data. In particular, genome assembly still suffers from inability of technologies to overcome issues related to such structural properties of genomes as single nucleotide polymorphisms and repeats, not even mentioning the drawbacks of technologies themselves like sequencing errors which also hinder the reconstruction of the true reference genomes. Other types of issues arise in transcriptome quantification and differential gene expression analysis. Processing millions of reads requires sophisticated algorithms which are able to compute gene expression with high precision and in reasonable amount of time. Following downstream analysis, the utmost computational task is to infer the activity of biological pathways (e.g., metabolic). With many overlapping pathways challenge is to infer the role of each gene in activity of a given pathway. Assignment products of a gene to a wrong pathway may result in misleading differential activity analysis, and thus, wrong scientific conclusions. In this dissertation I present several algorithmic solutions to some of the enumerated problems above. In particular, I designed scaffolding algorithm for genome assembly and created new tools for differential gene and biological pathways expression analysis

    A hybrid method for the analysis of learner behaviour in active learning environments

    Get PDF
    Software-mediated learning requires adjustments in the teaching and learning process. In particular active learning facilitated through interactive learning software differs from traditional instructor-oriented, classroom-based teaching. We present behaviour analysis techniques for Web-mediated learning. Motivation, acceptance of the learning approach and technology, learning organisation and actual tool usage are aspects of behaviour that require different analysis techniques to be used. A hybrid method based on a combination of survey methods and Web usage mining techniques can provide accurate and comprehensive analysis results. These techniques allow us to evaluate active learning approaches implemented in form of Web tutorials

    Technology, Pedagogy and Digital Production: A Case Study of Children Learning New Media Skills

    Get PDF
    This article presents an analysis of data from a project which investigated children and young people's learning of digital cultures in informal settings in Britain. The project aimed to build links between young peoples' leisure and learning experiences, by engaging with the content and styles of learning connected with digital cultures in homes and community centres. The focus of this article is on a computer games making course for young people age 9 ā€“ 13. The article looks specifically at issues around technology and pedagogy. Questions are raised about types of software used with this age range, and the article includes a discussion of the models of learning which describe young people?s interactions with digital cultures

    REAPR: a universal tool for genome assembly evaluation.

    Get PDF
    Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/

    BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.

    Get PDF
    BackgroundThe problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.ResultsUsing both the assembled contigs and original long reads as input, BIGMAC first breaks the contigs at potentially mis-assembled locations and subsequently scaffolds contigs. Our experiments on metagenomes assembled from long reads show that BIGMAC can improve assembly quality by reducing the number of mis-assemblies while maintaining or increasing N50 and N75. Moreover, BIGMAC shows the largest N75 to number of mis-assemblies ratio on all tested datasets when compared to other post-processing tools.ConclusionsBIGMAC demonstrates the effectiveness of the post-processing approach in improving the quality of metagenomic assemblies

    The Use of a Mock Environment Summit to Support Learning about Global Climate Change

    Get PDF
    NOTE: This is a large file, 26.6 mb in size! This article advocates the use of a Learner-Centered Environment (LCE) to teach Earth System Science. In this instance, LCE takes the form of a mock environmental summit in which students play the roles of country representatives and participate in activities such as writings, class discussions, presentations and negotiations. Rubrics developed for each activity are used both to assess student learning and to communicate feedback to students about their work. The study suggests that the adoption of an LCE enhanced student learning of content and critical skills. The frequent student presentations were found to play a major role in student learning. The rubrics served as scaffolding for knowledge construction, helped students to self-assess and maintain their quality of work, and allowed instructors to provide quick and efficient feedback. The development of basic learner-centered tools and teaching practices will help Earth System Science instructors provide learning environments most suitable for their discipline. Educational levels: Graduate or professional
    • ā€¦
    corecore