2,072 research outputs found

    Jabba: hybrid error correction for long sequencing reads using maximal exact matches

    Get PDF
    Third generation sequencing platforms produce longer reads with higher error rates than second generation sequencing technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned. In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is that this mapping is constructed with a seed and extend methodology, using maximal exact matches as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of maximal exact matches in the context of third generation reads are presented

    OMSim : a simulator for optical map data

    Get PDF
    Motivation: The Bionano Genomics platform allows for the optical detection of short sequence patterns in very long DNA molecules (up to 2.5 Mbp). Molecules with overlapping patterns can be assembled to generate a consensus optical map of the entire genome. In turn, these optical maps can be used to validate or improve de novo genome assembly projects or to detect large-scale structural variation in genomes. Simulated optical map data can assist in the development and benchmarking of tools that operate on those data, such as alignment and assembly software. Additionally, it can help to optimize the experimental setup for a genome of interest. Such a simulator is currently not available. Results: We have developed a simulator, OMSim, that produces synthetic optical map data that mimics real Bionano Genomics data. These simulated data have been tested for compatibility with the Bionano Genomics Irys software system and the Irys-scaffolding scripts. OMSim is capable of handling very large genomes (over 30 Gbp) with high throughput and low memory requirements

    Computational assessment of the feasibility of protonation-based protein sequencing

    Get PDF
    Recent advances in DNA sequencing methods revolutionized biology by providing highly accurate reads, with high throughput or high read length. These read data are being used in many biological and medical applications. Modern DNA sequencing methods have no equivalent in protein sequencing, severely limiting the widespread application of protein data. Recently, several optical protein sequencing methods have been proposed that rely on the fluorescent labeling of amino acids. Here, we introduce the reprotonation-deprotonation protein sequencing method. Unlike other methods, this proposed technique relies on the measurement of an electrical signal and requires no fluorescent labeling. In reprotonation-deprotonation protein sequencing, the terminal amino acid is identified through its unique protonation signal, and by repeatedly cleaving the terminal amino acids one-by-one, each amino acid in the peptide is measured. By means of simulations, we show that, given a reference database of known proteins, reprotonation-deprotonation sequencing has the potential to correctly identify proteins in a sample. Our simulations provide target values for the signal-to-noise ratios that sensor devices need to attain in order to detect reprotonation-deprotonation events, as well as suitable pH values and required measurement times per amino acid. For instance, an SNR of 10 is required for a 61.71% proteome recovery rate with 100 ms measurement time per amino acid

    Jabba: hybrid error correction for long sequencing reads

    Get PDF
    Background: Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned. Results: In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is the use of a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of MEMs in the context of third generation reads are presented. Conclusion: Jabba produces highly reliable corrected reads: almost all corrected reads align to the reference, and these alignments have a very high identity. Many of the aligned reads are error-free. Additionally, Jabba corrects reads using a very low amount of CPU time. From this we conclude that pseudo alignment with MEMs is a fast and reliable method to map long highly erroneous sequences on a de Bruijn graph

    Optimal randomized multilevel algorithms for infinite-dimensional integration on function spaces with ANOVA-type decomposition

    Full text link
    In this paper, we consider the infinite-dimensional integration problem on weighted reproducing kernel Hilbert spaces with norms induced by an underlying function space decomposition of ANOVA-type. The weights model the relative importance of different groups of variables. We present new randomized multilevel algorithms to tackle this integration problem and prove upper bounds for their randomized error. Furthermore, we provide in this setting the first non-trivial lower error bounds for general randomized algorithms, which, in particular, may be adaptive or non-linear. These lower bounds show that our multilevel algorithms are optimal. Our analysis refines and extends the analysis provided in [F. J. Hickernell, T. M\"uller-Gronbach, B. Niu, K. Ritter, J. Complexity 26 (2010), 229-254], and our error bounds improve substantially on the error bounds presented there. As an illustrative example, we discuss the unanchored Sobolev space and employ randomized quasi-Monte Carlo multilevel algorithms based on scrambled polynomial lattice rules.Comment: 31 pages, 0 figure

    Muon Energy Measurement from Radiative Losses in a Calorimeter for a Collider Detector

    Full text link
    The performance demands of future particle-physics experiments investigating the high-energy frontier pose a number of new challenges, forcing us to find new solutions for the detection, identification, and measurement of final-state particles in subnuclear collisions. One such challenge is the precise measurement of muon momenta at very high energy, where the curvature provided by conceivable magnetic fields in realistic detectors proves insufficient to achieve the desired resolution. In this work we show the feasibility of an entirely new avenue for the measurement of the energy of muons based on their radiative losses in a dense, finely segmented calorimeter. This is made possible by the use of the spatial information of the clusters of deposited photon energy in the regression task. Using a homogeneous lead-tungstate calorimeter as a benchmark, we show how energy losses may provide significant complementary information for the estimate of muon energies above 1 TeV.Comment: 20 pages, 12 figure

    How much spatial information is lost in the sensory substitution process? Comparing visual, tactile, and auditory approaches

    Get PDF
    Sensory substitution devices (SSDs) can convey visuospatial information through spatialised auditory or tactile stimulation using wearable technology. However, the level of information loss associated with this transformation is unknown. In this study novice users discriminated the location of two objects at 1.2m using devices that transformed a 16x 8 depth map into spatially distributed patterns of light, sound, or touch on the abdomen. Results showed that through active sensing, participants could discriminate the vertical position of objects to a visual angle of 1°, 14°, and 21°, and their distance to 2cm, 8cm, and 29cm using these visual, auditory, and haptic SSDs respectively. Visual SSDs significantly outperformed auditory and tactile SSDs on vertical localisation, whereas for depth perception, all devices significantly differed from one another (visual > auditory > haptic). Our findings highlight the high level of acuity possible for SSDs even with low spatial resolutions (e.g. 16 8) and quantify the level of information loss attributable to this transformation for the SSD user. Finally, we discuss ways of closing this ‘modality gap’ found in SSDs and conclude that this process is best benchmarked against performance with SSDs that return to their primary modality (e.g. visuospatial into visual)

    Biopedagogies and Indigenous knowledge: examining sport for development and peace for urban Indigenous young women in Canada and Australia

    Get PDF
    This paper uses transnational postcolonial feminist participatory action research (TPFPAR) to examine two sport for development and peace (SDP) initiatives that focus on Indigenous young women residing in urban areas, one in Vancouver, Canada, and one in Perth, Australia. We examine how SDP programs that target urban Indigenous young women and girls reproduce the hegemony of neoliberalism by deploying biopedagogies of neoliberalism to \u27teach\u27 Indigenous young women certain education and employment skills that are deemed necessary to participate in competitive capitalism. We found that activities in both programs were designed to equip the Indigenous girls and young women with individual attributes that would enhance their chances of future success in arenas valued by neoliberal capitalism: Eurocentric employment, post-secondary education and healthy active living. These forms of \u27success\u27 fall within neoliberal logic, where the focus is on the individual being able to provide for oneself. However, the girls and young women we interviewed argued that their participation in the SDP programs would help them change racist and sexist stereotypes about their communities and thereby challenged negative stereotypes. Thus, it is possible that these programs, despite their predominant use of neoliberal logic and biopedagogies, may help to prepare the participants to more successfully negotiate Eurocentric institutions, and through this assist them participants in contributing to social change. Nevertheless, based on our findings, we argue that SDP programs led by Indigenous peoples that are fundamentally shaped by Indigenous voices, epistemologies, concerns and standpoints would provide better opportunities to shake SDP\u27s current biopedagogical foundation. We conclude by suggesting that a more radical approach to SDP, one that fosters Indigenous self-determination and attempts to disrupt dominant relations of power, could have difficulty in attracting the sort of corporate donors who currently play such important roles in the current SDP landscape
    • …
    corecore