Search CORE

6,784 research outputs found

Alignment-free Genomic Analysis via a Big Data Spark Platform

Author: Cattaneo Giuseppe
Giancarlo Raffaele
Palini Francesco
Petrillo Umberto Ferraro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Motivation: Alignment-free distance and similarity functions (AF functions, for short) are a well established alternative to two and multiple sequence alignments for many genomic, metagenomic and epigenomic tasks. Due to data-intensive applications, the computation of AF functions is a Big Data problem, with the recent Literature indicating that the development of fast and scalable algorithms computing AF functions is a high-priority task. Somewhat surprisingly, despite the increasing popularity of Big Data technologies in Computational Biology, the development of a Big Data platform for those tasks has not been pursued, possibly due to its complexity. Results: We fill this important gap by introducing FADE, the first extensible, efficient and scalable Spark platform for Alignment-free genomic analysis. It supports natively eighteen of the best performing AF functions coming out of a recent hallmark benchmarking study. FADE development and potential impact comprises novel aspects of interest. Namely, (a) a considerable effort of distributed algorithms, the most tangible result being a much faster execution time of reference methods like MASH and FSWM; (b) a software design that makes FADE user-friendly and easily extendable by Spark non-specialists; (c) its ability to support data- and compute-intensive tasks. About this, we provide a novel and much needed analysis of how informative and robust AF functions are, in terms of the statistical significance of their output. Our findings naturally extend the ones of the highly regarded benchmarking study, since the functions that can really be used are reduced to a handful of the eighteen included in FADE

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Robust Algorithms for Registration of 3D Images of Human Brain

Author: Cizek Jiri
Publication venue
Publication date: 01/01/2004
Field of study

This thesis is concerned with the process of automatically aligning 3D medical images of human brain. It concentrates on rigid-body matching of Positron Emission Tomography images (PET) and Magnetic Resonance images (MR) within one patient and on non-linear matching of PET images of different patients. In recent years, mutual information has proved to be an excellent criterion for automatic registration of intra-individual images from different modalities. We propose and evaluate a method that combines a multi-resolution optimization of mutual information with an efficient segmentation of background voxels and a modified principal axes algorithm. We show that an acceleration factor of 6-7 can be achieved without loss of accuracy and that the method significantly reduces the rate of unsuccessful registrations. Emphasis was also laid on creation of an automatic registration system that could be used routinely in clinical environment. Non-linear registration tries to reduce the inter-individual variability of shape and structure between two brain images by deforming one image so that homologous regions in both images get aligned. It is an important step of many procedures in medical image processing and analysis. We present a novel algorithm for an automatic non-linear registration of PET images based on hierarchical volume subdivisions and local affine optimizations. It produces a C2-continuous deformation function and guarantees that the deformation is one-to-one. Performance of the algorithm was evaluated on more than 600 clinical PET images

CiteSeerX

Kölner UniversitätsPublikationsServer

Appearance-Based Gaze Estimation in the Wild

Author: Bulling Andreas
Fritz Mario
Sugano Yusuke
Zhang Xucong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

arXiv.org e-Print Archive

CiteSeerX

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

PhylOTU: a high-throughput procedure quantifies microbial community diversity and resolves novel taxa from metagenomic data.

Author: Eisen Jonathan A
Green Jessica L
Kembel Steven W
Ladau Joshua
O'Dwyer James P
Pollard Katherine S
Riesenfeld Samantha J
Sharpton Thomas J
Publication venue: eScholarship, University of California
Publication date: 01/01/2011
Field of study

Microbial diversity is typically characterized by clustering ribosomal RNA (SSU-rRNA) sequences into operational taxonomic units (OTUs). Targeted sequencing of environmental SSU-rRNA markers via PCR may fail to detect OTUs due to biases in priming and amplification. Analysis of shotgun sequenced environmental DNA, known as metagenomics, avoids amplification bias but generates fragmentary, non-overlapping sequence reads that cannot be clustered by existing OTU-finding methods. To circumvent these limitations, we developed PhylOTU, a computational workflow that identifies OTUs from metagenomic SSU-rRNA sequence data through the use of phylogenetic principles and probabilistic sequence profiles. Using simulated metagenomic data, we quantified the accuracy with which PhylOTU clusters reads into OTUs. Comparisons of PCR and shotgun sequenced SSU-rRNA markers derived from the global open ocean revealed that while PCR libraries identify more OTUs per sequenced residue, metagenomic libraries recover a greater taxonomic diversity of OTUs. In addition, we discover novel species, genera and families in the metagenomic libraries, including OTUs from phyla missed by analysis of PCR sequences. Taken together, these results suggest that PhylOTU enables characterization of part of the biosphere currently hidden from PCR-based surveys of diversity

Directory of Open Access Journals

PubMed Central

eScholarship - University of California