Search CORE

42 research outputs found

High-Performance Modelling and Simulation for Big Data Applications

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

OAPEN Library

Informational and Linguistic Analysis of Large Genomic Sequence Collections via Efficient Hadoop Cluster Algorithms

Author: Acquisti
Alfonso Valencia
Audano
Aurell
Ben-Ari
Benoit
Benson
Bhatia
Birol
Cattaneo
Cattaneo
Chor
Compeau
Dean
Denning
Ferraro Petrillo
Ferraro Petrillo
Giancarlo
Giancarlo
Giancarlo
Gianluca Roscigno
Giuseppe Cattaneo
Hampikian
Horwege
ITIS Partnership
Kokot
Leimeister
Lo Bosco
Marçais
Nordberg
Nordstrom
Nystedt
Pinello
Raffaele Giancarlo
Rahman
Rizk
Shvachko
Siretskiy
Umberto Ferraro Petrillo
Utro
Vergni
White
Zaharia
Zhou
Zimin
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e., how many times each k-mer in A;C; G; Tk occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes

Archivio della Ricerca - Università di Salerno

Archivio della ricerca- Università di Roma La Sapienza

Open Access Repository

Archivio istituzionale della ricerca - Università di Palermo