Search CORE

12 research outputs found

E-Biothon: an experimental platform for BioInformatics

Author: Chabbert Marie
Daydé Michel
Depardon Benjamin
Franc Alain,
Gibrat Jean-François
Guillier Romaric
Karami Yasaman
Pérez Christian
Suter Frédéric
Taddese Bruck
Thérond Sylvie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2015
Field of study

International audienceThe E-Biothon platform is an experimental Cloud platform to help speed up and advance research in biology, health and environment. It is based on a Blue Gene/P system and a web portal that allow members of the bioinformatics community to easily launch their scientific applications. We describe in this paper the technical capacities of the platform, the different applications supported and finally a set of user experiences on the platform

HAL-ENS-LYON

HAL-IN2P3

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

HAL-Inserm

Open Archive Toulouse Archive Ouverte

Hal-Diderot

Oskar Bordeaux

HAL-Rennes 1

E-Biothon : Une plate-forme pour accélérer les recherches en biologie, santé et environnement

Author: Bard Nicolas
Boin Sylvie
Bothorel François
Chaumeil Philippe
Collinet Philippe
Daydé Michel
Depardon Benjamin
Desprez Frédéric
Flé Marie
Franc Alain,
Frigerio Jean-Marc
Gascuel Olivier
Gibrat Jean-François
Girou Denis
Guindon Stéphane
Lavallée Pierre-François
Lefort Vincent
Lesage Gilles
Rugeri Marc
Ruinet Evelyne
Seguin Christophe
Thérond Sylvie
Publication venue: HAL CCSD
Publication date: 13/11/2013
Field of study

National audienceE-Biothon : Une plate-forme pour accélérer les recherches en biologie, santé et environnemen

HAL-ENS-LYON

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Oskar Bordeaux

Retours sur l’école-chercheurs : Prospective, participation et modélisation spatiale pour la gestion des ressources dans les territoires

Author: Lardon Sylvie
Thérond Olivier
Publication venue: 'Consortium Erudit'
Publication date: 01/01/2018
Field of study

Érudit

Synchronized navigation and comparative analyses across Ensembl complete bacterial genomes with INSYGHT.

Author: Gendrault Annie
Gibrat Jean-François
Lacroix Thomas
Loux Valentin
Nicolas Pierre,
Rugeri Marc
Thérond Sylvie
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/11/2015
Field of study

International audienceMotivation: High-throughput sequencing technologies provide access to an increasing number of bacterial genomes. Today, many analyses involve the comparison of biological properties among many strains of a given species, or among species of a particular genus. Tools that can help the microbiologist with these tasks become increasingly important. Results: Insyght is a comparative visualization tool whose core features combine a synchronized navigation across genomic data of multiple organisms with a versatile interoperability between complementary views. In this work, we have greatly increased the scope of the Insyght public data- set by including 2688 complete bacterial genomes available in Ensembl thus vastly improving its phylogenetic coverage. We also report the development of a virtual machine that allows users to easily set up and customize their own local Insyght server

HAL Descartes

PubMed Central

A geometric view of Biodiversity: scaling to metagenomics

Author: BLANCHARD Pierre
CHAUMEIL Philippe
COULAUD Olivier
FRANC Alain
FRIGERIO Jean-Marc
RIMET Frédéric
SALIN Franck
THÉROND Sylvie
Publication venue
Publication date: 01/01/2018
Field of study

Nous avons conçu un algorithme de réduction de la dimension pour explorer de nouvellesvoies pour une caractérisation précise de la biodiversité, ici par une approche géométrique,qui satisfait aux critères de passage à l'échelle pour les jeux de données produits par NGS(actuellement

\sim 10^5

reads). Cette aproche est basée sur la technique dite "Multidimensional Scaling",qui permet de projeter les éléments à étudier sur un ensemble de n points dans un espaceeuclidien de faible dimension, connaissant leurs distances respectives. Nous avons calculé toutesles distances deux à deux entre reads d'un échantillon environnemental, réalisé une MDS dutableau de distances, et analysé les projections sur les premiers axes par des techniques de visualisation.Nous avons abordé la question de la complexité quadratique du calcul des distances deux à deux en réalisant les calculs dans un Centre National disposant d'une machine hyperparallèle (Turing, une IBM BLue Gene Q), et la complexité cubique de la décomposition spectrale dans la MDS en utilisant un algorithme de projection aléatoire dense. Nous avons appliqué cette procédure à un jeu de

\sim 10^5

reads d'un échantillon environnemental de diatomées du lac Léman.L'analyse de la forme du nuage de points obtenu ouvre la voie vers une analyse géométrique de la biodiversité, et une construction rigoureuse d'OTUs (Operational Taxonomic Units) lorsque le jeu de données est trop grand pour mettre en oeuvre les méthodes de classiffcation ascendante hiérarchique, non supervisée.We have designed a new efficient dimensionality reduction algorithm in order to investigate new ways of accurately characterizing the biodiversity, namely from a geometric point of view, scaling with large environmental sets produced by NGS (

\sim 10^5

sequences). The approach is based on Multidimensional Scaling (MDS) that allows for mapping items on a set of

n

points into a low dimensional euclidean space given the set of pairwise distances. We compute all pairwise distances between reads in a given sample, run MDS on the distance matrix, and analyze the projection on first axis, by visualization tools. We have circumvented the quadratic complexity of computing pairwise distances by implementing it on a hyperparallel computer (Turing, a Blue Gene Q), and the cubic complexity of the spectral decomposition by implementing a dense random projection based algorithm. We have applied this data analysis scheme on a set of

10^5

reads, which are amplicons of a diatom environmental sample from Lake Geneva. Analyzing the shape of the point cloud paves the way for a geometric analysis of biodiversity, and for accurately building OTUs (Operational Taxonomic Units), when the data set is too large for implementing unsupervised, hierarchical, high-dimensional clustering

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL Université de Savoie

Oskar Bordeaux

Synchronized navigation and comparative analyses across Ensembl complete bacterial genomes with INSYGHT

Author: Annie Gendrault
Jean-François Gibrat
Marc Rugeri
Pierre Nicolas
Sylvie Thérond
Thomas Lacroix
Valentin Loux
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

A geometric view of Biodiversity: scaling to metagenomics

Author: BLANCHARD Pierre
CHAUMEIL Philippe
COULAUD Olivier
FRANC Alain
FRIGERIO Jean-Marc
RIMET Frédéric
SALIN Franck
THÉROND Sylvie
Publication venue
Publication date: 01/01/2018
Field of study

\sim 10^5

\sim 10^5

\sim 10^5

sequences). The approach is based on Multidimensional Scaling (MDS) that allows for mapping items on a set of

n

10^5

Oskar Bordeaux

diagno-syst: a tool for accurate inventories in metabarcoding

Author: BOUCHEZ Agnes
CHANCEREL Emilie
CHAUMEIL Philippe
FRANC Alain
FRIGERIO Jean-Marc
KAHLERT Maria
RIMET Frédéric
SALIN Franck
THÉROND Sylvie
Publication venue
Publication date: 28/11/2016
Field of study

Metabarcoding on amplicons is rapidly expanding as a method to produce molecular based inventories of microbial communities. Here, we work on freshwater diatoms, which are microalgae possibly inventoried both on a morphological and a molecular basis. We have developed an algorithm, in a program called diagno-syst, based a the notion of informative read, which carries out supervised clustering of reads by mapping them exactly one by one on all reads of a well curated and taxonomically annotated reference database. This program has been run on a HPC (and HTC) infrastructure to address computation load. We compare optical and molecular based inventories on 10 samples from Léman lake, and 30 from Swedish rivers. We track all possibilities of mismatches between both approaches, and compare the results with standard pipelines (with heuristics) like Mothur. We find that the comparison with optics is more accurate when using exact calculations, at the price of a heavier computation load. It is crucial when studying the long tail of biodiversity, which may be overestimated by pipelines or algorithms using heuristics instead (more false positive). This work supports the analysis that these methods will benefit from progress in, first, building an agreement between molecular based and morphological based systematics and, second, having as complete as possible publicly available reference databases

Oskar Bordeaux