Search CORE

27 research outputs found

Automated and traceable processing for large-scale high-throughput sequencing facilities

Author: Cuccuru Gianmauro
Fotia Giorgio
Lianas Luca
Pireddu Luca
Vocale Matteo
Zanetti Gianluigi
Publication venue: 'EMBnet Stichting'
Publication date: 01/01/2013
Field of study

Scaling up production in medium and large high-throughput sequencing facilities presents a number of challenges. As the rate of samples to process increases, manually performing and tracking the center’s operations becomes increasingly difficult, costly and error prone, while processing the massive amounts of data poses significant computational challenges. We present our ongoing work to automate and track all data-related procedures at the CRS4 Sequencing and Genotyping Platform, while integrating state-of-the-art processing technologies such as Hadoop, OMERO, iRODS, and Galaxy into our automated workflows. Currently, the core system is in its testing phase and it is on schedule to be in production use at CRS4 by May 2013. The results thus far obtained are encouraging and the authors are confident that the CRS4 Platform will increase its efficiency and capacity thanks to this system. In the near future, the integration components will be released as as open source software.23-24Pubblicat

P-arch

Recommended from our members

Training Infrastructure as a Service

Author: Bacon Wendi
Bretaudeau Anthony
Coraor Nate
Cuccuru Gianmauro
Davis John
Gladman Simon
Grüning Björn
Hillman-Jackson Jennifer
Hiltemann Saskia
Hyde Cameron
Rasche Helena
Serrano-Solano Beatriz
Stubbs Andrew
Zhou Miaomiao
Publication venue
Publication date: 28/12/2022
Field of study

Background Hands-on training, whether in bioinformatics or other domains, often requires significant technical resources and knowledge to set up and run. Instructors must have access to powerful compute infrastructure that can support resource-intensive jobs running efficiently. Often this is achieved using a private server where there is no contention for the queue. However, this places a significant prerequisite knowledge or labor barrier for instructors, who must spend time coordinating deployment and management of compute resources. Furthermore, with the increase of virtual and hybrid teaching, where learners are located in separate physical locations, it is difficult to track student progress as efficiently as during in-person courses. Findings Originally developed by Galaxy Europe and the Gallantries project, together with the Galaxy community, we have created Training Infrastructure-as-a-Service (TIaaS), aimed at providing user-friendly training infrastructure to the global training community. TIaaS provides dedicated training resources for Galaxy-based courses and events. Event organizers register their course, after which trainees are transparently placed in a private queue on the compute infrastructure, which ensures jobs complete quickly, even when the main queue is experiencing high wait times. A built-in dashboard allows instructors to monitor student progress. Conclusions TIaaS provides a significant improvement for instructors and learners, as well as infrastructure administrators. The instructor dashboard makes remote events not only possible but also easy. Students experience continuity of learning, as all training happens on Galaxy, which they can continue to use after the event. In the past 60 months, 504 training events with over 24,000 learners have used this infrastructure for Galaxy training

Open Research Online (The Open University)

EUR Research Repository

HAL-Rennes 1

Exome sequencing in Crisponi/CISS-like individuals reveals unpredicted alternative diagnoses

Author: Angius Andrea
Annerén Göran
Aubertin Gudrun
Buers Insa
Crisponi Giangiorgio
Crisponi Laura
Cucca Francesco
Cuccuru Gianmauro
Fry Andrew E.
Hulait Gurdip
Muntoni Francesco
Onano Stefano
Oppo Manuela
Palomares-Bralo María
Persico Ivana
Rutsch Frank
Santos-Simarro Fernando
Stattin Eva-Lena
Uva Paolo
Van Allen Margot I.
Publication venue: 'Wiley'
Publication date: 01/05/2019
Field of study

Crisponi/cold‐induced sweating syndrome (CS/CISS) is a rare autosomal recessive disorder characterized by a complex phenotype (hyperthermia and feeding difficulties in the neonatal period, followed by scoliosis and paradoxical sweating induced by cold since early childhood) and a high neonatal lethality. CS/CISS is a genetically heterogeneous disorder caused by mutations in CRLF1 (CS/CISS1), CLCF1 (CS/CISS2) and KLHL7 (CS/CISS‐like). Here, a whole exome sequencing approach in individuals with CS/CISS‐like phenotype with unknown molecular defect revealed unpredicted alternative diagnoses. This approach identified putative pathogenic variations in NALCN, MAGEL2 and SCN2A. They were already found implicated in the pathogenesis of other syndromes, respectively the congenital contractures of the limbs and face, hypotonia, and developmental delay syndrome, the Schaaf‐Yang syndrome, and the early infantile epileptic encephalopathy‐11 syndrome. These results suggest a high neonatal phenotypic overlap among these disorders and will be very helpful for clinicians. Genetic analysis of these genes should be considered for those cases with a suspected CS/CISS during neonatal period who were tested as mutation negative in the known CS/CISS genes, because an expedited and corrected diagnosis can improve patient management and can provide a specific clinical follow‐up

Crossref

Online Research @ Cardiff

UCL Discovery

Genome-wide association study of susceptibility loci for breast cancer in Sardinian population

Abstract Background Despite progress in identifying genes associated with breast cancer, many more risk loci exist. Genome-wide association analyses in genetically-homogeneous populations, such as that of Sardinia (Italy), could represent an additional approach to detect low penetrance alleles. Methods We performed a genome-wide association study comparing 1431 Sardinian patients with non-familial, BRCA1/2-mutation-negative breast cancer to 2171 healthy Sardinian blood donors. DNA was genotyped using GeneChip Human Mapping 500 K Arrays or Genome-Wide Human SNP Arrays 6.0. To increase genomic coverage, genotypes of additional SNPs were imputed using data from HapMap Phase II. After quality control filtering of genotype data, 1367 cases (9 men) and 1658 controls (1156 men) were analyzed on a total of 2,067,645 SNPs. Results Overall, 33 genomic regions (67 candidate SNPs) were associated with breast cancer risk at the p < 10−6 level. Twenty of these regions contained defined genes, including one already associated with breast cancer risk: TOX3. With a lower threshold for preliminary significance to p < 10−5, we identified 11 additional SNPs in FGFR2, a well-established breast cancer-associated gene. Ten candidate SNPs were selected, excluding those already associated with breast cancer, for technical validation as well as replication in 1668 samples from the same population. Only SNP rs345299, located in intron 1 of VAV3, remained suggestively associated (p-value, 1.16x10−5), but it did not associate with breast cancer risk in pooled data from two large, mixed-population cohorts. Conclusions This study indicated the role of TOX3 and FGFR2 as breast cancer susceptibility genes in BRCA1/2-wild-type breast cancer patients from Sardinian population

Biblioteca Digital de la Comunidad de Madrid

Apollo (Cambridge)

Genome-wide association study of susceptibility loci for breast cancer in Sardinian population.

BACKGROUND: Despite progress in identifying genes associated with breast cancer, many more risk loci exist. Genome-wide association analyses in genetically-homogeneous populations, such as that of Sardinia (Italy), could represent an additional approach to detect low penetrance alleles. METHODS: We performed a genome-wide association study comparing 1431 Sardinian patients with non-familial, BRCA1/2-mutation-negative breast cancer to 2171 healthy Sardinian blood donors. DNA was genotyped using GeneChip Human Mapping 500 K Arrays or Genome-Wide Human SNP Arrays 6.0. To increase genomic coverage, genotypes of additional SNPs were imputed using data from HapMap Phase II. After quality control filtering of genotype data, 1367 cases (9 men) and 1658 controls (1156 men) were analyzed on a total of 2,067,645 SNPs. RESULTS: Overall, 33 genomic regions (67 candidate SNPs) were associated with breast cancer risk at the p < 0(-6) level. Twenty of these regions contained defined genes, including one already associated with breast cancer risk: TOX3. With a lower threshold for preliminary significance to p < 10(-5), we identified 11 additional SNPs in FGFR2, a well-established breast cancer-associated gene. Ten candidate SNPs were selected, excluding those already associated with breast cancer, for technical validation as well as replication in 1668 samples from the same population. Only SNP rs345299, located in intron 1 of VAV3, remained suggestively associated (p-value, 1.16 x 10(-5)), but it did not associate with breast cancer risk in pooled data from two large, mixed-population cohorts. CONCLUSIONS: This study indicated the role of TOX3 and FGFR2 as breast cancer susceptibility genes in BRCA1/2-wild-type breast cancer patients from Sardinian population

Crossref

Harvard University - DASH

PubMed Central

Apollo (Cambridge)

Deep Blue Documents at the University of Michigan

Tools and data services registry: a community effort to document bioinformatics resources

Author: Anthon Christian
Beard Niall
Berka Karel
Bolser Dan
Booth Tim
Bretaudeau Anthony
Brezovsky Jan
Brunak Søren
Casadio Rita
Cesareni Gianni
Chmura Piotr
Coppens Frederik
Cornell Michael
Cuccuru Gianmauro
Davidsen Kristian
de la Torre Victor
Dogan Tunca
Doppelt-Azeroual Olivia
Emery Laura
Friborg Rune Møllegaard
Gasteiger Elisabeth
Gatter Thomas
Goldberg Tatyana
Grosjean Marie
Grüning Björn
Helmer-Citterich Manuela
Ienasescu Hans
Ioannidis Vassilios
Ison Jon
Jespersen Martin Closter
Jimenez Rafael
Juty Nick
Juvan Peter
Kalaš Matúš
Koch Maximilian
Laibe Camille
Li Jing-Woei
Licata Luana
Løngreen Peter
Mareuil Fabien
Mičetić Ivan
Moretti Sebastien
Morris Chris
Ménager Hervé
Möller Steffen
Nenadic Aleksandra
Parkinson Helen
Peterson Hedi
Profiti Giuseppe
Rapacki Kristoffer
Rice Peter
Romano Paolo
Roncaglia Paola
Rost Burkhard
Rydza Emil
Saidi Rabie
Schafferhans Andrea
Schwämmle Veit
Smith Callum
Sperotto Maria Maddalena
Stockinger Heinz
Tosatto Silvio C.E.
Uva Paolo
Vařeková Radka Svobodová
Vedova Gianluca Della
Via Allegra
Vriend Gert
Yachdav Guy
Zambelli Federico
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools

HAL Descartes

Online Research Database In Technology

Hal-Diderot

Archivio istituzionale della ricerca - Università di Padova

NERC Open Research Archive

Crossref

Ghent University Academic Bibliography

Copenhagen University Research Information System

PubMed Central

Archivsystem Ask23

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

University of Southern Denmark Research Output

Archivio della ricerca- Università di Roma La Sapienza

HAL-Rennes 1

Variants within the immunoregulatory CBLB gene are associated with multiple sclerosis

A genome wide association scan of ~6.6 million genotyped or imputed variants in 882 Sardinian Multiple Sclerosis (MS) cases and 872 controls suggested association of CBLB gene variants with disease, which was confirmed in 1,775 cases and 2,005 controls (overall P =1.60 × 10-10). CBLB encodes a negative regulator of adaptive immune responses and mice lacking the orthologue are prone to experimental autoimmune encephalomyelitis, the animal model of MS

PubMed Central

Carolina Digital Repository

Simulating Cardiac Electrophysiology Using Unstructured All-Hexahedra Spectral Elements

Author: Fabio Maggio
Gianmauro Cuccuru
Giorgio Fotia
James Southern
Publication venue: Hindawi Limited
Publication date: 01/01/2015
Field of study

We discuss the application of the spectral element method to the monodomain and bidomain equations describing propagation of cardiac action potential. Models of cardiac electrophysiology consist of a system of partial differential equations coupled with a system of ordinary differential equations representing cell membrane dynamics. The solution of these equations requires solving multiple length scales due to the ratio of advection to diffusion that varies among the different equations. High order approximation of spectral elements provides greater flexibility in resolving multiple length scales. Furthermore, spectral elements are extremely efficient to model propagation phenomena on complex shapes using fewer degrees of freedom than its finite element equivalent (for the same level of accuracy). We illustrate a fully unstructured all-hexahedra approach implementation of the method and we apply it to the solution of full 3D monodomain and bidomain test cases. We discuss some key elements of the proposed approach on some selected benchmarks and on an anatomically based whole heart human computational model

Directory of Open Access Journals

crs4/Galaxy4Developers: July 2017

Author: Gianmauro Cuccuru
Marco Tangaro
Paolo Uva
Rossano Atzeni
Publication venue
Publication date
Field of study

Training material for the ELIXIR-IIB course on "Galaxy for Bioinformatics tool developers" https://crs4.github.io/Galaxy4Developers

ZENODO

The PARIGA server for real time filtering and analysis of reciprocal BLAST results

Author: Carcangiu Simone
Cuccuru Gianmauro
Orsini Massimiliano
Tramontano Anna
Uva Paolo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

BLAST-based similarity searches are commonly used in several applications involving both nucleotide and protein sequences. These applications span from simple tasks such as mapping sequences over a database to more complex procedures as clustering or annotation processes. When the amount of analysed data increases, manual inspection of BLAST results become a tedious procedure. Tools for parsing or filtering BLAST results for different purposes are then required. We describe here PARIGA (http://resources.bioinformatica.crs4.it/pariga/), a server that enables users to perform all-against-all BLAST searches on two sets of sequences selected by the user. Moreover, since it stores the two BLAST output in a python-serialized-objects database, results can be filtered according to several parameters in real-time fashion, without re-running the process and avoiding additional programming efforts. Results can be interrogated by the user using logical operations, for example to retrieve cases where two queries match same targets, or when sequences from the two datasets are reciprocal best hits, or when a query matches a target in multiple regions. The Pariga web server is designed to be a helpful tool for managing the results of sequence similarity searches. The design and implementation of the server renders all operations very fast and easy to use

CiteSeerX

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

FigShare