Search CORE

346 research outputs found

Standardization Initiatives in the (eco)toxicogenomics Domain: A Review

Author: Ball
Brazma
Brazma
Edgar
Hermjakob
Ikeo
Jennifer Fostel
Mattes
Norman Morrison
Pennie
Philippe Rocca-Serra
Quackenbush
Spellman
Stoeckert
Stoeckert
Susanna Assunta Sansone
Tong
Waters
Waters
Xirasagar
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2004
Field of study

The purpose of this document is to provide readers with a resource of different ongoing standardization efforts within the ‘omics’ (genomic, proteomics, metabolomics) and related communities, with particular focus on toxicological and environmental applications. The review includes initiatives within the research community as well as in the regulatory arena. It addresses data management issues (format and reporting structures for the exchange of information) and database interoperability, highlighting key objectives, target audience and participants. A considerable amount of work still needs to be done and, ideally, collaboration should be optimized and duplication and incompatibility should be avoided where possible. The consequence of failing to deliver data standards is an escalation in the burden and cost of data management tasks

Crossref

Directory of Open Access Journals

PubMed Central

Meeting Report from the Genomic Standards Consortium (GSC) Workshop 10

Author: Field Dawn
Gilbert Jack A
Glass Elizabeth
Hunter Sarah
Kottmann Renzo
Kyrpides Nikos
Meyer Folker
Sansone Susanna
Schriml Lynn
Sterk Peter
White Owen
Wooley John
Publication venue: Michigan State University
Publication date: 25/12/2010
Field of study

This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ongoing within the GSC. Overall, the primary focus of the workshop was on advancing the M5 platform for next-generation collaborative computational infrastructures. Other key outcomes included the formation of a GSC working group focused on MIGS/MIMS/MIENS compliance using the ISA software suite and the formal launch of the GSC Developer Working Group. Further information about the GSC and its range of activities can be found at http://gensc.org/

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Automatic annotation of bioinformatics workflows with biomedical ontologies

Author: B. Smith
B.P. Vandervalk
D. Sáchez
D. Withers
J. Ison
M.D. Wilkinson
M.D. Wilkinson
P. Lord
P. Rice
S. Harispe
T. Oinn
U. Radetzki
Publication venue
Publication date: 01/01/2014
Field of study

Legacy scientific workflows, and the services within them, often present scarce and unstructured (i.e. textual) descriptions. This makes it difficult to find, share and reuse them, thus dramatically reducing their value to the community. This paper presents an approach to annotating workflows and their subcomponents with ontology terms, in an attempt to describe these artifacts in a structured way. Despite a dearth of even textual descriptions, we automatically annotated 530 myExperiment bioinformatics-related workflows, including more than 2600 workflow-associated services, with relevant ontological terms. Quantitative evaluation of the Information Content of these terms suggests that, in cases where annotation was possible at all, the annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014 conference), 15 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Updates in metabolomics tools and resources: 2014-2015

Author: Misra Biswapriya B.
van der Hooft Justin
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Data processing and interpretation represent the most challenging and time-consuming steps in high-throughput metabolomic experiments, regardless of the analytical platforms (MS or NMR spectroscopy based) used for data acquisition. Improved machinery in metabolomics generates increasingly complex datasets that create the need for more and better processing and analysis software and in silico approaches to understand the resulting data. However, a comprehensive source of information describing the utility of the most recently developed and released metabolomics resources—in the form of tools, software, and databases—is currently lacking. Thus, here we provide an overview of freely-available, and open-source, tools, algorithms, and frameworks to make both upcoming and established metabolomics researchers aware of the recent developments in an attempt to advance and facilitate data processing workflows in their metabolomics research. The major topics include tools and researches for data processing, data annotation, and data visualization in MS and NMR-based metabolomics. Most in this review described tools are dedicated to untargeted metabolomics workflows; however, some more specialist tools are described as well. All tools and resources described including their analytical and computational platform dependencies are summarized in an overview Table

Enlighten

High-throughput bioinformatics with the Cyrille2 pipeline system

Author: Datema Erwin
de Groot Joost CW
Fiers Mark WEJ
van der Burgt Ate
van Ham Roeland CHJ
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or <it>pipelines</it>. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (<it>GUI</it>) that enables a pipeline operator to manage the system; 2) the <it>Scheduler</it>, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the <it>Executor</it>, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.</p

Lirias

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

Author: Oluwaseun Ajayi Olabode
Publication venue: 'University of the Western Cape Library Service'
Publication date: 01/01/2018
Field of study

>Magister Scientiae - MScFunctional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs

UWC Theses and Dissertations

Complex networks theory for analyzing metabolic networks

Author: A. B. Horne
A. Bairoch
A. Broder
A. L. Barabasi
A. L. Barabasi
A. Samal
A. Wagner
B. M. Bakker
C. H. Schilling
C. Wagner
D. J. Watts
E. Ravasz
H. Jeong
H. Kitano
H. Lipson
H. W. Ma
H. W. Ma
H. W. Ma
Hong Yu
J. A. Papin
J. C. M. Mombach
J. Gagneur
J. Stelling
J. Stelling
Jianhua Luo
Jing Zhao
L. H. Hartwell
M. Arita
M. C. Palumbo
M. Faloutsos
M. Kanehisa
M. Kanehisa
M. Nakao
M. V. Martinov
N. Lemke
P. D. Karp
P. D. Karp
P. Erdös
P. Holme
R. Albert
R. Guimera
R. Mahadevan
R. Overbeek
R. Schuster
S. Goto
S. Schuster
S. Schuster
S. Wuchty
Yixue Li
Z. N. Oltvai
Z. W. Cao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/08/2006
Field of study

One of the main tasks of post-genomic informatics is to systematically investigate all molecules and their interactions within a living cell so as to understand how these molecules and the interactions between them relate to the function of the organism, while networks are appropriate abstract description of all kinds of interactions. In the past few years, great achievement has been made in developing theory of complex networks for revealing the organizing principles that govern the formation and evolution of various complex biological, technological and social networks. This paper reviews the accomplishments in constructing genome-based metabolic networks and describes how the theory of complex networks is applied to analyze metabolic networks.Comment: 13 pages, 2 figure

arXiv.org e-Print Archive

Crossref

The EMBRACE web service collection

Author: A. B. Clegg
A. Liaquat
Altschul
Ashburner
C. Blanchet
Curcin
D. G. Pisano
D. Thorne
E. Bartaseviciute
E. Bongcam-Rudloff
G. Cameron
G. Vriend
H. Stockinger
Hull
I. Jonassen
I.- Partners
J. Ison
J. M. Fernandez
J. M. Rodriguez
J. Salzemann
K. Rapacki
Kouranov
M. Hekkelman
M. Kalas
M. Uludag
O. Sand
P. McDermott
P. Rice
Pillai
S. Pettifer
Smith
T. K. Attwood
V. Breton
Vriend
Wolstencroft
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.embraceregistry.net) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions

University of Bergen

HAL-IN2P3

Crossref

HAL Clermont Université

PubMed Central

UCL Discovery

DI-fusion

The University of Manchester - Institutional Repository

NORA - Norwegian Open Research Archives

Online Research Database In Technology

Single sample pathway analysis in metabolomics: performance evaluation and application

Author: Ebbels T
Lai P-J
Wieder C
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2022
Field of study

Background Single sample pathway analysis (ssPA) transforms molecular level omics data to the pathway level, enabling the discovery of patient-specific pathway signatures. Compared to conventional pathway analysis, ssPA overcomes the limitations by enabling multi-group comparisons, alongside facilitating numerous downstream analyses such as pathway-based machine learning. While in transcriptomics ssPA is a widely used technique, there is little literature evaluating its suitability for metabolomics. Here we provide a benchmark of established ssPA methods (ssGSEA, GSVA, SVD (PLAGE), and z-score) alongside the evaluation of two novel methods we propose: ssClustPA and kPCA, using semi-synthetic metabolomics data. We then demonstrate how ssPA can facilitate pathway-based interpretation of metabolomics data by performing a case-study on inflammatory bowel disease mass spectrometry data, using clustering to determine subtype-specific pathway signatures. Results While GSEA-based and z-score methods outperformed the others in terms of recall, clustering/dimensionality reduction-based methods provided higher precision at moderate-to-high effect sizes. A case study applying ssPA to inflammatory bowel disease data demonstrates how these methods yield a much richer depth of interpretation than conventional approaches, for example by clustering pathway scores to visualise a pathway-based patient subtype-specific correlation network. We also developed the sspa python package (freely available at https://pypi.org/project/sspa/), providing implementations of all the methods benchmarked in this study. Conclusion This work underscores the value ssPA methods can add to metabolomic studies and provides a useful reference for those wishing to apply ssPA methods to metabolomics data

Spiral - Imperial College Digital Repository