Search CORE

Archivio istituzionale della ricerca - Università di Cagliari

BioCloud Search EnGene: Surfing Biological Data on the Cloud

Author: DESSI NICOLETTA
MILIA GABRIELE
Pascariello E
PES BARBARA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The massive production and spread of biomedical data around the web introduces new challenges related to identify computational approaches for providing quality search and browsing of web resources. This papers presents BioCloud Search EnGene (BSE), a cloud application that facilitates searching and integration of the many layers of biological information offered by public large-scale genomic repositories. Grounding on the concept of dataspace, BSE is built on top of a cloud platform that severely curtails issues associated with scalability and performance. Like popular online gene portals, BSE adopts a gene-centric approach: researchers can find their information of interest by means of a simple “Google-like” query interface that accepts standard gene identification as keywords. We present BSE architecture and functionality and discuss how our strategies contribute to successfully tackle big data problems in querying gene-based web resources. BSE is publically available at: http://biocloud-unica.appspot.com/

VisANT 3.0: new modules for pathway visualization, editing, prediction and construction

Author: Ball
Barrett
Bolan Linghu
Charles DeLisi
Chung
Chunnuan Chen
Dahlquist
David M. Ng
Demir
Demir
DeRisi
Fukuda
Gagneur
Gavin
Ge
Hasegawa
Herman
Hu
Hu
Hu
Jansen
Joe Mellor
Joshi-Tope
Joshua M. Stuart
Junker
Kanehisa
Keseler
Kitano
Klukas
Lashkari
Mellor
Minoru Kanehisa
Mlecnik
Ng
Ng
Owen
Ravasz
Saraiya
Segal
Shannon
Shuichi Kawashima
Spirin
Stuart
Sugiyama
Takuji Yamada
Zhenjun Hu
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu

Boston University Institutional Repository (OpenBU)

e-Science and biological pathway semantics

Author: A Gangemi
A Rector
A Rector
A Ruttenberg
AL Rector
AL Rector
B Yang
BioPAX workgroup
C Goble
C Goble
C Goble
C Lutz
DC Fallside
DL McGuinness
E Sirin
I Horrocks
JA Papin
JD Eckart
Joanne S Luciano
JS Grethe
L Stein
M Arita
M Ashburner
M Cary
M Horridge
M Kanehisa
ML Green
N Kotecha
O Lassila
P Romero
PD Karp
R Stevens
Robert D Stevens
S Bechhofer
S Bechhofer
S Klamt
T Hey
T Hey
TM McPhillips
V Curcin
Publication venue: BioMed Central
Publication date: 01/05/2007
Field of study

Abstract Background The development of e-Science presents a major set of opportunities and challenges for the future progress of biological and life scientific research. Major new tools are required and corresponding demands are placed on the high-throughput data generated and used in these processes. Nowhere is the demand greater than in the semantic integration of these data. Semantic Web tools and technologies afford the chance to achieve this semantic integration. Since pathway knowledge is central to much of the scientific research today it is a good test-bed for semantic integration. Within the context of biological pathways, the BioPAX initiative, part of a broader movement towards the standardization and integration of life science databases, forms a necessary prerequisite for its successful application of e-Science in health care and life science research. This paper examines whether BioPAX, an effort to overcome the barrier of disparate and heterogeneous pathway data sources, addresses the needs of e-Science. Results We demonstrate how BioPAX pathway data can be used to ask and answer some useful biological questions. We find that BioPAX comes close to meeting a broad range of e-Science needs, but certain semantic weaknesses mean that these goals are missed. We make a series of recommendations for re-modeling some aspects of BioPAX to better meet these needs. Conclusion Once these semantic weaknesses are addressed, it will be possible to integrate pathway information in a manner that would be useful in e-Science.</p

The University of Manchester - Institutional Repository

A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi

Author: A Bernstein
A Brazma
A Ruttenberg
A Ruttenberg
AA Ackermann
Aaron R. Jex
Amir H. Asiaee
Amit P. Sheth
B Chukualim
BM Good
C Aurrecoechea
C Bizer
C Blaschke
C Goble
C Goble
C Hertz-Fowler
D Xu
E Antezana
E Sirin
H Dietze
H Lam
H Tang
J Bhagat
J Luciano
J Malone
JA Atwood
JC Jeremy
K Christoph
K Eilbeck
K-H Cheung
M Ashburner
M Aslett
M Johnson
M Kanehisa
NM El-Sayed
P Hitzler
P Mendes
PR Smart
Prashant Doshi
Priti P. Parikh
R Brinkman
Rick Tarleton
Sarasi Lalithsena
Satya S. Sahoo
SS Sahoo
SS Sahoo
SS Sahoo
SS Sahoo
T Minning
TA Minning
Todd A. Minning
V Cross
V Petri
Vinh Nguyen
W Hersh
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Effective research in parasite biology requires analyzing experimental lab data in the context of constantly expanding public data resources. Integrating lab data with public resources is particularly difficult for biologists who may not possess significant computational skills to acquire and process heterogeneous data stored at different locations. Therefore, we develop a semantic problem solving environment (SPSE) that allows parasitologists to query their lab data integrated with public resources using ontologies. An ontology specifies a common vocabulary and formal relationships among the terms that describe an organism, and experimental data and processes in this case. SPSE supports capturing and querying provenance information, which is metadata on the experimental processes and data recorded for reproducibility, and includes a visual query-processing tool to formulate complex queries without learning the query language syntax. We demonstrate the significance of SPSE in identifying gene knockout targets for T. cruzi. The overall goal of SPSE is to help researchers discover new or existing knowledge that is implicitly present in the data but not always easily detected. Results demonstrate improved usefulness of SPSE over existing lab systems and approaches, and support for complex query design that is otherwise difficult to achieve without the knowledge of query language syntax

Public Library of Science (PLOS)

Scholar Commons - Institutional Repository of the University of South Carolina

CORE

VisANT 3.0: new modules for pathway visualization, editing, prediction and construction

Author: Ball
Barrett
Bolan Linghu
Charles DeLisi
Chung
Chunnuan Chen
Dahlquist
David M. Ng
Demir
Demir
DeRisi
Fukuda
Gagneur
Gavin
Ge
Hasegawa
Herman
Hu
Hu
Hu
Jansen
Joe Mellor
Joshi-Tope
Joshua M. Stuart
Junker
Kanehisa
Keseler
Kitano
Klukas
Lashkari
Mellor
Minoru Kanehisa
Mlecnik
Ng
Ng
Owen
Ravasz
Saraiya
Segal
Shannon
Shuichi Kawashima
Spirin
Stuart
Sugiyama
Takuji Yamada
Zhenjun Hu
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

Boston University Institutional Repository (OpenBU)

KA-SB: from data integration to large scale reasoning

Author: Aldana-Montes José F
Chniber Othmane
Kerzazi Amine
Molina-Castro Joaquín
Navas-Delgado Ismael
Roldán-García María del Mar
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The analysis of information in the biological domain is usually focused on the analysis of data from single on-line data sources. Unfortunately, studying a biological process requires having access to disperse, heterogeneous, autonomous data sources. In this context, an analysis of the information is not possible without the integration of such data. Methods KA-SB is a querying and analysis system for final users based on combining a data integration solution with a reasoner. Thus, the tool has been created with a process divided into two steps: 1) KOMF, the Khaos Ontology-based Mediator Framework, is used to retrieve information from heterogeneous and distributed databases; 2) the integrated information is crystallized in a (persistent and high performance) reasoner (DBOWL). This information could be further analyzed later (by means of querying and reasoning). Results In this paper we present a novel system that combines the use of a mediation system with the reasoning capabilities of a large scale reasoner to provide a way of finding new knowledge and of analyzing the integrated information from different databases, which is retrieved as a set of ontology instances. This tool uses a graphical query interface to build user queries easily, which shows a graphical representation of the ontology and allows users o build queries by clicking on the ontology concepts. Conclusion These kinds of systems (based on KOMF) will provide users with very large amounts of information (interpreted as ontology instances once retrieved), which cannot be managed using traditional main memory-based reasoners. We propose a process for creating persistent and scalable knowledgebases from sets of OWL instances obtained by integrating heterogeneous data sources with KOMF. This process has been applied to develop a demo tool <url>http://khaos.uma.es/KA-SB</url>, which uses the BioPax Level 3 ontology as the integration schema, and integrates UNIPROT, KEGG, CHEBI, BRENDA and SABIORK databases.</p

Springer - Publisher Connector

University of Toronto Research Repository

cPath: open source software for collecting, storing, and querying biological pathways

Author: A Birkland
A Bouchie
A Zanzoni
Benjamin E Gross
C Sander
CF Schaefer
Chris Sander
CM Lloyd
D Hanahan
EM Zdobnov
Ethan G Cerami
F Campagne
F Iragne
G Joshi-Tope
Gary D Bader
GD Bader
H Hermjakob
H Hermjakob
H Kitano
H Ogata
I Xenarios
J Kohler
KH Buetow
L Salwinski
L Stein
LD Stein
M Hucka
M Kanehisa
MP Cary
N Le Novere
N Le Novere
P Nurse
P Shannon
PD Karp
PD Karp
PJ Kersey
R Aragues
RT Fielding
S Peri
SP Shah
T Ideker
T Ideker
WC Hahn
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. RESULTS: We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. CONCLUSION: cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling

Springer - Publisher Connector

arXiv.org e-Print Archive

Ranking relations using analogies in biological and information networks

Author: Airoldi EM
Ghahramani Z
Heller K
Silva R
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/06/2010
Field of study

Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects

\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}

, measures how well other pairs A:B fit in with the set

\mathbf{S}

. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in

\mathbf{S}

? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org