Search CORE

8 research outputs found

Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes

Author: Li Jianrong
Lussier Yves A.
Pantazatos Spiro P.
Pavlidis Paul
Publication venue: Libertas Academica
Publication date: 01/01/2009
Field of study

An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets

Directory of Open Access Journals

PubMed Central

A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data.

Author: Breit T.M.
Marshall M.S.
Post L.J.G.
Roos M.
van Driel R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2007
Field of study

International Migration, Integration and Social Cohesion online publications

Ontologies for Bioinformatics

Author: Leszczynski Agnieszka
Schuurman Nadine
Publication venue: Libertas Academica
Publication date: 01/01/2008
Field of study

The past twenty years have witnessed an explosion of biological data in diverse database formats governed by heterogeneous infrastructures. Not only are semantics (attribute terms) different in meaning across databases, but their organization varies widely. Ontologies are a concept imported from computing science to describe different conceptual frameworks that guide the collection, organization and publication of biological data. An ontology is similar to a paradigm but has very strict implications for formatting and meaning in a computational context. The use of ontologies is a means of communicating and resolving semantic and organizational differences between biological databases in order to enhance their integration. The purpose of interoperability (or sharing between divergent storage and semantic protocols) is to allow scientists from around the world to share and communicate with each other. This paper describes the rapid accumulation of biological data, its various organizational structures, and the role that ontologies play in interoperability

Directory of Open Access Journals

PubMed Central

BioWarehouse: a bioinformatics database warehouse toolkit

Author: Gupta Priyanka
Karp Peter D
Lee Thomas J
Pouliot Yannick
Stringer-Calvert David WJ
Tenenbaum Jessica D
Wagner Valerie
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: This article addresses the problem of interoperation of heterogeneous bioinformatics databases. RESULTS: We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. CONCLUSION: BioWarehouse embodies significant progress on the database integration problem for bioinformatics

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Biomedical informatics and translational medicine

Author: A Berlin
A Brazma
A Burgun
A Ebidia
A Ikekawa
A Kundaje
A Mangalampalli
A Ruttenberg
A Rzhetsky
A Wright
AH Peden
AJ Butte
AJ Butte
AJ Butte
AJ Cawsey
AK Smith
AM McDaniel
AS N
AX Garg
B Chaudhry
B Honigman
B Kaplan
B Louie
B Mollon
B Williams-Jones
BC Choi
BC Choi
BC Choi
BG Blobel
BJ Liu
BJ Liu
BL Humphreys
BM Costa
C Fomous
C Ohmann
CD Manning
CE Kahn Jr
CP Friedman
CS Ledbetter
D Detmer
D Johnston
D Jurafsky
D Lorence
D Rebholz-Schuhman
D Revere
D Short
DA Jordan
DA Lindberg
DB Keator
DC Balfour
DF Sittig
DJ Persell
DJ Severtson
DK Manley
DL Buckeridge
DL Heymann
DL Hunt
DL Rubin
DL Rubin
DM Bravata
DR Masys
DR Swanson
E Barclay
E Cadag
E Reiter
EA Zerhouni
EA Zerhouni
EG Poon
EH Shortliffe
EJ Hovenga
ER Weitzman
EV Bernstam
EV Bernstam
FT de Dombal
FT De Dombal
G Eysenbach
G Hripcsak
G Wade
GA Thorisson
GJ Downing
GO Barnett
GO Klein
GO Klein
GS Butler
GS Omenn
H Eriksson
H Muller
HJ Lee
HP Lehmann
HU Prokosch
Indra Neil Sarkar
IS Vizirianakis
J Allen
J Blake
J Cimino
J Lahteenmaki
J Lombardo
J Lyon
J Mantas
J Mykkanen
J Pathak
J Pearl
J Quackenbush
JA Osheroff
JD Halamka
JE Allen
JH van Bemmel
JJ Cimino
JJ Cimino
JK Iglehart
JM Marchibroda
JM Westfall
JS Brownstein
K Hayrinen
K Kawamoto
K Kawamoto
K Wasson
KA Kuhn
KB Cohen
KD Mandl
L Ohno-Machado
L Poissant
L Stein
LM Prevedello
M Dalal
M Fieschi
M Gerstein
M Musen
M Scherf
M Weeber
MA Harris
MA Hoffman
MA Musen
MD Kane
MF Collen
MF Collen
MJ Ball
MJ Ball
MJ Khoury
MS Siadaty
MS Watson
MY Galperin
MY Law
O Bodenreider
O Ratib
O Ratib
P Baxter
P De Clercq
P Durieux
P Jacquemart
P Mirhaji
PA Dang
PA Dang
PC Tang
PF Brennan
PG Shekelle
PH Gesteland
PJ Embi
PJ Embi
PL Reichertz
PM Kuzmak
PR Payne
PR Payne
PW O'Carroll
QT Zeng
R Feldman
R Haux
R Khorasani
R Kukafka
R Kukafka
R Mattheus
RA Greenes
RA Greenes
RA Greenes
RA Pagon
RB Altman
RL Arenson
RL Richesson
RO Duda
RS Dick
S Oster
S Xu
SB Johnson
SB King
SC Kirkwood
SF Altschul
SH Woolf
SM Huff
SM Maviglia
SM Meystre
SS Furuie
ST Rosenbloom
TH Payne
TK Houston
TR Frieden
U Rajcevic
U Sax
V Kashyap
V Maojo
V Maojo
VL Patel
W Clancey
W Hersh
W Hersh
W Hsu
WA Yasnoff
WD Bidgood Jr
WE Evans
WE Hammond
WE Hammond
WE Schreiber
WJ Bug
WR Hersh
WR Hersh
WR Hersh
WW Chapman
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Design and implementation of a cyberinfrastructure for RNA motif search, prediction and analysis

Author: Wen Dongrong
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2012
Field of study

RNA secondary and tertiary structure motifs play important roles in cells. However, very few web servers are available for RNA motif search and prediction. In this dissertation, a cyberinfrastructure, named RNAcyber, capable of performing RNA motif search and prediction, is proposed, designed and implemented. The first component of RNAcyber is a web-based search engine, named RmotifDB. This web-based tool integrates an RNA secondary structure comparison algorithm with the secondary structure motifs stored in the Rfam database. With a user-friendly interface, RmotifDB provides the ability to search for ncRNA structure motifs in both structural and sequential ways. The second component of RNAcyber is an enhanced version of RmotifDB. This enhanced version combines data from multiple sources, incorporates a variety of well-established structure-based search methods, and is integrated with the Gene Ontology. To display RmotifDB’s search results, a software tool, called RSview, is developed. RSview is able to display the search results in a graphical manner. Finally, RNAcyber contains a web-based tool called Junction-Explorer, which employs a data mining method for predicting tertiary motifs in RNA junctions. Specifically, the tool is trained on solved RNA tertiary structures obtained from the Protein Data Bank, and is able to predict the configuration of coaxial helical stacks and families (topologies) in RNA junctions at the secondary structure level. Junction-Explorer employs several algorithms for motif prediction, including a random forest classification algorithm, a pseudoknot removal algorithm, and a feature ranking algorithm based on the gini impurity measure. A series of experiments including 10-fold cross- validation has been conducted to evaluate the performance of the Junction-Explorer tool. Experimental results demonstrate the effectiveness of the proposed algorithms and the superiority of the tool over existing methods. The RNAcyber infrastructure is fully operational, with all of its components accessible on the Internet

Digital Commons @ New Jersey Institute of Technology (NJIT)

(MASSA: Multi-agent system to support functional annotation)

Author: Dias Xavier Daniela
Publication venue: 'Universidad Complutense de Madrid (UCM)'
Publication date: 30/06/2016
Field of study

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Ingeniería del Software e Inteligencia Artificial, leída el 23-11-2015Predecir la función biológica de secuencias de Ácido Desoxirribonucleico (ADN) es unos de los mayores desafíos a los que se enfrenta la Bioinformática. Esta tarea se denomina anotación funcional y es un proceso complejo, laborioso y que requiere mucho tiempo. Dado su impacto en investigaciones y anotaciones futuras, la anotación debe ser lo más able y precisa posible. Idealmente, las secuencias deberían ser estudiadas y anotadas manualmente por un experto, garantizando así resultados precisos y de calidad. Sin embargo, la anotación manual solo es factible para pequeños conjuntos de datos o genomas de referencia. Con la llegada de las nuevas tecnologías de secuenciación, el volumen de datos ha crecido signi cativamente, haciendo aún más crítica la necesidad de implementaciones automáticas del proceso. Por su parte, la anotación automática es capaz de manejar grandes cantidades de datos y producir un análisis consistente. Otra ventaja de esta aproximación es su rapidez y bajo coste en relación a la manual. Sin embargo, sus resultados son menos precisos que los manuales y, en general, deben ser revisados ( curados ) por un experto. Aunque los procesos colaborativos de la anotación en comunidad pueden ser utilizados para reducir este cuello de botella, los esfuerzos en esta línea no han tenido hasta ahora el éxito esperado. Además, el problema de la anotación, como muchos otros en el dominio de la Bioinformática, abarca información heterogénea, distribuida y en constante evolución. Una posible aproximación para superar estos problemas consiste en cambiar el foco del proceso de los expertos individuales a su comunidad, y diseñar las herramientas de manera que faciliten la gestión del conocimiento y los recursos. Este trabajo adopta esta línea y propone MASSA (Multi-Agent System to Support functional Annotation), una arquitectura de Sistema Multi-Agente (SMA) para Soportar la Anotación funcional...Predicting the biological function of Deoxyribonucleic Acid (DNA) sequences is one of the many challenges faced by Bioinformatics. This task is called functional annotation, and it is a complex, labor-intensive, and time-consuming process. This annotation has to be as accurate and reliable as possible given its impact in further researches and annotations. In order to guarantee a high-quality outcome, each sequence should be manually studied and annotated by an expert. Although desirable, the manual annotation is only feasible for small datasets or reference genomes. As the volume of genomic data has been increasing, specially after the advent of Next Generation Sequencing techniques, automatic implementations of this process are a necessity. The automatic annotation can handle a huge amount of data and produce consistent analyses. Besides, it is faster and less expensive than the manual approach. However, its outcome is less precise than the one predicted manually and often has to be curated by an expert. Although collaborative processes of community annotation could address this expert bottleneck in automatic annotation, these e orts have failed until now. Moreover, the annotation problem, as many others in this domain, has to deal with heterogeneous information that is distributed and constantly evolving. A possible way to overcome these hurdles is with a shift in the focus of the process from individual experts to communities, and with a design of tools that facilitates the management of knowledge and resources. This work follows this approach proposing MASSA, an architecture for a Multi-Agent System (MAS) to Support functional Annotation...Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

Docta Complutense

Resolving semantic conflicts through ontological layering

Author: Kataria P.
Kataria P.
Publication venue
Publication date: 01/01/2011
Field of study

We examine the problem of semantic interoperability in modern software systems, which exhibit pervasiveness, a range of heterogeneities and in particular, semantic heterogeneity of data models which are built upon ubiquitous data repositories. We investigate whether we can build ontologies upon heterogeneous data repositories in order to resolve semantic conflicts in them, and achieve their semantic interoperability. We propose a layered software architecture, which accommodates in its core, ontological layering, resulting in a Generic ontology for Context aware, Interoperable and Data sharing (Go-CID) software applications. The software architecture supports retrievals from various data repositories and resolves semantic conflicts which arise from heterogeneities inherent in them. It allows extendibility of heterogeneous data repositories through ontological layering, whilst preserving the autonomy of their individual elements. Our specific ontological layering for interoperable data repositories is based on clearly defined reasoning mechanisms in order to perform ontology mappings. The reasoning mechanisms depend on the user‟s involvments in retrievals of and types of semantic conflicts, which we have to resolve after identifying semantically related data. Ontologies are described in terms of ontological concepts and their semantic roles that make the types of semantic conflicts explicit. We contextualise semantically related data through our own categorisation of semantic conflicts and their degrees of similarities. Our software architecture has been tested through a case study of retrievals of semantically related data across repositories in pervasive healthcare and deployed with Semantic Web technology. The extensions to the research results include the applicability of our ontological layering and reasoning mechanisms in various problem domains and in environments where we need to (i) establish if and when we have overlapping “semantics”, and (ii) infer/assert a correct set of “semantics” which can support any decision making in such domains

WestminsterResearch