205 research outputs found
An ontology to standardize research output of nutritional epidemiology : from paper-based standards to linked content
Background: The use of linked data in the Semantic Web is a promising approach to add value to nutrition research. An ontology, which defines the logical relationships between well-defined taxonomic terms, enables linking and harmonizing research output. To enable the description of domain-specific output in nutritional epidemiology, we propose the Ontology for Nutritional Epidemiology (ONE) according to authoritative guidance for nutritional epidemiology.
Methods: Firstly, a scoping review was conducted to identify existing ontology terms for reuse in ONE. Secondly, existing data standards and reporting guidelines for nutritional epidemiology were converted into an ontology. The terms used in the standards were summarized and listed separately in a taxonomic hierarchy. Thirdly, the ontologies of the nutritional epidemiologic standards, reporting guidelines, and the core concepts were gathered in ONE. Three case studies were included to illustrate potential applications: (i) annotation of existing manuscripts and data, (ii) ontology-based inference, and (iii) estimation of reporting completeness in a sample of nine manuscripts.
Results: Ontologies for food and nutrition (n = 37), disease and specific population (n = 100), data description (n = 21), research description (n = 35), and supplementary (meta) data description (n = 44) were reviewed and listed. ONE consists of 339 classes: 79 new classes to describe data and 24 new classes to describe the content of manuscripts.
Conclusion: ONE is a resource to automate data integration, searching, and browsing, and can be used to assess reporting completeness in nutritional epidemiology
Towards structured sharing of raw and derived neuroimaging data across existing resources
Data sharing efforts increasingly contribute to the acceleration of
scientific discovery. Neuroimaging data is accumulating in distributed
domain-specific databases and there is currently no integrated access mechanism
nor an accepted format for the critically important meta-data that is necessary
for making use of the combined, available neuroimaging data. In this
manuscript, we present work from the Derived Data Working Group, an open-access
group sponsored by the Biomedical Informatics Research Network (BIRN) and the
International Neuroimaging Coordinating Facility (INCF) focused on practical
tools for distributed access to neuroimaging data. The working group develops
models and tools facilitating the structured interchange of neuroimaging
meta-data and is making progress towards a unified set of tools for such data
and meta-data exchange. We report on the key components required for integrated
access to raw and derived neuroimaging data as well as associated meta-data and
provenance across neuroimaging resources. The components include (1) a
structured terminology that provides semantic context to data, (2) a formal
data model for neuroimaging with robust tracking of data provenance, (3) a web
service-based application programming interface (API) that provides a
consistent mechanism to access and query the data model, and (4) a provenance
library that can be used for the extraction of provenance data by image
analysts and imaging software developers. We believe that the framework and set
of tools outlined in this manuscript have great potential for solving many of
the issues the neuroimaging community faces when sharing raw and derived
neuroimaging data across the various existing database systems for the purpose
of accelerating scientific discovery
Current advances in systems and integrative biology
Systems biology has gained a tremendous amount of interest in the last few years. This is partly due to the realization that traditional approaches focusing only on a few molecules at a time cannot describe the impact of aberrant or modulated molecular environments across a whole system. Furthermore, a hypothesis-driven study aims to prove or disprove its postulations, whereas a hypothesis-free systems approach can yield an unbiased and novel testable hypothesis as an end-result. This latter approach foregoes assumptions which predict how a biological system should react to an altered microenvironment within a cellular context, across a tissue or impacting on distant organs. Additionally, re-use of existing data by systematic data mining and re-stratification, one of the cornerstones of integrative systems biology, is also gaining attention. While tremendous efforts using a systems methodology have already yielded excellent results, it is apparent that a lack of suitable analytic tools and purpose-built databases poses a major bottleneck in applying a systematic workflow. This review addresses the current approaches used in systems analysis and obstacles often encountered in large-scale data analysis and integration which tend to go unnoticed, but have a direct impact on the final outcome of a systems approach. Its wide applicability, ranging from basic research, disease descriptors, pharmacological studies, to personalized medicine, makes this emerging approach well suited to address biological and medical questions where conventional methods are not ideal
Conceptualization of Computational Modeling Approaches and Interpretation of the Role of Neuroimaging Indices in Pathomechanisms for Pre-Clinical Detection of Alzheimer Disease
With swift advancements in next-generation sequencing technologies alongside the voluminous growth of biological data, a diversity of various data resources such as databases and web services have been created to facilitate data management, accessibility, and analysis. However, the burden of interoperability between dynamically growing data resources is an increasingly rate-limiting step in biomedicine, specifically concerning neurodegeneration. Over the years, massive investments and technological advancements for dementia research have resulted in large proportions of unmined data. Accordingly, there is an essential need for intelligent as well as integrative approaches to mine available data and substantiate novel research outcomes. Semantic frameworks provide a unique possibility to integrate multiple heterogeneous, high-resolution data resources with semantic integrity using standardized ontologies and vocabularies for context- specific domains. In this current work, (i) the functionality of a semantically structured terminology for mining pathway relevant knowledge from the literature, called Pathway Terminology System, is demonstrated and (ii) a context-specific high granularity semantic framework for neurodegenerative diseases, known as NeuroRDF, is presented. Neurodegenerative disorders are especially complex as they are characterized by widespread manifestations and the potential for dramatic alterations in disease progression over time. Early detection and prediction strategies through clinical pointers can provide promising solutions for effective treatment of AD. In the current work, we have presented the importance of bridging the gap between clinical and molecular biomarkers to effectively contribute to dementia research. Moreover, we address the need for a formalized framework called NIFT to automatically mine relevant clinical knowledge from the literature for substantiating high-resolution cause-and-effect models
A Knowledge-based Integrative Modeling Approach for <em>In-Silico</em> Identification of Mechanistic Targets in Neurodegeneration with Focus on Alzheimer’s Disease
Dementia is the progressive decline in cognitive function due to damage or disease in the body beyond what might be expected from normal aging. Based on neuropathological and clinical criteria, dementia includes a spectrum of diseases, namely Alzheimer's dementia, Parkinson's dementia, Lewy Body disease, Alzheimer's dementia with Parkinson's, Pick's disease, Semantic dementia, and large and small vessel disease. It is thought that these disorders result from a combination of genetic and environmental risk factors. Despite accumulating knowledge that has been gained about pathophysiological and clinical characteristics of the disease, no coherent and integrative picture of molecular mechanisms underlying neurodegeneration in Alzheimer’s disease is available. Existing drugs only offer symptomatic relief to the patients and lack any efficient disease-modifying effects. The present research proposes a knowledge-based rationale towards integrative modeling of disease mechanism for identifying potential candidate targets and biomarkers in Alzheimer’s disease. Integrative disease modeling is an emerging knowledge-based paradigm in translational research that exploits the power of computational methods to collect, store, integrate, model and interpret accumulated disease information across different biological scales from molecules to phenotypes. It prepares the ground for transitioning from ‘descriptive’ to “mechanistic” representation of disease processes. The proposed approach was used to introduce an integrative framework, which integrates, on one hand, extracted knowledge from the literature using semantically supported text-mining technologies and, on the other hand, primary experimental data such as gene/protein expression or imaging readouts. The aim of such a hybrid integrative modeling approach was not only to provide a consolidated systems view on the disease mechanism as a whole but also to increase specificity and sensitivity of the mechanistic model by providing disease-specific context. This approach was successfully used for correlating clinical manifestations of the disease to their corresponding molecular events and led to the identification and modeling of three important mechanistic components underlying Alzheimer’s dementia, namely the CNS, the immune system and the endocrine components. These models were validated using a novel in-silico validation method, namely biomarker-guided pathway analysis and a pathway-based target identification approach was introduced, which resulted in the identification of the MAPK signaling pathway as a potential candidate target at the crossroad of the triad components underlying disease mechanism in Alzheimer’s dementia
Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach
Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases
A FRAMEWORK FOR BIOPROFILE ANALYSIS OVER GRID
An important trend in modern medicine is towards individualisation of healthcare to tailor
care to the needs of the individual. This makes it possible, for example, to personalise
diagnosis and treatment to improve outcome. However, the benefits of this can only be fully
realised if healthcare and ICT resources are exploited (e.g. to provide access to relevant data,
analysis algorithms, knowledge and expertise). Potentially, grid can play an important role
in this by allowing sharing of resources and expertise to improve the quality of care. The
integration of grid and the new concept of bioprofile represents a new topic in the healthgrid
for individualisation of healthcare.
A bioprofile represents a personal dynamic "fingerprint" that fuses together a person's
current and past bio-history, biopatterns and prognosis. It combines not just data, but also
analysis and predictions of future or likely susceptibility to disease, such as brain diseases
and cancer. The creation and use of bioprofile require the support of a number of healthcare
and ICT technologies and techniques, such as medical imaging and electrophysiology and
related facilities, analysis tools, data storage and computation clusters. The need to share
clinical data, storage and computation resources between different bioprofile centres creates
not only local problems, but also global problems.
Existing ICT technologies are inappropriate for bioprofiling because of the difficulties in the
use and management of heterogeneous IT resources at different bioprofile centres. Grid as an
emerging resource sharing concept fulfils the needs of bioprofile in several aspects, including
discovery, access, monitoring and allocation of distributed bioprofile databases, computation
resoiuces, bioprofile knowledge bases, etc. However, the challenge of how to integrate the
grid and bioprofile technologies together in order to offer an advanced distributed bioprofile
environment to support individualized healthcare remains.
The aim of this project is to develop a framework for one of the key meta-level bioprofile
applications: bioprofile analysis over grid to support individualised healthcare. Bioprofile
analysis is a critical part of bioprofiling (i.e. the creation, use and update of bioprofiles).
Analysis makes it possible, for example, to extract markers from data for diagnosis and to
assess individual's health status. The framework provides a basis for a "grid-based" solution
to the challenge of "distributed bioprofile analysis" in bioprofiling. The main contributions
of the thesis are fourfold:
A. An architecture for bioprofile analysis over grid. The design of a suitable aichitecture
is fundamental to the development of any ICT systems. The architecture creates a
meaiis for categorisation, determination and organisation of core grid components to
support the development and use of grid for bioprofile analysis;
B. A service model for bioprofile analysis over grid. The service model proposes a
service design principle, a service architecture for bioprofile analysis over grid, and
a distributed EEG analysis service model. The service design principle addresses
the main service design considerations behind the service model, in the aspects of
usability, flexibility, extensibility, reusability, etc. The service architecture identifies
the main categories of services and outlines an approach in organising services to
realise certain functionalities required by distributed bioprofile analysis applications.
The EEG analysis service model demonstrates the utilisation and development of
services to enable bioprofile analysis over grid;
C. Two grid test-beds and a practical implementation of EEG analysis over grid. The two
grid test-beds: the BIOPATTERN grid and PlymGRID are built based on existing
grid middleware tools. They provide essential experimental platforms for research in
bioprofiling over grid. The work here demonstrates how resources, grid middleware
and services can be utilised, organised and implemented to support distributed EEG
analysis for early detection of dementia. The distributed Electroencephalography
(EEG) analysis environment can be used to support a variety of research activities in
EEG analysis;
D. A scheme for organising multiple (heterogeneous) descriptions of individual grid
entities for knowledge representation of grid. The scheme solves the compatibility
and adaptability problems in managing heterogeneous descriptions (i.e. descriptions
using different languages and schemas/ontologies) for collaborated representation of
a grid environment in different scales. It underpins the concept of bioprofile analysis
over grid in the aspect of knowledge-based global coordination between components
of bioprofile analysis over grid
NaviCell: a web-based environment for navigation, curation and maintenance of large molecular interaction maps
Molecular biology knowledge can be systematically represented in a
computer-readable form as a comprehensive map of molecular interactions. There
exist a number of maps of molecular interactions containing detailed
description of various cell mechanisms. It is difficult to explore these large
maps, to comment their content and to maintain them. Though there exist several
tools addressing these problems individually, the scientific community still
lacks an environment that combines these three capabilities together. NaviCell
is a web-based environment for exploiting large maps of molecular interactions,
created in CellDesigner, allowing their easy exploration, curation and
maintenance. NaviCell combines three features: (1) efficient map browsing based
on Google Maps engine; (2) semantic zooming for viewing different levels of
details or of abstraction of the map and (3) integrated web-based blog for
collecting the community feedback. NaviCell can be easily used by experts in
the field of molecular biology for studying molecular entities of their
interest in the context of signaling pathways and cross-talks between pathways
within a global signaling network. NaviCell allows both exploration of detailed
molecular mechanisms represented on the map and a more abstract view of the map
up to a top-level modular representation. NaviCell facilitates curation,
maintenance and updating the comprehensive maps of molecular interactions in an
interactive fashion due to an imbedded blogging system. NaviCell provides an
easy way to explore large-scale maps of molecular interactions, thanks to the
Google Maps and WordPress interfaces, already familiar to many users. Semantic
zooming used for navigating geographical maps is adopted for molecular maps in
NaviCell, making any level of visualization meaningful to the user. In
addition, NaviCell provides a framework for community-based map curation.Comment: 20 pages, 5 figures, submitte
Non-coding RNA regulatory networks
It is well established that the vast majority of human RNA transcripts do not encode for proteins and that non-coding RNAs regulate cell physiology and shape cellular functions. A subset of them is involved in gene regulation at different levels, from epigenetic gene silencing to post-transcriptional regulation of mRNA stability. Notably, the aberrant expression of many non-coding RNAs has been associated with aggressive pathologies. Rapid advances in network biology indicates that the robustness of cellular processes is the result of specific properties of biological networks such as scale-free degree distribution and hierarchical modularity, suggesting that regulatory network analyses could provide new insights on gene regulation and dysfunction mechanisms. In this study we present an overview of public repositories where non-coding RNA-regulatory interactions are collected and annotated, we discuss unresolved questions for data integration and we recall existing resources to build and analyse networks
- …