3,670 research outputs found
Metamodel Instance Generation: A systematic literature review
Modelling and thus metamodelling have become increasingly important in
Software Engineering through the use of Model Driven Engineering. In this paper
we present a systematic literature review of instance generation techniques for
metamodels, i.e. the process of automatically generating models from a given
metamodel. We start by presenting a set of research questions that our review
is intended to answer. We then identify the main topics that are related to
metamodel instance generation techniques, and use these to initiate our
literature search. This search resulted in the identification of 34 key papers
in the area, and each of these is reviewed here and discussed in detail. The
outcome is that we are able to identify a knowledge gap in this field, and we
offer suggestions as to some potential directions for future research.Comment: 25 page
Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.
Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)
Towards an Intelligent Database System Founded on the SP Theory of Computing and Cognition
The SP theory of computing and cognition, described in previous publications,
is an attractive model for intelligent databases because it provides a simple
but versatile format for different kinds of knowledge, it has capabilities in
artificial intelligence, and it can also function like established database
models when that is required.
This paper describes how the SP model can emulate other models used in
database applications and compares the SP model with those other models. The
artificial intelligence capabilities of the SP model are reviewed and its
relationship with other artificial intelligence systems is described. Also
considered are ways in which current prototypes may be translated into an
'industrial strength' working system
Semantic technologies: from niche to the mainstream of Web 3? A comprehensive framework for web Information modelling and semantic annotation
Context: Web information technologies developed and applied in the last decade
have considerably changed the way web applications operate and have
revolutionised information management and knowledge discovery. Social
technologies, user-generated classification schemes and formal semantics have a
far-reaching sphere of influence. They promote collective intelligence, support
interoperability, enhance sustainability and instigate innovation.
Contribution: The research carried out and consequent publications follow the
various paradigms of semantic technologies, assess each approach, evaluate its
efficiency, identify the challenges involved and propose a comprehensive framework for web information modelling and semantic annotation, which is the thesis’ original contribution to knowledge. The proposed framework assists web information
modelling, facilitates semantic annotation and information retrieval, enables system interoperability and enhances information quality.
Implications: Semantic technologies coupled with social media and end-user
involvement can instigate innovative influence with wide organisational implications that can benefit a considerable range of industries. The scalable and sustainable business models of social computing and the collective intelligence of organisational social media can be resourcefully paired with internal research and knowledge from interoperable information repositories, back-end databases and legacy systems.
Semantified information assets can free human resources so that they can be used to better serve business development, support innovation and increase productivity
The NASA Astrophysics Data System: Architecture
The powerful discovery capabilities available in the ADS bibliographic
services are possible thanks to the design of a flexible search and retrieval
system based on a relational database model. Bibliographic records are stored
as a corpus of structured documents containing fielded data and metadata, while
discipline-specific knowledge is segregated in a set of files independent of
the bibliographic data itself.
The creation and management of links to both internal and external resources
associated with each bibliography in the database is made possible by
representing them as a set of document properties and their attributes.
To improve global access to the ADS data holdings, a number of mirror sites
have been created by cloning the database contents and software on a variety of
hardware and software platforms.
The procedures used to create and manage the database and its mirrors have
been written as a set of scripts that can be run in either an interactive or
unsupervised fashion.
The ADS can be accessed at http://adswww.harvard.eduComment: 25 pages, 8 figures, 3 table
XML-based approaches for the integration of heterogeneous bio-molecular data
Background: The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results: In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion: XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources. </p
Data integration support for offshore decommissioning waste management
Offshore oil and gas platforms have a design life of about 25 years whereas the techniques and tools
used for managing their data are constantly evolving. Therefore, data captured about platforms during
their lifetimes will be in varying forms. Additionally, due to the many stakeholders involved with a facility
over its life cycle, information representation of its components varies. These challenges make data
integration difficult. Over the years, data integration technology application in the oil and gas industry
has focused on meeting the needs of asset life cycle stages other than decommissioning. This is the
case because most assets are just reaching the end of their design lives.
Currently, limited work has
been done on integrating life cycle data for offshore decommissioning purposes, and reports by industry
stakeholders underscore this need.
This thesis proposes a method for the integration of the common data types relevant in oil and gas
decommissioning. The key features of the method are that it (i) ensures semantic homogeneity using
knowledge representation languages (Semantic Web) and domain specific reference data (ISO 15926);
and (ii) allows stakeholders to continue to use their current applications. Prototypes of the framework
have been implemented using open source software applications and performance measures made.
The work of this thesis has been motivated by the business case of reusing offshore decommissioning
waste items. The framework developed is generic and can be applied whenever there is a need to
integrate and query disparate data involving oil and gas assets. The prototypes presented show how
the data management challenges associated with assessing the suitability of decommissioned offshore
facility items for reuse can be addressed. The performance of the prototypes show that significant time
and effort is saved compared to the state-of‐the‐art solution. The ability to do this effectively and
efficiently during decommissioning will advance the oil the oil and gas industry’s transition toward a
circular economy and help save on cost
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining
Background
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed.
Results
We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network.
Conclusion
The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies
- …