4 research outputs found

    Remote Data Retrieval for Bioinformatics Applications: An Agent Migration Approach

    Get PDF
    Some of the approaches have been developed to retrieve data automatically from one or multiple remote biological data sources. However, most of them require researchers to remain online and wait for returned results. The latter not only requires highly available network connection, but also may cause the network overload. Moreover, so far none of the existing approaches has been designed to address the following problems when retrieving the remote data in a mobile network environment: (1) the resources of mobile devices are limited; (2) network connection is relatively of low quality; and (3) mobile users are not always online. To address the aforementioned problems, we integrate an agent migration approach with a multi-agent system to overcome the high latency or limited bandwidth problem by moving their computations to the required resources or services. More importantly, the approach is fit for the mobile computing environments. Presented in this paper are also the system architecture, the migration strategy, as well as the security authentication of agent migration. As a demonstration, the remote data retrieval from GenBank was used to illustrate the feasibility of the proposed approach

    Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading

    Full text link
    Todays genomic domain evolves around insecurity: too many imprecise concepts, too much information to be properly managed. Considering that conceptualization is the most exclusive human characteristic, it makes full sense to try to conceptualize the principles that guide the essence of why humans are as we are. This question can of course be generalized to any species, but we are especially interested in this work in showing how conceptual modeling is strictly required to understand the ''execution model'' that human beings ''implement''. The main issue is to defend the idea that only by having an in-depth knowledge of the Conceptual Model that is associated to the Human Genome, can this Human Genome properly be understood. This kind of Model-Driven perspective of the Human Genome opens challenging possibilities, by looking at the individuals as implementation of that Conceptual Model, where different values associated to different modeling primitives will explain the diversity among individuals and the potential, unexpected variations together with their unwanted effects in terms of illnesses. This work focuses on the challenges faced in loading data from conventional resources into Information Systems created according to the above mentioned conceptual modeling approach. The work reports on various loading efforts, problems encountered and the solutions to these problems. Also, a strong argument is made about why conventional methods to solve the so called `data chaos¿ problems associated to the genomics domain so often fail to meet the demands.Van Der Kroon ., M. (2011). Conceptual Modeling Applied to Genomics: Challenges Faced in Data Loading. http://hdl.handle.net/10251/16993Archivo delegad

    SEMEDA (Semantic Meta-Database) : ontology based semantic integration of biological databases

    Get PDF
    Köhler J. SEMEDA (Semantic Meta-Database) : ontology based semantic integration of biological databases. Bielefeld (Germany): Bielefeld University; 2003.The work presented in this thesis is outlined in the following. The state of the art in the relevant disciplines is introduced and reviewed in chapter 2. This includes on the one hand the current state of molecular biological databases, their heterogeneity and the integration of molecular biological databases. On the other hand the current usage of ontologies in general and with special regard to database integration is described. The principles of semantic database integration as introduced in this thesis are new and suitable to be used also in other database integration systems, which have to deal with a high number of semantically heterogeneous databases. Therefore in Chapter 3 the newly introduced principles for ontology based semantic database integration are presented independent of their implementation. Chapter 4 introduces the requirements for the implementation of a semantic database integration system (SEMEDA). Several general requirements for the integration of molecular biological systems from the scientific literature are discussed with regard to the feasibility of their implementation in general and in SEMEDA. In addition, the requirements specific to semantic database integration are introduced. In addition how the BioDataServer is used to overcome "technical" heterogeneity, so that SEMEDA only has to deal with semantic heterogeneity is analysed. In chapter 5, an appropriate data structure for storing ontologies, database metadata and the semantic definitions as described in Chapter 3 is developed. Subsequently, it is discussed how this data structure can be edited and queried. In Chapter 6, SEMEDAs software design, implementation and system architecture is given. Chapter 7 describes the use of SEMEDA and its interfaces. The user interface SEMEDA-edit is used to collaboratively edit ontologies and to semantically define databases using ontologies. SEMEDA-query is the query interface that provides uniform access to heterogeneous databases. In addition, a set of procedures exists which can be used by external applications. In order to use SEMEDA to semantically define databases, an appropriate ontology is needed. Although SEMEDA allows building ontologies from the scratch, due to the fact that generating ontologies is a labour intensive time-consuming task, it would be preferable to use an existing ontology. Therefore, in chapter 8 several ontologies were evaluated for their usability in SEMEDA. The intention was to find out if a suitable ontology can be found and imported or whether it is more appropriate to build a custom ontology for SEMEDA. It turned out that the existing ontologies were not well suited for semantic database integration. In chapter 9 general and SEMEDA specific ontology design principles are introduced which were then followed to build a custom ontology for database integration. The structure of this custom ontology and some issues concerning its use for semantic database integration are explained. In chapter 10, the practical use of SEMEDA is described by two examples. The first section of this chapter shows how SEMEDA supports the building of user schemata for the BioDataServer. The second section describes how the clone database of the RZPD Berlin (Deutsches Ressourcenzentrum für Genomforschung GmbH) is connected to SEMEDA and thus linked to the other databases. In the discussion (chapter 11) SEMEDA is compared to existing database integration systems, especially other ontology based integration systems. It is further discussed how principles for semantic database integration apply to other database integration systems and how they might be implemented there. A database mirror is proposed to improve the overall performance of SEMEDA and the BioDataServer

    A model system for studying the integration of molecular biology databases.

    No full text
    MOTIVATION: Integration of molecular biology databases remains limited in practice despite its practical importance and considerable research effort. The complexity of the problem is such that an experimental approach is mandatory, yet this very complexity makes it hard to design definitive experiments. This dilemma is common in science, and one tried-and-true strategy is to work with model systems. We propose a model system for this problem, namely a database of genes integrating diverse data across organisms, and describe an experiment using this model. RESULTS: We attempted to construct a database of human and mouse genes integrating data from GenBank and the human and mouse genome-databases. We discovered numerous errors in these well-respected databases: approximately 15% of genes are apparently missing from the genome-databases; links between the sequence and genome-databases are missing for another 5-10% of the cases; about a third of likely homology links are missing between the genome-databases; 10-20% of entries classified as \u27genes\u27 are apparently misclassified. By using a model system, we were able to study the problems caused by anomalous data without having to face all the hard problems of database integration. CONTACT: [email protected]
    corecore