Search CORE

1,462 research outputs found

An object query language for multimedia federations

Author: Becarevic Damir
Publication venue: Dublin City University. School of Computing
Publication date: 01/01/2004
Field of study

The Fischlar system provides a large centralised repository of multimedia files. As expansion is difficult in centralised systems and as different user groups have a requirement to define their own schemas, the EGTV (Efficient Global Transactions for Video) project was established to examine how the distribution of this database could be managed. The federated database approach is advocated where global schema is designed in a top-down approach, while all multimedia and textual data is stored in object-oriented (O-O) and object-relational (0-R) compliant databases. This thesis investigates queries and updates on large multimedia collections organised in the database federation. The goal of this research is to provide a generic query language capable of interrogating global and local multimedia database schemas. Therefore, a new query language EQL is defined to facilitate the querying of object-oriented and objectrelational database schemas in a database and platform independent manner, and acts as a canonical language for database federations. A new canonical language was required as the existing query language standards (SQL: 1999 and OQL) axe generally incompatible and translation between them is not trivial. EQL is supported with a formally defined object algebra and specified semantics for query evaluation. The ability to capture and store metadata of multiple database schemas is essential when constructing and querying a federated schema. Therefore we also present a new platform independent metamodel for specifying multimedia schemas stored in both object-oriented and object-relational databases. This metadata information is later used for the construction of a global schemas, and during the evaluation of local and global queries. Another important feature of any federated system is the ability to unambiguously define database schemas. The schema definition language for an EGTV database federation must be capable of specifying both object-oriented and object-relational schemas in the database independent format. As XML represents a standard for encoding and distributing data across various platforms, a language based upon XML has been developed as a part of our research. The ODLx (Object Definition Language XML) language specifies a set of XMLbased structures for defining complex database schemas capable of representing different multimedia types. The language is fully integrated with the EGTV metamodel through which ODLx schemas can be mapped to 0-0 and 0-R databases

Irish Universities

DCU Online Research Access Service

Integration of Legacy and Heterogeneous Databases

Author: Hainaut Jean-Luc
Thiran Philippe
Publication venue: Institut d'Informatrique - LIBD
Publication date: 01/01/2002
Field of study

Repository of the University of Namur

An automated ETL for online datasets

Author: McCarren Andrew
McCarthy Suzanne
Roantree Mark
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2019
Field of study

While using online datasets for machine learning is commonplace today, the quality of these datasets impacts on the performance of prediction algorithms. One method for improving the semantics of new data sources is to map these sources to a common data model or ontology. While semantic and structural heterogeneities must still be resolved, this provides a well established approach to providing clean datasets, suitable for machine learning and analysis. However, when there is a requirement for a close to real time usage of online data, a method for dynamic Extract-Transform-Load of new sources data must be developed. In this work, we present a framework for integrating online and enterprise data sources, in close to real time, to provide datasets for machine learning and predictive algorithms. An exhaustive evaluation compares a human built data transformation process with our system’s machine generated ETL process, with very favourable results, illustrating the value and impact of an automated approach

DCU Online Research Access Service

An automated ETL for online datasets

Author: McCarren Andrew
McCarthy Suzanne
Roantree Mark
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2019
Field of study

Crossref

Irish Universities

DCU Online Research Access Service

An automated ETL for online datasets

Author: McCarren Andrew
McCarthy Suzanne
Roantree Mark
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/12/2019
Field of study

Crossref

Irish Universities

DCU Online Research Access Service

Towards interoperability in heterogeneous database systems

Author: Kramer J
Zisman A
Publication venue: Department of Computing, Imperial College London
Publication date: 01/12/1995
Field of study

Distributed heterogeneous databases consist of systems which differ physically and logically, containing different data models and data manipulation languages. Although these databases are independently created and administered they must cooperate and interoperate. Users need to access and manipulate data from several databases and applications may require data from a wide variety of independent databases. Therefore, a new system architecture is required to manipulate and manage distinct and multiple databases, in a transparent way, while preserving their autonomy. This report contains an extensive survey on heterogeneous databases, analysing and comparing the different aspects, concepts and approaches related to the topic. It introduces an architecture to support interoperability among heterogeneous database systems. The architecture avoids the use of a centralised structure to assist in the different phases of the interoperability process. It aims to support scalability, and to assure privacy and nfidentiality of the data. The proposed architecture allows the databases to decide when to participate in the system, what type of data to share and with which other databases, thereby preserving their autonomy. The report also describes an approach to information discovery in the proposed architecture, without using any centralised structure as repositories and dictionaries, and broadcasting to all databases. It attempts to reduce the number of databases searched and to preserve the privacy of the shared data. The main idea is to visit a database that either containsthe requested data or knows about another database that possible contains this data

Spiral - Imperial College Digital Repository

The INCF Digital Atlasing Program: Report on Digital Atlasing Standards in the Rodent Brain

Author: Albert Burger
Fons Verbeek
G. Allan Johnson
Ilya Zaslavsky
Jonathan Nissanov
Jyl Boline
Luis Puelles
Lydia Ng
Maryann Martone
Michael Hawrylycz
Seth Ruffins
Tsutomu Hashikawa
Publication venue
Publication date: 23/11/2009
Field of study

The goal of the INCF Digital Atlasing Program is to provide the vision and direction necessary to make the rapidly growing collection of multidimensional data of the rodent brain (images, gene expression, etc.) widely accessible and usable to the international research community. This Digital Brain Atlasing Standards Task Force was formed in May 2008 to investigate the state of rodent brain digital atlasing, and formulate standards, guidelines, and policy recommendations.

Our first objective has been the preparation of a detailed document that includes the vision and specific description of an infrastructure, systems and methods capable of serving the scientific goals of the community, as well as practical issues for achieving
the goals. This report builds on the 1st INCF Workshop on Mouse and Rat Brain Digital Atlasing Systems (Boline et al., 2007, _Nature Preceedings_, doi:10.1038/npre.2007.1046.1) and includes a more detailed analysis of both the current state and desired state of digital atlasing along with specific recommendations for achieving these goals

Crossref

Nature Precedings

Recommended from our members

View mappings for query languages

Author: Wong Kam Chooi
Publication venue
Publication date: 01/01/1984
Field of study

The problems of current use of query languages are looked at. One chief drawback is the undesirable requirement for end user familiarity with and knowledge of the underlying database structures, in order to retrieve data effectively. The approach adopted towards resolving this is by means of high-level view support, using unit view structures called perceived records. A prime concern of this thesis then, is the study of perceived record mappings from the database. A set of criteria for categorising and analysing the features of database mappings for end-user views is first developed. In addition, a classification of data structure transformations and data item transformations is also presented. The framework is general and is independent of a specific data model or database management system. Its usefulness is demonstrated by its application to the analysis of view transformations from recursive database structures to high-level, unit view structures. In addition, it serves as a basis for evaluating and comparing the mapping facilities in existing systems. Possible ways of specifying a suitable data model for the perceived record view concept are described. Following on, two general mapping techniques are discussed. This leads to a proposal for a mapping mechanism that supports the flexible derivation of complex perceived record views that can differ considerably from the source structures. The mechanism uses an intermediary canonical transform model. Description of how the transform model mechansim can be used in practical systems to derive perceived record views, is also presented. The feasibility of the ideas proposed are tested out by implementing an interactive software system for defining perceived record views. For this, a mapping definition language for perceived record derivation is first designed. The control system sets up the structures of the mapping definition language and prompts the End-User-Administrator to define and specify the mappings for a perceived record. Appraisals of both the proposed mapping mechanism and implementation are discussed. Examples of use of the interface system are included. The limitations of the implementation are pinpointed with suggestions for further improvements. Practical applications of the work and evaluation of the approach in the light of other existing approaches, are also discussed

Open Research Online (The Open University)

Semantics of Database Transformations

Author: Anthony Kosky
Anthony Kosky
Peter Buneman
Peter Buneman
Susan B. Davidson
Susan Davidson
Publication venue: ScholarlyCommons
Publication date: 10/07/1995
Field of study

Database transformations arise in many different settings including database integrations, evolution of database systems, and implementing user views and data-entry tools. This paper surveys approaches that have been taken to problems in these settings, assesses their strengths and weaknesses, and develops requirements on a formal model for specifying and implementing database transformations. We also consider the problem of insuring the correctness of database transformations. In particular, we demonstrate that the usefulness of correctness conditions such as information preservation are hindered by the interactions of transformations and database constraints, and the limited expressive power of established database constraint languages. We conclude that more general notions of correctness are required, and that there is a need for a uniform formalism for expressing both database transformations and constraints, and reasoning about their interactions. Finally we introduce WOL, a declarative language for specifying and implementing database transformations and constraints. We briefly describe the WOL language and its semantics, and argue that it addresses many of the requirements of a formalism for dealing with general database transformations

CiteSeerX

ScholarlyCommons@Penn

A Service Late Binding Enabled Solution for Data Integration from Autonomous and Evolving Databases

Author: WANG CHONG
Publication venue
Publication date: 01/01/2010
Field of study

Integrating data from autonomous, distributed and heterogeneous data sources to provide a unified vision is a common demand for many businesses. Since the data sources may evolve frequently to satisfy their own independent business needs, solutions which use hard coded queries to integrate participating databases may cause high maintenance costs when evolution occurs. Thus a new solution which can handle database evolution with lower maintenance effort is required. This thesis presents a new solution: Service Late binding Enabled Data Integration (SLEDI) which is set into a framework modeling the essential processes of the data integration activity. It integrates schematic heterogeneous relational databases with decreased maintenance costs for handling database evolution. An algorithm, named Information Provision Unit Describing (IPUD) is designed to describe each database as a set of Information Provision Units (IPUs). The IPUs are represented as Directed Acyclic Graph (DAG) structured data instead of hard coded queries, and further realized as data services. Hence the data integration is achieved through service invocations. Furthermore, a set of processes is defined to handle the database evolution through automatically identifying and modifying the IPUs which are affected by the evolution. An extensive evaluation based on a case study is presented. The result shows that the schematic heterogeneities defined in this thesis can be solved by IPUD except the relation isomorphism discrepancy. Ten out of thirteen types of schematic database evolution can be automatically handled by the evolution handling processes as long as the evolution is represented by the designed data model. The computational costs of the automatic evolution handling show a slow linear growth with the number of participating databases. Other characteristics addressed include SLEDI’s scalability, independence of application domain and databases model. The descriptive comparison with other data integration approaches shows that although the Data as a Service approach may result in lower performance under some circumstances, it supports better flexibility for integrating data from autonomous and evolving data sources

Durham e-Theses