7,769 research outputs found
Using Ontologies for the Design of Data Warehouses
Obtaining an implementation of a data warehouse is a complex task that forces
designers to acquire wide knowledge of the domain, thus requiring a high level
of expertise and becoming it a prone-to-fail task. Based on our experience, we
have detected a set of situations we have faced up with in real-world projects
in which we believe that the use of ontologies will improve several aspects of
the design of data warehouses. The aim of this article is to describe several
shortcomings of current data warehouse design approaches and discuss the
benefit of using ontologies to overcome them. This work is a starting point for
discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
Finding Temporal Patterns in Noisy Longitudinal Data: A Study in Diabetic Retinopathy
This paper describes an approach to temporal pattern mining using the concept of user defined temporal prototypes to define the nature of the trends of interests. The temporal patterns are defined in terms of sequences of support values associated with identified frequent patterns. The prototypes are defined mathematically so that they can be mapped onto the temporal patterns. The focus for the advocated temporal pattern mining process is a large longitudinal patient database collected as part of a diabetic retinopathy screening programme, The data set is, in itself, also of interest as it is very noisy (in common with other similar medical datasets) and does not feature a clear association between specific time stamps and subsets of the data. The diabetic retinopathy application, the data warehousing and cleaning process, and the frequent pattern mining procedure (together with the application of the prototype concept) are all described in the paper. An evaluation of the frequent pattern mining process is also presented
Recommended from our members
The representation of time in data warehouses
This thesis researches the problems concerning the specification and implementation of the temporal requirements in data warehouses. The thesis focuses on two areas, firstly, the methods for identifying and capturing the business information needs and associated temporal requirements at the conceptual level and; secondly, methods for classifying and implementing the requirements at the logical level using the relational model.
At the conceptual level, eight candidate methodologies were investigated to examine their suitability for the creation of data models that are appropriate for a data warehouse. The methods were evaluated to assess their representation of time, their ability to reflect the dimensional nature of data warehouse models and their simplicity of use. The research found that none of the methods under review fully satisfied the criteria.
At the logical level, the research concluded that the methods widely used in current practice result in data structures that are either incapable of answering some very basic questions involving history or that return inaccurate results.
Specific proposals are made in three areas. Firstly, a new conceptual model is described that is designed to capture the information requirements for dimensional models and has full support for time. Secondly, a new approach at the logical level is proposed. It provides the data structures that enable the requirements captured in the conceptual model to be implemented, thus enabling the historical questions to be answered simply and accurately. Thirdly, a set of rules is developed to help minimise the inaccuracy caused by time.
A guide has been produced that provides practitioners with the tools and instructions on how to implement data warehouses using the methods developed in the thesis
Logic-Based Specification Languages for Intelligent Software Agents
The research field of Agent-Oriented Software Engineering (AOSE) aims to find
abstractions, languages, methodologies and toolkits for modeling, verifying,
validating and prototyping complex applications conceptualized as Multiagent
Systems (MASs). A very lively research sub-field studies how formal methods can
be used for AOSE. This paper presents a detailed survey of six logic-based
executable agent specification languages that have been chosen for their
potential to be integrated in our ARPEGGIO project, an open framework for
specifying and prototyping a MAS. The six languages are ConGoLog, Agent-0, the
IMPACT agent programming language, DyLog, Concurrent METATEM and Ehhf. For each
executable language, the logic foundations are described and an example of use
is shown. A comparison of the six languages and a survey of similar approaches
complete the paper, together with considerations of the advantages of using
logic-based languages in MAS modeling and prototyping.Comment: 67 pages, 1 table, 1 figure. Accepted for publication by the Journal
"Theory and Practice of Logic Programming", volume 4, Maurice Bruynooghe
Editor-in-Chie
Warehousing of object oriented petroleum data for knowledge mapping
Australia produces a-third of world?s natural resources. Enormous amounts of energy and financial resources are expended in order to tap these natural reserves from the earth?s surface. Vast amounts of these resources, however, remain unexplored and under exploited. Data pertaining natural resources, such as mineral and petroleum, are, in general, heterogeneous and complex in nature. Volumes of these types of data are geographically distributed among many companies in Australia and abroad. The existing historical resources data are logically and physically organized using warehousing techniques. Entity relationship (ER) and object oriented (OO) data mapping techniques are used for analyzing the data entities, dimensions and objects. In this paper object oriented data and warehousing of object class data models have been described. Data mining techniques can be employed to explore many more resources hidden, under great depths of the earth?s crust, without additional efforts of exploration and development. Warehoused object oriented resources data can significantly reduce the complexity of the resources data structuring and enhance the data integration and information sharing among various operational units of the resources industry. Large amount of financial inputs can be saved if these technologies are successfully implemented in the resources industry
- …