Search CORE

7,769 research outputs found

Using Ontologies for the Design of Data Warehouses

Author: Mazón Jose-Norberto
Pardillo Jesús
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2011
Field of study

Obtaining an implementation of a data warehouse is a complex task that forces designers to acquire wide knowledge of the domain, thus requiring a high level of expertise and becoming it a prone-to-fail task. Based on our experience, we have detected a set of situations we have faced up with in real-world projects in which we believe that the use of ontologies will improve several aspects of the design of data warehouses. The aim of this article is to describe several shortcomings of current data warehouse design approaches and discuss the benefit of using ontologies to overcome them. This work is a starting point for discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

XML content warehousing: Improving sociological studies of mailing lists and web data

Author: Colazzo Dario
Dudouet François-Xavier
Manolescu Ioana
Nguyen Benjamin
Senellart Pierre
Vion Antoine
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

INRIA a CCSD electronic archive server

HAL UVSQ

HAL-Rennes 1

Finding Temporal Patterns in Noisy Longitudinal Data: A Study in Diabetic Retinopathy

Author: F.P. Coenen
F.P. Coenen
G. Dong
G. Kalton
J.D. Singer
J.D. Skinner
J.F. Mum̃oz
J.W.R. Twisk
K. Yamaguchi
L.J.T. Kamp van der
M.L. Levy
P.N.E. Nohuddin
R.J. Little
S.Y.S. Kimm
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

This paper describes an approach to temporal pattern mining using the concept of user defined temporal prototypes to define the nature of the trends of interests. The temporal patterns are defined in terms of sequences of support values associated with identified frequent patterns. The prototypes are defined mathematically so that they can be mapped onto the temporal patterns. The focus for the advocated temporal pattern mining process is a large longitudinal patient database collected as part of a diabetic retinopathy screening programme, The data set is, in itself, also of interest as it is very noisy (in common with other similar medical datasets) and does not feature a clear association between specific time stamps and subsets of the data. The diabetic retinopathy application, the data warehousing and cleaning process, and the frequent pattern mining procedure (together with the application of the prototype concept) are all described in the paper. An evaluation of the frequent pattern mining process is also presented

Crossref

University of Huddersfield Repository

Recommended from our members

The representation of time in data warehouses

Author: Todman Christopher Derek
Publication venue
Publication date: 01/01/1999
Field of study

This thesis researches the problems concerning the specification and implementation of the temporal requirements in data warehouses. The thesis focuses on two areas, firstly, the methods for identifying and capturing the business information needs and associated temporal requirements at the conceptual level and; secondly, methods for classifying and implementing the requirements at the logical level using the relational model. At the conceptual level, eight candidate methodologies were investigated to examine their suitability for the creation of data models that are appropriate for a data warehouse. The methods were evaluated to assess their representation of time, their ability to reflect the dimensional nature of data warehouse models and their simplicity of use. The research found that none of the methods under review fully satisfied the criteria. At the logical level, the research concluded that the methods widely used in current practice result in data structures that are either incapable of answering some very basic questions involving history or that return inaccurate results. Specific proposals are made in three areas. Firstly, a new conceptual model is described that is designed to capture the information requirements for dimensional models and has full support for time. Secondly, a new approach at the logical level is proposed. It provides the data structures that enable the requirements captured in the conceptual model to be implemented, thus enabling the historical questions to be answered simply and accurately. Thirdly, a set of rules is developed to help minimise the inaccuracy caused by time. A guide has been produced that provides practitioners with the tools and instructions on how to implement data warehouses using the methods developed in the thesis

Open Research Online (The Open University)

Logic-Based Specification Languages for Intelligent Software Agents

Author: Martelli Maurizio
Mascardi Viviana
Sterling Leon
Publication venue
Publication date: 20/11/2003
Field of study

The research field of Agent-Oriented Software Engineering (AOSE) aims to find abstractions, languages, methodologies and toolkits for modeling, verifying, validating and prototyping complex applications conceptualized as Multiagent Systems (MASs). A very lively research sub-field studies how formal methods can be used for AOSE. This paper presents a detailed survey of six logic-based executable agent specification languages that have been chosen for their potential to be integrated in our ARPEGGIO project, an open framework for specifying and prototyping a MAS. The six languages are ConGoLog, Agent-0, the IMPACT agent programming language, DyLog, Concurrent METATEM and Ehhf. For each executable language, the logic foundations are described and an example of use is shown. A comparison of the six languages and a survey of similar approaches complete the paper, together with considerations of the advantages of using logic-based languages in MAS modeling and prototyping.Comment: 67 pages, 1 table, 1 figure. Accepted for publication by the Journal "Theory and Practice of Logic Programming", volume 4, Maurice Bruynooghe Editor-in-Chie

arXiv.org e-Print Archive

CiteSeerX

Archivio istituzionale della ricerca - Università di Genova

Warehousing of object oriented petroleum data for knowledge mapping

Author: Dreher Heinz
Nimmagadda Shastri
Rudra Amit
Publication venue: 'IBIMA Publishing'
Publication date: 01/01/2005
Field of study

Australia produces a-third of world?s natural resources. Enormous amounts of energy and financial resources are expended in order to tap these natural reserves from the earth?s surface. Vast amounts of these resources, however, remain unexplored and under exploited. Data pertaining natural resources, such as mineral and petroleum, are, in general, heterogeneous and complex in nature. Volumes of these types of data are geographically distributed among many companies in Australia and abroad. The existing historical resources data are logically and physically organized using warehousing techniques. Entity relationship (ER) and object oriented (OO) data mapping techniques are used for analyzing the data entities, dimensions and objects. In this paper object oriented data and warehousing of object class data models have been described. Data mining techniques can be employed to explore many more resources hidden, under great depths of the earth?s crust, without additional efforts of exploration and development. Warehoused object oriented resources data can significantly reduce the complexity of the resources data structuring and enhance the data integration and information sharing among various operational units of the resources industry. Large amount of financial inputs can be saved if these technologies are successfully implemented in the resources industry

espace@Curtin