17,783 research outputs found
Automatic generation of data merging program codes.
Data merging is an essential part of ETL (Extract-Transform-Load) processes to build a data warehouse system. To avoid rewheeling merging techniques, we propose a Data Merging Meta-model (DMM) and its transformation into executable program codes in the manner of model driven engineering. DMM allows defining relationships of different model entities and their merging types in conceptual level. Our formalized transformation described using ATL (ATLAS Transformation Language) enables automatic generation of PL/SQL packages to execute data merging in commercial ETL tools. With this approach data warehouse engineers can be relieved from the burden of repetitive complex script coding and the pain of maintaining consistency of design and implementation
Automatic generation of data merging program codes
Data merging is an essential part of ETL (Extract-Transform-Load) processes to build a data warehouse system. To avoid rewheeling merging techniques, we propose a Data Merging Meta-model (DMM) and its transformation into executable program codes in the manner of model driven engineering. DMM allows defining relationships of different model entities and their merging types in conceptual level. Our formalized transformation described using ATL (ATLAS Transformation Language) enables automatic generation of PL/SQL packages to execute data merging in commercial ETL tools. With this approach data warehouse engineers can be relieved from the burden of repetitive complex script coding and the pain of maintaining consistency of design and implementation
Using Ontologies for the Design of Data Warehouses
Obtaining an implementation of a data warehouse is a complex task that forces
designers to acquire wide knowledge of the domain, thus requiring a high level
of expertise and becoming it a prone-to-fail task. Based on our experience, we
have detected a set of situations we have faced up with in real-world projects
in which we believe that the use of ontologies will improve several aspects of
the design of data warehouses. The aim of this article is to describe several
shortcomings of current data warehouse design approaches and discuss the
benefit of using ontologies to overcome them. This work is a starting point for
discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
SODA: Generating SQL for Business Users
The purpose of data warehouses is to enable business analysts to make better
decisions. Over the years the technology has matured and data warehouses have
become extremely successful. As a consequence, more and more data has been
added to the data warehouses and their schemas have become increasingly
complex. These systems still work great in order to generate pre-canned
reports. However, with their current complexity, they tend to be a poor match
for non tech-savvy business analysts who need answers to ad-hoc queries that
were not anticipated. This paper describes the design, implementation, and
experience of the SODA system (Search over DAta Warehouse). SODA bridges the
gap between the business needs of analysts and the technical complexity of
current data warehouses. SODA enables a Google-like search experience for data
warehouses by taking keyword queries of business users and automatically
generating executable SQL. The key idea is to use a graph pattern matching
algorithm that uses the metadata model of the data warehouse. Our results with
real data from a global player in the financial services industry show that
SODA produces queries with high precision and recall, and makes it much easier
for business users to interactively explore highly-complex data warehouses.Comment: VLDB201
Recommended from our members
Radio frequency identification (RFID) technologies for locating warehouse resources: A conceptual framework
Copyright @ 2012 Information Technology SocietyIn the supply chain, a warehouse is a crucial component for linking all chain parties. It is necessary to track the real time resource location and status to support warehouse operations effectively. Therefore, RFID technology has been adopted to facilitate the collection and sharing of data in a warehouse environment. However, an essential decision should be made on the type of RFID tags the warehouse managers should adopt, because it is very important to implement RFID tags that work in warehouse environment. As a result, the warehouse resources will be easily tracked and accurately located which will improve the visibility of warehouse operations, enhance the productivity and reduce the operation costs of the warehouse. Therefore, it is crucial to evaluate the reading performance of all types of RFID tags in a warehouse environment in order to choose the most appropriate RFID tags which will enhance the operational efficiency of a warehouse. Reading performance of active and passive RFID tags have been evaluated before while, semi-passive RFID tag, which is battery-assisted with greater sensitivity than passive tags and cheaper than active tags, has not been examined yet in a warehouse environment. This research is in- progress research and it seeks to (i) provide a general overview of the existing real-time data management techniques in tracking warehouse resources location, (ii) provide an overall conceptual framework that can help warehouse managers to choose the best RFID technologies for a warehouse environment, (iii) Finally, the paper submits an experiment design for evaluating the reading performance of semi-passive RFID tags in a warehouse environment
- …