Search CORE

4,305 research outputs found

Representation Independent Analytics Over Structured Data

Author: Chodpathumwan Yodsawalai
Fern Alan
Picado Jose
Sun Yizhou
Termehchy Arash
Publication venue
Publication date: 08/09/2014
Field of study

Database analytics algorithms leverage quantifiable structural properties of the data to predict interesting concepts and relationships. The same information, however, can be represented using many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, there is no guarantee that current database analytics algorithms will still provide the correct insights, no matter what structures are chosen to organize the database. Because these algorithms tend to be highly effective over some choices of structure, such as that of the databases used to validate them, but not so effective with others, database analytics has largely remained the province of experts who can find the desired forms for these algorithms. We argue that in order to make database analytics usable, we should use or develop algorithms that are effective over a wide range of choices of structural organizations. We introduce the notion of representation independence, study its fundamental properties for a wide range of data analytics algorithms, and empirically analyze the amount of representation independence of some popular database analytics algorithms. Our results indicate that most algorithms are not generally representation independent and find the characteristics of more representation independent heuristics under certain representational shifts

arXiv.org e-Print Archive

CiteSeerX

Classifier System Learning of Good Database Schema

Author: Tanaka Mitsuru
Publication venue: ScholarWorks@UNO
Publication date: 07/08/2008
Field of study

This thesis presents an implementation of a learning classifier system which learns good database schema. The system is implemented in Java using the NetBeans development environment, which provides a good control for the GUI components. The system contains four components: a user interface, a rule and message system, an apportionment of credit system, and genetic algorithms. The input of the system is a set of simple database schemas and the objective for the classifier system is to keep the good database schemas which are represented by classifiers. The learning classifier system is given some basic knowledge about database concepts or rules. The result showed that the system could decrease the bad schemas and keep the good ones

University of New Orleans

Classifier System Learning of Good Database Schema

Author: Tanaka Mitsuru
Publication venue: ScholarWorks@UNO
Publication date: 07/08/2008
Field of study

GeneReg: integration of experimental data on the DNA transcription process

Author: Cortés-Calabuig Álvaro
De Moor Bart
Denecker Marc
Lemmens Karen
Marchal Kathleen
Pastor David
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Quality measures for ETL processes: from goals to implementation

Author: Akkaoui
Batini
Bellatreche
Brereton
Bresciani
Chung
Dustdar
Frakes
Gill
Giorgini
Horkoff
Horkoff
Horkoff
Jarke
Jarke
Jogalekar
Kitchenham
Lamsweerde
Leite
Naumann
Romero
Simitsis
Sánchez-González
Thiele
Yu
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Extraction transformation loading (ETL) processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key business entities. The apparent complexity of these activities has been examined through the prism of business process management, mainly focusing on functional requirements and performance optimization. However, the quality dimension has not yet been thoroughly investigated, and there is a need for a more human-centric approach to bring them closer to business-users requirements. In this paper, we take a first step towards this direction by defining a sound model for ETL process quality characteristics and quantitative measures for each characteristic, based on existing literature. Our model shows dependencies among quality characteristics and can provide the basis for subsequent analysis using goal modeling techniques. We showcase the use of goal modeling for ETL process design through a use case, where we employ the use of a goal model that includes quantitative components (i.e., indicators) for evaluation and analysis of alternative design decisions.Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

What Makes Data Possible? A Sociotechnical View on Structured Data Innovations

Author: Aaltonen Aleksi
Penttinen Esko
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2021
Field of study

Drawing from the theory of digital objects, this paper examines the distinction between structured and unstructured data as carriers of facts. We argue that data do not ‘have’ a structure but are made by a structure that confers data their capacity to represent contextual facts. We employ a case vignette involving XBRL (eXtensible Business Reporting Language) and its use in statutory financial reporting to illustrate and explore the sociotechnical nature of data and to describe what we call data innovations: new valuable ways to render phenomena as data. We find that data structure is best viewed as a matter that is relative to a purpose in a context. Theorizing data from a sociotechnical perspective could evolve to provide, in effect, the material science of digital economy

ScholarSpace at University of Hawai'i at Manoa

Aaltodoc Publication Archive

AIS Electronic Library (AISeL)

Database design: A practical methodology.

Author: Nemovicher Kerry
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

Schema Independent Relational Learning

Author: Abiteboul S.
Anderson M.
Arias M.
Kraska T.
Muggleton S.
Muggleton S.
Muggleton S.
Yin X.
Publication venue
Publication date: 06/11/2017
Field of study

Learning novel concepts and relations from relational databases is an important problem with many applications in database systems and machine learning. Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. Nevertheless, the same data set may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. This variation complicates their off-the-shelf application. In this paper, we introduce and formalize the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We study both sample-based learning algorithms, which learn from sets of labeled examples, and query-based algorithms, which learn by asking queries to an oracle. We prove that current relational learning algorithms are generally not schema independent. For query-based learning algorithms we show that the (de) composition transformations influence their query complexity. We propose Castor, a sample-based relational learning algorithm that achieves schema independence by leveraging data dependencies. We support the theoretical results with an empirical study that demonstrates the schema dependence/independence of several algorithms on existing benchmark and real-world datasets under (de) compositions

arXiv.org e-Print Archive

Crossref

Modeling ontology views: An abstract view model for semantic web

Author: Chang Elizabeth
Dillon Tharam S.
Feng L.
Rajugan R.
Wouters C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The emergence of Semantic Web (SW) and the related technologies promise to make the web a meaningful experience. However, high level modelling, design and querying techniques proves to be a challenging task for organizations that are hoping to utilize the SW paradigm for their industrial applications. To address one such issue, in this paper, we propose an abstract view model with conceptual extensions for the SW. First we outline the view model, its properties and some modelling issues with the help of an industrial case study example. Then, we provide some discussions on constructing such views (at the conceptual level) using a set of operators. Later we provide a brief discussion on how such this view model can utilized in the MOVE [1] system, to design and construct materialized Ontology views to support Ontology extraction

University of Twente Research Information

espace@Curtin