Search CORE

35,476 research outputs found

Revisiting Pattern Structures for Structured Attribute Sets

Author: Alam Mehwish
Buzmakov Aleksey
Napoli Amedeo
Sailanbayev Alibek
Publication venue: HAL CCSD
Publication date: 13/08/2015
Field of study

International audienceIn this paper, we revisit an original proposition on pattern structures for structured sets of attributes. There are several reasons for carrying out this kind of research work. The original proposition does not give many details on the whole framework, and especially on the possible ways of implementing the similarity operation. There exists an alternative definition without any reference to pattern structures, and we would like to make a parallel between two points of view. Moreover we discuss an efficient implementation of the intersection operation in the corresponding pattern structure. Finally, we discovered that pattern structures for structured attribute sets are very well adapted to the classification and the analysis of RDF data. We terminate the paper by an experimental section where it is shown that the provided implementation of pattern structures for structured attribute sets is quite efficient

INRIA a CCSD electronic archive server

On the performance impact of using JSON, beyond impedance mismatch

Author: Abelló Gamazo Alberto
Hewasinghage Moditha Lakshan Dharmasir
Nadal Francesch Sergi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

NOSQL database management systems adopt semi-structured data models, such as JSON, to easily accommodate schema evolution and overcome the overhead generated from transforming internal structures to tabular data (i.e., impedance mismatch). There exist multiple, and equivalent, ways to physically represent semi-structured data, but there is a lack of evidence about the potential impact on space and query performance. In this paper, we embark on the task of quantifying that, precisely for document stores. We empirically compare multiple ways of representing semi-structured data, which allows us to derive a set of guidelines for efficient physical database design considering both JSON and relational options in the same palette.Partly funded by the European Commission through the programme “EM IT4BI-DC”.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Evolving Ensemble Fuzzy Classifier

Author: Lughofer Edwin
Pedrycz Witold
Pratama Mahardhika
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Form and function in hillslope hydrology : in situ imaging and characterization of flow-relevant structures

Author: C. Jackisch
E. Zehe
J. Tronicke
L. Angermann
L. Angermann
M. Sprenger
M. Sprenger
N. Allroggen
T. Blume
Publication venue: 'Copernicus GmbH'
Publication date: 01/07/2017
Field of study

Thanks to Elly Karle and the Engler-BunteInstitute, KIT, for the IC measurements of bromide. We are grateful to Selina Baldauf, Marcel Delock, Razije Fiden, Barbara Herbstritt, Lisei Köhn, Jonas Lanz, Francois Nyobeu, Marvin Reich and Begona Lorente Sistiaga for their support in the lab and during fieldwork, as well as Markus Morgner and Jean Francois Iffly for technical support and Britta Kattenstroth for hydrometeorological data acquisition. Laurent Pfister and Jean-Francois Iffly from the Luxembourg Institute of Science and Technology (LIST) are acknowledged for organizing the permissions for the experiments. Moreover, we thank Markus Weiler (University of Freiburg) for his strong support during the planning of the hillslope experiment and the preparation of the manuscript. This study is part of the DFG-funded CAOS project “From Catchments as Organised Systems to Models based on Dynamic Functional Units” (FOR 1598). The manuscript was substantially improved based on the critical and constructive comments of the anonymous reviewers, Christian Stamm and Alexander Zimmermann, and the editor Ross Woods during the open review process, which is highly appreciated.Peer reviewedPublisher PD

Aberdeen University Research

Crossref

KITopen

Directory of Open Access Journals

Ontologies and Information Extraction

Author: Nazarenko Adeline
Nédellec Claire
Publication venue
Publication date: 01/01/2005
Field of study

This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

arXiv.org e-Print Archive

HAL Descartes

HAL-Paris 13