Search CORE

30 research outputs found

Entity Resolution On-Demand for Querying Dirty Datasets

Author: Bergamaschi Sonia
Naumann Felix
Simonini Giovanni
Zecchini Luca
Publication venue
Publication date: 01/01/2023
Field of study

Entity Resolution (ER) is the process of identifying and merging records that refer to the same real-world entity. ER is usually applied as an expensive cleaning step on the entire data before consuming it, yet the relevance of cleaned entities ultimately depends on the user’s specific application, which may only require a small portion of the entities. We introduce BrewER, a framework designed to evaluate SQL SP queries on unclean data while progressively providing results as if they were obtained from cleaned data. BrewER aims at cleaning a single entity at a time, adhering to an ORDER BY predicate, thus it inherently supports top-k queries and stop-and-resume execution. This approach can save a significant amount of resources for various applications. BrewER has been implemented as an open-source Python library and can be seamlessly employed with existing ER tools and algorithms. We thoroughly demonstrated its efficiency through its evaluation on four real-world datasets

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

The Case for Multi-task Active Learning Entity Resolution

Author: Domenico Beneventano
Giovanni Simonini
Henrique Saccani
Luca Gagliardelli
Luca Zecchini
Sonia Bergamaschi.
Publication venue
Publication date: 01/01/2021
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Big Data Integration for Data-Centric AI

Author: Aslam Adeel
Beneventano Domenico
Bergamaschi Sonia
De Sabbata Giulio
Gagliardelli Luca
Simonini Giovanni
Zecchini Luca
Publication venue
Publication date: 01/01/2022
Field of study

Big data integration represents one of the main challenges for the use of techniques and tools based on Artificial Intelligence (AI) in several crucial areas: eHealth, energy management, enterprise data, etc. In this context, Data-Centric AI plays a primary role in guaranteeing the quality of the data on which these tools and techniques operate. Thus, the activities of the Database Research Group (DBGroup) of the “Enzo Ferrari” Engineering Department of the University of Modena and Reggio Emilia are moving in this direction. Therefore, we present the main research projects of the DBGroup, which are part of collaborations in various application sectors

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Progressive Entity Resolution with Node Embeddings

Author: Aslam Adeel
Beneventano Domenico
Bergamaschi Sonia
De Sabbata Giulio
Gagliardelli Luca
Rinaldi Michele
Simonini Giovanni
Zecchini Luca
Publication venue
Publication date: 01/01/2022
Field of study

Entity Resolution (ER) is the task of finding records that refer to the same real-world entity, which are called matches. ER is a fundamental pre-processing step when dealing with dirty and/or heterogeneous datasets; however, it can be very time-consuming when employing complex machine learning models to detect matches, as state-of-the-art ER methods do. Thus, when time is a critical component and having a partial ER result is better than having no result at all, progressive ER methods are employed to try to maximize the number of detected matches as a function of time. In this paper, we study how to perform progressive ER by exploiting graph embeddings. The basic idea is to represent candidate matches in a graph: each node is a record and each edge is a possible comparison to check—we build that on top of a well-known, established graph-based ER framework. We experimentally show that our method performs better than existing state-of-the-art progressive ER methods on real-world benchmark datasets

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

ECDP: A Big Data Platform for the Smart Monitoring of Local Energy Communities

Author: Andrea Livaldi
Domenico Beneventano
Emma Mescoli
Fabio Moretti3
Fabrizio Paolucci
Gianluca D’Agosta
Giovanni Simonini
Luca Gagliardelli
Luca Magnotta
Luca Zecchini
Mirko Orsini
Nicola Gessa
Piero De Sabbata
Sonia Bergamaschi
Publication venue
Publication date: 01/01/2022
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia