Search CORE

31 research outputs found

Automatic Table Extension with Open Data

Author: Kleppmann Benedikt
Publication venue: Technological University Dublin
Publication date: 01/01/2018
Field of study

With thousands of data sources available on the web as well as within organisations, data scientists increasingly spend more time searching for data than analysing it. To ease the task of find and integrating relevant data for data mining projects, this dissertation presents two new methods for automatic table extension. Automatic table extension systems take over the task of tata discovery and data integration by adding new columns with new information (new attributes) to any table. The data values in the new columns are extracted from a given corpus of tables

Arrow@TUDublin

Extending RapidMiner with data search and integration capabilities

Author: Bizer Christian
Gentile Anna Lisa
Kirstein Sabrina
Paulheim Heiko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

MAnnheim DOCument Server

On-the-fly Table Generation

Author: Nguyen Thanh Tam
Sekhavat Yoones A.
Yahya Mohamed
Yin Pengcheng
Zwicklbauer Stefan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/05/2018
Field of study

Many information needs revolve around entities, which would be better answered by summarizing results in a tabular format, rather than presenting them as a ranked list. Unlike previous work, which is limited to retrieving existing tables, we aim to answer queries by automatically compiling a table in response to a query. We introduce and address the task of on-the-fly table generation: given a query, generate a relational table that contains relevant entities (as rows) along with their key properties (as columns). This problem is decomposed into three specific subtasks: (i) core column entity ranking, (ii) schema determination, and (iii) value lookup. We employ a feature-based approach for entity ranking and schema determination, combining deep semantic features with task-specific signals. We further show that these two subtasks are not independent of each other and can assist each other in an iterative manner. For value lookup, we combine information from existing tables and a knowledge base. Using two sets of entity-oriented queries, we evaluate our approach both on the component level and on the end-to-end table generation task.Comment: The 41st International ACM SIGIR Conference on Research and Development in Information Retrieva

arXiv.org e-Print Archive

Crossref

Dataset search: a survey

Author: Chapman Adriane
Groth Paul
Ibáñez-Gonzalez Luis-Daniel
Kacprzak Emilia
Koesten Laura
Konstantinidis George
Simperl Elena
Publication venue
Publication date: 03/01/2019
Field of study

Generating value from data requires the ability to find, access and make sense of datasets. There are many efforts underway to encourage data sharing and reuse, from scientific publishers asking authors to submit data alongside manuscripts to data marketplaces, open data portals and data communities. Google recently beta released a search service for datasets, which allows users to discover data stored in various online repositories via keyword queries. These developments foreshadow an emerging research field around dataset search or retrieval that broadly encompasses frameworks, methods and tools that help match a user data need against a collection of datasets. Here, we survey the state of the art of research and commercial systems in dataset retrieval. We identify what makes dataset search a research field in its own right, with unique challenges and methods and highlight open problems. We look at approaches and implementations from related areas dataset search is drawing upon, including information retrieval, databases, entity-centric and tabular search in order to identify possible paths to resolve these open problems as well as immediate next steps that will take the field forward.Comment: 20 pages, 153 reference

arXiv.org e-Print Archive

King's Research Portal

International Migration, Integration and Social Cohesion online publications

UvA-DARE