Search CORE

8 research outputs found

Knowledge Rich Natural Language Queries over Structured Biological Databases

Author: Chu W. W.
Goldsmith E. J.
InterProlog
Kossmann D.
Lawrence C.
Maio C. D.
Mir S.
Mou X.
Nandi A.
Novik L.
Safran M.
Swofford D. L.
Publication venue
Publication date: 30/03/2017
Field of study

Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made

arXiv.org e-Print Archive

Crossref

Supporting users tasks with personal information management and web forms augmentation

Author: A. Girgensohn
D. Recordon
D. Zhou
G.A. Toda
J.A. Bargas-Avila
K.A. Olsen
M. Hartmann
M. Heinrich
M. Hori
M.C. Norrie
R. Khare
S. Araujo
S. Firmenich
S. Leone
T. Stocky
W. Jones
X. Guo
Publication venue
Publication date: 01/01/2012
Field of study

Currently, many tasks performed on the Web prompt users to provide personal information through forms. Despite the fact that most users are familiarized with this kind of interaction technique, the use of Web forms is not always straightforward. Indeed, some users might need assistance to understand labels and complex data format required to fill in form fields that, quite often, vary from a Web site to another even when requesting similar data. Filling in forms can be tedious and repetitive as many Web sites request similar information. In this work we analyze user's interactions with Web forms and propose an approach for enhancing Web forms using client-side adaptation techniques in order to assist users to fill in Web forms. As the use of Web forms is closely related to the management of personal information our approach includes the support for data exchange between user's personal information management systems (PIMs) and third-party Web forms. The approach is illustrated by a set of client-side adaptation tools and a pervasive Personal Information Management Systems called PIMI.Publicado en Lecture Notes in Computer Science book series (vol. 7387).Laboratorio de Investigación y Formación en Informática Avanzad

Crossref

Supporting users tasks with personal information management and web forms augmentation

Author: Firmenich Sergio
Gaits Vincent
Gordillo Silvia Ethel
Rossi Gustavo Héctor
Winckler Marco
Publication venue
Publication date: 01/11/2019
Field of study

DeepEC: uma abordagem para extração e catalogação de conteúdo presente na Deep Web

Author: Souza Augusto Ferreira de
Publication venue
Publication date: 01/01/2013
Field of study

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2013.Esta dissertação apresenta uma solução chamada DeepEC (DeepWeb Extraction and Cataloguing Process) para realizar a extração e catalogação de dados relevantes em bancos de dados presentes na Deep Web, também denominados de bancos de dados escondidos. Essas informações são extraídas a partir de um conjunto de páginas HTML geradas a partir de consultas definidas sobre formulários Web. A intenção é adquirir conhecimento sobre esses bancos de dados e, consequentemente, permitir buscas estruturadas sobre esse conteúdo escondido. Experimentos comprovaram a eficácia da abordagem proposta. Comparado com trabalhos relacionados, as contribuições desta dissertação são a realização conjunta e sequencial de um processo de extração e catalogação dos dados de bancos de dados escondidos, um processo de extração automático com suporte de uma base de conhecimento e um processo de catalogação que gera registros estruturados e é capaz de realizar a detecção de atributos cujos valores não estão presentes nos dados extraídos. Abstract : This work presents an approach called DeepEC (Deep Web Extraction and Cataloguing Process) that performs the extraction and cataloging of relevant data presented in Deep Web databases, also called hidden databases. This information is extracted from a set of HTML pages generated by queries posed on web forms. The intention is to obtain knowledge about these databases and thus enable structured queries over this hidden content. Experiments have shown the effectiveness of the proposed approach. Compared to related work, the contributions of this paper are the simultaneous process of data extraction and cataloging of hidden databases, an automatic extraction process with a knowledge base support, and a cataloging process that generates structured records and it is able to detect attribute values that are missing in the extracted data

Repositório Institucional da UFSC

Supporting users tasks with personal information management and web forms augmentation

Author: Firmenich Sergio
Gaits Vincent
Gordillo Silvia Ethel
Rossi Gustavo Héctor
Winckler Marco
Publication venue
Publication date: 01/11/2019
Field of study

Servicio de Difusión de la Creación Intelectual

A probabilistic approach for automatically filling form-based web interfaces

Author: J.
J. D.
T.
Usher K.
W.
Xplorer S.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

BIG DATA AND ANALYTICS AS A NEW FRONTIER OF ENTERPRISE DATA MANAGEMENT

Author: FADLER Martin
Publication venue: Université de Lausanne, Faculté des hautes études commerciales
Publication date: 01/01/2021
Field of study

Big Data and Analytics (BDA) promises significant value generation opportunities across industries. Even though companies increase their investments, their BDA initiatives fall short of expectations and they struggle to guarantee a return on investments. In order to create business value from BDA, companies must build and extend their data-related capabilities. While BDA literature has emphasized the capabilities needed to analyze the increasing volumes of data from heterogeneous sources, EDM researchers have suggested organizational capabilities to improve data quality. However, to date, little is known how companies actually orchestrate the allocated resources, especially regarding the quality and use of data to create value from BDA. Considering these gaps, this thesis – through five interrelated essays – investigates how companies adapt their EDM capabilities to create additional business value from BDA. The first essay lays the foundation of the thesis by investigating how companies extend their Business Intelligence and Analytics (BI&A) capabilities to build more comprehensive enterprise analytics platforms. The second and third essays contribute to fundamental reflections on how organizations are changing and designing data governance in the context of BDA. The fourth and fifth essays look at how companies provide high quality data to an increasing number of users with innovative EDM tools, that are, machine learning (ML) and enterprise data catalogs (EDC). The thesis outcomes show that BDA has profound implications on EDM practices. In the past, operational data processing and analytical data processing were two “worlds” that were managed separately from each other. With BDA, these "worlds" are becoming increasingly interdependent and organizations must manage the lifecycles of data and analytics products in close coordination. Also, with BDA, data have become the long-expected, strategically relevant resource. As such data must now be viewed as a distinct value driver separate from IT as it requires specific mechanisms to foster value creation from BDA. BDA thus extends data governance goals: in addition to data quality and regulatory compliance, governance should facilitate data use by broadening data availability and enabling data monetization. Accordingly, companies establish comprehensive data governance designs including structural, procedural, and relational mechanisms to enable a broad network of employees to work with data. Existing EDM practices therefore need to be rethought to meet the emerging BDA requirements. While ML is a promising solution to improve data quality in a scalable and adaptable way, EDCs help companies democratize data to a broader range of employees

Serveur académique lausannois