Search CORE

2,958 research outputs found

Variability Management in Database Applications

Author: Humblet Mathieu
Tran Dang
Publication venue
Publication date: 24/06/2016
Field of study

RDF Querying

Author: A. Wilk
B. Ludäscher
B. Parsia
D. Chamberlin
E. Cohen
F. Bry
F. Bry
G. Gottlob
G. Karvounarakis
G. Karvounarakis
J. Bailey
J. Bruijn de
J. Robie
M. Kifer
M. Lacher
M. Magiridou
M. Marx
S. Abiteboul
S. Berger
T. Grust
V. Bönström
W. May
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Reactive Web systems, Web services, and Web-based publish/ subscribe systems communicate events as XML messages, and in many cases require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. Emphasizing language design and formal semantics, we describe the rule-based query language XChangeEQ for detecting composite events. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as model and fixpoint theories; while this is an established approach for rule languages, it has not been applied for event queries before

CiteSeerX

Crossref

Open Access LMU

Oxford University Research Archive

Taxonomy, Semantic Data Schema, and Schema Alignment for Open Data in Urban Building Energy Modeling

Author: Chen Jianli
Zhang Liang
Zou Jia
Publication venue
Publication date: 14/11/2023
Field of study

Urban Building Energy Modeling (UBEM) is a critical tool to provide quantitative analysis on building decarbonization, sustainability, building-to-grid integration, and renewable energy applications on city, regional, and national scales. Researchers usually use open data as inputs to build and calibrate UBEM. However, open data are from thousands of sources covering various perspectives of weather, building characteristics, etc. Besides, a lack of semantic features of open data further increases the engineering effort to process information to be directly used for UBEM as inputs. In this paper, we first reviewed open data types used for UBEM and developed a taxonomy to categorize open data. Based on that, we further developed a semantic data schema for each open data category to maintain data consistency and improve model automation for UBEM. In a case study, we use three popular open data to show how they can be automatically processed based on the proposed schematic data structure using large language models. The accurate results generated by large language models indicate the machine-readability and human-interpretability of the developed semantic data schema

arXiv.org e-Print Archive

Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

Author: Chan Jonathan H.
Qi Yiyan
Song Yuanfeng
Tian Yuxing
Wang Yifei
Wei Victor Junqiu
Wong Raymond Chi-Wing
Yang Haiqin
Zhang Weixu
Publication venue
Publication date: 27/10/2023
Field of study

The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models.Comment: 20 pages, 4 figures, 5 tables. Submitted to IEEE TKD

arXiv.org e-Print Archive

Identifying and Consolidating Knowledge Engineering Requirements

Author: Allen Bradley P.
Ilievski Filip
Joshi Saurav
Publication venue
Publication date: 26/06/2023
Field of study

Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used because high-quality knowledge is assumed to be crucial for reliable intelligent agents. However, the landscape of knowledge engineering has changed, presenting four challenges: unaddressed stakeholder requirements, mismatched technologies, adoption barriers for new organizations, and misalignment with software engineering practices. In this paper, we propose to address these challenges by developing a reference architecture using a mainstream software methodology. By studying the requirements of different stakeholders and eras, we identify 23 essential quality attributes for evaluating reference architectures. We assess three candidate architectures from recent literature based on these attributes. Finally, we discuss the next steps towards a comprehensive reference architecture, including prioritizing quality attributes, integrating components with complementary strengths, and supporting missing socio-technical requirements. As this endeavor requires a collaborative effort, we invite all knowledge engineering researchers and practitioners to join us

arXiv.org e-Print Archive

SPEIR: Scottish Portals for Education, Information and Research. Final Project Report: Elements and Future Development Requirements of a Common Information Environment for Scotland

Author: Dawson A.
Dunsire G.
Jones E.
Joseph A.
Macgregor G.
Nicholson D.
Shiri A.
SLIC (Scottish Library and Information Council) (Funder)
Williamson A.
Publication venue: University of Strathclyde
Publication date: 01/01/2004
Field of study

The SPEIR (Scottish Portals for Education, Information and Research) project was funded by the Scottish Library and Information Council (SLIC). It ran from February 2003 to September 2004, slightly longer than the 18 months originally scheduled and was managed by the Centre for Digital Library Research (CDLR). With SLIC's agreement, community stakeholders were represented in the project by the Confederation of Scottish Mini-Cooperatives (CoSMiC), an organisation whose members include SLIC, the National Library of Scotland (NLS), the Scottish Further Education Unit (SFEU), the Scottish Confederation of University and Research Libraries (SCURL), regional cooperatives such as the Ayrshire Libraries Forum (ALF)1, and representatives from the Museums and Archives communities in Scotland. Aims; A Common Information Environment For Scotland The aims of the project were to: o Conduct basic research into the distributed information infrastructure requirements of the Scottish Cultural Portal pilot and the public library CAIRNS integration proposal; o Develop associated pilot facilities by enhancing existing facilities or developing new ones; o Ensure that both infrastructure proposals and pilot facilities were sufficiently generic to be utilised in support of other portals developed by the Scottish information community; o Ensure the interoperability of infrastructural elements beyond Scotland through adherence to established or developing national and international standards. Since the Scottish information landscape is taken by CoSMiC members to encompass relevant activities in Archives, Libraries, Museums, and related domains, the project was, in essence, concerned with identifying, researching, and developing the elements of an internationally interoperable common information environment for Scotland, and of determining the best path for future progress

E-LIS

University of Strathclyde Institutional Repository

Semantic interoperability through context interchange : representing and reasoning about data conflicts in heterogeneous and autonomous systems

Author
Publication venue: Sloan School of Management, Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Cover title.Includes bibliographical references (p. 24-25).Supported in part by ARPA, International Financial Services Research Center (IFSRC), PROductivity From Information Technology (PROFIT), National University of Singapore, and USAF/Rome Laboratory. F30602-93-C-0160Cheng Hian Goh, Stuart E. Madnick, Michael D. Siegel

DSpace@MIT

Neo: A Learned Query Optimizer

Author: Alizadeh Mohammad
Kraska Tim
Mao Hongzi
Marcus Ryan
Negi Parimarjan
Papaemmanouil Olga
Tatbul Nesime
Zhang Chi
Publication venue: 'VLDB Endowment'
Publication date: 07/04/2019
Field of study

Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo (Neural Optimizer), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them

arXiv.org e-Print Archive

DSpace@MIT

Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond

Author: Ariosa Roberto
Baudis Michael
Beck Tim
Brookes Anthony J
Fromont Lauren A
Navarro Arcadi
Paloots Rahel
Rambla Jordi
Rueda Manuel
Saunders Gary
Singh Babita
Spalding John D
Törnroos Juha
Vasallo Claudia
Veal Colin D
Publication venue: 'Wiley'
Publication date: 17/03/2022
Field of study

Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers. The popularity of this concept has fostered the design of a version 2, that better serves real-world requirements and addresses the needs of clinical genomics research and healthcare, as assessed by several contributing projects and organizations. Particularly, rare disease genetics and cancer research will benefit from new case level and genomic variant level requests and the enabling of richer phenotype and clinical queries as well as support for fuzzy searches. Beacon is designed as a "lingua franca" to bridge data collections hosted in software solutions with different and rich interfaces. Beacon version 2 works alongside popular standards like Phenopackets, OMOP, or FHIR, allowing implementing consortia to return matches in beacon responses and provide a handover to their preferred data exchange format. The protocol is being explored by other research domains and is being tested in several international projects

ZORA