5,316 research outputs found
Knowledge Rich Natural Language Queries over Structured Biological Databases
Increasingly, keyword, natural language and NoSQL queries are being used for
information retrieval from traditional as well as non-traditional databases
such as web, document, image, GIS, legal, and health databases. While their
popularity are undeniable for obvious reasons, their engineering is far from
simple. In most part, semantics and intent preserving mapping of a well
understood natural language query expressed over a structured database schema
to a structured query language is still a difficult task, and research to tame
the complexity is intense. In this paper, we propose a multi-level
knowledge-based middleware to facilitate such mappings that separate the
conceptual level from the physical level. We augment these multi-level
abstractions with a concept reasoner and a query strategy engine to dynamically
link arbitrary natural language querying to well defined structured queries. We
demonstrate the feasibility of our approach by presenting a Datalog based
prototype system, called BioSmart, that can compute responses to arbitrary
natural language queries over arbitrary databases once a syntactic
classification of the natural language query is made
A Human-Centric Approach to Group-Based Context-Awareness
The emerging need for qualitative approaches in context-aware information
processing calls for proper modeling of context information and efficient
handling of its inherent uncertainty resulted from human interpretation and
usage. Many of the current approaches to context-awareness either lack a solid
theoretical basis for modeling or ignore important requirements such as
modularity, high-order uncertainty management and group-based
context-awareness. Therefore, their real-world application and extendability
remains limited. In this paper, we present f-Context as a service-based
context-awareness framework, based on language-action perspective (LAP) theory
for modeling. Then we identify some of the complex, informational parts of
context which contain high-order uncertainties due to differences between
members of the group in defining them. An agent-based perceptual computer
architecture is proposed for implementing f-Context that uses computing with
words (CWW) for handling uncertainty. The feasibility of f-Context is analyzed
using a realistic scenario involving a group of mobile users. We believe that
the proposed approach can open the door to future research on context-awareness
by offering a theoretical foundation based on human communication, and a
service-based layered architecture which exploits CWW for context-aware,
group-based and platform-independent access to information systems
Enabling Global Price Comparison through Semantic Integration of Web Data
“Sell Globally” and “Shop Globally” have been seen as a potential
benefit of web-enabled electronic business. One important step toward realizing
this benefit is to know how things are selling in various parts of the world. A
global price comparison service would address this need. But there have not
been many such services. In this paper, we use a case study of global price
dispersion to illustrate the need and the value of a global price comparison
service. Then we identify and discuss several technology challenges, including
semantic heterogeneity, in providing a global price comparison service. We
propose a mediation architecture to address the semantic heterogeneity
problem, and demonstrate the feasibility of the proposed architecture by
implementing a prototype that enables global price comparison using data from
web sources in several countries
Query Processing in a P2P Network of Taxonomy-based Information Sources
In this study we address the problem of answering queries over a peer-to-peer system of taxonomy-based sources. A taxonomy states subsumption relationships between negation-free DNF formulas on terms and negation-free conjunctions of terms. To the end of laying the foundations of our study, we first consider the centralized case, deriving the complexity of the decision problem and of query evaluation. We conclude by presenting an algorithm that is efficient in data complexity and is based on hypergraphs. We then move to the distributed case, and introduce a logical model of a network of taxonomy-based sources. On such network, a distributed version of the centralized algorithm is then presented, based on a message passing paradigm, and its correctness is proved. We finally discuss optimization issues, and relate our work to the literature
an approach for semantic integration of heterogeneous data sources
Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view
Understanding Questions that Arise When Working with Business Documents
While digital assistants are increasingly used to help with various
productivity tasks, less attention has been paid to employing them in the
domain of business documents. To build an agent that can handle users'
information needs in this domain, we must first understand the types of
assistance that users desire when working on their documents. In this work, we
present results from two user studies that characterize the information needs
and queries of authors, reviewers, and readers of business documents. In the
first study, we used experience sampling to collect users' questions in-situ as
they were working with their documents, and in the second, we built a
human-in-the-loop document Q&A system which rendered assistance with a variety
of users' questions. Our results have implications for the design of document
assistants that complement AI with human intelligence including whether
particular skillsets or roles within the document are needed from human
respondents, as well as the challenges around such systems.Comment: This paper will appear in CSCW'2
- …