29,339 research outputs found
Bridging the Semantic Gap with SQL Query Logs in Natural Language Interfaces to Databases
A critical challenge in constructing a natural language interface to database
(NLIDB) is bridging the semantic gap between a natural language query (NLQ) and
the underlying data. Two specific ways this challenge exhibits itself is
through keyword mapping and join path inference. Keyword mapping is the task of
mapping individual keywords in the original NLQ to database elements (such as
relations, attributes or values). It is challenging due to the ambiguity in
mapping the user's mental model and diction to the schema definition and
contents of the underlying database. Join path inference is the process of
selecting the relations and join conditions in the FROM clause of the final SQL
query, and is difficult because NLIDB users lack the knowledge of the database
schema or SQL and therefore cannot explicitly specify the intermediate tables
and joins needed to construct a final SQL query. In this paper, we propose
leveraging information from the SQL query log of a database to enhance the
performance of existing NLIDBs with respect to these challenges. We present a
system Templar that can be used to augment existing NLIDBs. Our extensive
experimental evaluation demonstrates the effectiveness of our approach, leading
up to 138% improvement in top-1 accuracy in existing NLIDBs by leveraging SQL
query log information.Comment: Accepted to IEEE International Conference on Data Engineering (ICDE)
201
Bletchley Park text: using mobile and semantic web technologies to support the post-visit use of online museum resources
A number of technologies have been developed to support the museum visitor, with the aim of making their visit more educationally rewarding and/or entertaining. Examples include PDA-based personalized tour guides and virtual reality representations of cultural objects or scenes. Rather than supporting the actual visit, we decided to employ technology to support the post-visitor, that is, encourage follow-up activities among recent visitors to a museum. This allowed us to use the technology in a way that would not detract from the existing curated experience and allow the museum to provide access to additional heritage resources that cannot be presented during the physical visit. Within our application, called Bletchley Park Text, visitors express their interests by sending text (SMS) messages containing suggested keywords using their own mobile phone. The semantic description of the archive of resources is then used to retrieve and organize a collection of content into a personalized web site for use when they get home. Organization of the collection occurs both bottom-up from the semantic description of each item in the collection, and also top-down according to a formal representation of the overall museum story. In designing the interface we aimed to support exploration across the content archive rather than just the search and retrieval of specific resources. The service was developed for the Bletchley Park museum and has since been launched for use by all visitors
Knowledge-rich Image Gist Understanding Beyond Literal Meaning
We investigate the problem of understanding the message (gist) conveyed by
images and their captions as found, for instance, on websites or news articles.
To this end, we propose a methodology to capture the meaning of image-caption
pairs on the basis of large amounts of machine-readable knowledge that has
previously been shown to be highly effective for text understanding. Our method
identifies the connotation of objects beyond their denotation: where most
approaches to image understanding focus on the denotation of objects, i.e.,
their literal meaning, our work addresses the identification of connotations,
i.e., iconic meanings of objects, to understand the message of images. We view
image understanding as the task of representing an image-caption pair on the
basis of a wide-coverage vocabulary of concepts such as the one provided by
Wikipedia, and cast gist detection as a concept-ranking problem with
image-caption pairs as queries. To enable a thorough investigation of the
problem of gist understanding, we produce a gold standard of over 300
image-caption pairs and over 8,000 gist annotations covering a wide variety of
topics at different levels of abstraction. We use this dataset to
experimentally benchmark the contribution of signals from heterogeneous
sources, namely image and text. The best result with a Mean Average Precision
(MAP) of 0.69 indicate that by combining both dimensions we are able to better
understand the meaning of our image-caption pairs than when using language or
vision information alone. We test the robustness of our gist detection approach
when receiving automatically generated input, i.e., using automatically
generated image tags or generated captions, and prove the feasibility of an
end-to-end automated process
Ontology mapping by concept similarity
This paper presents an approach to the problem of mapping ontologies. The motivation for the research stems from the Diogene Project which is developing a web training environment for ICT professionals. The system includes high quality training material from registered content providers, and free web material will also be made available through the project's "Web Discovery" component. This involves using web search engines to locate relevant material, and mapping the ontology at the core of the Diogene system to other ontologies that exist on the Semantic Web. The project's approach to ontology mapping is presented, and an evaluation of this method is described
Biology of Applied Digital Ecosystems
A primary motivation for our research in Digital Ecosystems is the desire to
exploit the self-organising properties of biological ecosystems. Ecosystems are
thought to be robust, scalable architectures that can automatically solve
complex, dynamic problems. However, the biological processes that contribute to
these properties have not been made explicit in Digital Ecosystems research.
Here, we discuss how biological properties contribute to the self-organising
features of biological ecosystems, including population dynamics, evolution, a
complex dynamic environment, and spatial distributions for generating local
interactions. The potential for exploiting these properties in artificial
systems is then considered. We suggest that several key features of biological
ecosystems have not been fully explored in existing digital ecosystems, and
discuss how mimicking these features may assist in developing robust, scalable
self-organising architectures. An example architecture, the Digital Ecosystem,
is considered in detail. The Digital Ecosystem is then measured experimentally
through simulations, with measures originating from theoretical ecology, to
confirm its likeness to a biological ecosystem. Including the responsiveness to
requests for applications from the user base, as a measure of the 'ecological
succession' (development).Comment: 9 pages, 4 figure, conferenc
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
The heterogeneity of methods and technologies to publish open data is still an issue to develop distributed systems on the Web. On the one hand, Web APIs, the most popular approach to offer data services, implement REST principles, which focus on addressing loose coupling and interoperability issues. On the other hand, Linked Data, available through SPARQL endpoints, focus on data integration between distributed data sources. The paper proposes BASIL, an approach to build Web APIs on top of SPARQL endpoints, in order to benefit of the advantages from both Web APIs and Linked Data approaches. Compared to similar solution, BASIL aims on minimising the learning curve for users to promote its adoption. The main feature of BASIL is a simple API that does not introduce new specifications, formalisms and technologies for users that belong to both Web APIs and Linked Data communities
The state-of-the-art in personalized recommender systems for social networking
With the explosion of Web 2.0 application such as blogs, social and professional networks, and various other types of social media, the rich online information and various new sources of knowledge flood users and hence pose a great challenge in terms of information overload. It is critical to use intelligent agent software systems to assist users in finding the right information from an abundance of Web data. Recommender systems can help users deal with information overload problem efficiently by suggesting items (e.g., information and products) that match users’ personal interests. The recommender technology has been successfully employed in many applications such as recommending films, music, books, etc. The purpose of this report is to give an overview of existing technologies for building personalized recommender systems in social networking environment, to propose a research direction for addressing user profiling and cold start problems by exploiting user-generated content newly available in Web 2.0
- …