118,870 research outputs found
Improving the presentation of search results by multipartite graph clustering of multiple reformulated queries and a novel document representation
The goal of clustering web search results is to reveal the semantics of the retrieved documents. The main challenge is to make clustering partition relevant to a user’s query. In this paper, we describe a method of clustering search results using a similarity measure between documents retrieved by multiple reformulated queries. The method produces clusters of documents that are most relevant to the original query and, at the same time, represent a more diverse set of semantically related queries. In order to cluster thousands of documents in real time, we designed a novel multipartite graph clustering algorithm that has low polynomial complexity and no manually adjusted hyper–parameters. The loss of semantics resulting from the stem–based document representation is a common problem in information retrieval. To address this problem, we propose an alternative novel document representation, under which words are represented by their synonymy groups.This work was supported by Yandex grant 110104
Weakly Supervised Semantic Parsing with Execution-based Spurious Program Filtering
The problem of spurious programs is a longstanding challenge when training a
semantic parser from weak supervision. To eliminate such programs that have
wrong semantics but correct denotation, existing methods focus on exploiting
similarities between examples based on domain-specific knowledge. In this
paper, we propose a domain-agnostic filtering mechanism based on program
execution results. Specifically, for each program obtained through the search
process, we first construct a representation that captures the program's
semantics as execution results under various inputs. Then, we run a majority
vote on these representations to identify and filter out programs with
significantly different semantics from the other programs. In particular, our
method is orthogonal to the program search process so that it can easily
augment any of the existing weakly supervised semantic parsing frameworks.
Empirical evaluations on the Natural Language Visual Reasoning and
WikiTableQuestions demonstrate that applying our method to the existing
semantic parsers induces significantly improved performances.Comment: EMNLP 202
Semantic-driven matchmaking of web services using case-based reasoning
With the rapid proliferation of Web services as the medium of choice to securely publish application services beyond the firewall, the importance of accurate, yet flexible matchmaking of similar services gains importance both for the human user and for dynamic composition engines. In this paper, we present a novel approach that utilizes the case based reasoning methodology for modelling dynamic Web service discovery and matchmaking. Our framework considers Web services execution experiences in the decision making process and is highly adaptable to the service requester constraints. The framework also utilises OWL semantic descriptions extensively for implementing both the components of the CBR engine and the matchmaking profile of the Web services
A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge
We present the architecture and the evaluation of a new system for
recognizing textual entailment (RTE). In RTE we want to identify automatically
the type of a logical relation between two input texts. In particular, we are
interested in proving the existence of an entailment between them. We conceive
our system as a modular environment allowing for a high-coverage syntactic and
semantic text analysis combined with logical inference. For the syntactic and
semantic analysis we combine a deep semantic analysis with a shallow one
supported by statistical models in order to increase the quality and the
accuracy of results. For RTE we use logical inference of first-order employing
model-theoretic techniques and automated reasoning tools. The inference is
supported with problem-relevant background knowledge extracted automatically
and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or
other, more experimental sources with, e.g., manually defined presupposition
resolutions, or with axiomatized general and common sense knowledge. The
results show that fine-grained and consistent knowledge coming from diverse
sources is a necessary condition determining the correctness and traceability
of results.Comment: 25 pages, 10 figure
- …