Search CORE

20,511 research outputs found

Target Type Identification for Entity-Bearing Queries

Author: Balog Krisztian
Croft W. Bruce
Mikolov Tomas
Sawant Uma
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/07/2017
Field of study

Identifying the target types of entity-bearing queries can help improve retrieval performance as well as the overall search experience. In this work, we address the problem of automatically detecting the target types of a query with respect to a type taxonomy. We propose a supervised learning approach with a rich variety of features. Using a purpose-built test collection, we show that our approach outperforms existing methods by a remarkable margin. This is an extended version of the article published with the same title in the Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page

arXiv.org e-Print Archive

Crossref

Tools for producing formal specifications : a view of current architectures and future directions

Author: Meziane F
Vadera S
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

During the last decade, one important contribution towards requirements engineering has been the advent of formal specification languages. They offer a well-defined notation that can improve consistency and avoid ambiguity in specifications. However, the process of obtaining formal specifications that are consistent with the requirements is itself a difficult activity. Hence various researchers are developing systems that aid the transition from informal to formal specifications. The kind of problems tackled and the contributions made by these proposed systems are very diverse. This paper brings these studies together to provide a vision for future architectures that aim to aid the transition from informal to formal specifications. The new architecture, which is based on the strengths of existing studies, tackles a number of key issues in requirements engineering such as identifying ambiguities, incompleteness, and reusability. The paper concludes with a discussion of the research problems that need to be addressed in order to realise the proposed architecture

University of Salford Institutional Repository

Knowledge management support for enterprise distributed systems

Author: Chen-Burger Jessica
Kalfoglou Yannis
Publication venue: Information Science Reference
Publication date: 01/01/2008
Field of study

Explosion of information and increasing demands on semantic processing web applications have software systems to their limits. To address the problem we propose a semantic based formal framework (ADP) that makes use of promising technologies to enable knowledge generation and retrieval. We argue that this approach is cost effective, as it reuses and builds on existing knowledge and structure. It is also a good starting point for creating an organisational memory and providing knowledge management functions

Southampton (e-Prints Soton)

Cross-lingual document retrieval categorisation and navigation based on distributed services

Author: Deksne D.
Demetriou G.
Gaizauskas R.
Hansen P.
Karlgren J.
Keskustalo H.
Petrelli D.
Sanderson M.
Skadina I.
Publication venue
Publication date
Field of study

The widespread use of the Internet across countries has increased the need for access to document collections that are often written in languages different from a user’s native language. In this paper we describe Clarity, a Cross Language Information Retrieval (CLIR) system for English, Finnish, Swedish, Latvian and Lithuanian. Clarity is a fully-fledged retrieval system that supports the user during the whole process of query formulation, text retrieval and document browsing. We address four of the major aspects of Clarity: (i) the user-driven methodology that formed the basis for the iterative design cycle and framework in the project, (ii) the system architecture that was developed to support the interaction and coordination of Clarity’s distributed services, (iii) the data resources and methods for query translation, and (iv) the support for Baltic languages. Clarity is an example of a distributed CLIR system built with minimal translation resources and, to our knowledge, the only such system that currently supports Baltic languages

White Rose Research Online

Multiple Models for Recommending Temporal Aspects of Entities

Author: Kanhabua Nattiya
Nejdl Wolfgang
Nguyen Tu Ngoc
Publication venue
Publication date: 03/06/2018
Field of study

Entity aspect recommendation is an emerging task in semantic search that helps users discover serendipitous and prominent information with respect to an entity, of which salience (e.g., popularity) is the most important factor in previous work. However, entity aspects are temporally dynamic and often driven by events happening over time. For such cases, aspect suggestion based solely on salience features can give unsatisfactory results, for two reasons. First, salience is often accumulated over a long time period and does not account for recency. Second, many aspects related to an event entity are strongly time-dependent. In this paper, we study the task of temporal aspect recommendation for a given entity, which aims at recommending the most relevant aspects and takes into account time in order to improve search experience. We propose a novel event-centric ensemble ranking method that learns from multiple time and type-dependent models and dynamically trades off salience and recency characteristics. Through extensive experiments on real-world query logs, we demonstrate that our method is robust and achieves better effectiveness than competitive baselines.Comment: In proceedings of the 15th Extended Semantic Web Conference (ESWC 2018

arXiv.org e-Print Archive

A distributional and syntactic approach to fine-grained opinion mining

Author: Sayeed Asad Basheer
Publication venue
Publication date: 01/01/2011
Field of study

This thesis contributes to a larger social science research program of analyzing the diffusion of IT innovations. We show how to automatically discriminate portions of text dealing with opinions about innovations by finding {source, target, opinion} triples in text. In this context, we can discern a list of innovations as targets from the domain itself. We can then use this list as an anchor for finding the other two members of the triple at a ``fine-grained'' level---paragraph contexts or less. We first demonstrate a vector space model for finding opinionated contexts in which the innovation targets are mentioned. We can find paragraph-level contexts by searching for an ``expresses-an-opinion-about'' relation between sources and targets using a supervised model with an SVM that uses features derived from a general-purpose subjectivity lexicon and a corpus indexing tool. We show that our algorithm correctly filters the domain relevant subset of subjectivity terms so that they are more highly valued. We then turn to identifying the opinion. Typically, opinions in opinion mining are taken to be positive or negative. We discuss a crowd sourcing technique developed to create the seed data describing human perception of opinion bearing language needed for our supervised learning algorithm. Our user interface successfully limited the meta-subjectivity inherent in the task (``What is an opinion?'') while reliably retrieving relevant opinionated words using labour not expert in the domain. Finally, we developed a new data structure and modeling technique for connecting targets with the correct within-sentence opinionated language. Syntactic relatedness tries (SRTs) contain all paths from a dependency graph of a sentence that connect a target expression to a candidate opinionated word. We use factor graphs to model how far a path through the SRT must be followed in order to connect the right targets to the right words. It turns out that we can correctly label significant portions of these tries with very rudimentary features such as part-of-speech tags and dependency labels with minimal processing. This technique uses the data from the crowdsourcing technique we developed as training data. We conclude by placing our work in the context of a larger sentiment classification pipeline and by describing a model for learning from the data structures produced by our work. This work contributes to computational linguistics by proposing and verifying new data gathering techniques and applying recent developments in machine learning to inference over grammatical structures for highly subjective purposes. It applies a suffix tree-based data structure to model opinion in a specific domain by imposing a restriction on the order in which the data is stored in the structure

Digital Repository at the University of Maryland