4,835 research outputs found
Applying semantic web technologies to knowledge sharing in aerospace engineering
This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale
Relation Discovery from Web Data for Competency Management
This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006
Transitive probabilistic CLIR models.
Transitive translation could be a useful technique to enlarge the number of supported language pairs for a cross-language information retrieval (CLIR) system in a cost-effective manner. The paper describes several setups for transitive translation based on probabilistic translation models. The transitive CLIR models were evaluated on the CLEF test collection and yielded a retrieval effectiveness\ud
up to 83% of monolingual performance, which is significantly better than a baseline using the synonym operator
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Word-Entity Duet Representations for Document Ranking
This paper presents a word-entity duet framework for utilizing knowledge
bases in ad-hoc retrieval. In this work, the query and documents are modeled by
word-based representations and entity-based representations. Ranking features
are generated by the interactions between the two representations,
incorporating information from the word space, the entity space, and the
cross-space connections through the knowledge graph. To handle the
uncertainties from the automatically constructed entity representations, an
attention-based ranking model AttR-Duet is developed. With back-propagation
from ranking labels, the model learns simultaneously how to demote noisy
entities and how to rank documents with the word-entity duet. Evaluation
results on TREC Web Track ad-hoc task demonstrate that all of the four-way
interactions in the duet are useful, the attention mechanism successfully
steers the model away from noisy entities, and together they significantly
outperform both word-based and entity-based learning to rank systems
Entity Query Feature Expansion Using Knowledge Base Links
Recent advances in automatic entity linking and knowledge base
construction have resulted in entity annotations for document and
query collections. For example, annotations of entities from large
general purpose knowledge bases, such as Freebase and the Google
Knowledge Graph. Understanding how to leverage these entity
annotations of text to improve ad hoc document retrieval is an open
research area. Query expansion is a commonly used technique to
improve retrieval effectiveness. Most previous query expansion
approaches focus on text, mainly using unigram concepts. In this
paper, we propose a new technique, called entity query feature
expansion (EQFE) which enriches the query with features from
entities and their links to knowledge bases, including structured
attributes and text. We experiment using both explicit query entity
annotations and latent entities. We evaluate our technique on TREC
text collections automatically annotated with knowledge base entity
links, including the Google Freebase Annotations (FACC1) data.
We find that entity-based feature expansion results in significant
improvements in retrieval effectiveness over state-of-the-art text
expansion approaches
Modeling Documents as Mixtures of Persons for Expert Finding
In this paper we address the problem of searching for knowledgeable
persons within the enterprise, known as the expert finding (or
expert search) task. We present a probabilistic algorithm using the assumption
that terms in documents are produced by people who are mentioned
in them.We represent documents retrieved to a query as mixtures
of candidate experts language models. Two methods of personal language
models extraction are proposed, as well as the way of combining
them with other evidences of expertise. Experiments conducted with the
TREC Enterprise collection demonstrate the superiority of our approach
in comparison with the best one among existing solutions
- …