4 research outputs found
Recommended from our members
SEARCHING BASED ON QUERY DOCUMENTS
Searches can start with query documents where search queries are formulated based on document-level descriptions. This type of searches is more common in domain-specific search environments. For example, in patent retrieval, one major search task is finding relevant information for new (query) patents, and search queries are generated from the query patents One unique characteristic of this search is that the search process can take longer and be more comprehensive, compared to general web search. As an example, to complete a single patent retrieval task, a typical user may generate 15 queries and examine more than 100 retrieved documents. In these search environments, searchers need to formulate multiple queries based on query documents that are typically complex and difficult to understand. In this work, we describe methods for automatically generating queries and diversifying search results based on query documents, which can be used for query vi suggestion and for improving the quality of retrieval results. In particular, we focus on resolving three main issues related to query document-based searches: (1) query generation, (2) query suggestion and formulation, and (3) search result diversification. Automatic query generation helps users by reducing the burden of formulating queries from query documents. Using generated queries as suggestions is investigated as a method of presenting alternative queries. Search result diversification is important in domain-specific search because of the nature of the query documents. Since query documents generally contain long complex descriptions, diverse query topics can be identified, and a range of relevant documents can be found that are related to these diverse topics. The proposed methods we study in this thesis explicitly address these three issues. To solve the query generation issue, we use binary decision trees to generate effective Boolean queries and labeling propagation to formulate more effective phrasal-concept queries. In order to diversify search results, we propose two different approaches: query-side and result-level diversification. To generate diverse queries, we identify important topics from query documents and generate queries based on the identified topics. For result-level diversification, we extract query topics from query documents, and apply state-of-the-art diversification algorithms based on the extracted topics. In addition, we devise query suggestion techniques for each query generation method. To demonstrate the effectiveness of our approach, we conduct experiments for various domain-specific search tasks, and devise appropriate evaluation measures for domain-specific search environments
Formal Linguistic Models and Knowledge Processing. A Structuralist Approach to Rule-Based Ontology Learning and Population
2013 - 2014The main aim of this research is to propose a structuralist approach for knowledge processing by means of ontology learning and population, achieved starting from unstructured and structured texts. The method suggested includes distributional semantic approaches and NL formalization theories, in order to develop a framework, which relies upon deep linguistic analysis... [edited by author]XIII n.s
Interactive grammar development with WCDG
The manual design of grammars for accurate natural language analysis is an iterative process; while modelling decisions usually determine parser behaviour, evidence from analysing more or different input can suggest unforeseen regularities, which leads to a reformulation of rules, or even to a different model of previously analysed phenomena. We describe an implementation of Weighted Constraint Dependency Grammar that supports the grammar writer by providing display, automatic analysis, and diagnosis of dependency analyses and allows the direct exploration of alternative analyses and their status under the current grammar.