56,311 research outputs found

    ラベル伝搬によるトレンドクエリのカテゴリ推定

    Get PDF
    Query classification is an important technique for web search engines, allowing them to improve users\u27 search experience. Specifically, query classification methods classify queries according to topical categories, such as celebrities and sports. Such category information is effective in improving web search results, online advertisements, and so on. Unlike previous studies, our research focuses on trend queries that have suddenly become popular and are extensively searched. Our aim is to classify such trend queries in a timely manner, i.e., classify the queries on the same day when they become popular, in order to provide a better search experience. To reduce the expensive manual annotation costs to train supervised learning methods, we focus on a label propagation method that belongs to the semi-supervised learning family. Specifically, the proposed method is based on our previous method that constructs a graph using a corpus, and propagates a small number of ground-truth categories of labeled queries in order to estimate the categories of unlabeled queries. We extend this method to cut ineffective edges to improve both classification accuracy and computational efficiency. Furthermore, we investigate in detail the effects of different corpora, i.e., web/blog/news search results, Tweets, and news pages, on the trend query classification task. Our experiments replicate the situation of an emerging trend query; the results show that web search results are the most effective for trend query classification, achieving a 50.1% F-score, which significantly outperforms the state-of-the-art method by 7.2 points. These results provide useful insights into selecting an appropriate dataset for query classification from the various types of data available

    NITELIGHT: A Graphical Tool for Semantic Query Construction

    No full text
    Query formulation is a key aspect of information retrieval, contributing to both the efficiency and usability of many semantic applications. A number of query languages, such as SPARQL, have been developed for the Semantic Web; however, there are, as yet, few tools to support end users with respect to the creation and editing of semantic queries. In this paper we introduce a graphical tool for semantic query construction (NITELIGHT) that is based on the SPARQL query language specification. The tool supports end users by providing a set of graphical notations that represent semantic query language constructs. This language provides a visual query language counterpart to SPARQL that we call vSPARQL. NITELIGHT also provides an interactive graphical editing environment that combines ontology navigation capabilities with graphical query visualization techniques. This paper describes the functionality and user interaction features of the NITELIGHT tool based on our work to date. We also present details of the vSPARQL constructs used to support the graphical representation of SPARQL queries

    A Factoid Question Answering System for Vietnamese

    Full text link
    In this paper, we describe the development of an end-to-end factoid question answering system for the Vietnamese language. This system combines both statistical models and ontology-based methods in a chain of processing modules to provide high-quality mappings from natural language text to entities. We present the challenges in the development of such an intelligent user interface for an isolating language like Vietnamese and show that techniques developed for inflectional languages cannot be applied "as is". Our question answering system can answer a wide range of general knowledge questions with promising accuracy on a test set.Comment: In the proceedings of the HQA'18 workshop, The Web Conference Companion, Lyon, Franc

    A semantic-based system for querying personal digital libraries

    Get PDF
    This is the author's accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-540-28640-0_4. Copyright @ Springer 2004.The decreasing cost and the increasing availability of new technologies is enabling people to create their own digital libraries. One of the main topic in personal digital libraries is allowing people to select interesting information among all the different digital formats available today (pdf, html, tiff, etc.). Moreover the increasing availability of these on-line libraries, as well as the advent of the so called Semantic Web [1], is raising the demand for converting paper documents into digital, possibly semantically annotated, documents. These motivations drove us to design a new system which could enable the user to interact and query documents independently from the digital formats in which they are represented. In order to achieve this independence from the format we consider all the digital documents contained in a digital library as images. Our system tries to automatically detect the layout of the digital documents and recognize the geometric regions of interest. All the extracted information is then encoded with respect to a reference ontology, so that the user can query his digital library by typing free text or browsing the ontology

    Survey over Existing Query and Transformation Languages

    Get PDF
    A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas
    corecore