87 research outputs found

    FuFaIR: a Fuzzy Farsi Information Retrieval System

    Get PDF
    Persian (Farsi) is one of the languages of Middle East. There are significant amount of Persian documents available in digital form and even more are created every day. Therefore, there is a necessity to implement Information Retrieval System with high precision for this language. This paper discusses the design, implementation and testing of a Fuzzy retrieval system for Persian called FuFaIR. This system also supports Fuzzy quantifiers in its query language. Tests have been conducted using a standard Persian test corpus called Hamshari. The performance results obtained from FuFaIR are positive and they indicate that the FuFaIR could notably outperform well known industry systems such as the vector space model

    Aplicaciones de la teoria de conjuntos borrosos

    Get PDF

    Evaluating Information Retrieval and Access Tasks

    Get PDF
    This open access book summarizes the first two decades of the NII Testbeds and Community for Information access Research (NTCIR). NTCIR is a series of evaluation forums run by a global team of researchers and hosted by the National Institute of Informatics (NII), Japan. The book is unique in that it discusses not just what was done at NTCIR, but also how it was done and the impact it has achieved. For example, in some chapters the reader sees the early seeds of what eventually grew to be the search engines that provide access to content on the World Wide Web, today’s smartphones that can tailor what they show to the needs of their owners, and the smart speakers that enrich our lives at home and on the move. We also get glimpses into how new search engines can be built for mathematical formulae, or for the digital record of a lived human life. Key to the success of the NTCIR endeavor was early recognition that information access research is an empirical discipline and that evaluation therefore lay at the core of the enterprise. Evaluation is thus at the heart of each chapter in this book. They show, for example, how the recognition that some documents are more important than others has shaped thinking about evaluation design. The thirty-three contributors to this volume speak for the many hundreds of researchers from dozens of countries around the world who together shaped NTCIR as organizers and participants. This book is suitable for researchers, practitioners, and students—anyone who wants to learn about past and present evaluation efforts in information retrieval, information access, and natural language processing, as well as those who want to participate in an evaluation task or even to design and organize one

    Graph-Based Entity-Oriented Search

    Get PDF

    Accommodating Complex Chained Prepositional Phrases in Natural Language Query Interface to an Event-Based Triplestore

    Get PDF
    Building Natural language query interfaces (NLI) to databases is one the most interesting and challenging fields of study for computer scientists and researchers. There have been many advancements and achievements in this area that enables NLIs to operate more efficiently and have wide NL coverage. However, there exists some shortcomings in query interface to semantic web triplestores. Some researchers have attempted to extend the range of queries that can be answered. However, only a few techniques can handle queries containing complex chained prepositional phrases. This thesis involves extending an existing method that can accommodate prepositional phrases to also be able to handle when..., where..., and with what... type queries. The approach developed is implemented in the Miranda programing environment

    Applying Wikipedia to Interactive Information Retrieval

    Get PDF
    There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

    A portfolio theory approach to ease navigation task of users

    Get PDF
    The way the users interact with Information Retrieval (IR) systems is an interesting topic of interest in the field of Human Computer Interaction (HCI) and IR. With the ever increasing information in the web, users are often lost in the vast information space. Navigating in the complex information space to find the required information, is often an abstruse task by users. One of the reasons is the difficulty in designing systems that would present the user with an optimal set of navigation options to support varying information needs. As a solution to the navigation problem, in this thesis we propose a method referred as interaction portfolio theory, based on Markowitz's 'Modern Portfolio theory', a theory of finance. It provides the users with N optimal interaction options in each iteration, by taking into account user's goal expressed via interaction during the task, but also the risk related to a potentially suboptimal choice made by the user. In each iteration, the proposed method learns the relevant interaction options from user behaviour interactively and optimizes relevance and diversity to allow the user to accomplish the task in a shorter interaction sequence. This theory can be applied to any IR system to help users to retrieve the required information efficiently
    corecore