339 research outputs found

    Reformulation of queries using similarity thesauri

    Get PDF
    Este artículo trata sobre la recuparación de la información en thesaurus similares.One of the major problems in information retrieval is the formulation of queries on thepart of the user. This entails specifying a set of words or terms that express their informationalneed. However, it is well-known that two people can assign different terms to refer tothe same concepts. The techniques that attempt to reduce this problem as much as possiblegenerally start from a first search, and then study how the initial query can be modified toobtain better results. In general, the construction of the new query involves expanding theterms of the initial query and recalculating the importance of each term in the expandedquery. Depending on the technique used to formulate the new query several strategies aredistinguished. These strategies are based on the idea that if two terms are similar (withrespect to any criterion), the documents in which both terms appear frequently will also berelated. The technique we used in this study is known as query expansion using similaritythesauri

    A Systematic Review of Automated Query Reformulations in Source Code Search

    Full text link
    Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight major methodologies (e.g., term weighting, term co-occurrence analysis, thesaurus lookup) have been adopted to reformulate queries. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, subjective bias) that might prevent their wide adoption. Finally, we discuss the best practices and future opportunities to advance the state of research in search query reformulations.Comment: 81 pages, accepted at TOSE

    Cross-concordances: terminology mapping and its effectiveness for information retrieval

    Get PDF
    The German Federal Ministry for Education and Research funded a major terminology mapping initiative, which found its conclusion in 2007. The task of this terminology mapping initiative was to organize, create and manage 'cross-concordances' between controlled vocabularies (thesauri, classification systems, subject heading lists) centred around the social sciences but quickly extending to other subject areas. 64 crosswalks with more than 500,000 relations were established. In the final phase of the project, a major evaluation effort to test and measure the effectiveness of the vocabulary mappings in an information system environment was conducted. The paper reports on the cross-concordance work and evaluation results.Comment: 19 pages, 4 figures, 11 tables, IFLA conference 200

    Improving Retrieval Results with discipline-specific Query Expansion

    Full text link
    Choosing the right terms to describe an information need is becoming more difficult as the amount of available information increases. Search-Term-Recommendation (STR) systems can help to overcome these problems. This paper evaluates the benefits that may be gained from the use of STRs in Query Expansion (QE). We create 17 STRs, 16 based on specific disciplines and one giving general recommendations, and compare the retrieval performance of these STRs. The main findings are: (1) QE with specific STRs leads to significantly better results than QE with a general STR, (2) QE with specific STRs selected by a heuristic mechanism of topic classification leads to better results than the general STR, however (3) selecting the best matching specific STR in an automatic way is a major challenge of this process.Comment: 6 pages; to be published in Proceedings of Theory and Practice of Digital Libraries 2012 (TPDL 2012

    Consolidated study on query expansion

    Full text link
    A typical day of million web users all over the world starts with a simple query. The quest for information on a particular topic drives them to search for it, and in the pursuit of their info the terms they supply for queries varies from person to person depending on the knowledge they have. With a vast collection of documents available on the web universe it is the onus of the retrieval system to return only those documents that are relevant and satisfy the user’s search requirements. The document mismatch problem is resolved by appending extra query terms to the original query which improves the retrieval performance. The addition of terms tends to minimize the bridging-gap between the documents and queries. In this thesis, a brief study is done on the reformulation of queries, along with methods of calculating the relevancy of candidate terms for query expansion by using several ranking algorithms, term weighting algorithms and feedback processes involving evaluations. Comparisons of various methods based on their efficiencies are also discussed. On the whole a consolidated report of query expansion in general is given
    corecore