41,124 research outputs found

    Meeting of the MINDS: an information retrieval research agenda

    Get PDF
    Since its inception in the late 1950s, the field of Information Retrieval (IR) has developed tools that help people find, organize, and analyze information. The key early influences on the field are well-known. Among them are H. P. Luhn's pioneering work, the development of the vector space retrieval model by Salton and his students, Cleverdon's development of the Cranfield experimental methodology, Spärck Jones' development of idf, and a series of probabilistic retrieval models by Robertson and Croft. Until the development of the WorldWideWeb (Web), IR was of greatest interest to professional information analysts such as librarians, intelligence analysts, the legal community, and the pharmaceutical industry

    Distributed resource discovery using a context sensitive infrastructure

    Get PDF
    Distributed Resource Discovery in a World Wide Web environment using full-text indices will never scale. The distinct properties of WWW information (volume, rate of change, topical diversity) limits the scaleability of traditional approaches to distributed Resource Discovery. An approach combining metadata clustering and query routing can, on the other hand, be proven to scale much better. This paper presents the Content-Sensitive Infrastructure, which is a design building on these results. We also present an analytical framework for comparing scaleability of different distribution strategies

    Simplifying Deep-Learning-Based Model for Code Search

    Full text link
    To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR) based models for code search, which match keywords in query with code text. But they fail to connect the semantic gap between query and code. To conquer this challenge, Gu et al. proposed a deep-learning-based model named DeepCS. It jointly embeds method code and natural language description into a shared vector space, where methods related to a natural language query are retrieved according to their vector similarities. However, DeepCS' working process is complicated and time-consuming. To overcome this issue, we proposed a simplified model CodeMatcher that leverages the IR technique but maintains many features in DeepCS. Generally, CodeMatcher combines query keywords with the original order, performs a fuzzy search on name and body strings of methods, and returned the best-matched methods with the longer sequence of used keywords. We verified its effectiveness on a large-scale codebase with about 41k repositories. Experimental results showed the simplified model CodeMatcher outperforms DeepCS by 97% in terms of MRR (a widely used accuracy measure for code search), and it is over 66 times faster than DeepCS. Besides, comparing with the state-of-the-art IR-based model CodeHow, CodeMatcher also improves the MRR by 73%. We also observed that: fusing the advantages of IR-based and deep-learning-based models is promising because they compensate with each other by nature; improving the quality of method naming helps code search, since method name plays an important role in connecting query and code

    The End of Institutional Repositories and the Beginning of Social Academic Research Service: An Enhanced Role for Libraries

    Get PDF
    As more and more universities establish Institutional Repositories (IR), awareness is developing about the limitations of IRs in enhancing the academic research service. The concept of an IR needs to be expanded to include the integration of the processes that transform intellectual endeavor into a broadening array of academic and research support services which are fundamentally social. These include, but are not limited to – (1) sharing institutionally developed intellectual product (traditional IR) (2) informing others of the availability of this product with defined purpose (3) collecting additional academically relevant materials in digital formats using IRs (4) disseminating timely information about what has been collected to researchers (5) creating an environment that encourages awareness and exchange of information (6) and more…. In brief, information gathering, dissemination, and discussion in the form of library service must become a crucial part of researchers’ networks. An IR cannot and should not be viewed as a stand alone endeavor. It needs to be viewed and used as a research and communication tool in an environment that synergizes all elements of the research process. If an IR does not create discussions between librarians (information specialists) and researchers, its potential is lost both to the academy and the library. The library and its librarians must be interactive with researchers and the institution served. With the advent of digital acquisition that IRs started, a new vision of the role of librarians can be fulfilled. The foundational concepts behind this vision are found in my article: The Library as an Agent of Change: Pushing the Client Institution Forward Information Outlook (Journal of the Special Libraries Association), Vol. 3, No. 8, August 1999, pages 37-40. The above is not theoretical. It is being practiced every day at the Martin P. Catherwood Library of the School of Industrial and Labor Relations (ILR) at Cornell University where I work. By combining the uses of an IR, known as the DigitalCommons@ILR – see http://www.digitalcommons.ilr.cornell.edu, with a discipline-based Internet news service, see -- http://www.ilr.cornell.edu/iws/news-bureau/index.html, supported with outstanding web content, technical support for both print and digital collecting, reference, referral, and teaching, a goal has been realized. The library is seamlessly integrated into the outreach, research and teaching of the institution it serves. The library is part of the social fabric and network of the school

    Identifying Student Difficulties with Entropy, Heat Engines, and the Carnot Cycle

    Get PDF
    We report on several specific student difficulties regarding the Second Law of Thermodynamics in the context of heat engines within upper-division undergraduates thermal physics courses. Data come from ungraded written surveys, graded homework assignments, and videotaped classroom observations of tutorial activities. Written data show that students in these courses do not clearly articulate the connection between the Carnot cycle and the Second Law after lecture instruction. This result is consistent both within and across student populations. Observation data provide evidence for myriad difficulties related to entropy and heat engines, including students' struggles in reasoning about situations that are physically impossible and failures to differentiate between differential and net changes of state properties of a system. Results herein may be seen as the application of previously documented difficulties in the context of heat engines, but others are novel and emphasize the subtle and complex nature of cyclic processes and heat engines, which are central to the teaching and learning of thermodynamics and its applications. Moreover, the sophistication of these difficulties is indicative of the more advanced thinking required of students at the upper division, whose developing knowledge and understanding give rise to questions and struggles that are inaccessible to novices

    Academic Libraries in Transition: Current Trends, Future Prospects

    Get PDF
    Academic libraries are in transition because of changes in the context of higher education. Changes in the world of information are even more radical: the displacement of paper, the primacy of the search engine, the emergence of the digital lifestyle, and innovative patterns of scholarly communication. Decreasing reliance on local collections is transforming the library as a physical destination.Traditional measures of library success have begun to be replaced. Given the superiority of other information professionals’ data management skills, the role of academic librarians will shift toward the enablement of learning.This environment of upheaval will pose both opportunities and challenges for academic librarians

    Report on the Information Retrieval Festival (IRFest2017)

    Get PDF
    The Information Retrieval Festival took place in April 2017 in Glasgow. The focus of the workshop was to bring together IR researchers from the various Scottish universities and beyond in order to facilitate more awareness, increased interaction and reflection on the status of the field and its future. The program included an industry session, research talks, demos and posters as well as two keynotes. The first keynote was delivered by Prof. Jaana Kekalenien, who provided a historical, critical reflection of realism in Interactive Information Retrieval Experimentation, while the second keynote was delivered by Prof. Maarten de Rijke, who argued for more Artificial Intelligence usage in IR solutions and deployments. The workshop was followed by a "Tour de Scotland" where delegates were taken from Glasgow to Aberdeen for the European Conference in Information Retrieval (ECIR 2017

    DYNIQX: A novel meta-search engine for the web

    Get PDF
    The effect of metadata in collection fusion has not been sufficiently studied. In response to this, we present a novel meta-search engine called Dyniqx for metadata based search. Dyniqx integrates search results from search services of documents, images, and videos for generating a unified list of ranked search results. Dyniqx exploits the availability of metadata in search services such as PubMed, Google Scholar, Google Image Search, and Google Video Search etc for fusing search results from heterogeneous search engines. In addition, metadata from these search engines are used for generating dynamic query controls such as sliders and tick boxes etc which are used by users to filter search results. Our preliminary user evaluation shows that Dyniqx can help users complete information search tasks more efficiently and successfully than three well known search engines respectively. We also carried out one controlled user evaluation of the integration of six document/image/video based search engines (Google Scholar, PubMed, Intute, Google Image, Yahoo Image, and Google Video) in Dyniqx. We designed a questionnaire for evaluating different aspect of Dyniqx in assisting users complete search tasks. Each user used Dyniqx to perform a number of search tasks before completing the questionnaire. Our evaluation results confirm the effectiveness of the meta-search of Dyniqx in assisting user search tasks, and provide insights into better designs of the Dyniqx' interface
    • …
    corecore