5,896 research outputs found

    Managing Keyword Variation with Frequency Based Generation of Word Forms in IR

    Get PDF
    Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 318-323

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Irrigation Management Information Network (IMIN) Keyword thesaurus

    Get PDF
    Irrigation management / Thesauri / Information services / Networks

    Retrieving relevant parts from large environmental-related documents

    Get PDF
    When attempting to consider the environment, a large quantity of information is available. Historically, librarians have provided a facility for both sorting this information into storage, and guiding users to the material relevant to their queries. With the steady increase in volume, detail and character of this information, existing methods of handling cannot cope. This thesis addresses this problem by developing a novel information system framework and applying it to the environmental domain. A brief study was made of information retrieval systems. An information system. framework was developed through the project. It covers the areas of query augmentation and search execution. In particular, the framework considers the issues of: using a domain model to help in specifying queries; and assessing and retrieving sub-parts of large documents. In order to test the novel concepts, a case study, which covers many steps in the information retrieval process, was designed and carried out with supportive results

    From Frequency to Meaning: Vector Space Models of Semantics

    Full text link
    Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field
    corecore