96 research outputs found

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Applying digital content management to support localisation

    Get PDF
    The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

    How NLP Can Improve Question Answering

    Get PDF
    Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated to the Question Answering track of the TREC8, TREC9, and TREC10 evaluations. QALC performs an analysis of documents relying on multi-word term search and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies on the application of syntactic patterns chosen according to the kind of information that is sought for, and categorized depending on the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer leve

    Evaluating epistemic uncertainty under incomplete assessments

    Get PDF
    The thesis of this study is to propose an extended methodology for laboratory based Information Retrieval evaluation under incomplete relevance assessments. This new methodology aims to identify potential uncertainty during system comparison that may result from incompleteness. The adoption of this methodology is advantageous, because the detection of epistemic uncertainty - the amount of knowledge (or ignorance) we have about the estimate of a system's performance - during the evaluation process can guide and direct researchers when evaluating new systems over existing and future test collections. Across a series of experiments we demonstrate how this methodology can lead towards a finer grained analysis of systems. In particular, we show through experimentation how the current practice in Information Retrieval evaluation of using a measurement depth larger than the pooling depth increases uncertainty during system comparison

    Einsatz neuronaler Netze als Transferkomponenten beim Retrieval in heterogenen DokumentbestÀnden

    Full text link
    "Die zunehmende weltweite Vernetzung und der Aufbau von digitalen Bibliotheken fĂŒhrt zu neuen Möglichkeiten bei der Suche in mehreren DatenbestĂ€nden. Dabei entsteht das Problem der semantischen HeterogenitĂ€t, da z.B. Begriffe in verschiedenen Kontexten verschiedene Bedeutung haben können. Die dafĂŒr notwendigen Transferkomponenten bilden eine neue Herausforderung, fĂŒr die neuronale Netze gut geeignet sind." (Autorenreferat

    Connexionisme et génétique pour la recherche d'information

    Get PDF
    Ce papier prĂ©sente le champ d'application des techniques issues des rĂ©seaux de neurones et de l'algorithmique gĂ©nĂ©tique au domaine de la recherche d'information. Un intĂ©rĂȘt particulier portera sur les principes de reformulation de requĂȘtes dans un processus de recherche d'informatio

    Approaches to collection selection and results merging for distributed information retrieval

    Get PDF
    • 

    corecore