78,289 research outputs found

    Placenames analysis in historical texts: tools, risks and side effects

    Get PDF
    International audienceThis article presents an approach combining linguistic analysis, geographic information retrieval and visualization in order to go from toponym extraction in historical texts to projection on customizable maps. The toolkit is released under an open source license, it features bootstrapping options, geocod-ing and disambiguation algorithms, as well as cartographic processing. The software setting is designed to be adaptable to various historical contexts, it can be extended by further automatically processed or user-curated gazetteers, used directly on texts or plugged-in on a larger processing pipeline. I provide an example of the issues raised by generic extraction and show the benefits of integrated knowledge-based approach, data cleaning and filtering

    Learning to Retrieve Videos by Asking Questions

    Full text link
    The majority of traditional text-to-video retrieval systems operate in static environments, i.e., there is no interaction between the user and the agent beyond the initial textual query provided by the user. This can be sub-optimal if the initial query has ambiguities, which would lead to many falsely retrieved videos. To overcome this limitation, we propose a novel framework for Video Retrieval using Dialog (ViReD), which enables the user to interact with an AI agent via multiple rounds of dialog, where the user refines retrieved results by answering questions generated by an AI agent. Our novel multimodal question generator learns to ask questions that maximize the subsequent video retrieval performance using (i) the video candidates retrieved during the last round of interaction with the user and (ii) the text-based dialog history documenting all previous interactions, to generate questions that incorporate both visual and linguistic cues relevant to video retrieval. Furthermore, to generate maximally informative questions, we propose an Information-Guided Supervision (IGS), which guides the question generator to ask questions that would boost subsequent video retrieval accuracy. We validate the effectiveness of our interactive ViReD framework on the AVSD dataset, showing that our interactive method performs significantly better than traditional non-interactive video retrieval systems. We also demonstrate that our proposed approach generalizes to the real-world settings that involve interactions with real humans, thus, demonstrating the robustness and generality of our framewor

    A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval

    Get PDF
    In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    Wykorzystanie terminologii w systemie informacyjno-wyszukiwawczym językoznawstwa slawistycznego iSybislaw

    Get PDF
    The paper describes the usage of linguistic terminology in the online iSybislaw bibliographic database which represents the information retrieval system of the Slavic linguistics bibliography (www.isybislaw.ispan.waw.pl). The concept of the system was presented at the International Congress of Slavists in Krakow in 1998. iSybislaw consists of works in the field of Slavic linguistics, contrastive studies, Slavic – non-Slavic contrastive studies, and to a lesser degree, general and theoretical linguistics. The linguistic terminology reflected in the keyword language is the core of one of the information retrieval tools (IR tool) in the iSybislaw system: Its primal function is the retrieval information function. The iSybislaw system is multilingual, and it provides information in all Slavic languages as well as in English. The languages of the gathered documents represent all natural languages. Due to the multilinguality of the system, methodological problems such as synonymy and polysemy need to be resolved. The constructors of the system use the original method ological approach to create classes of equivalent terms that are relevant for current Slavic linguistics. The classes of terms in one language are combined with those in another and are joined in a multilingual class of equivalence. The use of linguistic terminology in the function of keywords in the iSybislaw system provides efficiency of information retrieval for cross-lingual searches and multilingual users. The paper focuses on the theoretical and practical usage of linguistic terminology in the iSybislaw system.W artykule omawiane jest wykorzystanie terminologii językoznawczej w bazie iSybislaw, która prezentuje nowoczesny system informacyjno-wyszukiwawczy językoznawstwa slawistycznego.W języku tego systemu terminy językoznawcze pełnią specjalną funkcję: funkcję słów kluczowych. Szczególną uwagę autorzy zwracają na rozwiązanie problemów związanych z wielojęzycznością zbioru informacyjnego, synonimią i wieloznacznością terminów. Wskazują również na korzyści dla użytkowników płynące z wykorzystania terminologii językoznawczej w systemie iSybislaw.Научни скупови / Српска академија наука и уметности ; књ. 157. Одељење језика и књижевности ; књ. 2

    Beyond English text: Multilingual and multimedia information retrieval.

    Get PDF
    Non
    corecore