78,289 research outputs found
Placenames analysis in historical texts: tools, risks and side effects
International audienceThis article presents an approach combining linguistic analysis, geographic information retrieval and visualization in order to go from toponym extraction in historical texts to projection on customizable maps. The toolkit is released under an open source license, it features bootstrapping options, geocod-ing and disambiguation algorithms, as well as cartographic processing. The software setting is designed to be adaptable to various historical contexts, it can be extended by further automatically processed or user-curated gazetteers, used directly on texts or plugged-in on a larger processing pipeline. I provide an example of the issues raised by generic extraction and show the benefits of integrated knowledge-based approach, data cleaning and filtering
Learning to Retrieve Videos by Asking Questions
The majority of traditional text-to-video retrieval systems operate in static
environments, i.e., there is no interaction between the user and the agent
beyond the initial textual query provided by the user. This can be sub-optimal
if the initial query has ambiguities, which would lead to many falsely
retrieved videos. To overcome this limitation, we propose a novel framework for
Video Retrieval using Dialog (ViReD), which enables the user to interact with
an AI agent via multiple rounds of dialog, where the user refines retrieved
results by answering questions generated by an AI agent. Our novel multimodal
question generator learns to ask questions that maximize the subsequent video
retrieval performance using (i) the video candidates retrieved during the last
round of interaction with the user and (ii) the text-based dialog history
documenting all previous interactions, to generate questions that incorporate
both visual and linguistic cues relevant to video retrieval. Furthermore, to
generate maximally informative questions, we propose an Information-Guided
Supervision (IGS), which guides the question generator to ask questions that
would boost subsequent video retrieval accuracy. We validate the effectiveness
of our interactive ViReD framework on the AVSD dataset, showing that our
interactive method performs significantly better than traditional
non-interactive video retrieval systems. We also demonstrate that our proposed
approach generalizes to the real-world settings that involve interactions with
real humans, thus, demonstrating the robustness and generality of our framewor
A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval
In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail
Language-based multimedia information retrieval
This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality
Wykorzystanie terminologii w systemie informacyjno-wyszukiwawczym językoznawstwa slawistycznego iSybislaw
The paper describes the usage of linguistic terminology in the online iSybislaw bibliographic
database which represents the information retrieval system of the Slavic linguistics bibliography
(www.isybislaw.ispan.waw.pl). The concept of the system was presented at the International Congress
of Slavists in Krakow in 1998. iSybislaw consists of works in the field of Slavic linguistics,
contrastive studies, Slavic – non-Slavic contrastive studies, and to a lesser degree, general and theoretical
linguistics. The linguistic terminology reflected in the keyword language is the core of one
of the information retrieval tools (IR tool) in the iSybislaw system: Its primal function is the retrieval
information function. The iSybislaw system is multilingual, and it provides information in all
Slavic languages as well as in English. The languages of the gathered documents represent all natural
languages. Due to the multilinguality of the system, methodological problems such as synonymy and polysemy need to be resolved. The constructors of the system use the original method ological
approach to create classes of equivalent terms that are relevant for current Slavic linguistics. The
classes of terms in one language are combined with those in another and are joined in a multilingual
class of equivalence. The use of linguistic terminology in the function of keywords in the iSybislaw
system provides efficiency of information retrieval for cross-lingual searches and multilingual
users. The paper focuses on the theoretical and practical usage of linguistic terminology in the iSybislaw
system.W artykule omawiane jest wykorzystanie terminologii językoznawczej w bazie
iSybislaw, która prezentuje nowoczesny system informacyjno-wyszukiwawczy
językoznawstwa slawistycznego.W języku tego systemu terminy językoznawcze pełnią
specjalną funkcję: funkcję słów kluczowych. Szczególną uwagę autorzy zwracają
na rozwiązanie problemów związanych z wielojęzycznością zbioru informacyjnego,
synonimią i wieloznacznością terminów. Wskazują również na korzyści dla użytkowników
płynące z wykorzystania terminologii językoznawczej w systemie iSybislaw.Научни скупови / Српска академија наука и уметности ; књ. 157. Одељење језика и књижевности ; књ. 2
- …