133,574 research outputs found
Domain-sensitive Temporal Tagging for Event-centric Information Retrieval
Temporal and geographic information is of major importance in virtually all contexts. Thus, it also occurs frequently in many types of text documents in the form of temporal and geographic expressions. Often, those are used to refer to something that was, is, or will be happening at some specific time and some specific place – in other words, temporal and geographic expressions are often used to refer to events. However, so far, event-related information needs are not well served by standard information retrieval approaches, which motivates the topic of this thesis: event-centric information retrieval.
An important characteristic of temporal and geographic expressions – and thus of two components of events – is that they can be normalized so that their meaning is unambiguous and can be placed on a timeline or pinpointed on a map. In many research areas in which natural language processing is involved, e.g., in information retrieval, document summarization, and question answering, applications can highly benefit from having access to normalized information instead of only the words as they occur in documents.
In this thesis, we present several frameworks for searching and exploring document collections with respect to occurring temporal, geographic, and event information. While we rely on an existing tool for extracting and normalizing geographic expressions, we study the task of temporal tagging, i.e., the extraction and normalization of temporal expressions. A crucial issue is that so far most research on temporal tagging dealt with English news-style documents. However, temporal expressions have to be handled in different ways depending on the domain of the documents from which they are extracted. Since we do not want to limit our research to one domain and one language, we develop the multilingual, cross-domain temporal tagger HeidelTime. It is the only publicly available temporal tagger for several languages and easy to extend to further languages. In addition, it achieves state-of-the-art evaluation results for all addressed domains and languages, and lays the foundations for all further contributions developed in this thesis.
To achieve our goal of exploiting temporal and geographic expressions for event-centric information retrieval from a variety of text documents, we introduce the concept of spatio-temporal events and several concepts to "compute" with temporal, geographic, and event information. These concepts are used to develop a spatio-temporal ranking approach, which does not only consider textual, temporal, and geographic query parts but also two different types of proximity information. Furthermore, we adapt the spatio-temporal search idea by presenting a framework to directly search for events. Additionally, several map-based exploration frameworks are introduced that allow a new way of exploring event information latently contained in huge document collections. Finally, an event-centric document similarity model is developed that calculates document similarity on multilingual corpora solely based on extracted and normalized event information
Recommended from our members
Exploring mobile trajectories: An investigation of individual spatial behaviour and geographic filters for information retrieval - Volume 2
This section presents both quantitative results, related to the performance of prediction surfaces according to the evaluation criteria described in section 3.5, and qualitative results, based upon testing geographic filters for information retrieval in an outdoor mobile computing environment as part of a major user evaluation study in the Swiss National Park, in the Summer of 2004.
The chapter begins with quantitative evaluation, where three test scenarios are described (walking, driving and daily migration), followed by a description of the systematic variation in the temporal component of prediction, designed to compare short-term (10 minutes) and long-term (60 minutes) predictions into future. Next, three geographic filters described in the methodology are introduced as prediction surfaces. The three filters are spatial proximity, temporal proximity and speed-heading predictions. For each approach, the prediction input parameters were varied in a systematic way to uncover the impact of buffer size and distance decay functions (for spatial proximity prediction surfaces), the ‘recent behaviour period’, temporal weighting and decay function (for speed-heading prediction surfaces) and the time budget and enclosing function (for temporal proximity prediction surfaces). Next, the results are presented and analysed to describe, explain and contrast the characteristics of each prediction approach. This is followed by a brief assessment of the suitability of each approach to the scenarios in which it was tested.
Overall, predictions based upon temporal proximity are found to be more effective than speed-heading predictions, which in turn out are more effective than predictions based upon spatial proximity.
Finally, the chapter concludes with the results of the user evaluation study, which suggests that users of mobile information retrieval tools found the implemented “search ahead” filter useful, are receptive to the idea of other geographic filters, and benefit from the use of personalised geographic information with a spatial and temporal component
Recommended from our members
Exploring mobile trajectories: An investigation of individual spatial behaviour and geographic filters for information retrieval - Volume 1
This section presents both quantitative results, related to the performance of prediction surfaces according to the evaluation criteria described in section 3.5, and qualitative results, based upon testing geographic filters for information retrieval in an outdoor mobile computing environment as part of a major user evaluation study in the Swiss National Park, in the Summer of 2004.
The chapter begins with quantitative evaluation, where three test scenarios are described (walking, driving and daily migration), followed by a description of the systematic variation in the temporal component of prediction, designed to compare short-term (10 minutes) and long-term (60 minutes) predictions into future. Next, three geographic filters described in the methodology are introduced as prediction surfaces. The three filters are spatial proximity, temporal proximity and speed-heading predictions. For each approach, the prediction input parameters were varied in a systematic way to uncover the impact of buffer size and distance decay functions (for spatial proximity prediction surfaces), the ‘recent behaviour period’, temporal weighting and decay function (for speed-heading prediction surfaces) and the time budget and enclosing function (for temporal proximity prediction surfaces). Next, the results are presented and analysed to describe, explain and contrast the characteristics of each prediction approach. This is followed by a brief assessment of the suitability of each approach to the scenarios in which it was tested.
Overall, predictions based upon temporal proximity are found to be more effective than speed-heading predictions, which in turn out are more effective than predictions based upon spatial proximity.
Finally, the chapter concludes with the results of the user evaluation study, which suggests that users of mobile information retrieval tools found the implemented “search ahead” filter useful, are receptive to the idea of other geographic filters, and benefit from the use of personalised geographic information with a spatial and temporal component
The DIGMAP geo-temporal web gazetteer service
This paper presents the DIGMAP geo-temporal Web gazetteer service, a system providing access to names of places, historical periods, and associated geo-temporal information. Within the DIGMAP project, this gazetteer serves as the unified repository of geographic and temporal information, assisting in the recognition and disambiguation of geo-temporal expressions over text, as well as in resource searching and indexing. We describe the data integration methodology, the handling of temporal information and some of the applications that use the gazetteer. Initial evaluation results show that the proposed system can adequately support several tasks related to geo-temporal information extraction and retrieval
A geo-temporal information extraction service for processing descriptive metadata in digital libraries
In the context of digital map libraries, resources are usually described according to metadata records that define the relevant subject, location, time-span, format and keywords. On what concerns locations and time-spans, metadata records are often incomplete or they provide information in a way that is not machine-understandable (e.g. textual descriptions). This paper presents techniques for extracting geotemporal information from text, using relatively simple text mining methods that leverage on a Web gazetteer service. The idea is to go from human-made geotemporal referencing (i.e. using place and period names in textual expressions) into geo-spatial coordinates and time-spans. A prototype system, implementing the proposed methods, is described in detail. Experimental results demonstrate the efficiency and accuracy of the proposed approaches
Exploring scholarly data with Rexplore.
Despite the large number and variety of tools and services available today for exploring scholarly data, current support is still very limited in the context of sensemaking tasks, which go beyond standard search and ranking of authors and publications, and focus instead on i) understanding the dynamics of research areas, ii) relating authors ‘semantically’ (e.g., in terms of common interests or shared academic trajectories), or iii) performing fine-grained academic expert search along multiple dimensions. To address this gap we have developed a novel tool, Rexplore, which integrates statistical analysis, semantic technologies, and visual analytics to provide effective support for exploring and making sense of scholarly data. Here, we describe the main innovative elements of the tool and we present the results from a task-centric empirical evaluation, which shows that Rexplore is highly effective at providing support for the aforementioned sensemaking tasks. In addition, these results are robust both with respect to the background of the users (i.e., expert analysts vs. ‘ordinary’ users) and also with respect to whether the tasks are selected by the evaluators or proposed by the users themselves
- …