3,526 research outputs found

    Domain-sensitive Temporal Tagging for Event-centric Information Retrieval

    Get PDF
    Temporal and geographic information is of major importance in virtually all contexts. Thus, it also occurs frequently in many types of text documents in the form of temporal and geographic expressions. Often, those are used to refer to something that was, is, or will be happening at some specific time and some specific place – in other words, temporal and geographic expressions are often used to refer to events. However, so far, event-related information needs are not well served by standard information retrieval approaches, which motivates the topic of this thesis: event-centric information retrieval. An important characteristic of temporal and geographic expressions – and thus of two components of events – is that they can be normalized so that their meaning is unambiguous and can be placed on a timeline or pinpointed on a map. In many research areas in which natural language processing is involved, e.g., in information retrieval, document summarization, and question answering, applications can highly benefit from having access to normalized information instead of only the words as they occur in documents. In this thesis, we present several frameworks for searching and exploring document collections with respect to occurring temporal, geographic, and event information. While we rely on an existing tool for extracting and normalizing geographic expressions, we study the task of temporal tagging, i.e., the extraction and normalization of temporal expressions. A crucial issue is that so far most research on temporal tagging dealt with English news-style documents. However, temporal expressions have to be handled in different ways depending on the domain of the documents from which they are extracted. Since we do not want to limit our research to one domain and one language, we develop the multilingual, cross-domain temporal tagger HeidelTime. It is the only publicly available temporal tagger for several languages and easy to extend to further languages. In addition, it achieves state-of-the-art evaluation results for all addressed domains and languages, and lays the foundations for all further contributions developed in this thesis. To achieve our goal of exploiting temporal and geographic expressions for event-centric information retrieval from a variety of text documents, we introduce the concept of spatio-temporal events and several concepts to "compute" with temporal, geographic, and event information. These concepts are used to develop a spatio-temporal ranking approach, which does not only consider textual, temporal, and geographic query parts but also two different types of proximity information. Furthermore, we adapt the spatio-temporal search idea by presenting a framework to directly search for events. Additionally, several map-based exploration frameworks are introduced that allow a new way of exploring event information latently contained in huge document collections. Finally, an event-centric document similarity model is developed that calculates document similarity on multilingual corpora solely based on extracted and normalized event information

    Time and information retrieval: Introduction to the special issue

    Get PDF
    The Special Issue of Information Processing and Management includes research papers on the intersection between time and information retrieval. In 'Evaluating Document Filtering Systems over Time', Tom Kenter and Krisztian Balog propose a time-aware way of measuring a system's performance at filtering documents. Manika Kar, SeAa7acute;rgio Nunes and Cristina Ribeiro present interesting methods for summarizing changes in dynamic text collections over time in their paper 'Summarization of Changes in Dynamic Text Collection using Latent Dirichlet Allocation Model.' Hideo Joho, Adam Jatowt and Roi Blanco report on the temporal information searching behaviour of users and their strategies for dealing with searches that have a temporal nature in 'Temporal Information Searching Behaviour and Strategies', a user study. In controlled settings, thirty participants are asked to perform searches on an array of topics on the web to find information related to particular time scopes. Adam Jatowt, Ching-man Au Yeung and Katsumi Tanaka present a 'Generic Method for Detecting Content Time of Documents'. The authors propose several methods for estimating the focus time of documents, i.e. the time a document's content refers to. Xujian Zhao, Peiquan Jin and Lihua Yue present an approach to determining the time of the underlying topic or event in their article entitled 'Discovering Topic Time from Web News'

    A Web GIS-based Integration of 3D Digital Models with Linked Open Data for Cultural Heritage Exploration

    Get PDF
    This PhD project explores how geospatial semantic web concepts, 3D web-based visualisation, digital interactive map, and cloud computing concepts could be integrated to enhance digital cultural heritage exploration; to offer long-term archiving and dissemination of 3D digital cultural heritage models; to better interlink heterogeneous and sparse cultural heritage data. The research findings were disseminated via four peer-reviewed journal articles and a conference article presented at GISTAM 2020 conference (which received the ‘Best Student Paper Award’)
    • …
    corecore