70 research outputs found

    Learning Dynamic Classes of Events using Stacked Multilayer Perceptron Networks

    Full text link
    People often use a web search engine to find information about events of interest, for example, sport competitions, political elections, festivals and entertainment news. In this paper, we study a problem of detecting event-related queries, which is the first step before selecting a suitable time-aware retrieval model. In general, event-related information needs can be observed in query streams through various temporal patterns of user search behavior, e.g., spiky peaks for popular events, and periodicities for repetitive events. However, it is also common that users search for non-popular events, which may not exhibit temporal variations in query streams, e.g., past events recently occurred, historical events triggered by anniversaries or similar events, and future events anticipated to happen. To address the challenge of detecting dynamic classes of events, we propose a novel deep learning model to classify a given query into a predetermined set of multiple event types. Our proposed model, a Stacked Multilayer Perceptron (S-MLP) network, consists of multilayer perceptron used as a basic learning unit. We assemble stacked units to further learn complex relationships between neutrons in successive layers. To evaluate our proposed model, we conduct experiments using real-world queries and a set of manually created ground truth. Preliminary results have shown that our proposed deep learning model outperforms the state-of-the-art classification models significantly.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieval, 6 pages, 4 figure

    Multiple Models for Recommending Temporal Aspects of Entities

    Full text link
    Entity aspect recommendation is an emerging task in semantic search that helps users discover serendipitous and prominent information with respect to an entity, of which salience (e.g., popularity) is the most important factor in previous work. However, entity aspects are temporally dynamic and often driven by events happening over time. For such cases, aspect suggestion based solely on salience features can give unsatisfactory results, for two reasons. First, salience is often accumulated over a long time period and does not account for recency. Second, many aspects related to an event entity are strongly time-dependent. In this paper, we study the task of temporal aspect recommendation for a given entity, which aims at recommending the most relevant aspects and takes into account time in order to improve search experience. We propose a novel event-centric ensemble ranking method that learns from multiple time and type-dependent models and dynamically trades off salience and recency characteristics. Through extensive experiments on real-world query logs, we demonstrate that our method is robust and achieves better effectiveness than competitive baselines.Comment: In proceedings of the 15th Extended Semantic Web Conference (ESWC 2018

    Location Inference for Non-geotagged Tweets in User Timelines

    Get PDF

    How to Search the Internet Archive Without Indexing It

    Get PDF
    Significant parts of cultural heritage are produced on the web during the last decades. While easy accessibility to the current web is a good baseline, optimal access to the past web faces several challenges. This includes dealing with large-scale web archive collections and lacking of usage logs that contain implicit human feedback most relevant for today's web search. In this paper, we propose an entity-oriented search system to support retrieval and analytics on the Internet Archive. We use Bing to retrieve a ranked list of results from the current web. In addition, we link retrieved results to the WayBack Machine; thus allowing keyword search on the Internet Archive without processing and indexing its raw archived content. Our search system complements existing web archive search tools through a user-friendly interface, which comes close to the functionalities of modern web search engines (e.g., keyword search, query auto-completion and related query suggestion), and provides a great benefit of taking user feedback on the current web into account also for web archive search. Through extensive experiments, we conduct quantitative and qualitative analyses in order to provide insights that enable further research on and practical applications of web archives

    Diachronic Variation of Temporal Expressions in Scientific Writing Through the Lens of Relative Entropy

    Get PDF
    The abundance of temporal information in documents has lead to an increased interest in processing such information in the NLP community by considering temporal expressions. Besides domain-adaptation, acquiring knowledge on variation of temporal expressions according to time is relevant for improvement in automatic processing. So far, frequency-based accounts dominate in the investigation of specific temporal expressions. We present an approach to investigate diachronic changes of temporal expressions based on relative entropy – with the advantage of using conditioned probabilities rather than mere frequency. While we focus on scientific writing, our approach is generalizable to other domains and interesting not only in the field of NLP, but also in humanities.This work is partially funded by Deutsche Forschungsgemeinschaft (DFG) under grant SFB 1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de)

    Time and information retrieval: Introduction to the special issue

    Get PDF
    The Special Issue of Information Processing and Management includes research papers on the intersection between time and information retrieval. In 'Evaluating Document Filtering Systems over Time', Tom Kenter and Krisztian Balog propose a time-aware way of measuring a system's performance at filtering documents. Manika Kar, SeAa7acute;rgio Nunes and Cristina Ribeiro present interesting methods for summarizing changes in dynamic text collections over time in their paper 'Summarization of Changes in Dynamic Text Collection using Latent Dirichlet Allocation Model.' Hideo Joho, Adam Jatowt and Roi Blanco report on the temporal information searching behaviour of users and their strategies for dealing with searches that have a temporal nature in 'Temporal Information Searching Behaviour and Strategies', a user study. In controlled settings, thirty participants are asked to perform searches on an array of topics on the web to find information related to particular time scopes. Adam Jatowt, Ching-man Au Yeung and Katsumi Tanaka present a 'Generic Method for Detecting Content Time of Documents'. The authors propose several methods for estimating the focus time of documents, i.e. the time a document's content refers to. Xujian Zhao, Peiquan Jin and Lihua Yue present an approach to determining the time of the underlying topic or event in their article entitled 'Discovering Topic Time from Web News'

    Real-time timeline summarisation for high-impact events in Twitter.

    Get PDF
    Twitter has become a valuable source of event-related information, namely, breaking news and local event reports. Due to its capability of transmitting information in real-time, Twitter is further exploited for timeline summarisation of high-impact events, such as protests, accidents, natural disasters or disease outbreaks. Such summaries can serve as important event digests where users urgently need information, especially if they are directly affected by the events. In this paper, we study the problem of timeline summarisation of high-impact events that need to be generated in real-time. Our proposed approach includes four stages: classification of realworld events reporting tweets, online incremental clustering, postprocessing and sub-events summarisation. We conduct a comprehensive evaluation of different stages on the “Ebola outbreak” tweet stream, and compare our approach with several baselines, to demonstrate its effectiveness. Our approach can be applied as a replacement of a manually generated timeline and provides early alarms for disaster surveillance
    corecore