Search CORE

70 research outputs found

Learning Dynamic Classes of Events using Stacked Multilayer Perceptron Networks

Author: Kanhabua Nattiya
Moeslund Thomas B.
Ren Huamin
Publication venue
Publication date: 01/01/2016
Field of study

People often use a web search engine to find information about events of interest, for example, sport competitions, political elections, festivals and entertainment news. In this paper, we study a problem of detecting event-related queries, which is the first step before selecting a suitable time-aware retrieval model. In general, event-related information needs can be observed in query streams through various temporal patterns of user search behavior, e.g., spiky peaks for popular events, and periodicities for repetitive events. However, it is also common that users search for non-popular events, which may not exhibit temporal variations in query streams, e.g., past events recently occurred, historical events triggered by anniversaries or similar events, and future events anticipated to happen. To address the challenge of detecting dynamic classes of events, we propose a novel deep learning model to classify a given query into a predetermined set of multiple event types. Our proposed model, a Stacked Multilayer Perceptron (S-MLP) network, consists of multilayer perceptron used as a basic learning unit. We assemble stacked units to further learn complex relationships between neutrons in successive layers. To evaluate our proposed model, we conduct experiments using real-world queries and a set of manually created ground truth. Preliminary results have shown that our proposed deep learning model outperforms the state-of-the-art classification models significantly.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieval, 6 pages, 4 figure

arXiv.org e-Print Archive

VBN

Multiple Models for Recommending Temporal Aspects of Entities

Author: Kanhabua Nattiya
Nejdl Wolfgang
Nguyen Tu Ngoc
Publication venue
Publication date: 03/06/2018
Field of study

Entity aspect recommendation is an emerging task in semantic search that helps users discover serendipitous and prominent information with respect to an entity, of which salience (e.g., popularity) is the most important factor in previous work. However, entity aspects are temporally dynamic and often driven by events happening over time. For such cases, aspect suggestion based solely on salience features can give unsatisfactory results, for two reasons. First, salience is often accumulated over a long time period and does not account for recency. Second, many aspects related to an event entity are strongly time-dependent. In this paper, we study the task of temporal aspect recommendation for a given entity, which aims at recommending the most relevant aspects and takes into account time in order to improve search experience. We propose a novel event-centric ensemble ranking method that learns from multiple time and type-dependent models and dynamically trades off salience and recency characteristics. Through extensive experiments on real-world query logs, we demonstrate that our method is robust and achieves better effectiveness than competitive baselines.Comment: In proceedings of the 15th Extended Semantic Web Conference (ESWC 2018

arXiv.org e-Print Archive

Location Inference for Non-geotagged Tweets in User Timelines

Author: Kanhabua Nattiya
Li Pengfei
Lu Hua
Pan Gang
Zhao Sha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

VBN

How to Search the Internet Archive Without Indexing It

Author: Kanhabua Nattiya
Kemkes Philipp
Nejdl Wolfgang
Nguyen Tu Ngoc
Reis Felipe
Tran Nam Khanh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Significant parts of cultural heritage are produced on the web during the last decades. While easy accessibility to the current web is a good baseline, optimal access to the past web faces several challenges. This includes dealing with large-scale web archive collections and lacking of usage logs that contain implicit human feedback most relevant for today's web search. In this paper, we propose an entity-oriented search system to support retrieval and analytics on the Internet Archive. We use Bing to retrieve a ranked list of results from the current web. In addition, we link retrieved results to the WayBack Machine; thus allowing keyword search on the Internet Archive without processing and indexing its raw archived content. Our search system complements existing web archive search tools through a user-friendly interface, which comes close to the functionalities of modern web search engines (e.g., keyword search, query auto-completion and related query suggestion), and provides a great benefit of taking user feedback on the current web into account also for web archive search. Through extensive experiments, we conduct quantitative and qualitative analyses in order to provide insights that enable further research on and practical applications of web archives

arXiv.org e-Print Archive

Crossref

VBN

Diachronic Variation of Temporal Expressions in Scientific Writing Through the Lens of Relative Entropy

Author: D Atkinson
D Biber
D Biber
I Dagan
J Gleick
J Pustejovsky
J Strötgen
JC Meister
JF Allen
N Kanhabua
P Mazur
R Campos
S Degaetano-Ortlieb
S Kullback
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The abundance of temporal information in documents has lead to an increased interest in processing such information in the NLP community by considering temporal expressions. Besides domain-adaptation, acquiring knowledge on variation of temporal expressions according to time is relevant for improvement in automatic processing. So far, frequency-based accounts dominate in the investigation of specific temporal expressions. We present an approach to investigate diachronic changes of temporal expressions based on relative entropy – with the advantage of using conditioned probabilities rather than mere frequency. While we focus on scientific writing, our approach is generalizable to other domains and interesting not only in the field of NLP, but also in humanities.This work is partially funded by Deutsche Forschungsgemeinschaft (DFG) under grant SFB 1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de)

Crossref

Universaar

MPG.PuRe

Scientific publications of the Saarland University

Time and information retrieval: Introduction to the special issue

Author: Belkin
Bethard
Campos
Campos
Campos
Derczynski
Derczynski
Jannik Strötgen
Jatowt
Jensen
Joho
Kahle
Kanhabua
Kar
Kenter
Leon Derczynski
Llorens
Matthews
Minard
Omar Alonso
Plaisant
Popescu
Radinsky
Ricardo Campos
Rula
Strötgen
Strötgen
Sugumaran
Talukdar
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/11/2015
Field of study

The Special Issue of Information Processing and Management includes research papers on the intersection between time and information retrieval. In 'Evaluating Document Filtering Systems over Time', Tom Kenter and Krisztian Balog propose a time-aware way of measuring a system's performance at filtering documents. Manika Kar, SeAa7acute;rgio Nunes and Cristina Ribeiro present interesting methods for summarizing changes in dynamic text collections over time in their paper 'Summarization of Changes in Dynamic Text Collection using Latent Dirichlet Allocation Model.' Hideo Joho, Adam Jatowt and Roi Blanco report on the temporal information searching behaviour of users and their strategies for dealing with searches that have a temporal nature in 'Temporal Information Searching Behaviour and Strategies', a user study. In controlled settings, thirty participants are asked to perform searches on an array of topics on the web to find information related to particular time scopes. Adam Jatowt, Ching-man Au Yeung and Katsumi Tanaka present a 'Generic Method for Detecting Content Time of Documents'. The authors propose several methods for estimating the focus time of documents, i.e. the time a document's content refers to. Xujian Zhao, Peiquan Jin and Lihua Yue present an approach to determining the time of the underlying topic or event in their article entitled 'Discovering Topic Time from Web News'

Crossref

White Rose Research Online

Real-time timeline summarisation for high-impact events in Twitter.

Author: Bouquet Paolo
Cristea A.I.
Dignum Frank
Dignum Virginia
Fox Maria
Harmelen Frank van
Hüllermeier Eyke
Kaminka Gal A.
Kanhabua N.
Zhou Yiwei
Publication venue: 'IOS Press'
Publication date: 24/08/2016
Field of study

Twitter has become a valuable source of event-related information, namely, breaking news and local event reports. Due to its capability of transmitting information in real-time, Twitter is further exploited for timeline summarisation of high-impact events, such as protests, accidents, natural disasters or disease outbreaks. Such summaries can serve as important event digests where users urgently need information, especially if they are directly affected by the events. In this paper, we study the problem of timeline summarisation of high-impact events that need to be generated in real-time. Our proposed approach includes four stages: classification of realworld events reporting tweets, online incremental clustering, postprocessing and sub-events summarisation. We conduct a comprehensive evaluation of different stages on the “Ebola outbreak” tweet stream, and compare our approach with several baselines, to demonstrate its effectiveness. Our approach can be applied as a replacement of a manually generated timeline and provides early alarms for disaster surveillance

Durham Research Online

Warwick Research Archives Portal Repository