927 research outputs found
Modeling Temporal Evidence from External Collections
Newsworthy events are broadcast through multiple mediums and prompt the
crowds to produce comments on social media. In this paper, we propose to
leverage on this behavioral dynamics to estimate the most relevant time periods
for an event (i.e., query). Recent advances have shown how to improve the
estimation of the temporal relevance of such topics. In this approach, we build
on two major novelties. First, we mine temporal evidences from hundreds of
external sources into topic-based external collections to improve the
robustness of the detection of relevant time periods. Second, we propose a
formal retrieval model that generalizes the use of the temporal dimension
across different aspects of the retrieval process. In particular, we show that
temporal evidence of external collections can be used to (i) infer a topic's
temporal relevance, (ii) select the query expansion terms, and (iii) re-rank
the final results for improved precision. Experiments with TREC Microblog
collections show that the proposed time-aware retrieval model makes an
effective and extensive use of the temporal dimension to improve search results
over the most recent temporal models. Interestingly, we observe a strong
correlation between precision and the temporal distribution of retrieved and
relevant documents.Comment: To appear in WSDM 201
Predicting Audio Advertisement Quality
Online audio advertising is a particular form of advertising used abundantly
in online music streaming services. In these platforms, which tend to host tens
of thousands of unique audio advertisements (ads), providing high quality ads
ensures a better user experience and results in longer user engagement.
Therefore, the automatic assessment of these ads is an important step toward
audio ads ranking and better audio ads creation. In this paper we propose one
way to measure the quality of the audio ads using a proxy metric called Long
Click Rate (LCR), which is defined by the amount of time a user engages with
the follow-up display ad (that is shown while the audio ad is playing) divided
by the impressions. We later focus on predicting the audio ad quality using
only acoustic features such as harmony, rhythm, and timbre of the audio,
extracted from the raw waveform. We discuss how the characteristics of the
sound can be connected to concepts such as the clarity of the audio ad message,
its trustworthiness, etc. Finally, we propose a new deep learning model for
audio ad quality prediction, which outperforms the other discussed models
trained on hand-crafted features. To the best of our knowledge, this is the
first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on
Web Search and Data Mining, 9 page
Graph Summarization via Node Grouping: A Spectral Algorithm
Graph summarization via node grouping is a popular method to build concise graph representations by grouping nodes from the original graph into supernodes and encoding edges into superedges such that the loss of adjacency information is minimized. Such summaries have immense applications in large-scale graph analytics due to their small size and high query processing efficiency. In this paper, we reformulate the loss minimization problem for summarization into an equivalent integer maximization problem. By initially allowing relaxed (fractional) solutions for integer maximization, we analytically expose the underlying connections to the spectral properties of the adjacency matrix. Consequently, we design an algorithm called SpecSumm that consists of two phases. In the first phase, motivated by spectral graph theory, we apply k-means clustering on the k largest (in magnitude) eigenvectors of the adjacency matrix to assign nodes to supernodes. In the second phase, we propose a greedy heuristic that updates the initial assignment to further improve summary quality. Finally, via extensive experiments on 11 datasets, we show that SpecSumm efficiently produces high-quality summaries compared to state-of-the-art summarization algorithms and scales to graphs with millions of nodes.Peer reviewe
Closing the loop: assisting archival appraisal and information retrieval in one sweep
In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval
Why people search for images using web search engines
What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image search behavior have mostly been query-based, focusing on what images people search for, rather than intent-based, that is, why people search for images. To date, there is no thorough investigation of how different image search intents affect users' search behavior. In this paper, we address the following questions: (1) Why do people search for images in text-based Web image search systems? (2) How does image search behavior
- …