Search CORE

36 research outputs found

Learning Points and Routes to Recommend Trajectories

Author: Cheng C.
Lim K. H.
Liu Q.
Papadimitriou C. H.
Shi Y.
Sinnott R. W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/08/2016
Field of study

The problem of recommending tours to travellers is an important and broadly studied area. Suggested solutions include various approaches of points-of-interest (POI) recommendation and route planning. We consider the task of recommending a sequence of POIs, that simultaneously uses information about POIs and routes. Our approach unifies the treatment of various sources of information by representing them as features in machine learning algorithms, enabling us to learn from past behaviour. Information about POIs are used to learn a POI ranking model that accounts for the start and end points of tours. Data about previous trajectories are used for learning transition patterns between POIs that enable us to recommend probable routes. In addition, a probabilistic model is proposed to combine the results of POI ranking and the POI to POI transitions. We propose a new F

_1

score on pairs of POIs that capture the order of visits. Empirical results show that our approach improves on recent methods, and demonstrate that combining points and routes enables better trajectory recommendations

arXiv.org e-Print Archive

Crossref

WordRank: Learning Word Embeddings via Robust Ranking

Author: Ji Shihao
Matsushima Shin
Vishwanathan S. V. N.
Yanardag Pinar
Yun Hyokun
Publication venue
Publication date: 01/01/2016
Field of study

Embedding words in a vector space has gained a lot of attention in recent years. While state-of-the-art methods provide efficient computation of word similarities via a low-dimensional matrix embedding, their motivation is often left unclear. In this paper, we argue that word embedding can be naturally viewed as a ranking problem due to the ranking nature of the evaluation metrics. Then, based on this insight, we propose a novel framework WordRank that efficiently estimates word representations via robust ranking, in which the attention mechanism and robustness to noise are readily achieved via the DCG-like ranking losses. The performance of WordRank is measured in word similarity and word analogy benchmarks, and the results are compared to the state-of-the-art word embedding techniques. Our algorithm is very competitive to the state-of-the- arts on large corpora, while outperforms them by a significant margin when the training set is limited (i.e., sparse and noisy). With 17 million tokens, WordRank performs almost as well as existing methods using 7.2 billion tokens on a popular word similarity benchmark. Our multi-node distributed implementation of WordRank is publicly available for general usage.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP), November 1-5, 2016, Austin, Texas, US

arXiv.org e-Print Archive

Crossref

PHD-GIFs: Personalized Highlight Detection for Automatic GIF Creation

Author: Ioffe Sergey
Jaimes Alejandro
Jiao Yifan
Li Yingbo
Ricci Francesco
Sharghi Aidean
Song Yale
Srivastava Nitish
Yao Ting
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2018
Field of study

Highlight detection models are typically trained to identify cues that make visual content appealing or interesting for the general public, with the objective of reducing a video to such moments. However, the "interestingness" of a video segment or image is subjective. Thus, such highlight models provide results of limited relevance for the individual user. On the other hand, training one model per user is inefficient and requires large amounts of personal information which is typically not available. To overcome these limitations, we present a global ranking model which conditions on each particular user's interests. Rather than training one model per user, our model is personalized via its inputs, which allows it to effectively adapt its predictions, given only a few user-specific examples. To train this model, we create a large-scale dataset of users and the GIFs they created, giving us an accurate indication of their interests. Our experiments show that using the user history substantially improves the prediction accuracy. On our test set of 850 videos, our model improves the recall by 8% with respect to generic highlight detectors. Furthermore, our method proves more precise than the user-agnostic baselines even with just one person-specific example.Comment: Accepted for publication at the 2018 ACM Multimedia Conference (MM '18

arXiv.org e-Print Archive

Crossref

Graph-of-Entity: A Model for Combined Data Representation and Retrieval

Author: Lopes Carla
Publication venue: OASIcs - OpenAccess Series in Informatics. 8th Symposium on Languages, Applications and Technologies (SLATE 2019)
Publication date: 01/01/2019
Field of study

Managing large volumes of digital documents along with the information they contain, or are associated with, can be challenging. As systems become more intelligent, it increasingly makes sense to power retrieval through all available data, where every lead makes it easier to reach relevant documents or entities. Modern search is heavily powered by structured knowledge, but users still query using keywords or, at the very best, telegraphic natural language. As search becomes increasingly dependent on the integration of text and knowledge, novel approaches for a unified representation of combined data present the opportunity to unlock new ranking strategies. We tackle entity-oriented search using graph-based approaches for representation and retrieval. In particular, we propose the graph-of-entity, a novel approach for indexing combined data, where terms, entities and their relations are jointly represented. We compare the graph-of-entity with the graph-of-word, a text-only model, verifying that, overall, it does not yet achieve a better performance, despite obtaining a higher precision. Our assessment was based on a small subset of the INEX 2009 Wikipedia Collection, created from a sample of 10 topics and respectively judged documents. The offline evaluation we do here is complementary to its counterpart from TREC 2017 OpenSearch track, where, during our participation, we had assessed graph-of-entity in an online setting, through team-draft interleaving

Dagstuhl Research Online Publication Server