74 research outputs found
A Batch Learning Framework for Scalable Personalized Ranking
In designing personalized ranking algorithms, it is desirable to encourage a
high precision at the top of the ranked list. Existing methods either seek a
smooth convex surrogate for a non-smooth ranking metric or directly modify
updating procedures to encourage top accuracy. In this work we point out that
these methods do not scale well to a large-scale setting, and this is partly
due to the inaccurate pointwise or pairwise rank estimation. We propose a new
framework for personalized ranking. It uses batch-based rank estimators and
smooth rank-sensitive loss functions. This new batch learning framework leads
to more stable and accurate rank approximations compared to previous work.
Moreover, it enables explicit use of parallel computation to speed up training.
We conduct empirical evaluation on three item recommendation tasks. Our method
shows consistent accuracy improvements over state-of-the-art methods.
Additionally, we observe time efficiency advantages when data scale increases.Comment: AAAI 2018, Feb 2-7, New Orleans, US
Implicit Language Model in LSTM for OCR
Neural networks have become the technique of choice for OCR, but many aspects
of how and why they deliver superior performance are still unknown. One key
difference between current neural network techniques using LSTMs and the
previous state-of-the-art HMM systems is that HMM systems have a strong
independence assumption. In comparison LSTMs have no explicit constraints on
the amount of context that can be considered during decoding. In this paper we
show that they learn an implicit LM and attempt to characterize the strength of
the LM in terms of equivalent n-gram context. We show that this implicitly
learned language model provides a 2.4\% CER improvement on our synthetic test
set when compared against a test set of random characters (i.e. not naturally
occurring sequences), and that the LSTM learns to use up to 5 characters of
context (which is roughly 88 frames in our configuration). We believe that this
is the first ever attempt at characterizing the strength of the implicit LM in
LSTM based OCR systems
An Investigation into the Pedagogical Features of Documents
Characterizing the content of a technical document in terms of its learning
utility can be useful for applications related to education, such as generating
reading lists from large collections of documents. We refer to this learning
utility as the "pedagogical value" of the document to the learner. While
pedagogical value is an important concept that has been studied extensively
within the education domain, there has been little work exploring it from a
computational, i.e., natural language processing (NLP), perspective. To allow a
computational exploration of this concept, we introduce the notion of
"pedagogical roles" of documents (e.g., Tutorial and Survey) as an intermediary
component for the study of pedagogical value. Given the lack of available
corpora for our exploration, we create the first annotated corpus of
pedagogical roles and use it to test baseline techniques for automatic
prediction of such roles.Comment: 12th Workshop on Innovative Use of NLP for Building Educational
Applications (BEA) at EMNLP 2017; 12 page
Deep Multimodal Image-Repurposing Detection
Nefarious actors on social media and other platforms often spread rumors and
falsehoods through images whose metadata (e.g., captions) have been modified to
provide visual substantiation of the rumor/falsehood. This type of modification
is referred to as image repurposing, in which often an unmanipulated image is
published along with incorrect or manipulated metadata to serve the actor's
ulterior motives. We present the Multimodal Entity Image Repurposing (MEIR)
dataset, a substantially challenging dataset over that which has been
previously available to support research into image repurposing detection. The
new dataset includes location, person, and organization manipulations on
real-world data sourced from Flickr. We also present a novel, end-to-end, deep
multimodal learning model for assessing the integrity of an image by combining
information extracted from the image with related information from a knowledge
base. The proposed method is compared against state-of-the-art techniques on
existing datasets as well as MEIR, where it outperforms existing methods across
the board, with AUC improvement up to 0.23.Comment: To be published at ACM Multimeda 2018 (orals
Temporal Learning and Sequence Modeling for a Job Recommender System
We present our solution to the job recommendation task for RecSys Challenge
2016. The main contribution of our work is to combine temporal learning with
sequence modeling to capture complex user-item activity patterns to improve job
recommendations. First, we propose a time-based ranking model applied to
historical observations and a hybrid matrix factorization over time re-weighted
interactions. Second, we exploit sequence properties in user-items activities
and develop a RNN-based recommendation model. Our solution achieved 5
place in the challenge among more than 100 participants. Notably, the strong
performance of our RNN approach shows a promising new direction in employing
sequence modeling for recommendation systems.Comment: a shorter version in proceedings of RecSys Challenge 201
- …