27,319 research outputs found
Transcribing Content from Structural Images with Spotlight Mechanism
Transcribing content from structural images, e.g., writing notes from music
scores, is a challenging task as not only the content objects should be
recognized, but the internal structure should also be preserved. Existing image
recognition methods mainly work on images with simple content (e.g., text lines
with characters), but are not capable to identify ones with more complex
content (e.g., structured symbols), which often follow a fine-grained grammar.
To this end, in this paper, we propose a hierarchical Spotlight Transcribing
Network (STN) framework followed by a two-stage "where-to-what" solution.
Specifically, we first decide "where-to-look" through a novel spotlight
mechanism to focus on different areas of the original image following its
structure. Then, we decide "what-to-write" by developing a GRU based network
with the spotlight areas for transcribing the content accordingly. Moreover, we
propose two implementations on the basis of STN, i.e., STNM and STNR, where the
spotlight movement follows the Markov property and Recurrent modeling,
respectively. We also design a reinforcement method to refine the framework by
self-improving the spotlight mechanism. We conduct extensive experiments on
many structural image datasets, where the results clearly demonstrate the
effectiveness of STN framework.Comment: Accepted by KDD2018 Research Track. In proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD'18
Single stream parallelization of generalized LSTM-like RNNs on a GPU
Recurrent neural networks (RNNs) have shown outstanding performance on
processing sequence data. However, they suffer from long training time, which
demands parallel implementations of the training procedure. Parallelization of
the training algorithms for RNNs are very challenging because internal
recurrent paths form dependencies between two different time frames. In this
paper, we first propose a generalized graph-based RNN structure that covers the
most popular long short-term memory (LSTM) network. Then, we present a
parallelization approach that automatically explores parallelisms of arbitrary
RNNs by analyzing the graph structure. The experimental results show that the
proposed approach shows great speed-up even with a single training stream, and
further accelerates the training when combined with multiple parallel training
streams.Comment: Accepted by the 40th IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP) 201
Context-Aware Systems for Sequential Item Recommendation
Quizlet is the most popular online learning tool in the United States, and is
used by over 2/3 of high school students, and 1/2 of college students. With
more than 95% of Quizlet users reporting improved grades as a result, the
platform has become the de-facto tool used in millions of classrooms. In this
paper, we explore the task of recommending suitable content for a student to
study, given their prior interests, as well as what their peers are studying.
We propose a novel approach, i.e. Neural Educational Recommendation Engine
(NERE), to recommend educational content by leveraging student behaviors rather
than ratings. We have found that this approach better captures social factors
that are more aligned with learning. NERE is based on a recurrent neural
network that includes collaborative and content-based approaches for
recommendation, and takes into account any particular student's speed, mastery,
and experience to recommend the appropriate task. We train NERE by jointly
learning the user embeddings and content embeddings, and attempt to predict the
content embedding for the final timestamp. We also develop a confidence
estimator for our neural network, which is a crucial requirement for
productionizing this model. We apply NERE to Quizlet's proprietary dataset, and
present our results. We achieved an R^2 score of 0.81 in the content embedding
space, and a recall score of 54% on our 100 nearest neighbors. This vastly
exceeds the recall@100 score of 12% that a standard matrix-factorization
approach provides. We conclude with a discussion on how NERE will be deployed,
and position our work as one of the first educational recommender systems for
the K-12 space
- …