3,506 research outputs found
Character-level Recurrent Neural Networks in Practice: Comparing Training and Sampling Schemes
Recurrent neural networks are nowadays successfully used in an abundance of
applications, going from text, speech and image processing to recommender
systems. Backpropagation through time is the algorithm that is commonly used to
train these networks on specific tasks. Many deep learning frameworks have
their own implementation of training and sampling procedures for recurrent
neural networks, while there are in fact multiple other possibilities to choose
from and other parameters to tune. In existing literature this is very often
overlooked or ignored. In this paper we therefore give an overview of possible
training and sampling schemes for character-level recurrent neural networks to
solve the task of predicting the next token in a given sequence. We test these
different schemes on a variety of datasets, neural network architectures and
parameter settings, and formulate a number of take-home recommendations. The
choice of training and sampling scheme turns out to be subject to a number of
trade-offs, such as training stability, sampling time, model performance and
implementation effort, but is largely independent of the data. Perhaps the most
surprising result is that transferring hidden states for correctly initializing
the model on subsequences often leads to unstable training behavior depending
on the dataset.Comment: 23 pages, 11 figures, 4 table
Semantics-driven event clustering in Twitter feeds
Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources - either textual, temporal, geographic or community features - have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic information can also be used to drive the actual event detection, which is less covered by academic research. We therefore supplemented an existing baseline event clustering algorithm with semantic information about the tweets in order to improve its performance. This paper lays out the details of the semantics-driven event clustering algorithms developed, discusses a novel method to aid in the creation of a ground truth for event detection purposes, and analyses how well the algorithms improve over baseline. We find that assigning semantic information to every individual tweet results in just a worse performance in F1 measure compared to baseline. If however semantics are assigned on a coarser, hashtag level the improvement over baseline is substantial and significant in both precision and recall
Representation learning for very short texts using weighted word embedding aggregation
Short text messages such as tweets are very noisy and sparse in their use of
vocabulary. Traditional textual representations, such as tf-idf, have
difficulty grasping the semantic meaning of such texts, which is important in
applications such as event detection, opinion mining, news recommendation, etc.
We constructed a method based on semantic word embeddings and frequency
information to arrive at low-dimensional representations for short texts
designed to capture semantic similarity. For this purpose we designed a
weight-based model and a learning procedure based on a novel median-based loss
function. This paper discusses the details of our model and the optimization
methods, together with the experimental results on both Wikipedia and Twitter
data. We find that our method outperforms the baseline approaches in the
experiments, and that it generalizes well on different word embeddings without
retraining. Our method is therefore capable of retaining most of the semantic
information in the text, and is applicable out-of-the-box.Comment: 8 pages, 3 figures, 2 tables, appears in Pattern Recognition Letter
Learning perception and planning with deep active inference
Active inference is a process theory of the brain that states that all living organisms infer actions in order to minimize their (expected) free energy. However, current experiments are limited to predefined, often discrete, state spaces. In this paper we use recent advances in deep learning to learn the state space and approximate the necessary probability distributions to engage in active inference
LiDAR-assisted Large-scale Privacy Protection in Street-view Cycloramas
Recently, privacy has a growing importance in several domains, especially in
street-view images. The conventional way to achieve this is to automatically
detect and blur sensitive information from these images. However, the
processing cost of blurring increases with the ever-growing resolution of
images. We propose a system that is cost-effective even after increasing the
resolution by a factor of 2.5. The new system utilizes depth data obtained from
LiDAR to significantly reduce the search space for detection, thereby reducing
the processing cost. Besides this, we test several detectors after reducing the
detection space and provide an alternative solution based on state-of-the-art
deep learning detectors to the existing HoG-SVM-Deep system that is faster and
has a higher performance.Comment: Accepted at Electronic Imaging 201
- …