10,249 research outputs found
Inferring transportation modes from GPS trajectories using a convolutional neural network
Identifying the distribution of users' transportation modes is an essential
part of travel demand analysis and transportation planning. With the advent of
ubiquitous GPS-enabled devices (e.g., a smartphone), a cost-effective approach
for inferring commuters' mobility mode(s) is to leverage their GPS
trajectories. A majority of studies have proposed mode inference models based
on hand-crafted features and traditional machine learning algorithms. However,
manual features engender some major drawbacks including vulnerability to
traffic and environmental conditions as well as possessing human's bias in
creating efficient features. One way to overcome these issues is by utilizing
Convolutional Neural Network (CNN) schemes that are capable of automatically
driving high-level features from the raw input. Accordingly, in this paper, we
take advantage of CNN architectures so as to predict travel modes based on only
raw GPS trajectories, where the modes are labeled as walk, bike, bus, driving,
and train. Our key contribution is designing the layout of the CNN's input
layer in such a way that not only is adaptable with the CNN schemes but
represents fundamental motion characteristics of a moving object including
speed, acceleration, jerk, and bearing rate. Furthermore, we ameliorate the
quality of GPS logs through several data preprocessing steps. Using the clean
input layer, a variety of CNN configurations are evaluated to achieve the best
CNN architecture. The highest accuracy of 84.8% has been achieved through the
ensemble of the best CNN configuration. In this research, we contrast our
methodology with traditional machine learning algorithms as well as the seminal
and most related studies to demonstrate the superiority of our framework.Comment: 12 pages, 3 figures, 7 tables, Transportation Research Part C:
Emerging Technologie
A High Quality Text-To-Speech System Composed of Multiple Neural Networks
While neural networks have been employed to handle several different
text-to-speech tasks, ours is the first system to use neural networks
throughout, for both linguistic and acoustic processing. We divide the
text-to-speech task into three subtasks, a linguistic module mapping from text
to a linguistic representation, an acoustic module mapping from the linguistic
representation to speech, and a video module mapping from the linguistic
representation to animated images. The linguistic module employs a
letter-to-sound neural network and a postlexical neural network. The acoustic
module employs a duration neural network and a phonetic neural network. The
visual neural network is employed in parallel to the acoustic module to drive a
talking head. The use of neural networks that can be retrained on the
characteristics of different voices and languages affords our system a degree
of adaptability and naturalness heretofore unavailable.Comment: Source link (9812006.tar.gz) contains: 1 PostScript file (4 pages)
and 3 WAV audio files. If your system does not support Windows WAV files, try
a tool like "sox" to translate the audio into a format of your choic
Convolutional Drift Networks for Video Classification
Analyzing spatio-temporal data like video is a challenging task that requires
processing visual and temporal information effectively. Convolutional Neural
Networks have shown promise as baseline fixed feature extractors through
transfer learning, a technique that helps minimize the training cost on visual
information. Temporal information is often handled using hand-crafted features
or Recurrent Neural Networks, but this can be overly specific or prohibitively
complex. Building a fully trainable system that can efficiently analyze
spatio-temporal data without hand-crafted features or complex training is an
open challenge. We present a new neural network architecture to address this
challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines
the visual feature extraction power of deep Convolutional Neural Networks with
the intrinsically efficient temporal processing provided by Reservoir
Computing. In this introductory paper on the CDN, we provide a very simple
baseline implementation tested on two egocentric (first-person) video activity
datasets.We achieve video-level activity classification results on-par with
state-of-the art methods. Notably, performance on this complex spatio-temporal
task was produced by only training a single feed-forward layer in the CDN.Comment: Published in IEEE Rebooting Computin
- …