40 research outputs found
Temporal activity detection in untrimmed videos with recurrent neural networks
This work proposes a simple pipeline to classify and temporally localize activities in untrimmed videos. Our system uses features from a 3D Convolutional Neural Network (C3D) as input to train a a recurrent neural network (RNN) that learns to classify video clips of 16 frames. After clip prediction, we post-process the output of the RNN to assign a single activity label to each video, and determine the temporal boundaries of the activity within the video. We show how our system can achieve competitive results in both tasks with a simple architecture. We evaluate our method in the ActivityNet Challenge 2016, achieving a 0.5874 mAP and a 0.2237 mAP in the classification and detection tasks, respectively. Our code and models are publicly available at: https://imatge-upc.github.io/activitynet-2016-cvprw/Peer ReviewedPostprint (published version
DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS
Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs
Modeling The Intensity Function Of Point Process Via Recurrent Neural Networks
Event sequence, asynchronously generated with random timestamp, is ubiquitous
among applications. The precise and arbitrary timestamp can carry important
clues about the underlying dynamics, and has lent the event data fundamentally
different from the time-series whereby series is indexed with fixed and equal
time interval. One expressive mathematical tool for modeling event is point
process. The intensity functions of many point processes involve two
components: the background and the effect by the history. Due to its inherent
spontaneousness, the background can be treated as a time series while the other
need to handle the history events. In this paper, we model the background by a
Recurrent Neural Network (RNN) with its units aligned with time series indexes
while the history effect is modeled by another RNN whose units are aligned with
asynchronous events to capture the long-range dynamics. The whole model with
event type and timestamp prediction output layers can be trained end-to-end.
Our approach takes an RNN perspective to point process, and models its
background and history effect. For utility, our method allows a black-box
treatment for modeling the intensity which is often a pre-defined parametric
form in point processes. Meanwhile end-to-end training opens the venue for
reusing existing rich techniques in deep network for point process modeling. We
apply our model to the predictive maintenance problem using a log dataset by
more than 1000 ATMs from a global bank headquartered in North America.Comment: Accepted at Thirty-First AAAI Conference on Artificial Intelligence
(AAAI17
BiLSTM with CNN Features For HAR in Videos
El reconocimiento de acciones en videos es actualmente un tema de inter茅s en el 谩rea de visi贸n por computadora debido a sus potenciales aplicaciones tales como indexaci贸n en multimedia, vigilancia en espacios p煤blicos, entre otras. En este trabajo proponemos una arquitectura CNN-BiLSTM. Primero, una red neuronal convolucional VGG16 previamente entrenada extrae las caracter铆sticas del video de entrada. Luego, un BiLSTM clasifica el video en una clase en particular. Evaluamos el rendimiento de nuestro sistema utilizando la precisi贸n como m茅trica de evaluaci贸n, obteniendo 40.9% y 78.1% para los conjuntos de datos HMDB-51 y LTCF-101 respectivamente.Sociedad Argentina de Inform谩tica e Investigaci贸n Operativ