2,749 research outputs found
Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks
Skeleton based action recognition distinguishes human actions using the
trajectories of skeleton joints, which provide a very good representation for
describing actions. Considering that recurrent neural networks (RNNs) with Long
Short-Term Memory (LSTM) can learn feature representations and model long-term
temporal dependencies automatically, we propose an end-to-end fully connected
deep LSTM network for skeleton based action recognition. Inspired by the
observation that the co-occurrences of the joints intrinsically characterize
human actions, we take the skeleton as the input at each time slot and
introduce a novel regularization scheme to learn the co-occurrence features of
skeleton joints. To train the deep LSTM network effectively, we propose a new
dropout algorithm which simultaneously operates on the gates, cells, and output
responses of the LSTM neurons. Experimental results on three human action
recognition datasets consistently demonstrate the effectiveness of the proposed
model.Comment: AAAI 2016 conferenc
Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US
The United States spends more than $1B each year on initiatives such as the
American Community Survey (ACS), a labor-intensive door-to-door study that
measures statistics relating to race, gender, education, occupation,
unemployment, and other demographic factors. Although a comprehensive source of
data, the lag between demographic changes and their appearance in the ACS can
exceed half a decade. As digital imagery becomes ubiquitous and machine vision
techniques improve, automated data analysis may provide a cheaper and faster
alternative. Here, we present a method that determines socioeconomic trends
from 50 million images of street scenes, gathered in 200 American cities by
Google Street View cars. Using deep learning-based computer vision techniques,
we determined the make, model, and year of all motor vehicles encountered in
particular neighborhoods. Data from this census of motor vehicles, which
enumerated 22M automobiles in total (8% of all automobiles in the US), was used
to accurately estimate income, race, education, and voting patterns, with
single-precinct resolution. (The average US precinct contains approximately
1000 people.) The resulting associations are surprisingly simple and powerful.
For instance, if the number of sedans encountered during a 15-minute drive
through a city is higher than the number of pickup trucks, the city is likely
to vote for a Democrat during the next Presidential election (88% chance);
otherwise, it is likely to vote Republican (82%). Our results suggest that
automated systems for monitoring demographic trends may effectively complement
labor-intensive approaches, with the potential to detect trends with fine
spatial resolution, in close to real time.Comment: 41 pages including supplementary material. Under review at PNA
A Neutral Network Based Vehicle Classification System for Pervasive Smart Road Security
Pervasive smart computing environments make people get accustomed to convenient and secure services. The overall goal of this research is to classify vehicles along the I215 freeway in Salt Lake City, USA. This information will be used to predict future roadway needs and the expected life of a roadway. The classification of vehicles will be performed by a synthesis of multiple sets of features. All feature sets have not yet been determined; however, one such set will be the reduced wavelet transform of the image of a vehicle. In order to use such a feature, it is necessary that the image be normalized with respect to size, position, and so on. For example, a car in the right most lane in an image will appear smaller than one in the left most lane, because the right most lane is closest to the camera. Likewise, a vehicle’s size will vary depending on where in a lane its image is captured. In our case, the image capture area for each lane is approximately 100 feet of roadway. A goal of this paper is to normalize the image of a vehicle so that regardless of its lane or position in a lane, the features will be approximately the same. The wavelet transform itself will not be used directly for recognition. Instead, it will be input to a neural network and the output of the neural network will be one element of the feature set used for recognition
Automatic target recognition with convolutional neural networks.
Automatic Target Recognition (ATR) characterizes the ability for an algorithm or device to identify targets or other objects based on data obtained from sensors, being commonly thermal. ATR is an important technology for both civilian and military computer vision applications. However, the current level of performance that is available is largely deficient compared to the requirements. This is mainly due to the difficulty of acquiring targets in realistic environments, and also to limitations of the distribution of classified data to the academic community for research purposes. This thesis proposes to solve the ATR task using Convolutional Neural Networks (CNN). We present three learning approaches using WideResNet-28-2\cite{wrn} as a backbone CNN. The first method uses random initialization of the network weights. The second method explores transfer learning. Finally, the third approach relies on spatial transformer networks \cite{stn} to enhance the geometric invariance of the model. To validate, analyze and compare our three proposed models, we use a large-scale real benchmark dataset that includes civilian and military vehicles. These targets are captured at different viewing angles, different resolutions, and different times of the day. We evaluate the effectiveness of our methods by studying their robustness to realistic case scenarios where no ground truth data is available and targets are automatically detected. We show that the method that uses spatial transformer networks achieves the best results and demonstrates the most robustness to various perturbations that can be encountered in real applications
- …