87 research outputs found
Combining Appearance and Motion for Human Action Classification in Videos
We study the question of activity classification in videos and present a novel approach for recognizing human action categories in videos by combining information from appearance and motion of human body parts. Our approach uses a tracking step which involves Particle Filtering and a local non - parametric clustering step. The motion information is provided by the trajectory of the cluster modes of a local set of particles. The statistical information about the particles of that cluster over a number of frames provides the appearance information. Later we use a “Bag ofWords” model to build one histogram per video sequence from the set of these robust appearance and motion descriptors. These histograms provide us characteristic information which helps us to discriminate among various human actions and thus classify them correctly. We tested our approach on the standard KTH and Weizmann human action datasets and the results were comparable to the state of the art. Additionally our approach is able to distinguish between activities that involve the motion of complete body from those in which only certain body parts move. In other words, our method discriminates well between activities with “gross motion” like running, jogging etc. and “local motion” like waving, boxing etc
People Detection and Pose Classification Inside a Moving Train Using Computer Vision
This paper has been presented at : 5th International Visual Informatics Conference (IVIC 2017)Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 10645)The use of surveillance video cameras in public transport is increasingly regarded as a solution to control vandalism and emergency situations. The widespread use of cameras brings in the problem of managing high volumes of data, resulting in pressure on people and resources. We illustrate a possible step to automate the monitoring task in the context of a moving train (where popular background removal algorithms will struggle with rapidly changing illumination). We looked at the detection of people in three possible postures: Sat down (on a train seat), Standing and Sitting (half way between sat down and standing). We then use the popular Histogram of Oriented Gradients (HOG) descriptor to train Support Vector Machines to detect people in any of the predefined postures. As a case study, we use the public BOSS dataset. We show different ways of training and combining the classifiers obtaining a sensitivity performance improvement of about 12% when using a combination of three SVM classifiers instead of a global (all classes) classifier, at the expense of an increase of 6% in false positive rate. We believe this is the first set of public results on people detection using the BOSS dataset so that future researchers can use our results as a baseline to improve upon.The work described here was carried out as part of the OBSERVE project funded by the Fondecyt Regular Program of Conicyt (Chilean Research Council for Science and Technology) under grant no. 1140209. S.A. Velastin is grateful to funding received from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 600371, el Ministerio de EconomÃa y Competitividad (COFUND2013-51509) and Banco Santander
Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation
We introduce a new loss function for the weakly-supervised training of
semantic image segmentation models based on three guiding principles: to seed
with weak localization cues, to expand objects based on the information about
which classes can occur in an image, and to constrain the segmentations to
coincide with object boundaries. We show experimentally that training a deep
convolutional neural network using the proposed loss function leads to
substantially better segmentations than previous state-of-the-art methods on
the challenging PASCAL VOC 2012 dataset. We furthermore give insight into the
working mechanism of our method by a detailed experimental study that
illustrates how the segmentation quality is affected by each term of the
proposed loss function as well as their combinations.Comment: ECCV 201
Superpixel Convolutional Networks using Bilateral Inceptions
In this paper we propose a CNN architecture for semantic image segmentation.
We introduce a new 'bilateral inception' module that can be inserted in
existing CNN architectures and performs bilateral filtering, at multiple
feature-scales, between superpixels in an image. The feature spaces for
bilateral filtering and other parameters of the module are learned end-to-end
using standard backpropagation techniques. The bilateral inception module
addresses two issues that arise with general CNN segmentation architectures.
First, this module propagates information between (super) pixels while
respecting image edges, thus using the structured information of the problem
for improved results. Second, the layer recovers a full resolution segmentation
result from the lower resolution solution of a CNN. In the experiments, we
modify several existing CNN architectures by inserting our inception module
between the last CNN (1x1 convolution) layers. Empirical results on three
different datasets show reliable improvements not only in comparison to the
baseline networks, but also in comparison to several dense-pixel prediction
techniques such as CRFs, while being competitive in time.Comment: European Conference on Computer Vision (ECCV), 201
Deterministic variational inference for robust Bayesian neural networks
Bayesian neural networks (BNNs) hold great promise as a flexible and
principled solution to deal with uncertainty when learning from finite data.
Among approaches to realize probabilistic inference in deep neural networks,
variational Bayes (VB) is theoretically grounded, generally applicable, and
computationally efficient. With wide recognition of potential advantages, why
is it that variational Bayes has seen very limited practical use for BNNs in
real applications? We argue that variational inference in neural networks is
fragile: successful implementations require careful initialization and tuning
of prior variances, as well as controlling the variance of Monte Carlo gradient
estimates. We provide two innovations that aim to turn VB into a robust
inference tool for Bayesian neural networks: first, we introduce a novel
deterministic method to approximate moments in neural networks, eliminating
gradient variance; second, we introduce a hierarchical prior for parameters and
a novel Empirical Bayes procedure for automatically selecting prior variances.
Combining these two innovations, the resulting method is highly efficient and
robust. On the application of heteroscedastic regression we demonstrate good
predictive performance over alternative approaches
Segmenting Planar Superpixel Adjacency Graphs w.r.t. Non-planar Superpixel Affinity Graphs
We address the problem of segmenting an image into a previously unknown number of segments from the perspective of graph partitioning. Specifically, we consider minimum multicuts of superpixel affinity graphs in which all affinities between non-adjacent superpixels are negative. We propose a relaxation by Lagrangian decomposition and a constrained set of re-parameterizations for which we can optimize exactly and efficiently. Our contribution is to show how the planarity of the adjacency graph can be exploited if the affinity graph is non-planar. We demonstrate the effectiveness of this approach in user-assisted image segmentation and show that the solution of the relaxed problem is fast and the relaxation is tight in practice
A Comparative Study of Modern Inference Techniques for Structured Discrete Energy Minimization Problems
International audienceSzeliski et al. published an influential study in 2006 on energy minimization methods for Markov Random Fields (MRF). This study provided valuable insights in choosing the best optimization technique for certain classes of problems. While these insights remain generally useful today, the phenomenal success of random field models means that the kinds of inference problems that have to be solved changed significantly. Specifically , the models today often include higher order interactions, flexible connectivity structures, large label-spaces of different car-dinalities, or learned energy tables. To reflect these changes, we provide a modernized and enlarged study. We present an empirical comparison of more than 27 state-of-the-art optimization techniques on a corpus of 2,453 energy minimization instances from diverse applications in computer vision. To ensure reproducibility, we evaluate all methods in the OpenGM 2 framework and report extensive results regarding runtime and solution quality. Key insights from our study agree with the results of Szeliski et al. for the types of models they studied. However, on new and challenging types of models our findings disagree and suggest that polyhedral methods and integer programming solvers are competitive in terms of runtime and solution quality over a large range of model types
A Two-Dimensional Electron Gas as a Sensitive Detector for Time-Resolved Tunneling Measurements on Self-Assembled Quantum Dots
A two-dimensional electron gas (2DEG) situated nearby a single layer of self-assembled quantum dots (QDs) in an inverted high electron mobility transistor (HEMT) structure is used as a detector for time-resolved tunneling measurements. We demonstrate a strong influence of charged QDs on the conductance of the 2DEG which allows us to probe the tunneling dynamics between the 2DEG and the QDs time resolved. Measurements of hysteresis curves with different sweep times and real-time conductance measurements in combination with an boxcar-like evaluation method enables us to unambiguously identify the transients as tunneling events between the s- and p-electron QD states and the 2DEG and rule out defect-related transients
- …