185 research outputs found
Histogram of oriented rectangles: A new pose descriptor for human action recognition
Cataloged from PDF version of article.Most of the approaches to human action recognition tend to form complex models which require lots of parameter estimation and computation time. In this study, we show that, human actions can be simply represented by pose without dealing with the complex representation of dynamics. Based on this idea, we propose a novel pose descriptor which we name as Histogram-of-Oriented-Rectangles (HOR) for representing and recognizing human actions in videos. We represent each human pose in an action sequence by oriented rectangular patches extracted over the human silhouette. We then form spatial oriented histograms to represent the distribution of these rectangular patches. We make use of several matching strategies to carry the information from the spatial domain described by the HOR descriptor to temporal domain. These are (i) nearest neighbor classification, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis to rectangular patches, (iii) a classifier-based approach using Support Vector Machines, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the HOR descriptor. For the cases when pose descriptor is not sufficiently strong alone, such as to differentiate actions "jogging" and "running", we also incorporate a simple velocity descriptor as a prior to the pose based classification step. We test our system with different configurations and experiment on two commonly used action datasets: the Weizmann dataset and the KTH dataset. Results show that our method is superior to other methods on Weizmann dataset with a perfect accuracy rate of 100%, and is comparable to the other methods on KTH dataset with a very high success rate close to 90%. These results prove that with a simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. (C) 2009 Elsevier B.V. All rights reserved
Diagnosis of gastric carcinoma by classification on feature projections
Cataloged from PDF version of article.A new classification algorithm, called benefit maximizing classifier on feature projections (BCFP), is developed and applied to the problem of diagnosis of gastric carcinoma. The domain contains records of patients with known diagnosis through gastroscopy results. Given a training set of such records, the BCFP classifier learns how to differentiate a new case in the domain. BCFP represents a concept in the form of feature projections on each feature dimension separately. Classification in the BCFP algorithm is based on a voting among the individual predictions made on each feature. In the gastric carcinoma domain, a lesion can be an indicator of one of nine different Levels of gastric carcinoma, from early to late stages. The benefit of correct classification of early levels is much more than that of late cases. Also, the costs of wrong classifications are not symmetric. In the training phase, the BCFP algorithm learns classification rules that maximize the benefit of classification. In the querying phase, using these rules, the BCFP algorithm tries to make a prediction maximizing the benefit. A genetic algorithm is applied to select the relevant features. The performance of the BCFP algorithm is evaluated in terms of accuracy and running time. The rules induced are verified by experts of the domain. (C) 2004 Elsevier B.V. All rights reserved
From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video Clips
Short internet video clips like vines present a significantly wild
distribution compared to traditional video datasets. In this paper, we focus on
the problem of unsupervised action classification in wild vines using
traditional labeled datasets. To this end, we use a data augmentation based
simple domain adaptation strategy. We utilise semantic word2vec space as a
common subspace to embed video features from both, labeled source domain and
unlablled target domain. Our method incrementally augments the labeled source
with target samples and iteratively modifies the embedding function to bring
the source and target distributions together. Additionally, we utilise a
multi-modal representation that incorporates noisy semantic information
available in form of hash-tags. We show the effectiveness of this simple
adaptation technique on a test set of vines and achieve notable improvements in
performance.Comment: 9 pages, GCPR, 201
Human action recognition using distribution of oriented rectangular patches
We describe a "bag-of-rectangles" method for representing and recognizing human actions in videos. In this method, each human pose in an action sequence is represented by oriented rectangular patches extracted over the whole body. Then, spatial oriented histograms are formed to represent the distribution of these rectangular patches. In order to carry the information from the spatial domain described by the bag-of-rectangles descriptor to temporal domain for recognition of the actions, four different methods are proposed. These are namely, (i) frame by frame voting, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis by rectangular patches, (iii) a classifier based approach using SVMs, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the descriptor. The detailed experiments are carried out on the action dataset of Blank et. al. High success rates (100%) prove that with a very simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. © Springer-Verlag Berlin Heidelberg 2007
Searching for complex human activities with no visual examples
We describe a method of representing human activities that allows a collection of motions to be queried without examples, using a simple and effective query language. Our approach is based on units of activity at segments of the body, that can be composed across space and across the body to produce complex queries. The presence of search units is inferred automatically by tracking the body, lifting the tracks to 3D and comparing to models trained using motion capture data. Our models of short time scale limb behaviour are built using labelled motion capture set. We show results for a large range of queries applied to a collection of complex motion and activity. We compare with discriminative methods applied to tracker data; our method offers significantly improved performance. We show experimental evidence that our method is robust to view direction and is unaffected by some important changes of clothing. © 2008 Springer Science+Business Media, LLC
Person search made easy
In this study, we present a method to extensively reduce the number of retrieved images and increase the retrieval performance for the person queries on the broadcast news videos. A multi-modal approach which integrates face and text information is proposed. A state-of-the-art face detection algorithm is improved using a skin color based method to eliminate the false alarms. This pruned set is clustered to group the similar faces and representative faces are selected from each cluster to be provided to the user. For six person queries of TRECVID2004, on the average, the retrieval rate is increased from 8% to around 50%, and the number of images that the user has to inspect are reduced from hundreds and thousands to tens. © Springer-Verlag Berlin Heidelberg 2005
Histogram of oriented rectangles: A new pose descriptor for human action recognition
Most of the approaches to human action recognition tend to form complex models which require lots of parameter estimation and computation time. In this study, we show that, human actions can be simply represented by pose without dealing with the complex representation of dynamics. Based on this idea, we propose a novel pose descriptor which we name as Histogram-of-Oriented-Rectangles (HOR) for representing and recognizing human actions in videos. We represent each human pose in an action sequence by oriented rectangular patches extracted over the human silhouette. We then form spatial oriented histograms to represent the distribution of these rectangular patches. We make use of several matching strategies to carry the information from the spatial domain described by the HOR descriptor to temporal domain. These are (i) nearest neighbor classification, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis to rectangular patches, (iii) a classifier-based approach using Support Vector Machines, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the HOR descriptor. For the cases when pose descriptor is not sufficiently strong alone, such as to differentiate actions "jogging" and "running", we also incorporate a simple velocity descriptor as a prior to the pose based classification step. We test our system with different configurations and experiment on two commonly used action datasets: the Weizmann dataset and the KTH dataset. Results show that our method is superior to other methods on Weizmann dataset with a perfect accuracy rate of 100%, and is comparable to the other methods on KTH dataset with a very high success rate close to 90%. These results prove that with a simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. © 2009 Elsevier B.V. All rights reserved
Maximizing benefit of classifications using feature intervals
There is a great need for classification methods that can properly handle asymmetric cost and benefit constraints of classifications. In this study, we aim to emphasize the importance of classification benefits by means of a new classification algorithm, Benefit-Maximizing classifier with Feature Intervals (BMFI) that uses feature projection based knowledge representation. Empirical results show that BMFI has promising performance compared to recent cost-sensitive algorithms in terms of the benefit gained
Human action recognition with line and flow histograms
We present a compact representation for human action recognition in videos using line and optical flow histograms. We introduce a new shape descriptor based on the distribution of lines which are fitted to boundaries of human figures. By using an entropy-based approach, we apply feature selection to densify our feature representation, thus, minimizing classification time without degrading accuracy. We also use a compact representation of optical flow for motion information. Using line and flow histograms together with global velocity information, we show that high-accuracy action recognition is possible, even in challenging recording conditions. © 2008 IEEE
On recognizing actions in still images via multiple features
We propose a multi-cue based approach for recognizing human actions in still images, where relevant object regions are discovered and utilized in a weakly supervised manner. Our approach does not require any explicitly trained object detector or part/attribute annotation. Instead, a multiple instance learning approach is used over sets of object hypotheses in order to represent objects relevant to the actions. We test our method on the extensive Stanford 40 Actions dataset [1] and achieve significant performance gain compared to the state-of-the-art. Our results show that using multiple object hypotheses within multiple instance learning is effective for human action recognition in still images and such an object representation is suitable for using in conjunction with other visual features. © 2012 Springer-Verlag
- …