13 research outputs found
3D Human Activity Recognition with Reconfigurable Convolutional Neural Networks
Human activity understanding with 3D/depth sensors has received increasing
attention in multimedia processing and interactions. This work targets on
developing a novel deep model for automatic activity recognition from RGB-D
videos. We represent each human activity as an ensemble of cubic-like video
segments, and learn to discover the temporal structures for a category of
activities, i.e. how the activities to be decomposed in terms of
classification. Our model can be regarded as a structured deep architecture, as
it extends the convolutional neural networks (CNNs) by incorporating structure
alternatives. Specifically, we build the network consisting of 3D convolutions
and max-pooling operators over the video segments, and introduce the latent
variables in each convolutional layer manipulating the activation of neurons.
Our model thus advances existing approaches in two aspects: (i) it acts
directly on the raw inputs (grayscale-depth data) to conduct recognition
instead of relying on hand-crafted features, and (ii) the model structure can
be dynamically adjusted accounting for the temporal variations of human
activities, i.e. the network configuration is allowed to be partially activated
during inference. For model training, we propose an EM-type optimization method
that iteratively (i) discovers the latent structure by determining the
decomposed actions for each training example, and (ii) learns the network
parameters by using the back-propagation algorithm. Our approach is validated
in challenging scenarios, and outperforms state-of-the-art methods. A large
human activity database of RGB-D videos is presented in addition.Comment: This manuscript has 10 pages with 9 figures, and a preliminary
version was published in ACM MM'14 conferenc
Characterizing Compressibility of Disjoint Subgraphs with NLC Grammars
We consider compression of a given set S of isomorphic and disjoint subgraphs of a graph G using node label controlled (NLC) graph grammars. Given S and G, we characterize whether or not there exists a NLC graph grammar consisting of exactly one rule such that (1) each of the subgraphs S in G are compressed (i.e., replaced by a nonterminal) in the (unique) initial graph I , and (2) the set of generated terminal graphs is the singleton {G}.acceptance rate: 39%status: publishe
A new paradigm based on agents applied to free-hand sketch recognition
Important advances in natural calligraphic interfaces for CAD (Computer Aided Design) applications are being achieved, enabling the development of CAS (Computer Aided Sketching) devices that allow facing up to the conceptual design phase of a product. Recognizers play an important role in this field, allowing the interpretation of the user’s intention, but they still present some important lacks. This paper proposes a new recognition paradigm using an agent-based architecture that does not depend on the drawing sequence and takes context information into account to help decisions. Another improvement is the absence of operation modes, that is, no button is needed to distinguish geometry from symbols or gestures, and also “interspersing” and “overtracing” are accomplishedThe Spanish Ministry of Science and Education and the FEDER Funds, through the CUESKETCH project (Ref. DPI2007-66755-C02-01), partially supported this work.Fernández Pacheco, D.; Albert Gil, FE.; Aleixos Borrás, MN.; Conesa Pastor, J. (2012). A new paradigm based on agents applied to free-hand sketch recognition. Expert Systems with Applications. 39(8):7181-7195. https://doi.org/10.1016/j.eswa.2012.01.063S7181719539
Agent-based framework for person re-identification
In computer based human object re-identification, a detected human is recognised to a
level sufficient to re-identify a tracked person in either a different camera capturing the
same individual, often at a different angle, or the same camera at a different time and/or
the person approaching the camera at a different angle. Instead of relying on face
recognition technology such systems study the clothing of the individuals being monitored
and/or objects being carried to establish correspondence and hence re-identify the human
object.
Unfortunately present human-object re-identification systems consider the entire human
object as one connected region in making the decisions about similarity of two objects
being matched. This assumption has a major drawback in that when a person is partially
occluded, a part of the occluding foreground will be picked up and used in matching. Our
research revealed that when a human observer carries out a manual human-object re-identification
task, the attention is often taken over by some parts of the human
figure/body, more than the others, e.g. face, brightly colour shirt, presence of texture
patterns in clothing etc., and occluding parts are ignored.
In this thesis, a novel multi-agent based framework is proposed for the design of a human
object re-identification system. Initially a HOG based feature extraction is used in a SVM
based classification of a human object as a human of a full-body or of half body nature.
Subsequently the relative visual significance of the top and the bottom parts of the human,
in re-identification is quantified by the analysis of Gray Level Co-occurrence based
texture features and colour histograms obtained in the HSV colour space. Accordingly
different weights are assigned to the top and bottom of the human body using a novel
probabilistic approach. The weights are then used to modify the Hybrid Spatiogram and
Covariance Descriptor (HSCD) feature based re-identification algorithm adopted.
A significant novelty of the human object re-identification systems proposed in this thesis
is the agent based design procedure adopted that separates the use of computer vision
algorithms for feature extraction, comparison etc., from the decision making process of re-identification. Multiple agents are assigned to execute different algorithmic tasks and
the agents communicate to make the required logical decisions.
Detailed experimental results are provided to prove that the proposed multi agent based
framework for human object re-identification performs significantly better than the state of-the-art algorithms. Further it is shown that the design flexibilities and scalabilities of
the proposed system allows it to be effectively utilised in more complex computer vision
based video analytic/forensic tasks often conducted within distributed, multi-camera
systems