260,926 research outputs found

    Semi-supervised and unsupervised extensions to maximum-margin structured prediction

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Structured prediction is the backbone of various computer vision and machine learning applications. Inspired by the success of maximum-margin classifiers in the recent years; in this thesis, we will present novel semi-supervised and unsupervised extensions to structured prediction via maximum-margin classifiers. For semi-supervised structured prediction, we have tackled the problem of recognizing actions from single images. Action recognition from a single image is an important task for applications such as image annotation, robotic navigation, video surveillance and several others. We propose approaching action recognition by first partitioning the entire image into “superpixels”, and then using their latent classes as attributes of the action. The action class is predicted based on a graphical model composed of measurements from each superpixel and a fully-connected graph of superpixel classes. The model is learned using a latent structural SVM approach, and an efficient, greedy algorithm is proposed to provide inference over the graph. Differently from most existing methods, the proposed approach does not require annotation of the actor (usually provided as a bounding box). For the unsupervised extension of structured prediction, we considered the case of labeling binary sequences. This case is important in a detection scenario, where one is interested in detecting an action or an event. In particular, we address the unsupervised SVM relaxation recently proposed in (Li et al. 2013) and extend it for structured prediction by merging it with structural SVM. The main contribution of the proposed extension (named Well-SSVM) is a re-organization of the feature map and loss function of structural SVM that permits finding the violating labelings required by the relaxation. Experiments on synthetic and real datasets in a fully unsupervised setting reveal a competitive performance as opposed to other unsupervised algorithms such as k-means and latent structural SVM. Finally, we approached the problem of unsupervised structured prediction by M³ Networks. M³ Networks are an alternative formulation of maximum-margin structured prediction that can satisfy the complete set of constraints for decomposable feature and loss functions; hence, the entire set of constraints is considered during the search for the optimal margin as opposed to Structural SVM. In the thesis, we present the interpretation of M³ Networks in Well-SSVM, thus allowing us to use in a semi-supervised and unsupervised scenario

    Actom Sequence Models for Efficient Action Detection

    Get PDF
    International audienceWe address the problem of detecting actions, such as drinking or opening a door, in hours of challenging video data. We propose a model based on a sequence of atomic action units, termed ''actoms'', that are characteristic for the action. Our model represents the temporal structure of actions as a sequence of histograms of actom-anchored visual features. Our representation, which can be seen as a temporally structured extension of the bag-of-features, is flexible, sparse and discriminative. We refer to our model as Actom Sequence Model (ASM). Training requires the annotation of actoms for action clips. At test time, actoms are detected automatically, based on a non-parametric model of the distribution of actoms, which also acts as a prior on an action's temporal structure. We present experimental results on two recent benchmarks for temporal action detection. We show that our ASM method outperforms the current state of the art in temporal action detection

    Detectability prediction of hidden Markov models with cluttered observation sequences

    Get PDF
    There is good reason to model an asymmetric threat (a structured action such as a terrorist attack) as an hmm whose observations are cluttered. Recently a Bernoulli filter was presented that can process cluttered observations («transactions») and is capable of detecting if there is an hmm present, and if so, estimate the state of the HMM. An important question in this context is: when is the HMM-in-clutter problem feasible? In other words, what system properties allow for a solvable problem? In this paper we show that, given a Gaussian approximation of the pdf of the log-likelihood, approximate detection error bounds can be derived. These error bounds allow a prediction of the detection performance, i.e. a prediction of the probability of detection given an «operating point» of transaction-level false alarm rate and miss probability. Simulations show that our analysis accurately predicts detectability of such threats. Our purpose here is to make statements about what sort of threats can be detected, and what quality of observations are necessary that this be accomplished
    • …
    corecore