15 research outputs found

    Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-Shot Learning

    Get PDF
    We propose a novel approach for unsupervised zero-shot learning (ZSL) of classes based on their names. Most existing unsupervised ZSL methods aim to learn a model for directly comparing image features and class names. However, this proves to be a difficult task due to dominance of non-visual semantics in underlying vector-space embeddings of class names. To address this issue, we discriminatively learn a word representation such that the similarities between class and combination of attribute names fall in line with the visual similarity. Contrary to the traditional zero-shot learning approaches that are built upon attribute presence, our approach bypasses the laborious attributeclass relation annotations for unseen classes. In addition, our proposed approach renders text-only training possible, hence, the training can be augmented without the need to collect additional image data. The experimental results show that our method yields state-of-the-art results for unsupervised ZSL in three benchmark datasets. © 2017 IEEE

    Human action recognition with line and flow histograms

    Get PDF
    We present a compact representation for human action recognition in videos using line and optical flow histograms. We introduce a new shape descriptor based on the distribution of lines which are fitted to boundaries of human figures. By using an entropy-based approach, we apply feature selection to densify our feature representation, thus, minimizing classification time without degrading accuracy. We also use a compact representation of optical flow for motion information. Using line and flow histograms together with global velocity information, we show that high-accuracy action recognition is possible, even in challenging recording conditions. © 2008 IEEE

    Recognizing actions from still images

    Get PDF
    In this paper, we approach the problem of under- standing human actions from still images. Our method involves representing the pose with a spatial and ori- entational histogramming of rectangular regions on a parse probability map. We use LDA to obtain a more compact and discriminative feature representation and binary SVMs for classification. Our results over a new dataset collected for this problem show that by using a rectangle histogramming approach, we can discriminate actions to a great extent. We also show how we can use this approach in an unsupervised setting. To our best knowledge, this is one of the first studies that try to recognize actions within still images. © 2008 IEEE

    On recognizing actions in still images via multiple features

    Get PDF
    We propose a multi-cue based approach for recognizing human actions in still images, where relevant object regions are discovered and utilized in a weakly supervised manner. Our approach does not require any explicitly trained object detector or part/attribute annotation. Instead, a multiple instance learning approach is used over sets of object hypotheses in order to represent objects relevant to the actions. We test our method on the extensive Stanford 40 Actions dataset [1] and achieve significant performance gain compared to the state-of-the-art. Our results show that using multiple object hypotheses within multiple instance learning is effective for human action recognition in still images and such an object representation is suitable for using in conjunction with other visual features. © 2012 Springer-Verlag

    You Talkin' to Me?

    No full text

    Recognizing human actions from noisy videos via multiple instance learning [Gürültü içeren videolardan insan hareketlerinin çoklu örnekle ö̌grenme ile taninmasi]

    No full text
    In this work, we study the task of recognizing human actions from noisy videos and effects of noise to recognition performance and propose a possible solution. Datasets available in computer vision literature are relatively small and could include noise due to labeling source. For new and relatively big datasets, noise amount would possible increase and the performance of traditional instance based learning methods is likely to decrease. In this work, we propose a multiple instance learning-based solution in case of an increase in noise. For this purpose, each video is represented with spatio-temporal features, then bag-of-words method is applied. Then, using support vector machines (SVM), both instance-based learning and multiple instance learning classifiers are constructed and compared. The classification results show that multiple instance learning classifiers has better performance than instance based learning counterparts on noisy videos. © 2013 IEEE

    Recognizing Complex Events in Videos by Learning Key Static-Dynamic Evidences

    No full text
    Abstract. Complex events consist of various human interactions with different objects in diverse environments. The evidences needed to rec-ognize events may occur in short time periods with variable lengths and can happen anywhere in a video. This fact prevents conventional machine learning algorithms from effectively recognizing the events. In this pa-per, we propose a novel method that can automatically identify the key evidences in videos for detecting complex events. Both static instances (objects) and dynamic instances (actions) are considered by sampling frames and temporal segments respectively. To compare the character-istic power of heterogeneous instances, we embed static and dynamic instances into a multiple instance learning framework via instance simi-larity measures, and cast the problem as an Evidence Selective Ranking (ESR) process. We impose 1 norm to select key evidences while us-ing the Infinite Push Loss Function to enforce positive videos to have higher detection scores than negative videos. The Alternating Direction Method of Multipliers (ADMM) algorithm is used to solve the optimiza-tion problem. Experiments on large-scale video datasets show that our method can improve the detection accuracy while providing the unique capability in discovering key evidences of each complex event

    Action Recognition with Stacked Fisher Vectors

    No full text

    Human Activity Recognition Using Hierarchically-Mined Feature Constellations

    Get PDF
    Abstract. In this paper we address the problem of human activity modelling and recognition by means of a hierarchical representation of mined dense spatiotemporal features. At each level of the hierarchy, the proposed method selects feature constellations that are increasingly discriminative and characteristic of a specifi
    corecore