148,046 research outputs found

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    Unsupervised Action Proposal Ranking through Proposal Recombination

    Full text link
    Recently, action proposal methods have played an important role in action recognition tasks, as they reduce the search space dramatically. Most unsupervised action proposal methods tend to generate hundreds of action proposals which include many noisy, inconsistent, and unranked action proposals, while supervised action proposal methods take advantage of predefined object detectors (e.g., human detector) to refine and score the action proposals, but they require thousands of manual annotations to train. Given the action proposals in a video, the goal of the proposed work is to generate a few better action proposals that are ranked properly. In our approach, we first divide action proposal into sub-proposal and then use Dynamic Programming based graph optimization scheme to select the optimal combinations of sub-proposals from different proposals and assign each new proposal a score. We propose a new unsupervised image-based actioness detector that leverages web images and employs it as one of the node scores in our graph formulation. Moreover, we capture motion information by estimating the number of motion contours within each action proposal patch. The proposed method is an unsupervised method that neither needs bounding box annotations nor video level labels, which is desirable with the current explosion of large-scale action datasets. Our approach is generic and does not depend on a specific action proposal method. We evaluate our approach on several publicly available trimmed and un-trimmed datasets and obtain better performance compared to several proposal ranking methods. In addition, we demonstrate that properly ranked proposals produce significantly better action detection as compared to state-of-the-art proposal based methods

    Pre and Post-hoc Diagnosis and Interpretation of Malignancy from Breast DCE-MRI

    Full text link
    We propose a new method for breast cancer screening from DCE-MRI based on a post-hoc approach that is trained using weakly annotated data (i.e., labels are available only at the image level without any lesion delineation). Our proposed post-hoc method automatically diagnosis the whole volume and, for positive cases, it localizes the malignant lesions that led to such diagnosis. Conversely, traditional approaches follow a pre-hoc approach that initially localises suspicious areas that are subsequently classified to establish the breast malignancy -- this approach is trained using strongly annotated data (i.e., it needs a delineation and classification of all lesions in an image). Another goal of this paper is to establish the advantages and disadvantages of both approaches when applied to breast screening from DCE-MRI. Relying on experiments on a breast DCE-MRI dataset that contains scans of 117 patients, our results show that the post-hoc method is more accurate for diagnosing the whole volume per patient, achieving an AUC of 0.91, while the pre-hoc method achieves an AUC of 0.81. However, the performance for localising the malignant lesions remains challenging for the post-hoc method due to the weakly labelled dataset employed during training.Comment: Submitted to Medical Image Analysi

    About the nature of Kansei information, from abstract to concrete

    Get PDF
    Designer’s expertise refers to the scientific fields of emotional design and kansei information. This paper aims to answer to a scientific major issue which is, how to formalize designer’s knowledge, rules, skills into kansei information systems. Kansei can be considered as a psycho-physiologic, perceptive, cognitive and affective process through a particular experience. Kansei oriented methods include various approaches which deal with semantics and emotions, and show the correlation with some design properties. Kansei words may include semantic, sensory, emotional descriptors, and also objects names and product attributes. Kansei levels of information can be seen on an axis going from abstract to concrete dimensions. Sociological value is the most abstract information positioned on this axis. Previous studies demonstrate the values the people aspire to drive their emotional reactions in front of particular semantics. This means that the value dimension should be considered in kansei studies. Through a chain of value-function-product attributes it is possible to enrich design generation and design evaluation processes. This paper describes some knowledge structures and formalisms we established according to this chain, which can be further used for implementing computer aided design tools dedicated to early design. These structures open to new formalisms which enable to integrate design information in a non-hierarchical way. The foreseen algorithmic implementation may be based on the association of ontologies and bag-of-words.AN

    STV-based Video Feature Processing for Action Recognition

    Get PDF
    In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end

    Going Deeper into Action Recognition: A Survey

    Full text link
    Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader
    corecore