744,241 research outputs found

    Focused Attention for Action Recognition

    Get PDF
    International audienceCurrent state-of-the art approaches to action recognition emphasize learning ConvNets on large amounts of training data, using 3D convolutions to process the temporal dimension. This approach is expensive in terms of memory usage and constitutes a major performance bottleneck of existing approaches. Further, video input data points typically include irrelevant information, along with useful features, which limits the level of detail that networks can process, regardless of the quality of the original video. Hence, models that can focus computational resources on relevant training signal are desirable.To address this problem, we rely on network-specific saliency outputs to drive an attention model that provides tighter crops around relevant video regions. We experimentally validate this approach and show how this strategy improves performance for the action recognition task

    Skeleton-based action analysis for ADHD diagnosis

    Full text link
    Attention Deficit Hyperactivity Disorder (ADHD) is a common neurobehavioral disorder worldwide. While extensive research has focused on machine learning methods for ADHD diagnosis, most research relies on high-cost equipment, e.g., MRI machine and EEG patch. Therefore, low-cost diagnostic methods based on the action characteristics of ADHD are desired. Skeleton-based action recognition has gained attention due to the action-focused nature and robustness. In this work, we propose a novel ADHD diagnosis system with a skeleton-based action recognition framework, utilizing a real multi-modal ADHD dataset and state-of-the-art detection algorithms. Compared to conventional methods, the proposed method shows cost-efficiency and significant performance improvement, making it more accessible for a broad range of initial ADHD diagnoses. Through the experiment results, the proposed method outperforms the conventional methods in accuracy and AUC. Meanwhile, our method is widely applicable for mass screening

    Qualitative Action Recognition by Wireless Radio Signals in Human–Machine Systems

    Get PDF
    Human-machine systems required a deep understanding of human behaviors. Most existing research on action recognition has focused on discriminating between different actions, however, the quality of executing an action has received little attention thus far. In this paper, we study the quality assessment of driving behaviors and present WiQ, a system to assess the quality of actions based on radio signals. This system includes three key components, a deep neural network based learning engine to extract the quality information from the changes of signal strength, a gradient-based method to detect the signal boundary for an individual action, and an activity-based fusion policy to improve the recognition performance in a noisy environment. By using the quality information, WiQ can differentiate a triple body status with an accuracy of 97%, whereas for identification among 15 drivers, the average accuracy is 88%. Our results show that, via dedicated analysis of radio signals, a fine-grained action characterization can be achieved, which can facilitate a large variety of applications, such as smart driving assistants

    Towards Active Image Segmentation: the Foveal Bounded Irregular Pyramid

    Get PDF
    Presentado en: 2nd workshop on Recognition and Action for Scene Understanding York, Inglaterra August 30, 2013It is well established that the units of attention on human vision are not merely spatial but closely related to perceptual objects. This implies a strong relationship between segmentation and attention processes. This interaction is bi-directional: if the segmentation process constraints attention, the way an image is segmented may depend on the specific question asked to an observer, i.e. what she 'attend' in this sense. When the focus of attention is deployed from one visual unit to another, the rest of the scene is perceived but at a lower resolution that the focused object. The result is a multi-resolution visual perception in which the fovea, a dimple on the central retina, provides the highest resolution vision. While much work has recently been focused on computational models for object-based attention, the design and development of multi-resolution structures that can segment the input image according to the focused perceptual unit is largely unexplored. This paper proposes a novel structure for multi-resolution image segmentation that extends the encoding provided by the Bounded Irregular Pyramid. Bottom-up attention is enclosed in the same structure, allowing to set the fovea over the most salient image region. Preliminary results obtained from the segmentation of natural images show that the performance of the approach is good in terms of speed and accuracy.Universidad de MĂĄlaga. Campus de Excelencia Internacional AndalucĂ­a Tech

    Interaction-aware spatio-temporal pyramid attention networks for action classification

    Get PDF
    For CNN-based visual action recognition, the accuracy may be increased if local key action regions are focused on. The task of self-attention is to focus on key features and ignore irrelevant information. So, self-attention is useful for action recognition. However, the current self-attention methods usually ignore correlations among local feature vectors at spatial positions in feature maps in CNNs. In this paper, we propose an effective interaction-aware self-attention model which can extract information about the interactions between feature vectors to learn attention maps. Since the different layers in a network capture feature maps at different scales, we introduce a spatial pyramid with the feature maps at different layers to attention modeling. The multi-scale information is utilized to obtain more accurate attention scores. These attention scores are used to weight the local feature vectors and the feature maps and then calculate the attention feature maps. Since the number of feature maps input to the spatial pyramid attention layer is unrestricted, we easily extend this attention layer to a spatial-temporal version. Our model can be embedded into any general CNN to form a video-level end-to-end attention network for action recognition. Besides using the RGB stream alone, several methods are investigated to combine the RGB and flow streams for the final prediction of the classes of human actions. Experimental results show that our method achieves state-of-the-art results on the datasets UCF101, HMDB51, Kinetics-400 and untrimmed Charades

    Learning to learn: A case for developing Small Firm Owner/Managers

    Get PDF
    Purpose: The paper seeks to contribute to the management development debate by providing insight on the dynamics of organisational learning and human interaction in the SME firm. The paper sets out to consider how a practice based perspective of knowledge is useful in this regard. Design/methodology/approach: The paper is theoretical in its intent and adopts a social constructionist view of knowledge and learning. Using qualitative analysis the paper establishes a review of the current literature by highlighting the centrality of knowledge and learning. Findings: Literature has suggested that critical aspects of learning within the SME firm are based around contextualised action, critical reflection and social interaction. A limited number of studies account for how practice is configured and influenced, in terms of value, uniqueness and scope of what is known, and how these influences can vary depending upon the contexts in which knowledge is being used, and potentially used. Practical Implications: There is a strong recognition in many of the empirical studies of learning and its use in the SME firm, that knowledge is gained through practice as opposed to formal instruction. What current research does not reflect is the changing nature of knowledge research in the wider organisational community, which has focused its attention towards the situated nature of knowledgeable activity or knowing in practice. Originality/Value: The paper argues that learning through practice, with its focus on real world issues and lived experiences, which are contextually embedded in the owner-manager's environment, may provide a better means of successfully developing practitioner focused owner/managers

    Exclusion, Transition, and Recognition: Normative Archetypes for Crossing Urban Social Spaces

    Get PDF
    The paper intends to explore three archetypes of possible interaction between the agent and the social space in which one's own action is located. In this article, we will talk about modalities endowed with normative significance, that is focused around universal scopes and extra-contextual validities (values). Special attention will be paid to the dimension of the "intersection of social spaces" (the scheme assumes both the permanent dimension of "acting within spaces" and the dynamic dimension of "passing beyond them"), the modalities of exclusion, transition, and recognition are thus presented. Their action is complicated by alternative intersection paths in introdynamic and extradynamic dimensions. The study proposes to represent these modalities in order to further offer scenarios for the development and change of urban social spaces. Finally, the paper intends to propose a phenomenological interpretation of their possible interaction with reference to some ways of transforming urban spaces, which are typical of the European context
    • 

    corecore