23,348 research outputs found

    Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

    Full text link
    Recently, substantial research effort has focused on how to apply CNNs or RNNs to better extract temporal patterns from videos, so as to improve the accuracy of video classification. In this paper, however, we show that temporal information, especially longer-term patterns, may not be necessary to achieve competitive results on common video classification datasets. We investigate the potential of a purely attention based local feature integration. Accounting for the characteristics of such features in video classification, we propose a local feature integration framework based on attention clusters, and introduce a shifting operation to capture more diverse signals. We carefully analyze and compare the effect of different attention mechanisms, cluster sizes, and the use of the shifting operation, and also investigate the combination of attention clusters for multimodal integration. We demonstrate the effectiveness of our framework on three real-world video classification datasets. Our model achieves competitive results across all of these. In particular, on the large-scale Kinetics dataset, our framework obtains an excellent single model accuracy of 79.4% in terms of the top-1 and 94.0% in terms of the top-5 accuracy on the validation set. The attention clusters are the backbone of our winner solution at ActivityNet Kinetics Challenge 2017. Code and models will be released soon.Comment: The backbone of the winner solution at ActivityNet Kinetics Challenge 201

    Visual analysis for drum sequence transcription

    Get PDF
    A system is presented for analysing drum performance video sequences. A novel ellipse detection algorithm is introduced that automatically locates drum tops. This algorithm fits ellipses to edge clusters, and ranks them according to various fitness criteria. A background/foreground segmentation method is then used to extract the silhouette of the drummer and drum sticks. Coupled with a motion intensity feature, this allows for the detection of ā€˜hitsā€™ in each of the extracted regions. In order to obtain a transcription of the performance, each of these regions is automatically labeled with the corresponding instrument class. A partial audio transcription and color cues are used to measure the compatibility between a region and its label, the Kuhn-Munkres algorithm is then employed to find the optimal labeling. Experimental results demonstrate the ability of visual analysis to enhance the performance of an audio drum transcription system

    Immersive Composition for Sensory Rehabilitation: 3D Visualisation, Surround Sound, and Synthesised Music to Provoke Catharsis and Healing

    Get PDF
    There is a wide range of sensory therapies using sound, music and visual stimuli. Some focus on soothing or distracting stimuli such as natural sounds or classical music as analgesic, while other approaches emphasize the active performance of producing music as therapy. This paper proposes an immersive multi-sensory Exposure Therapy for people suffering from anxiety disorders, based on a rich, detailed surround-soundscape. This soundscape is composed to include the usersā€™ own idiosyncratic anxiety triggers as a form of habituation, and to provoke psychological catharsis, as a non-verbal, visceral and enveloping exposure. To accurately pinpoint the most effective sounds and to optimally compose the soundscape we will monitor the participantsā€™ physiological responses such as electroencephalography, respiration, electromyography, and heart rate during exposure. We hypothesize that such physiologically optimized sensory landscapes will aid the development of future immersive therapies for various psychological conditions, Sound is a major trigger of anxiety, and auditory hypersensitivity is an extremely problematic symptom. Exposure to stress-inducing sounds can free anxiety sufferers from entrenched avoidance behaviors, teaching physiological coping strategies and encouraging resolution of the psychological issues agitated by the sound
    • ā€¦
    corecore