5 research outputs found

    Feature extraction for human action recognition based on saliency map

    Get PDF
    Human Action Recognition (HAR) plays an important role in computer vision for the interaction between human and environments which has been widely used in many applications. The focus of the research in recent years is the reliability of the feature extraction to achieve high performance with the usage of saliency map. However, this task is challenging where problems are faced during human action detection when most of videos are taken with cluttered background scenery and increasing the difficulties to detect or recognize the human action accurately due to merging effects and different level of interest. In this project, the main objective is to design a model that utilizes feature extraction with optical flow method and edge detector. Besides, the accuracy of the saliency map generation is needed to improve with the feature extracted to recognize various human actions. For feature extraction, motion and edge features are proposed as two spatial-temporal cues that using edge detector and Motion Boundary Histogram (MBH) descriptor respectively. Both of them are able to describe the pixels with gradients and other vector components. In addition, the features extracted are implemented into saliency computation using Spectral Residual (SR) method to represent the Fourier transform of vectors to log spectrum and eliminating excessive noises with filtering and data compressing. Computation of the saliency map after obtaining the remaining salient regions are combined to form a final saliency map. Simulation result and data analysis is done with benchmark datasets of human actions using Matlab implementation. The expectation for proposed methodology is to achieve the state-of-art result in recognizing the human actions

    Semantic Prior Analysis for Salient Object Detection

    Get PDF

    Visual saliency computation for image analysis

    Full text link
    Visual saliency computation is about detecting and understanding salient regions and elements in a visual scene. Algorithms for visual saliency computation can give clues to where people will look in images, what objects are visually prominent in a scene, etc. Such algorithms could be useful in a wide range of applications in computer vision and graphics. In this thesis, we study the following visual saliency computation problems. 1) Eye Fixation Prediction. Eye fixation prediction aims to predict where people look in a visual scene. For this problem, we propose a Boolean Map Saliency (BMS) model which leverages the global surroundedness cue using a Boolean map representation. We draw a theoretic connection between BMS and the Minimum Barrier Distance (MBD) transform to provide insight into our algorithm. Experiment results show that BMS compares favorably with state-of-the-art methods on seven benchmark datasets. 2) Salient Region Detection. Salient region detection entails computing a saliency map that highlights the regions of dominant objects in a scene. We propose a salient region detection method based on the Minimum Barrier Distance (MBD) transform. We present a fast approximate MBD transform algorithm with an error bound analysis. Powered by this fast MBD transform algorithm, our method can run at about 80 FPS and achieve state-of-the-art performance on four benchmark datasets. 3) Salient Object Detection. Salient object detection targets at localizing each salient object instance in an image. We propose a method using a Convolutional Neural Network (CNN) model for proposal generation and a novel subset optimization formulation for bounding box filtering. In experiments, our subset optimization formulation consistently outperforms heuristic bounding box filtering baselines, such as Non-maximum Suppression, and our method substantially outperforms previous methods on three challenging datasets. 4) Salient Object Subitizing. We propose a new visual saliency computation task, called Salient Object Subitizing, which is to predict the existence and the number of salient objects in an image using holistic cues. To this end, we present an image dataset of about 14K everyday images which are annotated using an online crowdsourcing marketplace. We show that an end-to-end trained CNN subitizing model can achieve promising performance without requiring any localization process. A method is proposed to further improve the training of the CNN subitizing model by leveraging synthetic images. 5) Top-down Saliency Detection. Unlike the aforementioned tasks, top-down saliency detection entails generating task-specific saliency maps. We propose a weakly supervised top-down saliency detection approach by modeling the top-down attention of a CNN image classifier. We propose Excitation Backprop and the concept of contrastive attention to generate highly discriminative top-down saliency maps. Our top-down saliency detection method achieves superior performance in weakly supervised localization tasks on challenging datasets. The usefulness of our method is further validated in the text-to-region association task, where our method provides state-of-the-art performance using only weakly labeled web images for training
    corecore