17 research outputs found
Texture-Aware Superpixel Segmentation
Most superpixel algorithms compute a trade-off between spatial and color
features at the pixel level. Hence, they may need fine parameter tuning to
balance the two measures, and highly fail to group pixels with similar local
texture properties. In this paper, we address these issues with a new
Texture-Aware SuperPixel (TASP) method. To accurately segment textured and
smooth areas, TASP automatically adjusts its spatial constraint according to
the local feature variance. Then, to ensure texture homogeneity within
superpixels, a new pixel to superpixel patch-based distance is proposed. TASP
outperforms the segmentation accuracy of the state-of-the-art methods on
texture and also natural color image datasets
Audiovisual Saliency Prediction in Uncategorized Video Sequences based on Audio-Video Correlation
Substantial research has been done in saliency modeling to develop
intelligent machines that can perceive and interpret their surroundings. But
existing models treat videos as merely image sequences excluding any audio
information, unable to cope with inherently varying content. Based on the
hypothesis that an audiovisual saliency model will be an improvement over
traditional saliency models for natural uncategorized videos, this work aims to
provide a generic audio/video saliency model augmenting a visual saliency map
with an audio saliency map computed by synchronizing low-level audio and visual
features. The proposed model was evaluated using different criteria against eye
fixations data for a publicly available DIEM video dataset. The results show
that the model outperformed two state-of-the-art visual saliency models.Comment: 9 pages, 2 figures, 4 table
Feature extraction for human action recognition based on saliency map
Human Action Recognition (HAR) plays an important role in computer vision for the interaction between human and environments which has been widely used in many applications. The focus of the research in recent years is the reliability of the feature extraction to achieve high performance with the usage of saliency map. However, this task is challenging where problems are faced during human action detection when most of videos are taken with cluttered background scenery and increasing the difficulties to detect or recognize the human action accurately due to merging effects and different level of interest. In this project, the main objective is to design a model that utilizes feature extraction with optical flow method and edge detector. Besides, the accuracy of the saliency map generation is needed to improve with the feature extracted to recognize various human actions. For feature extraction, motion and edge features are proposed as two spatial-temporal cues that using edge detector and Motion Boundary Histogram (MBH) descriptor respectively. Both of them are able to describe the pixels with gradients and other vector components. In addition, the features extracted are implemented into saliency computation using Spectral Residual (SR) method to represent the Fourier transform of vectors to log spectrum and eliminating excessive noises with filtering and data compressing. Computation of the saliency map after obtaining the remaining salient regions are combined to form a final saliency map. Simulation result and data analysis is done with benchmark datasets of human actions using Matlab implementation. The expectation for proposed methodology is to achieve the state-of-art result in recognizing the human actions
Unsupervised Superpixel Generation using Edge-Sparse Embedding
Partitioning an image into superpixels based on the similarity of pixels with
respect to features such as colour or spatial location can significantly reduce
data complexity and improve subsequent image processing tasks. Initial
algorithms for unsupervised superpixel generation solely relied on local cues
without prioritizing significant edges over arbitrary ones. On the other hand,
more recent methods based on unsupervised deep learning either fail to properly
address the trade-off between superpixel edge adherence and compactness or lack
control over the generated number of superpixels. By using random images with
strong spatial correlation as input, \ie, blurred noise images, in a
non-convolutional image decoder we can reduce the expected number of contrasts
and enforce smooth, connected edges in the reconstructed image. We generate
edge-sparse pixel embeddings by encoding additional spatial information into
the piece-wise smooth activation maps from the decoder's last hidden layer and
use a standard clustering algorithm to extract high quality superpixels. Our
proposed method reaches state-of-the-art performance on the BSDS500,
PASCAL-Context and a microscopy dataset