859 research outputs found

    Spatiotemporal Saliency Detection: State of Art

    Get PDF
    Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made

    Image analysis using visual saliency with applications in hazmat sign detection and recognition

    Get PDF
    Visual saliency is the perceptual process that makes attractive objects stand out from their surroundings in the low-level human visual system. Visual saliency has been modeled as a preprocessing step of the human visual system for selecting the important visual information from a scene. We investigate bottom-up visual saliency using spectral analysis approaches. We present separate and composite model families that generalize existing frequency domain visual saliency models. We propose several frequency domain visual saliency models to generate saliency maps using new spectrum processing methods and an entropy-based saliency map selection approach. A group of saliency map candidates are then obtained by inverse transform. A final saliency map is selected among the candidates by minimizing the entropy of the saliency map candidates. The proposed models based on the separate and composite model families are also extended to various color spaces. We develop an evaluation tool for benchmarking visual saliency models. Experimental results show that the proposed models are more accurate and efficient than most state-of-the-art visual saliency models in predicting eye fixation.^ We use the above visual saliency models to detect the location of hazardous material (hazmat) signs in complex scenes. We develop a hazmat sign location detection and content recognition system using visual saliency. Saliency maps are employed to extract salient regions that are likely to contain hazmat sign candidates and then use a Fourier descriptor based contour matching method to locate the border of hazmat signs in these regions. This visual saliency based approach is able to increase the accuracy of sign location detection, reduce the number of false positive objects, and speed up the overall image analysis process. We also propose a color recognition method to interpret the color inside the detected hazmat sign. Experimental results show that our proposed hazmat sign location detection method is capable of detecting and recognizing projective distorted, blurred, and shaded hazmat signs at various distances.^ In other work we investigate error concealment for scalable video coding (SVC). When video compressed with SVC is transmitted over loss-prone networks, the decompressed video can suffer severe visual degradation across multiple frames. In order to enhance the visual quality, we propose an inter-layer error concealment method using motion vector averaging and slice interleaving to deal with burst packet losses and error propagation. Experimental results show that the proposed error concealment methods outperform two existing methods

    PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions

    Get PDF
    Hypercomplex neural networks have proven to reduce the overall number of parameters while ensuring valuable performance by leveraging the properties of Clifford algebras. Recently, hypercomplex linear layers have been further improved by involving efficient parameterized Kronecker products. In this article, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models. Our method grasps the convolution rules and the filter organization directly from data without requiring a rigidly predefined domain structure to follow. PHNNs are flexible to operate in any user-defined or tuned domain, from 1-D to nD regardless of whether the algebra rules are preset. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks (QNNs) for 3-D inputs like color images. As a result, the proposed family of PHNNs operates with 1/n free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image datasets and audio datasets in which our method outperforms real and quaternion-valued counterparts

    Non-Intrusive Affective Assessment in the Circumplex Model from Pupil Diameter and Facial Expression Monitoring

    Get PDF
    Automatic methods for affective assessment seek to enable computer systems to recognize the affective state of their users. This dissertation proposes a system that uses non-intrusive measurements of the user’s pupil diameter and facial expression to characterize his /her affective state in the Circumplex Model of Affect. This affective characterization is achieved by estimating the affective arousal and valence of the user’s affective state. In the proposed system the pupil diameter signal is obtained from a desktop eye gaze tracker, while the face expression components, called Facial Animation Parameters (FAPs) are obtained from a Microsoft Kinect module, which also captures the face surface as a cloud of points. Both types of data are recorded 10 times per second. This dissertation implemented pre-processing methods and fixture extraction approaches that yield a reduced number of features representative of discrete 10-second recordings, to estimate the level of affective arousal and the type of affective valence experienced by the user in those intervals. The dissertation uses a machine learning approach, specifically Support Vector Machines (SVMs), to act as a model that will yield estimations of valence and arousal from the features derived from the data recorded. Pupil diameter and facial expression recordings were collected from 50 subjects who volunteered to participate in an FIU IRB-approved experiment to capture their reactions to the presentation of 70 pictures from the International Affective Picture System (IAPS) database, which have been used in large calibration studies and therefore have associated arousal and valence mean values. Additionally, each of the 50 volunteers in the data collection experiment provided their own subjective assessment of the levels of arousal and valence elicited in him / her by each picture. This process resulted in a set of face and pupil data records, along with the expected reaction levels of arousal and valence, i.e., the “labels”, for the data used to train and test the SVM classifiers. The trained SVM classifiers achieved 75% accuracy for valence estimation and 92% accuracy in arousal estimation, confirming the initial viability of non-intrusive affective assessment systems based on pupil diameter and face expression monitoring
    • …
    corecore