Search CORE

3,215 research outputs found

Interaction between high-level and low-level image analysis for semantic video object extraction

Author: Cavallaro A
Ebrahimi T
Publication venue
Publication date: 01/01/2004
Field of study

Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)

Springer - Publisher Connector

Directory of Open Access Journals

Queen Mary Research Online

Information recovery from rank-order encoded images

Author: Furber Steve
sen Bhattacharya Basabdatta
Publication venue
Publication date: 04/03/2008
Field of study

The time to detection of a visual stimulus by the primate eye is recorded at 100 – 150ms. This near instantaneous recognition is in spite of the considerable processing required by the several stages of the visual pathway to recognise and react to a visual scene. How this is achieved is still a matter of speculation. Rank-order codes have been proposed as a means of encoding by the primate eye in the rapid transmission of the initial burst of information from the sensory neurons to the brain. We study the efficiency of rank-order codes in encoding perceptually-important information in an image. VanRullen and Thorpe built a model of the ganglion cell layers of the retina to simulate and study the viability of rank-order as a means of encoding by retinal neurons. We validate their model and quantify the information retrieved from rank-order encoded images in terms of the visually-important information recovered. Towards this goal, we apply the ‘perceptual information preservation algorithm’, proposed by Petrovic and Xydeas after slight modification. We observe a low information recovery due to losses suffered during the rank-order encoding and decoding processes. We propose to minimise these losses to recover maximum information in minimum time from rank-order encoded images. We first maximise information recovery by using the pseudo-inverse of the filter-bank matrix to minimise losses during rankorder decoding. We then apply the biological principle of lateral inhibition to minimise losses during rank-order encoding. In doing so, we propose the Filteroverlap Correction algorithm. To test the perfomance of rank-order codes in a biologically realistic model, we design and simulate a model of the foveal-pit ganglion cells of the retina keeping close to biological parameters. We use this as a rank-order encoder and analyse its performance relative to VanRullen and Thorpe’s retinal model

University of Lincoln Institutional Repository

Recommended from our members

An investigation into the use of genetic algorithms for shape recognition

Author: Egan Thomas Michael
Publication venue
Publication date: 01/01/1999
Field of study

The use of the genetic algorithm for shape recognition has been investigated in relation to features along a shape boundary contour. Various methods for encoding chromosomes were investigated, the most successful of which led to the development of a new technique to input normalised 'perceptually important point' features from the contour into a genetic algorithm. Chromosomes evolve with genes defining various ways of 'observing' different parts of the contour. The normalisation process provides the capability for multi-scale spatial frequency filtering and fine/coarse resolution of the contour features. A standard genetic algorithm was chosen for this investigation because its performance can be analysed by applying schema analysis to the genes. A new method for measurement of gene diversity has been developed. It is shown that this diversity measure can be used to direct the genetic algorithm parameters to evolve a number of 'good' chromosomes. In this way a variety of sections along the contour can be observed. A new and effective recognition technique has been developed which makes use of these 'good' chromosomes and the same fitness calculation as used in the genetic algorithm. Correct recognition can be achieved by selecting chromosomes and adjusting two thresholds, the values of which are found not to be critical. Difficulties associated with the calculation of a shape's fitness were analysed and the structure of the genes in the chromosome investigated using schema and epistatic analysis. It was shown that the behaviour of the genetic algorithm is compatible with the schema theorem of J. H. Holland. Reasons are given to explain the minimum value for the mutation probability that is required for the evolution of a number of' good' chromosomes. Suggestions for future research are made and, in particular, it is recommended that the convergence properties of the standard genetic algorithm be investigated

Open Research Online (The Open University)

A perceptual comparison of empirical and predictive region-of-interest video

Author: Ghinea G
Gulliver SR
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

When viewing multimedia presentations, a user only attends to a relatively small part of the video display at any one point in time. By shifting allocation of bandwidth from peripheral areas to those locations where a user’s gaze is more likely to rest, attentive displays can be produced. Attentive displays aim to reduce resource requirements while minimizing negative user perception—understood in this paper as not only a user’s ability to assimilate and understand information but also his/her subjective satisfaction with the video content. This paper introduces and discusses a perceptual comparison between two region-of-interest display (RoID) adaptation techniques. A RoID is an attentive display where bandwidth has been preallocated around measured or highly probable areas of user gaze. In this paper, video content was manipulated using two sources of data: empirical measured data (captured using eye-tracking technology) and predictive data (calculated from the physical characteristics of the video data). Results show that display adaptation causes significant variation in users’ understanding of specific multimedia content. Interestingly, RoID adaptation and the type of video being presented both affect user perception of video quality. Moreover, the use of frame rates less than 15 frames per second, for any video adaptation technique, caused a significant reduction in user perceived quality, suggesting that although users are aware of video quality reduction, it does impact level of information assimilation and understanding. Results also highlight that user level of enjoyment is significantly affected by the type of video yet is not as affected by the quality or type of video adaptation—an interesting implication in the field of entertainment

Central Archive at the University of Reading

CiteSeerX

Crossref

Brunel University Research Archive

A Perceptually Based Comparison of Image Similarity Metrics

Author: Russell Richard
Sinha Pawan
Publication venue: The Cupola: Scholarship at Gettysburg College
Publication date: 01/01/2011
Field of study

The assessment of how well one image matches another forms a critical component both of models of human visual processing and of many image analysis systems. Two of the most commonly used norms for quantifying image similarity are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric, better than the other, captures the perceptual notion of image similarity. This can be used to derive inferences regarding similarity criteria the human visual system uses, as well as to evaluate and design metrics for use in image-analysis applications. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created by vector quantization. In both conditions the participants showed a small but consistent preference for images matched with the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity

CiteSeerX

Gettysburg College

Perceptual compression of magnitude-detected synthetic aperture radar imagery

Author: Gorman John D.
Werness Susan A.
Publication venue
Publication date
Field of study

A perceptually-based approach for compressing synthetic aperture radar (SAR) imagery is presented. Key components of the approach are a multiresolution wavelet transform, a bit allocation mask based on an empirical human visual system (HVS) model, and hybrid scalar/vector quantization. Specifically, wavelet shrinkage techniques are used to segregate wavelet transform coefficients into three components: local means, edges, and texture. Each of these three components is then quantized separately according to a perceptually-based bit allocation scheme. Wavelet coefficients associated with local means and edges are quantized using high-rate scalar quantization while texture information is quantized using low-rate vector quantization. The impact of the perceptually-based multiresolution compression algorithm on visual image quality, impulse response, and texture properties is assessed for fine-resolution magnitude-detected SAR imagery; excellent image quality is found at bit rates at or above 1 bpp along with graceful performance degradation at rates below 1 bpp

NASA Technical Reports Server

Detection of Motion Vector-Based Video Steganography by Adding or Subtracting One Motion Vector Value

Author: Bachu Srinivas
Madam Aravind Kumar
Publication venue: 'IntechOpen'
Publication date: 05/11/2018
Field of study

In last decades the Steganography is an tremendous progress, at the same time there exist issues to detect the steganalysis in motion based video where the substance is reliably in motion conduct that makes that to detect it. Analyzing the difference between the rated motion value plays a crucial role that enables us to focus on difference between the locally optimal SAD and actual SAD after adding-or-subtracting-one operation on the motion value. Based on the motion vectors to play out the classification and extraction process at last, two features sets are been used based on the fact that most motion vectors are locally optimal for most video codec’s to complete this process. The conventional approaches announced the technique for proposed prevails to meet the requirement applications and detecting the steganalysis in videos compare in the literature

IntechOpen

Crossref

Real-time filtering and detection of dynamics for compression of HDTV

Author: Bauer Peter
Sauer Ken D.
Publication venue
Publication date
Field of study

The preprocessing of video sequences for data compressing is discussed. The end goal associated with this is a compression system for HDTV capable of transmitting perceptually lossless sequences at under one bit per pixel. Two subtopics were emphasized to prepare the video signal for more efficient coding: (1) nonlinear filtering to remove noise and shape the signal spectrum to take advantage of insensitivities of human viewers; and (2) segmentation of each frame into temporally dynamic/static regions for conditional frame replenishment. The latter technique operates best under the assumption that the sequence can be modelled as a superposition of active foreground and static background. The considerations were restricted to monochrome data, since it was expected to use the standard luminance/chrominance decomposition, which concentrates most of the bandwidth requirements in the luminance. Similar methods may be applied to the two chrominance signals

NASA Technical Reports Server

Perceptual Copyright Protection Using Multiresolution Wavelet-Based Watermarking And Fuzzy Logic

Author: Hsieh Ming-Shing
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/07/2010
Field of study

In this paper, an efficiently DWT-based watermarking technique is proposed to embed signatures in images to attest the owner identification and discourage the unauthorized copying. This paper deals with a fuzzy inference filter to choose the larger entropy of coefficients to embed watermarks. Unlike most previous watermarking frameworks which embedded watermarks in the larger coefficients of inner coarser subbands, the proposed technique is based on utilizing a context model and fuzzy inference filter by embedding watermarks in the larger-entropy coefficients of coarser DWT subbands. The proposed approaches allow us to embed adaptive casting degree of watermarks for transparency and robustness to the general image-processing attacks such as smoothing, sharpening, and JPEG compression. The approach has no need the original host image to extract watermarks. Our schemes have been shown to provide very good results in both image transparency and robustness.Comment: 13 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals