3,215 research outputs found
Interaction between high-level and low-level image analysis for semantic video object extraction
Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)
Information recovery from rank-order encoded images
The time to detection of a visual stimulus by the primate eye is recorded at
100 – 150ms. This near instantaneous recognition is in spite of the considerable
processing required by the several stages of the visual pathway to recognise and
react to a visual scene. How this is achieved is still a matter of speculation.
Rank-order codes have been proposed as a means of encoding by the primate
eye in the rapid transmission of the initial burst of information from the sensory
neurons to the brain. We study the efficiency of rank-order codes in encoding
perceptually-important information in an image. VanRullen and Thorpe built a
model of the ganglion cell layers of the retina to simulate and study the viability
of rank-order as a means of encoding by retinal neurons. We validate their model
and quantify the information retrieved from rank-order encoded images in terms
of the visually-important information recovered. Towards this goal, we apply
the ‘perceptual information preservation algorithm’, proposed by Petrovic and
Xydeas after slight modification. We observe a low information recovery due
to losses suffered during the rank-order encoding and decoding processes. We
propose to minimise these losses to recover maximum information in minimum
time from rank-order encoded images. We first maximise information recovery by
using the pseudo-inverse of the filter-bank matrix to minimise losses during rankorder
decoding. We then apply the biological principle of lateral inhibition to
minimise losses during rank-order encoding. In doing so, we propose the Filteroverlap
Correction algorithm. To test the perfomance of rank-order codes in
a biologically realistic model, we design and simulate a model of the foveal-pit
ganglion cells of the retina keeping close to biological parameters. We use this
as a rank-order encoder and analyse its performance relative to VanRullen and
Thorpe’s retinal model
Recommended from our members
An investigation into the use of genetic algorithms for shape recognition
The use of the genetic algorithm for shape recognition has been investigated in relation to features along a shape boundary contour. Various methods for encoding chromosomes were investigated, the most successful of which led to the development of a new technique to input normalised 'perceptually important point' features from the contour into a genetic algorithm. Chromosomes evolve with genes defining various ways of 'observing' different parts of the contour. The normalisation process provides the capability for multi-scale spatial frequency filtering and fine/coarse resolution of the contour features. A standard genetic algorithm was chosen for this investigation because its performance can be analysed by applying schema analysis to the genes. A new method for measurement of gene diversity has been developed. It is shown that this diversity measure can be used to direct the genetic algorithm parameters to evolve a number of 'good' chromosomes. In this way a variety of sections along the contour can be observed. A new and effective recognition technique has been developed which makes use of these 'good' chromosomes and the same fitness calculation as used in the genetic algorithm. Correct recognition can be achieved by selecting chromosomes and adjusting two thresholds, the values of which are found not to be critical. Difficulties associated with the calculation of a shape's fitness were analysed and the structure of the genes in the chromosome investigated using schema and epistatic analysis. It was shown that the behaviour of the genetic algorithm is compatible with the schema theorem of J. H. Holland. Reasons are given to explain the minimum value for the mutation probability that is required for the evolution of a number of' good' chromosomes. Suggestions for future research are made and, in particular, it is recommended that the convergence properties of the standard genetic algorithm be investigated
A perceptual comparison of empirical and predictive region-of-interest video
When viewing multimedia presentations, a user only
attends to a relatively small part of the video display at any one point in time. By shifting allocation of bandwidth from peripheral areas to those locations where a user’s gaze is more likely to rest, attentive displays can be produced. Attentive displays aim to reduce resource requirements while minimizing negative user perception—understood in this paper as not only a user’s ability to assimilate and understand information but also his/her subjective satisfaction with the video content. This paper introduces and discusses a perceptual comparison between two region-of-interest display (RoID) adaptation techniques. A RoID is an attentive display where bandwidth has been preallocated around measured or highly probable areas of user gaze. In this paper, video content was manipulated using two sources of data: empirical measured data (captured using eye-tracking technology) and predictive data (calculated from the physical characteristics of the video data). Results show that display adaptation causes significant variation in users’ understanding of specific multimedia content. Interestingly, RoID adaptation and the type of video being presented both affect user perception of video quality. Moreover, the use of frame rates less than 15 frames per second, for any video adaptation technique, caused a significant reduction in user perceived quality, suggesting that although users are aware of video quality reduction, it does impact level of information assimilation and understanding. Results also highlight that user level of enjoyment is significantly affected by the type of video yet is not as affected by the quality or type of video adaptation—an interesting implication in the field of entertainment
A Perceptually Based Comparison of Image Similarity Metrics
The assessment of how well one image matches another forms a critical component both of models of human visual processing and of many image analysis systems. Two of the most commonly used norms for quantifying image similarity are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric, better than the other, captures the perceptual notion of image similarity. This can be used to derive inferences regarding similarity criteria the human visual system uses, as well as to evaluate and design metrics for use in image-analysis applications. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created by vector quantization. In both conditions the participants showed a small but consistent preference for images matched with the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity
Perceptual compression of magnitude-detected synthetic aperture radar imagery
A perceptually-based approach for compressing synthetic aperture radar (SAR) imagery is presented. Key components of the approach are a multiresolution wavelet transform, a bit allocation mask based on an empirical human visual system (HVS) model, and hybrid scalar/vector quantization. Specifically, wavelet shrinkage techniques are used to segregate wavelet transform coefficients into three components: local means, edges, and texture. Each of these three components is then quantized separately according to a perceptually-based bit allocation scheme. Wavelet coefficients associated with local means and edges are quantized using high-rate scalar quantization while texture information is quantized using low-rate vector quantization. The impact of the perceptually-based multiresolution compression algorithm on visual image quality, impulse response, and texture properties is assessed for fine-resolution magnitude-detected SAR imagery; excellent image quality is found at bit rates at or above 1 bpp along with graceful performance degradation at rates below 1 bpp
Detection of Motion Vector-Based Video Steganography by Adding or Subtracting One Motion Vector Value
In last decades the Steganography is an tremendous progress, at the same time there exist issues to detect the steganalysis in motion based video where the substance is reliably in motion conduct that makes that to detect it. Analyzing the difference between the rated motion value plays a crucial role that enables us to focus on difference between the locally optimal SAD and actual SAD after adding-or-subtracting-one operation on the motion value. Based on the motion vectors to play out the classification and extraction process at last, two features sets are been used based on the fact that most motion vectors are locally optimal for most video codec’s to complete this process. The conventional approaches announced the technique for proposed prevails to meet the requirement applications and detecting the steganalysis in videos compare in the literature
Real-time filtering and detection of dynamics for compression of HDTV
The preprocessing of video sequences for data compressing is discussed. The end goal associated with this is a compression system for HDTV capable of transmitting perceptually lossless sequences at under one bit per pixel. Two subtopics were emphasized to prepare the video signal for more efficient coding: (1) nonlinear filtering to remove noise and shape the signal spectrum to take advantage of insensitivities of human viewers; and (2) segmentation of each frame into temporally dynamic/static regions for conditional frame replenishment. The latter technique operates best under the assumption that the sequence can be modelled as a superposition of active foreground and static background. The considerations were restricted to monochrome data, since it was expected to use the standard luminance/chrominance decomposition, which concentrates most of the bandwidth requirements in the luminance. Similar methods may be applied to the two chrominance signals
Perceptual Copyright Protection Using Multiresolution Wavelet-Based Watermarking And Fuzzy Logic
In this paper, an efficiently DWT-based watermarking technique is proposed to
embed signatures in images to attest the owner identification and discourage
the unauthorized copying. This paper deals with a fuzzy inference filter to
choose the larger entropy of coefficients to embed watermarks. Unlike most
previous watermarking frameworks which embedded watermarks in the larger
coefficients of inner coarser subbands, the proposed technique is based on
utilizing a context model and fuzzy inference filter by embedding watermarks in
the larger-entropy coefficients of coarser DWT subbands. The proposed
approaches allow us to embed adaptive casting degree of watermarks for
transparency and robustness to the general image-processing attacks such as
smoothing, sharpening, and JPEG compression. The approach has no need the
original host image to extract watermarks. Our schemes have been shown to
provide very good results in both image transparency and robustness.Comment: 13 pages, 7 figure
- …