6,619 research outputs found
A perceptual comparison of empirical and predictive region-of-interest video
When viewing multimedia presentations, a user only
attends to a relatively small part of the video display at any one point in time. By shifting allocation of bandwidth from peripheral areas to those locations where a userâs gaze is more likely to rest, attentive displays can be produced. Attentive displays aim to reduce resource requirements while minimizing negative user perceptionâunderstood in this paper as not only a userâs ability to assimilate and understand information but also his/her subjective satisfaction with the video content. This paper introduces and discusses a perceptual comparison between two region-of-interest display (RoID) adaptation techniques. A RoID is an attentive display where bandwidth has been preallocated around measured or highly probable areas of user gaze. In this paper, video content was manipulated using two sources of data: empirical measured data (captured using eye-tracking technology) and predictive data (calculated from the physical characteristics of the video data). Results show that display adaptation causes significant variation in usersâ understanding of specific multimedia content. Interestingly, RoID adaptation and the type of video being presented both affect user perception of video quality. Moreover, the use of frame rates less than 15 frames per second, for any video adaptation technique, caused a significant reduction in user perceived quality, suggesting that although users are aware of video quality reduction, it does impact level of information assimilation and understanding. Results also highlight that user level of enjoyment is significantly affected by the type of video yet is not as affected by the quality or type of video adaptationâan interesting implication in the field of entertainment
A Fuzzy Approach to Text Segmentation in Web Images Based on Human Colour Perception
This chapter describes a new approach for the segmentation of text in images on Web pages. In the same spirit as the authorsâ previous work on this subject, this approach attempts to model the ability of humans to differentiate between colours. In this case, pixels of similar colour are first grouped using a colour distance defined in a perceptually uniform colour space (as opposed to the commonly used RGB). The resulting colour connected components are then grouped to form larger (character-like) regions with the aid of a propinquity measure, which is the output of a fuzzy inference system. This measure expresses the likelihood for merging two components based on two features. The first feature is the colour distance between the components, in the L*a*b* colour space. The second feature expresses the topological relationship of two components. The results of the method indicate a better performance than previous methods devised by the authors and possibly better (a direct comparison is not really possible due to the differences in application domain characteristics between this and previous methods) performance to other existing methods
Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball
We present a simple regularization of adversarial perturbations based upon
the perceptual loss. While the resulting perturbations remain imperceptible to
the human eye, they differ from existing adversarial perturbations in that they
are semi-sparse alterations that highlight objects and regions of interest
while leaving the background unaltered. As a semantically meaningful adverse
perturbations, it forms a bridge between counterfactual explanations and
adversarial perturbations in the space of images. We evaluate our approach on
several standard explainability benchmarks, namely, weak localization,
insertion deletion, and the pointing game demonstrating that perceptually
regularized counterfactuals are an effective explanation for image-based
classifiers.Comment: CVPR 202
Two Approaches for Text Segmentation in Web Images
There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perceptionâ in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background)
Two Approaches for Text Segmentation in Web Images
There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perceptionâ in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background)
- âŠ