4,454 research outputs found

    HDR or SDR? A Subjective and Objective Study of Scaled and Compressed Videos

    Full text link
    We conducted a large-scale study of human perceptual quality judgments of High Dynamic Range (HDR) and Standard Dynamic Range (SDR) videos subjected to scaling and compression levels and viewed on three different display devices. HDR videos are able to present wider color gamuts, better contrasts, and brighter whites and darker blacks than SDR videos. While conventional expectations are that HDR quality is better than SDR quality, we have found subject preference of HDR versus SDR depends heavily on the display device, as well as on resolution scaling and bitrate. To study this question, we collected more than 23,000 quality ratings from 67 volunteers who watched 356 videos on OLED, QLED, and LCD televisions. Since it is of interest to be able to measure the quality of videos under these scenarios, e.g. to inform decisions regarding scaling, compression, and SDR vs HDR, we tested several well-known full-reference and no-reference video quality models on the new database. Towards advancing progress on this problem, we also developed a novel no-reference model called HDRPatchMAX, that uses both classical and bit-depth sensitive distortion statistics more accurately than existing metrics

    Multimodal Content Analysis for Effective Advertisements on YouTube

    Full text link
    The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201

    Digital color image processing and psychophysics within the framework of a human visual model

    Get PDF
    Journal ArticleA three-dimensional homomorphic model of human color vision based on neurophysiological and psychophysical evidence is presented. This model permits the quantitative definition of perceptually important parameters such as brightness. saturation, huo and strength. By modelling neural interaction in the human visual system as three linear filters operating on perceptual quantities, this model accounts for the automatic gain control properties of the eye and for brightness and color contrast effects. In relation to color contrast effects, a psychophysical experiment was performed. It utilized a high quality color television monitor driven by a general purpose digital computer. This experiment, based on the cancellation by human subjects of simultaneous color contrast illusions, allowed the measurement of the low spatial frequency part of the frequency responses of the filters operating on the two chromatic channels of the human visual system. The experiment is described and its results are discussed. Next, the model is shown to provide a suitable framework in which to perform digital images processing tasks. First, applications to color image enhancement are presented and discussed in relation to photographic masking techniques and to the handling of digital color images. Second, application of the model to the definition of a distortion measure between color images (in the sense of Shannon's rate-distortion theory), meaningful in terms of human evaluation, is shown. Mathematical norms in the "perceptual" space defined by the model are used to evaluate quantitatively the amount of subjective distortion present in artificially distorted color presented. Results of a coding experiment yielding digital color images coded at an average bit rate of 1 bit/pixel are shown. Finally conclusions are drawn about the implications of this research from the standpoints of psychophysics and of digital image processing

    Beyond pitch/duration scoring: Towards a system dynamics model of electroacoustic music

    Get PDF
    Based on a hierarchy of discrete pitches and metrically sub-divisible duration, Western tonal art music is usually modelled through printed music scores. Scoring acoustic musical events beyond this paradigm has resulted in non-standard graphs in two dimensions. New digitally generated ‘soundscape’ forms are often not conceived or understandable within traditional musical paradigms or notation models, and often explore attributes of music such as spatial processing that fall outside two- dimensional graphic scoring. To date there is not a commonly accepted model that approximates the structural dynamics of electroacoustic music; providing a conceptual framework independent of the music to the degree of standard music notation. Based on recent work in spectro-morphology as a way of explaining sound shapes, a systems dynamics model is proposed through mapping a dynamic taxonomy for structural listening as an aid to composition. This approach captures formal but not semiotic discourse

    Dynamic Lighting for Tension in Games

    Get PDF
    Video and computer games are among the most complex forms of interactive media. Games simulate many elements of traditional media, such as plot, characters, sound and music, lighting and mise-en-scene. However, games are digital artifacts played through graphic interfaces and controllers. As interactive experiences, games are a host of player challenges ranging from more deliberate decision-making and problem solving strategies, to the immediate charge of reflex action. Games, thus, draw upon a unique mix of player resources, contributing to what Lindley refers to as the "game-play gestalt", "a particular way of thinking about the game state from the perspective of a player, together with a pattern of repetitive perceptual, cognitive, and motor operations" (Lindley, 2003)

    Literature survey:perceived quality of fluoroscopic images

    Get PDF

    Multimodal enhancement-fusion technique for natural images.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Durban.This dissertation presents a multimodal enhancement-fusion (MEF) technique for natural images. The MEF is expected to contribute value to machine vision applications and personal image collections for the human user. Image enhancement techniques and the metrics that are used to assess their performance are prolific, and each is usually optimised for a specific objective. The MEF proposes a framework that adaptively fuses multiple enhancement objectives into a seamless pipeline. Given a segmented input image and a set of enhancement methods, the MEF applies all the enhancers to the image in parallel. The most appropriate enhancement in each image segment is identified, and finally, the differentially enhanced segments are seamlessly fused. To begin with, this dissertation studies targeted contrast enhancement methods and performance metrics that can be utilised in the proposed MEF. It addresses a selection of objective assessment metrics for contrast-enhanced images and determines their relationship with the subjective assessment of human visual systems. This is to identify which objective metrics best approximate human assessment and may therefore be used as an effective replacement for tedious human assessment surveys. A subsequent human visual assessment survey is conducted on the same dataset to ascertain image quality as perceived by a human observer. The interrelated concepts of naturalness and detail were found to be key motivators of human visual assessment. Findings show that when assessing the quality or accuracy of these methods, no single quantitative metric correlates well with human perception of naturalness and detail, however, a combination of two or more metrics may be used to approximate the complex human visual response. Thereafter, this dissertation proposes the multimodal enhancer that adaptively selects the optimal enhancer for each image segment. MEF focusses on improving chromatic irregularities such as poor contrast distribution. It deploys a concurrent enhancement pathway that subjects an image to multiple image enhancers in parallel, followed by a fusion algorithm that creates a composite image that combines the strengths of each enhancement path. The study develops a framework for parallel image enhancement, followed by parallel image assessment and selection, leading to final merging of selected regions from the enhanced set. The output combines desirable attributes from each enhancement pathway to produce a result that is superior to each path taken alone. The study showed that the proposed MEF technique performs well for most image types. MEF is subjectively favourable to a human panel and achieves better performance for objective image quality assessment compared to other enhancement methods
    corecore