1,636 research outputs found

    Kuvanlaatukokemuksen arvionnin instrumentit

    Get PDF
    This dissertation describes the instruments available for image quality evaluation, develops new methods for subjective image quality evaluation and provides image and video databases for the assessment and development of image quality assessment (IQA) algorithms. The contributions of the thesis are based on six original publications. The first publication introduced the VQone toolbox for subjective image quality evaluation. It created a platform for free-form experimentation with standardized image quality methods and was the foundation for later studies. The second publication focused on the dilemma of reference in subjective experiments by proposing a new method for image quality evaluation: the absolute category rating with dynamic reference (ACR-DR). The third publication presented a database (CID2013) in which 480 images were evaluated by 188 observers using the ACR-DR method proposed in the prior publication. Providing databases of image files along with their quality ratings is essential in the field of IQA algorithm development. The fourth publication introduced a video database (CVD2014) based on having 210 observers rate 234 video clips. The temporal aspect of the stimuli creates peculiar artifacts and degradations, as well as challenges to experimental design and video quality assessment (VQA) algorithms. When the CID2013 and CVD2014 databases were published, most state-of-the-art I/VQAs had been trained on and tested against databases created by degrading an original image or video with a single distortion at a time. The novel aspect of CID2013 and CVD2014 was that they consisted of multiple concurrent distortions. To facilitate communication and understanding among professionals in various fields of image quality as well as among non-professionals, an attribute lexicon of image quality, the image quality wheel, was presented in the fifth publication of this thesis. Reference wheels and terminology lexicons have a long tradition in sensory evaluation contexts, such as taste experience studies, where they are used to facilitate communication among interested stakeholders; however, such an approach has not been common in visual experience domains, especially in studies on image quality. The sixth publication examined how the free descriptions given by the observers influenced the ratings of the images. Understanding how various elements, such as perceived sharpness and naturalness, affect subjective image quality can help to understand the decision-making processes behind image quality evaluation. Knowing the impact of each preferential attribute can then be used for I/VQA algorithm development; certain I/VQA algorithms already incorporate low-level human visual system (HVS) models in their algorithms.Väitöskirja tarkastelee ja kehittää uusia kuvanlaadun arvioinnin menetelmiä, sekä tarjoaa kuva- ja videotietokantoja kuvanlaadun arviointialgoritmien (IQA) testaamiseen ja kehittämiseen. Se, mikä koetaan kauniina ja miellyttävänä, on psykologisesti kiinnostava kysymys. Työllä on myös merkitystä teollisuuteen kameroiden kuvanlaadun kehittämisessä. Väitöskirja sisältää kuusi julkaisua, joissa tarkastellaan aihetta eri näkökulmista. I. julkaisussa kehitettiin sovellus keräämään ihmisten antamia arvioita esitetyistä kuvista tutkijoiden vapaaseen käyttöön. Se antoi mahdollisuuden testata standardoituja kuvanlaadun arviointiin kehitettyjä menetelmiä ja kehittää niiden pohjalta myös uusia menetelmiä luoden perustan myöhemmille tutkimuksille. II. julkaisussa kehitettiin uusi kuvanlaadun arviointimenetelmä. Menetelmä hyödyntää sarjallista kuvien esitystapaa, jolla muodostettiin henkilöille mielikuva kuvien laatuvaihtelusta ennen varsinaista arviointia. Tämän todettiin vähentävän tulosten hajontaa ja erottelevan pienempiä kuvanlaatueroja. III. julkaisussa kuvaillaan tietokanta, jossa on 188 henkilön 480 kuvasta antamat laatuarviot ja niihin liittyvät kuvatiedostot. Tietokannat ovat arvokas työkalu pyrittäessä kehittämään algoritmeja kuvanlaadun automaattiseen arvosteluun. Niitä tarvitaan mm. opetusmateriaalina tekoälyyn pohjautuvien algoritmien kehityksessä sekä vertailtaessa eri algoritmien suorituskykyä toisiinsa. Mitä paremmin algoritmin tuottama ennuste korreloi ihmisten antamiin laatuarvioihin, sen parempi suorituskyky sillä voidaan sanoa olevan. IV. julkaisussa esitellään tietokanta, jossa on 210 henkilön 234 videoleikkeestä tekemät laatuarviot ja niihin liittyvät videotiedostot. Ajallisen ulottuvuuden vuoksi videoärsykkeiden virheet ovat erilaisia kuin kuvissa, mikä tuo omat haasteensa videoiden laatua arvioiville algoritmeille (VQA). Aikaisempien tietokantojen ärsykkeet on muodostettu esimerkiksi sumentamalla yksittäistä kuvaa asteittain, jolloin ne sisältävät vain yksiulotteisia vääristymiä. Nyt esitetyt tietokannat poikkeavat aikaisemmista ja sisältävät useita samanaikaisia vääristymistä, joiden interaktio kuvanlaadulle voi olla merkittävää. V. julkaisussa esitellään kuvanlaatuympyrä (image quality wheel). Se on kuvanlaadun käsitteiden sanasto, joka on kerätty analysoimalla 146 henkilön tuottamat 39 415 kuvanlaadun sanallista kuvausta. Sanastoilla on pitkät perinteet aistinvaraisen arvioinnin tutkimusperinteessä, mutta niitä ei ole aikaisemmin kehitetty kuvanlaadulle. VI. tutkimuksessa tutkittiin, kuinka arvioitsijoiden antamat käsitteet vaikuttavat kuvien laadun arviointiin. Esimerkiksi kuvien arvioitu terävyys tai luonnollisuus auttaa ymmärtämään laadunarvioinnin taustalla olevia päätöksentekoprosesseja. Tietoa voidaan käyttää esimerkiksi kuvan- ja videonlaadun arviointialgoritmien (I/VQA) kehitystyössä

    Visual experience of 3D TV

    Get PDF

    Sound mosaics: a graphical user interface for sound synthesis based on audio-visual associations.

    Get PDF
    This thesis presents the design of a Graphical User Interface (GUI) for computer-based sound synthesis to support users in the externalisation of their musical ideas when interacting with the System in order to create and manipulate sound. The approach taken consisted of three research stages. The first stage was the formulation of a novel visualisation framework to display perceptual dimensions of sound in Visual terms. This framework was based on the findings of existing related studies and a series of empirical investigations of the associations between auditory and visual precepts that we performed for the first time in the area of computer-based sound synthesis. The results of our empirical investigations suggested associations between the colour dimensions of brightness and saturation with the auditory dimensions of pitch and loudness respectively, as well as associations between the multidimensional precepts of visual texture and timbre. The second stage of the research involved the design and implementation of Sound Mosaics, a prototype GUI for sound synthesis based on direct manipulation of visual representations that make use of the visualisation framework developed in the first stage. We followed an iterative design approach that involved the design and evaluation of an initial Sound Mosaics prototype. The insights gained during this first iteration assisted us in revising various aspects of the original design and visualisation framework that led to a revised implementation of Sound Mosaics. The final stage of this research involved an evaluation study of the revised Sound Mosaics prototype that comprised two controlled experiments. First, a comparison experiment with the widely used frequency-domain representations of sound indicated that visual representations created with Sound Mosaics were more comprehensible and intuitive. Comprehensibility was measured as the level of accuracy in a series of sound image association tasks, while intuitiveness was related to subjects' response times and perceived levels of confidence. Second, we conducted a formative evaluation of Sound Mosaics, in which it was exposed to a number of users with and without musical background. Three usability factors were measured: effectiveness, efficiency, and subjective satisfaction. Sound Mosaics was demonstrated to perform satisfactorily in ail three factors for music subjects, although non-music subjects yielded less satisfactory results that can be primarily attributed to the subjects' unfamiliarity with the task of sound synthesis. Overall, our research has set the necessary groundwork for empirically derived and validated associations between auditory and visual dimensions that can be used in the design of cognitively useful GUIs for computer-based sound synthesis and related area

    Perceptual Image Quality Of Launch Vehicle Imaging Telescopes

    Get PDF
    A large fleet (in the hundreds) of high quality telescopes are used for tracking and imaging of launch vehicles during ascent from Cape Canaveral Air Force Station and Kennedy Space Center. A maintenance tool has been development for use with these telescopes. The tool requires rankings of telescope condition in terms of the ability to generate useful imagery. It is thus a case of ranking telescope conditions on the basis of the perceptual image quality of their imagery. Perceptual image quality metrics that are well-correlated to observer opinions of image quality have been available for several decades. However, these are quite limited in their applications, not being designed to compare various optical systems. The perceptual correlation of the metrics implies that a constant image quality curve (such as the boundary between two qualitative categories labeled as excellent and good) would have a constant value of the metric. This is not the case if the optical system parameters (such as object distance or aperture diameter) are varied. No published data on such direct variation is available and this dissertation presents an investigation made into the perceptual metric responses as system parameters are varied. This investigation leads to some non-intuitive conclusions. The perceptual metrics are reviewed as well as more common metrics and their inability to perform in the necessary manner for the research of interest. Perceptual test methods are also reviewed, as is the human visual system. iv Image formation theory is presented in a non-traditional form, yielding the surprising result that perceptual image quality is invariant under changes in focal length if the final displayed image remains constant. Experimental results are presented of changes in perceived image quality as aperture diameter is varied. Results are analyzed and shortcomings in the process and metrics are discussed. Using the test results, predictions are made about the form of the metric response to object distance variations, and subsequent testing was conducted to validate the predictions. The utility of the results, limitations of applicability, and the immediate ability to further generalize the results is presented

    Scene-Dependency of Spatial Image Quality Metrics

    Get PDF
    This thesis is concerned with the measurement of spatial imaging performance and the modelling of spatial image quality in digital capturing systems. Spatial imaging performance and image quality relate to the objective and subjective reproduction of luminance contrast signals by the system, respectively; they are critical to overall perceived image quality. The Modulation Transfer Function (MTF) and Noise Power Spectrum (NPS) describe the signal (contrast) transfer and noise characteristics of a system, respectively, with respect to spatial frequency. They are both, strictly speaking, only applicable to linear systems since they are founded upon linear system theory. Many contemporary capture systems use adaptive image signal processing, such as denoising and sharpening, to optimise output image quality. These non-linear processes change their behaviour according to characteristics of the input signal (i.e. the scene being captured). This behaviour renders system performance “scene-dependent” and difficult to measure accurately. The MTF and NPS are traditionally measured from test charts containing suitable predefined signals (e.g. edges, sinusoidal exposures, noise or uniform luminance patches). These signals trigger adaptive processes at uncharacteristic levels since they are unrepresentative of natural scene content. Thus, for systems using adaptive processes, the resultant MTFs and NPSs are not representative of performance “in the field” (i.e. capturing real scenes). Spatial image quality metrics for capturing systems aim to predict the relationship between MTF and NPS measurements and subjective ratings of image quality. They cascade both measures with contrast sensitivity functions that describe human visual sensitivity with respect to spatial frequency. The most recent metrics designed for adaptive systems use MTFs measured using the dead leaves test chart that is more representative of natural scene content than the abovementioned test charts. This marks a step toward modelling image quality with respect to real scene signals. This thesis presents novel scene-and-process-dependent MTFs (SPD-MTF) and NPSs (SPDNPS). They are measured from imaged pictorial scene (or dead leaves target) signals to account for system scene-dependency. Further, a number of spatial image quality metrics are revised to account for capture system and visual scene-dependency. Their MTF and NPS parameters were substituted for SPD-MTFs and SPD-NPSs. Likewise, their standard visual functions were substituted for contextual detection (cCSF) or discrimination (cVPF) functions. In addition, two novel spatial image quality metrics are presented (the log Noise Equivalent Quanta (NEQ) and Visual log NEQ) that implement SPD-MTFs and SPD-NPSs. The metrics, SPD-MTFs and SPD-NPSs were validated by analysing measurements from simulated image capture pipelines that applied either linear or adaptive image signal processing. The SPD-NPS measures displayed little evidence of measurement error, and the metrics performed most accurately when they used SPD-NPSs measured from images of scenes. The benefit of deriving SPD-MTFs from images of scenes was traded-off, however, against measurement bias. Most metrics performed most accurately with SPD-MTFs derived from dead leaves signals. Implementing the cCSF or cVPF did not increase metric accuracy. The log NEQ and Visual log NEQ metrics proposed in this thesis were highly competitive, outperforming metrics of the same genre. They were also more consistent than the IEEE P1858 Camera Phone Image Quality (CPIQ) metric when their input parameters were modified. The advantages and limitations of all performance measures and metrics were discussed, as well as their practical implementation and relevant applications

    Contrast sensitivity in images of natural scenes

    Get PDF
    The contrast sensitivity function (CSF) characterizes spatial detection in the human visual system and is typically measured from simple, synthetic stimuli. We used spatial frequency decomposition, RMS contrast modulation, a yes/no paradigm and an adaptive staircase to measure isolated and contextual CSFs (iCSFs and cCSFs) from natural images. We employed Barten’s mechanistic model and adapted it for contextual modeling purposes by postulating that, signal detection in a given frequency band, when presented amongst other broadband signals, can be modeled as if amongst noise. We found that the iCSF varies with pictorial content, but that the standard CSF model and the image’s contrast spectrums are sufficient to predict with relative success the cCSF for any given image. We finally discuss the suitability of cCSF models in image quality modeling

    Soundscape design of water features used in outdoor spaces where road traffic noise is audible

    Get PDF
    This research focused on the soundscape design of a wide range of small to medium sized water features (waterfalls, fountains with upward jet(s), and streams) which can be used in gardens or parks for promoting peacefulness and relaxation in the presence of road traffic noise. Firstly, the thesis examined the audio-visual interaction and perceptual assessment of water features, including the semantic components and the qualitative categorisation and evocation of water sounds; and secondly, the thesis investigated the effectiveness of the water features tested in promoting relaxation through sound mapping. Different laboratory tests were carried out, and these included paired comparison tests (audio-only, visual-only and audio-visual tests), semantic differential tests, as well as tests aimed at the qualitative categorisation and evocation of water features. Sound maps of the water generated sounds were developed through the use of propagation models based on either point or line sources. Three acoustic zones (‘water sounds dominant zone’, ‘optimum zone’ and ‘RTN dominant zone’ (RTN: road traffic noise)) were defined in the maps as the zones where relaxation/pleasantness can be promoted over a 20 m × 20 m area for different road traffic noise levels. Paired comparisons highlighted the interdependence between uni-modal (audio-only or visual-only) and bi-modal (audio-visual) perception, indicating that equal attention should be given to the design of both stimuli. In general, natural looking features tended to increase preference scores (compared to audio-only paired comparison scores), while manmade looking features decreased them. Semantic descriptors showed significant correlations with preferences and were found to be more reliable design criteria than physical parameters. A principal component analysis identified three components within the nine semantic attributes tested: “emotional assessment,” “sound quality,” and “envelopment and temporal variation.” The first two showed significant correlations with audio-only preferences, “emotional assessment” being the most important predictor of preferences, and its attributes naturalness, relaxation, and freshness also being significantly correlated with preferences. Categorisation results indicated that natural stream sounds are easily identifiable (unlike waterfalls and fountains), while evocation results showed no unique relationship with preferences. The results of sound maps indicated that small to medium sized water features can be used mainly in environments where road traffic noise levels are equal or lower than 65 dBA
    corecore