79 research outputs found

    Biosignalų požymių regos diskomfortui vertinti išskyrimas ir tyrimas

    Get PDF
    Comfortable stereoscopic perception continues to be an essential area of research. The growing interest in virtual reality content and increasing market for head-mounted displays (HMDs) still cause issues of balancing depth perception and comfortable viewing. Stereoscopic views are stimulating binocular cues – one type of several available human visual depth cues which becomes conflicting cues when stereoscopic displays are used. Depth perception by binocular cues is based on matching of image features from one retina with corresponding features from the second retina. It is known that our eyes can tolerate small amounts of retinal defocus, which is also known as Depth of Focus. When magnitudes are larger, a problem of visual discomfort arises. The research object of the doctoral dissertation is a visual discomfort level. This work aimed at the objective evaluation of visual discomfort, based on physiological signals. Different levels of disparity and the number of details in stereoscopic views in some cases make it difficult to find the focus point for comfortable depth perception quickly. During this investigation, a tendency for differences in single sensor-based electroencephalographic EEG signal activity at specific frequencies was found. Additionally, changes in eye tracker collected gaze signals were also found. A dataset of EEG and gaze signal records from 28 control subjects was collected and used for further evaluation. The dissertation consists of an introduction, three chapters and general conclusions. The first chapter reveals the fundamental knowledge ways of measuring visual discomfort based on objective and subjective methods. In the second chapter theoretical research results are presented. This research was aimed to investigate methods which use physiological signals to detect changes on the level of sense of presence. Results of the experimental research are presented in the third chapter. This research aimed to find differences in collected physiological signals when a level of visual discomfort changes. An experiment with 28 control subjects was conducted to collect these signals. The results of the thesis were published in six scientific publications – three in peer-reviewed scientific papers, three in conference proceedings. Additionally, the results of the research were presented in 8 conferences.Dissertatio

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Perceptually Optimized Visualization on Autostereoscopic 3D Displays

    Get PDF
    The family of displays, which aims to visualize a 3D scene with realistic depth, are known as "3D displays". Due to technical limitations and design decisions, such displays create visible distortions, which are interpreted by the human vision as artefacts. In absence of visual reference (e.g. the original scene is not available for comparison) one can improve the perceived quality of the representations by making the distortions less visible. This thesis proposes a number of signal processing techniques for decreasing the visibility of artefacts on 3D displays. The visual perception of depth is discussed, and the properties (depth cues) of a scene which the brain uses for assessing an image in 3D are identified. Following the physiology of vision, a taxonomy of 3D artefacts is proposed. The taxonomy classifies the artefacts based on their origin and on the way they are interpreted by the human visual system. The principles of operation of the most popular types of 3D displays are explained. Based on the display operation principles, 3D displays are modelled as a signal processing channel. The model is used to explain the process of introducing distortions. It also allows one to identify which optical properties of a display are most relevant to the creation of artefacts. A set of optical properties for dual-view and multiview 3D displays are identified, and a methodology for measuring them is introduced. The measurement methodology allows one to derive the angular visibility and crosstalk of each display element without the need for precision measurement equipment. Based on the measurements, a methodology for creating a quality profile of 3D displays is proposed. The quality profile can be either simulated using the angular brightness function or directly measured from a series of photographs. A comparative study introducing the measurement results on the visual quality and position of the sweet-spots of eleven 3D displays of different types is presented. Knowing the sweet-spot position and the quality profile allows for easy comparison between 3D displays. The shape and size of the passband allows depth and textures of a 3D content to be optimized for a given 3D display. Based on knowledge of 3D artefact visibility and an understanding of distortions introduced by 3D displays, a number of signal processing techniques for artefact mitigation are created. A methodology for creating anti-aliasing filters for 3D displays is proposed. For multiview displays, the methodology is extended towards so-called passband optimization which addresses Moiré, fixed-pattern-noise and ghosting artefacts, which are characteristic for such displays. Additionally, design of tuneable anti-aliasing filters is presented, along with a framework which allows the user to select the so-called 3d sharpness parameter according to his or her preferences. Finally, a set of real-time algorithms for view-point-based optimization are presented. These algorithms require active user-tracking, which is implemented as a combination of face and eye-tracking. Once the observer position is known, the image on a stereoscopic display is optimised for the derived observation angle and distance. For multiview displays, the combination of precise light re-direction and less-precise face-tracking is used for extending the head parallax. For some user-tracking algorithms, implementation details are given, regarding execution of the algorithm on a mobile device or on desktop computer with graphical accelerator

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Landmark Localization, Feature Matching and Biomarker Discovery from Magnetic Resonance Images

    Get PDF
    The work presented in this thesis proposes several methods that can be roughly divided into three different categories: I) landmark localization in medical images, II) feature matching for image registration, and III) biomarker discovery in neuroimaging. The first part deals with the identification of anatomical landmarks. The motivation stems from the fact that the manual identification and labeling of these landmarks is very time consuming and prone to observer errors, especially when large datasets must be analyzed. In this thesis we present three methods to tackle this challenge: A landmark descriptor based on local self-similarities (SS), a subspace building framework based on manifold learning and a sparse coding landmark descriptor based on data-specific learned dictionary basis. The second part of this thesis deals with finding matching features between a pair of images. These matches can be used to perform a registration between them. Registration is a powerful tool that allows mapping images in a common space in order to aid in their analysis. Accurate registration can be challenging to achieve using intensity based registration algorithms. Here, a framework is proposed for learning correspondences in pairs of images by matching SS features and random sample and consensus (RANSAC) is employed as a robust model estimator to learn a deformation model based on feature matches. Finally, the third part of the thesis deals with biomarker discovery using machine learning. In this section a framework for feature extraction from learned low-dimensional subspaces that represent inter-subject variability is proposed. The manifold subspace is built using data-driven regions of interest (ROI). These regions are learned via sparse regression, with stability selection. Also, probabilistic distribution models for different stages in the disease trajectory are estimated for different class populations in the low-dimensional manifold and used to construct a probabilistic scoring function.Open Acces

    Text–to–Video: Image Semantics and NLP

    Get PDF
    When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Using binaural audio for inducing intersensory illusions to create illusory tactile feedback in virtual reality

    Get PDF
    Virtual reality has the potential to simulate a variety of real-world scenarios for training- and entertainment-purposes, as it has the ability to induce a sense of “presence”: the illusion that the user is physically transported to another location and is really “there”. VR and VR-technologies have seen a recent market resurgence due to the arrival of affordable, mass-market VR-display systems, such as the Oculus Rift, HTC Vive, PlayStation VR, Samsung GearVR, and Google Cardboard. However, the use of tactile feedback to convey information about the virtual environment is often lacking in VR applications. This study addresses this lack by proposing the use of binaural audio in VR to induce illusory tactile feedback. This is done by examining the literature on intersensory illusions as well as the relationship between audio and tactile feedback to inform the design of a software prototype that is able to induce the desired feedback. This prototype is used to test the viability of such an approach to induce illusory tactile feedback and to investigate the nature of this feedback. The software prototype is used to collect data from users regarding their experiences of this type of feedback and its underlying causes. Data collection is done through observation, questionnaires, interviews, and focus groups and the results indicate that the use of binaural audio in VR can be used to effectively induce an illusory sense of tactile feedback in the absence of real-world feedback. This study contributes insights regarding the nature of illusory sensations in VR, focusing on touch-sensations. This study also provides consolidated definitions of immersion and presence as well as a consolidated list of aspects of immersion, both of which are used to detail the relationship between immersion, presence, and illusory tactile feedback. Findings provide insight into the relationship between the design of audio in VR and its ability to alter perception in the tactile modality. Findings also provide insight into aspects of VR, such as presence and believability, and their relationship to perception across various sensory modalities.Dissertation (MIS)--University of Pretoria 2018.Information ScienceMISUnrestricte
    corecore