5,189 research outputs found

    Visual Distortions in 360-degree Videos.

    Get PDF
    Omnidirectional (or 360°) images and videos are emergent signals being used in many areas, such as robotics and virtual/augmented reality. In particular, for virtual reality applications, they allow an immersive experience in which the user can interactively navigate through a scene with three degrees of freedom, wearing a head-mounted display. Current approaches for capturing, processing, delivering, and displaying 360° content, however, present many open technical challenges and introduce several types of distortions in the visual signal. Some of the distortions are specific to the nature of 360° images and often differ from those encountered in classical visual communication frameworks. This paper provides a first comprehensive review of the most common visual distortions that alter 360° signals going through the different processing elements of the visual communication pipeline. While their impact on viewers' visual perception and the immersive experience at large is still unknown-thus, it is an open research topic-this review serves the purpose of proposing a taxonomy of the visual distortions that can be encountered in 360° signals. Their underlying causes in the end-to-end 360° content distribution pipeline are identified. This taxonomy is essential as a basis for comparing different processing techniques, such as visual enhancement, encoding, and streaming strategies, and allowing the effective design of new algorithms and applications. It is also a useful resource for the design of psycho-visual studies aiming to characterize human perception of 360° content in interactive and immersive applications

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    3D Scene Geometry Estimation from 360^\circ Imagery: A Survey

    Full text link
    This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360^\circ, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and mapping). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.Comment: Published in ACM Computing Survey

    A Dynamic Approach to Pose Invariant Face Identification Using Cellular Simultaneous Recurrent Networks

    Get PDF
    Face recognition is a widely covered and desirable research field that produced multiple techniques and different approaches. Most of them have severe limitations with pose variations or face rotation. The immediate goal of this thesis is to deal with pose variations by implementing a face recognition system using a Cellular Simultaneous Recurrent Network (CSRN). The CSRN is a novel bio-inspired recurrent neural network that mimics reinforcement learning in the brain. The recognition task is defined as an identification problem on image sequences. The goal is to correctly match a set of unknown pose distorted probe face sequences with a set of known gallery sequences. This system comprises of a pre-processing stage for face and feature extraction and a recognition stage to perform the identification. The face detection algorithm is based on the scale-space method combined with facial structural knowledge. These steps include extraction of key landmark points and motion unit vectors that describe movement of face sequqnces. The identification process applies Eigenface and PCA and reduces each image to a pattern vector used as input for the CSRN. In the training phase the CSRN learns the temporal information contained in image sequences. In the testing phase the network predicts the output pattern and finds similarity with a test input pattern indicating a match or mismatch.Previous applications of a CSRN system in face recognition have shown promise. The first objective of this research is to evaluate those prior implementations of CSRN-based pose invariant face recognition in video images with large scale databases. The publicly available VidTIMIT Audio-Video face dataset provides all the sequences needed for this study. The second objective is to modify a few well know standard face recognition algorithms to handle pose invariant face recognition for appropriate benchmarking with the CSRN. The final objective is to further improve CSRN face recognition by introducing motion units which can be used to capture the direction and intensity of movement of feature points in a rotating fac

    Automatic Detection and Characterization of Pathological Fluid Regions in Optical Coherence Tomography Images

    Get PDF
    Programa Oficial de Doutoramento en Computación. 5009V01[Abstract] Intraretinal fluid accumulation is both the common symptom and culprit of the main causes of blindness in developed countries: Age-related Macular Degeneration and Diabetic Macular Edema. For its diagnosis, experts of the domain employ Optical Coherence Tomography images (OCT), providing non-invasive cross-sectional representations of the retinal structures. However, like any medical imaging modality, OCT is influenced by multiple factors that impact its quality and subsequent interpretation. Coupled with the subjectiveness of the human experts, these factors can significantly affect the diagnostic process, treatment and quality of life for the affected individuals (particularly in these pathologies where early detection is crucial). To address these challenges, Computer-Aided Diagnosis (CAD) methodologies are developed, offering a layer of abstraction of the information present in the images. Still, in the particular scenario of these pathological fluid accumulations, the development of these methodologies is specially difficult due to their diffuse nature without defined boundaries. In this thesis, we proposed different CAD methodologies with the objective of helping expert clinicians to better detect and understand these pathologies. Furthermore, we expand the developed methodologies to other medical imaging modalities and conditions, such as macular neovascularizations in OCT Angiographies and COVID-19 diagnosis through the analysis of lung chest radiographs.[Resumen] La acumulación de líquido intrarretiniano es tanto síntoma común como culpable de las principales causas de ceguera en los países desarrollados: la degeneración macular asociada a la edad y el edema macular diabético. Para su diagnóstico, los expertos en el campo emplean imágenes de Tomografía de Coherencia Óptica (OCT), que proporcionan representaciones transversales no invasivas de las estructuras retinianas. Sin embargo, al igual que cualquier modalidad de imagen médica, OCT se ve influenciado por múltiples factores que afectan a su calidad y posterior interpretación. Junto con la subjetividad de los expertos humanos, estos factores pueden afectar significativamente el proceso diagnóstico, tratamiento y calidad de vida de las personas afectadas (particularmente en estas patologías donde una detección temprana es crucial). Para abordar estos desafíos, se desarrollan metodologías de diagnóstico asistido por ordenador (CAD), que ofrecen una capa de abstracción de la información presente en las imágenes. Sin embargo, en el escenario particular de estas acumulaciones patológicas de fluido, el desarrollo de estas metodologías es especialmente difícil debido a su naturaleza difusa, sin bordes definidos. En esta tesis doctoral proponemos diferentes metodologías CAD con el objetivo de ayudar a las personas expertas del dominio a detectar y comprender mejor estas patologías. Además, expandimos las metodologías desarrolladas a otras modalidades de imagen médica y afecciones, como al análisis de neovascularizaciones maculares en Angiografía OCT y al diagnóstico de COVID-19 mediante radiografías torácicas.[Resumo] A acumulación de líquido intrarretiniano é tanto o síntoma común como culpable das principais causas de cegueira nos países desenvolvidos: a dexeneración macular asociada á idade e o edema macular diabético. Para o seu diagnóstico, os expertos no campo empregan imaxes de tomografía de coherencia óptica (OCT), que proporcionan representacións transversais non invasivas das estruturas retinianas. Non obstante, ao igual que calquera modalidade de imaxe médica, a OCT vese influenciada por múltiples factores que afectan a s´ua calidade e a súa posterior interpretación. Xunto coa subxectividade dos expertos humanos, estes factores poden afectar significativamente ao proceso diagn´ostico, ao tratamento e á calidade de vida das persoas afectadas (particularmente nestas patoloxías onde unha detección precoz é crucial). Para abordar estes desafíos, desenvólvense metodoloxías de diagnóstico asistido por ordenador (CAD), que ofrecen unha capa de abstracción da información presente nas imaxes. Non obstante, no escenario particular das acumulacións patolóxicas de líquido, o desenvolvemento destas metodoloxías é especialmente difícil debido a súa natureza difusa, sen bordes definidos. Nesta tese de doutoramento propoñemos diferentes metodoloxías de CAD co obxectivo de axudar ás persoas expertas do campo a detectar e comprender mellor estas patoloxías. Ademais, expandimos as metodoloxías desenvoltas a outras modalidades de imaxe médica e patoloxías, como a an´alise de neovascularizacións maculares en Anxiografía OCT e ao diagnóstico da COVID-19 mediante a análise de radiografías torácicas
    corecore