1,915 research outputs found

    Motion and disparity estimation with self adapted evolutionary strategy in 3D video coding

    Get PDF
    Real world information, obtained by humans is three dimensional (3-D). In experimental user-trials, subjective assessments have clearly demonstrated the increased impact of 3-D pictures compared to conventional flat-picture techniques. It is reasonable, therefore, that we humans want an imaging system that produces pictures that are as natural and real as things we see and experience every day. Three-dimensional imaging and hence, 3-D television (3DTV) are very promising approaches expected to satisfy these desires. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. In this paper, we propose a novel approach to use Evolutionary Strategy (ES) for joint motion and disparity estimation to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression using a self adapted ES. A half pixel refinement algorithm is then applied by interpolating macro blocks in the previous frame to further improve the video quality. Experimental results demonstrate that the proposed adaptable ES with Half Pixel Joint Motion and Disparity Estimation can up to 1.5 dB objective quality gain without any additional computational cost over our previous algorithm.1Furthermore, the proposed technique get similar objective quality compared to the full search algorithm by reducing the computational cost up to 90%

    Spatial prediction based on self-similarity compensation for 3D holoscopic image and video coding

    Get PDF
    WOS:000298962501022 (NÂș de Acesso Web of Science)Holoscopic imaging, also known as integral imaging, provides a solution for glassless 3D, and is promising to change the market for 3D television. To start, this paper briefly describes the general concepts of holoscopic imaging, focusing mainly on the spatial correlations inherent to this new type of content, which appear due to the micro-lens array that is used for both acquisition and display. The micro-images that are formed behind each micro-lens, from which only one pixel is viewed from a given observation point, have a high cross-correlation between them, which can be exploited for coding. A novel scheme for spatial prediction, exploring the particular arrangement of holoscopic images, is proposed. The proposed scheme can be used for both still image coding and intra-coding of video. Experimental results based on an H.264/AVC video codec modified to handle 3D holoscopic images and video are presented, showing the superior performance of this approach

    Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

    Full text link
    Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

    Dense light field coding: a survey

    Get PDF
    Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio

    HEVC-based 3D holoscopic video coding using self-similarity compensated prediction

    Get PDF
    Holoscopic imaging, also known as integral, light field, and plenoptic imaging, is an appealing technology for glassless 3D video systems, which has recently emerged as a prospective candidate for future image and video applications, such as 3D television. However, to successfully introduce 3D holoscopic video applications into the market, adequate coding tools that can efficiently handle 3D holoscopic video are necessary. In this context, this paper discusses the requirements and challenges for 3D holoscopic video coding, and presents an efficient 3D holoscopic coding scheme based on High Efficiency Video Coding (HEVC). The proposed 3D holoscopic codec makes use of the self-similarity (SS) compensated prediction concept to efficiently explore the inherent correlation of the 3D holoscopic content in Intra- and Inter-coded frames, as well as a novel vector prediction scheme to take advantage of the peculiar characteristics of the SS prediction data. Extensive experiments were conducted, and have shown that the proposed solution is able to outperform HEVC as well as other coding solutions proposed in the literature. Moreover, a consistently better performance is also observed for a set of different quality metrics proposed in the literature for 3D holoscopic content, as well as for the visual quality of views synthesized from decompressed 3D holoscopic content.info:eu-repo/semantics/submittedVersio

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Stereo Pictorial Structure for 2D Articulated Human Pose Estimation

    Get PDF
    In this paper, we consider the problem of 2D human pose estimation on stereo image pairs. In particular, we aim at estimating the location, orientation and scale of upper-body parts of people detected in stereo image pairs from realistic stereo videos that can be found in the Internet. To address this task, we propose a novel pictorial structure model to exploit the stereo information included in such stereo image pairs: the Stereo Pictorial Structure (SPS). To validate our proposed model, we contribute a new annotated dataset of stereo image pairs, the Stereo Human Pose Estimation Dataset (SHPED), obtained from YouTube stereoscopic video sequences, depicting people in challenging poses and diverse indoor and outdoor scenarios. The experimental results on SHPED indicates that SPS improves on state-ofthe- art monocular models thanks to the appropriate use of the stereo informatio

    Spatial and temporal integration of binocular disparity in the primate brain

    Get PDF
    Le systĂšme visuel du primate s'appuie sur les lĂ©gĂšres diffĂ©rences entre les deux projections rĂ©tiniennes pour percevoir la profondeur. Cependant, on ne sait pas exactement comment ces disparitĂ©s binoculaires sont traitĂ©es et intĂ©grĂ©es par le systĂšme nerveux. D'un cĂŽtĂ©, des enregistrements unitaires chez le macaque permettent d'avoir accĂšs au codage neuronal de la disparitĂ© Ă  un niveau local. De l'autre cĂŽtĂ©, la neuroimagerie fonctionnelle (IRMf) chez l'humain met en lumiĂšre les rĂ©seaux corticaux impliquĂ©s dans le traitement de la disparitĂ© Ă  un niveau macroscopique mais chez une espĂšce diffĂ©rente. Dans le cadre de cette thĂšse, nous proposons d'utiliser la technique de l'IRMf chez le macaque pour permettre de faire le lien entre les enregistrements unitaires chez le macaque et les enregistrements IRMf chez l'humain. Cela, afin de pouvoir faire des comparaisons directes entre les deux espĂšces. Plus spĂ©cifiquement, nous nous sommes intĂ©ressĂ©s au traitement spatial et temporal des disparitĂ©s binoculaires au niveau cortical mais aussi au niveau perceptif. En Ă©tudiant l'activitĂ© corticale en rĂ©ponse au mouvement tridimensionnel (3D), nous avons pu montrer pour la premiĂšre fois 1) qu'il existe un rĂ©seau dĂ©diĂ© chez le macaque qui contient des aires allant au-delĂ  du cluster MT et des aires environnantes et 2) qu'il y a des homologies avec le rĂ©seau trouvĂ© chez l'humain en rĂ©ponse Ă  des stimuli similaires. Dans une deuxiĂšme Ă©tude, nous avons tentĂ© d'Ă©tablir un lien entre les biais perceptifs qui reflĂštent les rĂ©gularitĂ©s statistiques 3D ans l'environnement visuel et l'activitĂ© corticale. Nous nous sommes demandĂ©s si de tels biais existent et peuvent ĂȘtre reliĂ©s Ă  des rĂ©ponses spĂ©cifiques au niveau macroscopique. Nous avons trouvĂ© de plus fortes activations pour le stimulus reflĂ©tant les statistiques naturelles chez un sujet, dĂ©montrant ainsi une possible influence des rĂ©gularitĂ©s spatiales sur l'activitĂ© corticale. Des analyses supplĂ©mentaires sont cependant nĂ©cessaires pour conclure de façon dĂ©finitive. NĂ©anmoins, nous avons pu confirmer de façon robuste l'existence d'un vaste rĂ©seau cortical rĂ©pondant aux disparitĂ©s corrĂ©lĂ©es chez le macaque. Pour finir, nous avons pu mesurer pour la premiĂšre fois les points rĂ©tiniens correspondants au niveau du mĂ©ridien vertical chez un sujet macaque qui rĂ©alisait une tĂąche comportementale (procĂ©dure Ă  choix forcĂ©). Nous avons pu comparer les rĂ©sultats obtenus avec des donnĂ©es Ă©galement collectĂ©es chez des participants humains avec le mĂȘme protocole. Dans les diffĂ©rentes sections de discussion, nous montrons comment nos diffĂ©rents rĂ©sultats ouvrent la voie Ă  de nouvelles perspectives.The primate visual system strongly relies on the small differences between the two retinal projections to perceive depth. However, it is not fully understood how those binocular disparities are computed and integrated by the nervous system. On the one hand, single-unit recordings in macaque give access to neuronal encoding of disparity at a very local level. On the other hand, functional neuroimaging (fMRI) studies in human shed light on the cortical networks involved in disparity processing at a macroscopic level but with a different species. In this thesis, we propose to use an fMRI approach in macaque to bridge the gap between single-unit and fMRI recordings conducted in the non-human and human primate brain, respectively, by allowing direct comparisons between the two species. More specifically, we focused on the temporal and spatial processing of binocular disparities at the cortical but also at the perceptual level. Investigating cortical activity in response to motion-in-depth, we could show for the first time that 1) there is a dedicated network in macaque that comprises areas beyond the MT cluster and its surroundings and that 2) there are homologies with the human network involved in processing very similar stimuli. In a second study, we tried to establish a link between perceptual biases that reflect statistical regularities in the three-dimensional visual environment and cortical activity, by investigating whether such biases exist and can be related to specific responses at a macroscopic level. We found stronger activity for the stimulus reflecting natural statistics in one subject, demonstrating a potential influence of spatial regularities on the cortical activity. Further work is needed to firmly conclude about such a link. Nonetheless, we robustly confirmed the existence of a vast cortical network responding to correlated disparities in the macaque brain. Finally, we could measure for the first time retinal corresponding points on the vertical meridian of a macaque subject performing a behavioural task (forced-choice procedure) and compare it to the data we also collected in several human observers with the very same protocol. In the discussion sections, we showed how these findings open the door to varied perspectives
    • 

    corecore