1,056 research outputs found

    Unsupervised learning of human motion

    Get PDF
    An unsupervised learning algorithm that can obtain a probabilistic model of an object composed of a collection of parts (a moving human body in our examples) automatically from unlabeled training data is presented. The training data include both useful "foreground" features as well as features that arise from irrelevant background clutter - the correspondence between parts and detected features is unknown. The joint probability density function of the parts is represented by a mixture of decomposable triangulated graphs which allow for fast detection. To learn the model structure as well as model parameters, an EM-like algorithm is developed where the labeling of the data (part assignments) is treated as hidden variables. The unsupervised learning technique is not limited to decomposable triangulated graphs. The efficiency and effectiveness of our algorithm is demonstrated by applying it to generate models of human motion automatically from unlabeled image sequences, and testing the learned models on a variety of sequences

    Towards Detection of Human Motion

    Get PDF
    Detecting humans in images is a useful application of computer vision. Loose and textured clothing, occlusion and scene clutter make it a difficult problem because bottom-up segmentation and grouping do not always work. We address the problem of detecting humans from their motion pattern in monocular image sequences; extraneous motions and occlusion may be present. We assume that we may not rely on segmentation, nor grouping and that the vision front-end is limited to observing the motion of key points and textured patches in between pairs of frames. We do not assume that we are able to track features for more than two frames. Our method is based on learning an approximate probabilistic model of the joint position and velocity of different body features. Detection is performed by hypothesis testing on the maximum a posteriori estimate of the pose and motion of the body. Our experiments on a dozen of walking sequences indicate that our algorithm is accurate and efficient

    A computational model for motion detection and direction discrimination in humans

    Get PDF
    Seeing biological motion is very important for both humans and computers. Psychophysics experiments show that the ability of our visual system for biological motion detection and direction discrimination is different from that for simple translation. The existing quantitative models of motion perception cannot explain these findings. We propose a computational model, which uses learning and statistical inference based on the joint probability density function (PDF) of the position and motion of the body, on stimuli similar to (Neri et al., 1998). Our results are consistent with the psychophysics indicating that our model is consistent with human motion perception, accounting for both biological motion and pure translation

    Towards Object-Centric Scene Understanding

    Get PDF
    Visual perception for autonomous agents continues to attract community attention due to the disruptive technologies and the wide applicability of such solutions. Autonomous Driving (AD), a major application in this domain, promises to revolutionize our approach to mobility while bringing critical advantages in limiting accident fatalities. Fueled by recent advances in Deep Learning (DL), more computer vision tasks are being addressed using a learning paradigm. Deep Neural Networks (DNNs) succeeded consistently in pushing performances to unprecedented levels and demonstrating the ability of such approaches to generalize to an increasing number of difficult problems, such as 3D vision tasks. In this thesis, we address two main challenges arising from the current approaches. Namely, the computational complexity of multi-task pipelines, and the increasing need for manual annotations. On the one hand, AD systems need to perceive the surrounding environment on different levels of detail and, subsequently, take timely actions. This multitasking further limits the time available for each perception task. On the other hand, the need for universal generalization of such systems to massively diverse situations requires the use of large-scale datasets covering long-tailed cases. Such requirement renders the use of traditional supervised approaches, despite the data readily available in the AD domain, unsustainable in terms of annotation costs, especially for 3D tasks. Driven by the AD environment nature and the complexity dominated (unlike indoor scenes) by the presence of other scene elements (mainly cars and pedestrians) we focus on the above-mentioned challenges in object-centric tasks. We, then, situate our contributions appropriately in fast-paced literature, while supporting our claims with extensive experimental analysis leveraging up-to-date state-of-the-art results and community-adopted benchmarks

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Visual System Development in People with One Eye: Behaviour and Structural Neural Correlates

    Get PDF
    Postnatal monocular deprivation from the surgical removal (enucleation) of one eye in humans results in intact spatial form vision, although its consequences on motion perception development are less clear. Changes in brain structure following early monocular enucleation have either been assessed in species whose visual system is quite different from humans, or in enucleated monkeys and humans following short-term survival. In this dissertation, I sought to determine the long-term effects of enucleation on visual system development by examining behavioural visual abilities and visual system morphology in adults who have had one eye enucleated early in life due to retinoblastoma. In Chapter II, I conducted a series of speed and luminance contrast discrimination tasks not yet implemented in this group. Early monocular enucleation results in impaired speed discrimination but intact contrast perception compared to binocular and monocular viewing controls. These findings suggest differential effects of enucleation on the development of spatial form vision and motion perception. In Chapters III and IV, I obtained high-resolution structural magnetic resonance images to assess the morphological development of subcortical (Chapter III) and cortical (Chapter IV) structures in the visual pathway. Early monocular enucleation resulted in decreased optic chiasm width and volume, optic tract diameters, and lateral geniculate nuclei (LGN) volumes compared with binocularly intact controls. Surprisingly, however, optic tract diameter and LGN volume decreases were less severe contralateral to the remaining eye. Early monocular enucleation also resulted in increased grey matter surface area of visual and non-visual cortices compared with binocularly intact controls. Consistent with the LGN asymmetry, increased surface area of the primary visual cortex was restricted to the hemisphere contralateral to the remaining eye. Surprisingly, however, these increases were found for those with right- but not left-eye enucleation, suggesting different developmental time periods for each hemisphere. Possible mechanisms of altered development following early monocular enucleation include: 1) recruitment of deafferented cells by the remaining eye, 2) retention of deafferented cells due to feedback from visual cortex, and 3) a disruption in synaptic pruning. These data highlight the importance of receiving normal levels of binocular visual input during infancy for typical visual development
    corecore