1,700 research outputs found

    Occlusion resistant learning of intuitive physics from videos

    Get PDF
    To reach human performance on complex tasks, a key ability for artificial systems is to understand physical interactions between objects, and predict future outcomes of a situation. This ability, often referred to as intuitive physics, has recently received attention and several methods were proposed to learn these physical rules from video sequences. Yet, most of these methods are restricted to the case where no, or only limited, occlusions occur. In this work we propose a probabilistic formulation of learning intuitive physics in 3D scenes with significant inter-object occlusions. In our formulation, object positions are modeled as latent variables enabling the reconstruction of the scene. We then propose a series of approximations that make this problem tractable. Object proposals are linked across frames using a combination of a recurrent interaction network, modeling the physics in object space, and a compositional renderer, modeling the way in which objects project onto pixel space. We demonstrate significant improvements over state-of-the-art in the intuitive physics benchmark of IntPhys. We apply our method to a second dataset with increasing levels of occlusions, showing it realistically predicts segmentation masks up to 30 frames in the future. Finally, we also show results on predicting motion of objects in real videos

    Ongoing Emergence: A Core Concept in Epigenetic Robotics

    Get PDF
    We propose ongoing emergence as a core concept in epigenetic robotics. Ongoing emergence refers to the continuous development and integration of new skills and is exhibited when six criteria are satisfied: (1) continuous skill acquisition, (2) incorporation of new skills with existing skills, (3) autonomous development of values and goals, (4) bootstrapping of initial skills, (5) stability of skills, and (6) reproducibility. In this paper we: (a) provide a conceptual synthesis of ongoing emergence based on previous theorizing, (b) review current research in epigenetic robotics in light of ongoing emergence, (c) provide prototypical examples of ongoing emergence from infant development, and (d) outline computational issues relevant to creating robots exhibiting ongoing emergence

    Understanding the effects of one’s actions upon hidden objects and the development of search behaviour in 7-month-old infants

    Get PDF
    Infants' understanding of how their actions affect the visibility of hidden objects may be a crucial aspect of the development of search behaviour. To investigate this possibility, 7-month-old infants took part in a two-day training study. At the start of the first session, and at the end of the second, all infants performed a search task with a hiding-well. On both days, infants had an additional training experience. The ‘Agency group’ learnt to spin a turntable to reveal a hidden toy, whilst the ‘Means-End’ group learnt the same means-end motor action, but the toy was always visible. The Agency group showed greater improvement on the hiding-well search task following their training experience. We suggest that the Agency group's turntable experience was effective because it provided the experience of bringing objects back into visibility by one's actions. Further, the performance of the Agency group demonstrates generalized transfer of learning across situations with both different motor actions and stimuli in infants as young as 7 months

    Modellierung der kognitiven SĂ€uglingsentwicklung mittels neuronaler Netze

    Get PDF
    This thesis investigates the development of early cognition in infancy using neural network models. Fundamental events in visual perception such as caused motion, occlusion, object permanence, tracking of moving objects behind occluders, object unity perception and sequence learning are modeled in a unifying computational framework while staying close to experimental data in developmental psychology of infancy. In the first project, the development of causality and occlusion perception in infancy is modeled using a simple, three-layered, recurrent network trained with error backpropagation to predict future inputs (Elman network). The model unifies two infant studies on causality and occlusion perception. Subsequently, in the second project, the established framework is extended to a larger prediction network that models the development of object unity, object permanence and occlusion perception in infancy. It is shown that these different phenomena can be unified into a single theoretical framework thereby explaining experimental data from 14 infant studies. The framework shows that these developmental phenomena can be explained by accurately representing and predicting statistical regularities in the visual environment. The models assume (1) different neuronal populations processing different motion directions of visual stimuli in the visual cortex of the newborn infant which are supported by neuroscientific evidence and (2) available learning algorithms that are guided by the goal of predicting future events. Specifically, the models demonstrate that no innate force notions, motion analysis modules, common motion detectors, specific perceptual rules or abilities to "reason" about entities which have been widely postulated in the developmental literature are necessary for the explanation of the discussed phenomena. Since the prediction of future events turned out to be fruitful for theoretical explanation of various developmental phenomena and a guideline for learning in infancy, the third model addresses the development of visual expectations themselves. A self-organising, fully recurrent neural network model that forms internal representations of input sequences and maps them onto eye movements is proposed. The reinforcement learning architecture (RLA) of the model learns to perform anticipatory eye movements as observed in a range of infant studies. The model suggests that the goal of maximizing the looking time at interesting stimuli guides infants' looking behavior thereby explaining the occurrence and development of anticipatory eye movements and reaction times. In contrast to classical neural network modelling approaches in the developmental literature, the model uses local learning rules and contains several biologically plausible elements like excitatory and inhibitory spiking neurons, spike-timing dependent plasticity (STDP), intrinsic plasticity (IP) and synaptic scaling. It is also novel from the technical point of view as it uses a dynamic recurrent reservoir shaped by various plasticity mechanisms and combines it with reinforcement learning. The model accounts for twelve experimental studies and predicts among others anticipatory behavior for arbitrary sequences and facilitated reacquisition of already learned sequences. All models emphasize the development of the perception of the discussed phenomena thereby addressing the questions of how and why this developmental change takes place - questions that are difficult to be assessed experimentally. Despite the diversity of the discussed phenomena all three projects rely on the same principle: the prediction of future events. This principle suggests that cognitive development in infancy may largely be guided by building internal models and representations of the visual environment and using those models to predict its future development.Die vorliegende Dissertation untersucht die Entwicklung frĂŒher kognitiver FĂ€higkeiten im SĂ€uglingsalter mit neuronalen Netzen. Grundlegende Ereignisse in der visuellen Wahrnehmung wie durch StĂ¶ĂŸe verursachte Bewegung, Verdeckung, Objektpermanenz, Verfolgen bewegter Objekte hinter Verdeckungen, Wahrnehmung von Objekteinheit und das Erlernen von Reizfolgen werden in einem vereinheitlichenden, theoretischen Rahmen modelliert, wĂ€hrend die NĂ€he zu experimentellen Ergebnissen der Entwicklungspsychologie im SĂ€uglingsalter gewahrt wird

    Object Permanence Emerges in a Random Walk along Memory

    Full text link
    This paper proposes a self-supervised objective for learning representations that localize objects under occlusion - a property known as object permanence. A central question is the choice of learning signal in cases of total occlusion. Rather than directly supervising the locations of invisible objects, we propose a self-supervised objective that requires neither human annotation, nor assumptions about object dynamics. We show that object permanence can emerge by optimizing for temporal coherence of memory: we fit a Markov walk along a space-time graph of memories, where the states in each time step are non-Markovian features from a sequence encoder. This leads to a memory representation that stores occluded objects and predicts their motion, to better localize them. The resulting model outperforms existing approaches on several datasets of increasing complexity and realism, despite requiring minimal supervision and assumptions, and hence being broadly applicable

    Occlusion-Robust MVO: Multimotion Estimation Through Occlusion Via Motion Closure

    Full text link
    Visual motion estimation is an integral and well-studied challenge in autonomous navigation. Recent work has focused on addressing multimotion estimation, which is especially challenging in highly dynamic environments. Such environments not only comprise multiple, complex motions but also tend to exhibit significant occlusion. Previous work in object tracking focuses on maintaining the integrity of object tracks but usually relies on specific appearance-based descriptors or constrained motion models. These approaches are very effective in specific applications but do not generalize to the full multimotion estimation problem. This paper presents a pipeline for estimating multiple motions, including the camera egomotion, in the presence of occlusions. This approach uses an expressive motion prior to estimate the SE (3) trajectory of every motion in the scene, even during temporary occlusions, and identify the reappearance of motions through motion closure. The performance of this occlusion-robust multimotion visual odometry (MVO) pipeline is evaluated on real-world data and the Oxford Multimotion Dataset.Comment: To appear at the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). An earlier version of this work first appeared at the Long-term Human Motion Planning Workshop (ICRA 2019). 8 pages, 5 figures. Video available at https://www.youtube.com/watch?v=o_N71AA6FR
    • 

    corecore