20,722 research outputs found

    Introduction: The Fourth International Workshop on Epigenetic Robotics

    Get PDF
    As in the previous editions, this workshop is trying to be a forum for multi-disciplinary research ranging from developmental psychology to neural sciences (in its widest sense) and robotics including computational studies. This is a two-fold aim of, on the one hand, understanding the brain through engineering embodied systems and, on the other hand, building artificial epigenetic systems. Epigenetic contains in its meaning the idea that we are interested in studying development through interaction with the environment. This idea entails the embodiment of the system, the situatedness in the environment, and of course a prolonged period of postnatal development when this interaction can actually take place. This is still a relatively new endeavor although the seeds of the developmental robotics community were already in the air since the nineties (Berthouze and Kuniyoshi, 1998; Metta et al., 1999; Brooks et al., 1999; Breazeal, 2000; Kozima and Zlatev, 2000). A few had the intuition – see Lungarella et al. (2003) for a comprehensive review – that, intelligence could not be possibly engineered simply by copying systems that are “ready made” but rather that the development of the system fills a major role. This integration of disciplines raises the important issue of learning on the multiple scales of developmental time, that is, how to build systems that eventually can learn in any environment rather than program them for a specific environment. On the other hand, the hope is that robotics might become a new tool for brain science similarly to what simulation and modeling have become for the study of the motor system. Our community is still pretty much evolving and “under construction” and for this reason, we tried to encourage submissions from the psychology community. Additionally, we invited four neuroscientists and no roboticists for the keynote lectures. We received a record number of submissions (more than 50), and given the overall size and duration of the workshop together with our desire to maintain a single-track format, we had to be more selective than ever in the review process (a 20% acceptance rate on full papers). This is, if not an index of quality, at least an index of the interest that gravitates around this still new discipline

    Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding

    Get PDF
    Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness

    The Ionizing Stars of the Galactic Ultra-Compact HII Region G45.45+0.06

    Full text link
    Using the NIFS near-infrared integral-field spectrograph behind the facility adaptive optics module, ALTAIR, on Gemini North, we have identified several massive O-type stars that are responsible for the ionization of the Galactic Ultra-Compact HII region G45.45+0.06. The sources ``m'' and ``n'' from the imaging study of Feldt et a. 1998 are classified as hot, massive O-type stars based on their K-band spectra. Other bright point sources show red and/or nebular spectra and one appears to have cool star features that we suggest are due to a young, low-mass pre-main sequence component. Still two other embedded sources (``k'' and ``o'' from Feldt et al.) exhibit CO bandhead emission that may arise in circumstellar disks which are possibly still accreting. Finally, nebular lines previously identified only in higher excitation planetary nebulae and associated with KrIII and SeIV ions are detected in G45.45+0.06.Comment: Latex, 28 pages, 10 figure

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency
    corecore