2,823 research outputs found

    SDR-GAIN: A High Real-Time Occluded Pedestrian Pose Completion Method for Autonomous Driving

    Full text link
    To mitigate the challenges arising from partial occlusion in human pose keypoint based pedestrian detection methods , we present a novel pedestrian pose keypoint completion method called the separation and dimensionality reduction-based generative adversarial imputation networks (SDR-GAIN) . Firstly, we utilize OpenPose to estimate pedestrian poses in images. Then, we isolate the head and torso keypoints of pedestrians with incomplete keypoints due to occlusion or other factors and perform dimensionality reduction to enhance features and further unify feature distribution. Finally, we introduce two generative models based on the generative adversarial networks (GAN) framework, which incorporate Huber loss, residual structure, and L1 regularization to generate missing parts of the incomplete head and torso pose keypoints of partially occluded pedestrians, resulting in pose completion. Our experiments on MS COCO and JAAD datasets demonstrate that SDR-GAIN outperforms basic GAIN framework, interpolation methods PCHIP and MAkima, machine learning methods k-NN and MissForest in terms of pose completion task. In addition, the runtime of SDR-GAIN is approximately 0.4ms, displaying high real-time performance and significant application value in the field of autonomous driving

    "'Who are you?' - Learning person specific classifiers from video"

    Get PDF
    We investigate the problem of automatically labelling faces of characters in TV or movie material with their names, using only weak supervision from automaticallyaligned subtitle and script text. Our previous work (Everingham et al. [8]) demonstrated promising results on the task, but the coverage of the method (proportion of video labelled) and generalization was limited by a restriction to frontal faces and nearest neighbour classification. In this paper we build on that method, extending the coverage greatly by the detection and recognition of characters in profile views. In addition, we make the following contributions: (i) seamless tracking, integration and recognition of profile and frontal detections, and (ii) a character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters. We report results on seven episodes of the TV series “Buffy the Vampire Slayer”, demonstrating significantly increased coverage and performance with respect to previous methods on this material

    A Longitudinal Analysis on the Feasibility of Iris Recognition Performance for Infants 0-2 Years Old

    Get PDF
    The focus of this study was to longitudinally evaluate iris recognition for infants between the ages of 0 to 2 years old. Image quality metrics of infant and adult irises acquired on the same iris camera were compared. Matching performance was evaluated for four groups, infants 0 to 6 months, 7 to 12 months, 13 to 24 months, and adults. A mixed linear regression model was used to determine if infants’ genuine similarity scores changed over time. This study found that image quality metrics were different between infants and adults but in the older group, (13 to 24 months old) the image quality metric scores were more likely to be similar to adults. Infants 0 to 6 months old had worse performance at an FMR of 0.01% than infants 7 to 12 months, 13 to 24 months, and adults

    The Missing Link between Morphemic Assemblies and Behavioral Responses:a Bayesian Information-Theoretical model of lexical processing

    Get PDF
    We present the Bayesian Information-Theoretical (BIT) model of lexical processing: A mathematical model illustrating a novel approach to the modelling of language processes. The model shows how a neurophysiological theory of lexical processing relying on Hebbian association and neural assemblies can directly account for a variety of effects previously observed in behavioural experiments. We develop two information-theoretical measures of the distribution of usages of a morpheme or word, and use them to predict responses in three visual lexical decision datasets investigating inflectional morphology and polysemy. Our model offers a neurophysiological basis for the effects of morpho-semantic neighbourhoods. These results demonstrate how distributed patterns of activation naturally result in the arisal of symbolic structures. We conclude by arguing that the modelling framework exemplified here, is a powerful tool for integrating behavioural and neurophysiological results

    Seeing with ears: how we create an auditory representation of space with echoes and its relation with other senses

    Get PDF
    Spatial perception is the capability that allows us to learn about the environment. All our senses are involved in creating a representation of the external world. When we create the representation of space we rely primarily on visual information, but it is the integration with the other senses that allows us a more global and truthful representation of it. While the influence of vision and the integration of different senses among each other in spatial perception has been widely investigated, many questions remain about the role of the acoustic system in space perception and how it can be influenced by the other senses. Give an answer to these questions on healthy people can help to better understand whether the same \u201crules\u201d can be applied to, for example, people that have lost vision in the early stages of development. Understanding how spatial perception works in blind people from birth is essential to then develop rehabilitative methodologies or technologies to help these people to provide for lack of vision, since vision is the main source of spatial information. For this reason, one of the main scientific objective of this thesis is to increase knowledge about auditory spatial perception in sighted and visually impaired people, thanks to the development of new tasks to assess spatial abilities. Moreover, I focus my attention on a recent investigative topic in humans, i.e. echolocation. Echolocation has a great potential in terms of improvement regarding space and navigation skills for people with visual disabilities. Several studies demonstrate how the use of this technique can be favorable in the absence of vision, both on the level perceptual level and also at the social level. Based in the importance of echolocation, we developed some tasks to test the ability of novice people and we undergo the participants to an echolocation training to see how long does it take to manage this technique (in simple task). Instead of using blind individuals, we decide to test the ability of novice sighted people to see whether technique is blind related or not and whether it is possible to create a representation of space using echolocatio

    Eye movement studies with a vestibular prosthesis/

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references.Vestibular loss, which can manifest as dizziness, imbalance, or spatial disorientation, is widespread and often caused by inner ear hair cell malfunction. To address these problems, we are developing a vestibular implant analogous to cochlear implants for the deaf. This vestibular prosthesis provides pulsatile electrical stimulation to the vestibular nerve. Prosthesis effectiveness is assessed using the vestibulo-ocular reflex (VOR), since the VOR helps stabilize gaze in healthy individuals by evoking eye movements that compensate for head movements. In this thesis, the prosthesis was used to probe the high frequency VOR in squirrel monkeys and guinea pigs. In two studies, modulated stimulation was applied acutely to characterize the VOR between 1.5 and 701 Hz. A third study characterized the VOR response to chronic stimulation with a constant rate of 250 Hz. The VOR has previously been characterized up to 50 Hz in monkeys and 2 Hz in guinea pigs by physically rotating subjects. This range was extended in these studies, by using electrical stimulation from the prosthesis. Eye movement spectral peaks were used to characterize the VOR frequency response. The VOR was measurable up to 267 Hz in squirrel monkeys and 151 Hz in guinea pigs. The magnitude response was similar in both species - it increased gradually with frequency, peaked (at 140 Hz in squirrel monkeys and 50 Hz in guinea pigs), and then rolled off.(cont.) The high frequency fall-off was consistent with the low-pass nature of the oculomotor plant. The phase responses had a linear lag with frequency, consistent with a fixed 4 ms delay of the VOR three-neuronarc. Since the VOR responds at high frequencies, this raises the question whether the prosthesis causes eye movements at the prosthesis pulse rate, since electrical stimulation elicits neural responses that are phase-locked with the stimulation. Such responses might cause visual blurring for a patient using the device. This thesis shows that such eye movements are measurable, and have substantial velocity magnitude of 8.1 deg/s initially, but within 30 minutes the magnitude reduces by 80% and probably does not yield perceptible visual blurring.by Michael A. Saginaw.Ph.D

    Learning Representations that Support Extrapolation

    Full text link
    Extrapolation -- the ability to make inferences that go beyond the scope of one's experiences -- is a hallmark of human intelligence. By contrast, the generalization exhibited by contemporary neural network algorithms is largely limited to interpolation between data points in their training corpora. In this paper, we consider the challenge of learning representations that support extrapolation. We introduce a novel visual analogy benchmark that allows the graded evaluation of extrapolation as a function of distance from the convex domain defined by the training data. We also introduce a simple technique, temporal context normalization, that encourages representations that emphasize the relations between objects. We find that this technique enables a significant improvement in the ability to extrapolate, considerably outperforming a number of competitive techniques.Comment: ICML 202

    Diagnostic trials: a field guide

    Get PDF
    The Diagnostic Trials, conducted in Kenya, Malawi, Mali, Nigeria, and Tanzania, constitute a major part of Africa Soil Information Service agronomic activities. This guide provides a standard tool that is part of a structured approach for the diagnosis of soil health related constraints to crop production. It is intended for use by national and international agricultural research systems, development partners and extension services to ensure standard procedures in data collection that will feed to an Africa-wide database of diagnostic trials, allowing an increase in data density over time and an improvement of the reliability in the assessment of soil constraints and inferences
    • …
    corecore