65,783 research outputs found

    Extraction of Eye and Mouth Features for Drowsiness Face Detection Using Neural Network

    Get PDF
    Facial feature extraction is the process of searching for features of facial components such as eyes, nose, mouth and other parts of human facial features. Facial feature extraction is essential for initializing processing techniques such as face tracking, facial expression recognition or face shape recognition. Among all facial features, eye area detection is important because of the detection and localization of the eye. The location of all other facial features can be identified. This study describes automated algorithms for feature extraction of eyes and mouth. The data takes form of video, then converted into a sequence of images through frame extraction process. From the sequence of images, feature extraction is based on the morphology of the eyes and mouth using Neural Network Backpropagation method. After feature extraction of the eye and mouth is completed, the result of the feature extraction will later be used to detect a person’s drowsiness, being useful for other research

    Predicting OCT biological marker localization from weak annotations.

    Get PDF
    Recent developments in deep learning have shown success in accurately predicting the location of biological markers in Optical Coherence Tomography (OCT) volumes of patients with Age-Related Macular Degeneration (AMD) and Diabetic Retinopathy (DR). We propose a method that automatically locates biological markers to the Early Treatment Diabetic Retinopathy Study (ETDRS) rings, only requiring B-scan-level presence annotations. We trained a neural network using 22,723 OCT B-Scans of 460 eyes (433 patients) with AMD and DR, annotated with slice-level labels for Intraretinal Fluid (IRF) and Subretinal Fluid (SRF). The neural network outputs were mapped into the corresponding ETDRS rings. We incorporated the class annotations and domain knowledge into a loss function to constrain the output with biologically plausible solutions. The method was tested on a set of OCT volumes with 322 eyes (189 patients) with Diabetic Macular Edema, with slice-level SRF and IRF presence annotations for the ETDRS rings. Our method accurately predicted the presence of IRF and SRF in each ETDRS ring, outperforming previous baselines even in the most challenging scenarios. Our model was also successfully applied to en-face marker segmentation and showed consistency within C-scans, despite not incorporating volume information in the training process. We achieved a correlation coefficient of 0.946 for the prediction of the IRF area

    3D Camouflaging Object using RGB-D Sensors

    Full text link
    This paper proposes a new optical camouflage system that uses RGB-D cameras, for acquiring point cloud of background scene, and tracking observers eyes. This system enables a user to conceal an object located behind a display that surrounded by 3D objects. If we considered here the tracked point of observer s eyes is a light source, the system will work on estimating shadow shape of the display device that falls on the objects in background. The system uses the 3d observer s eyes and the locations of display corners to predict their shadow points which have nearest neighbors in the constructed point cloud of background scene.Comment: 6 pages, 12 figures, 2017 IEEE International Conference on SM


    Get PDF
    In the near future, it is expected that the robot can interact with humans. Communication itself has many varieties. Not only from word to word, but body language also be the medium. One of them is using facial expressions. Facial expression in human communication is always used to show human emotions. Whether it is happy, sad, angry, shocked, disappointed, or even relaxed? This final project focused on how to make robots that only consist of head, so it could make a variety facial expression like human beings. This Face Humanoid Robot divided into several subsystems. There are image processing subsystem, hardware subsystem and subsystem of controllers. In image processing subsystem, webcam is used for image data acquisition processed by a computer. This process needs Microsoft Visual C compiler for programming that has been installed with the functions of the Open Source Computer Vision Library (OpenCV). Image processing subsystem is used for recognizing human facial expressions. With image processing, it can be seen the pattern of an object. Backpropagation Neural Network is useful to recognize the object pattern. Subsystem hardware is a Humanoid Robot Face. Subsystem controller is a single microcontroller ATMega128 and a camera that can capture images at a distance of 50 to 120 cm. The process of running the robot is as follows. Images captured by a camera webcam. From the images that have been processed with image processing by a computer, human facial expression is obtained. Data results are sent to the subsystem controller via serial communications. Microcontroller subsystem hardware then ordered to make that facial expression. Result of this final project is all of the subsystems can be integrated to make the robot that can respond the form of human expression. The method used is simple but looks quite capable of recognizing human facial expression. Keyword: OpenCV, Neural Network BackPropagation, Humanoid Robo

    Neural Representations for Sensory-Motor Control, II: Learning a Head-Centered Visuomotor Representation of 3-D Target Position

    Full text link
    A neural network model is described for how an invariant head-centered representation of 3-D target position can be autonomously learned by the brain in real time. Once learned, such a target representation may be used to control both eye and limb movements. The target representation is derived from the positions of both eyes in the head, and the locations which the target activates on the retinas of both eyes. A Vector Associative Map, or YAM, learns the many-to-one transformation from multiple combinations of eye-and-retinal position to invariant 3-D target position. Eye position is derived from outflow movement signals to the eye muscles. Two successive stages of opponent processing convert these corollary discharges into a. head-centered representation that closely approximates the azimuth, elevation, and vergence of the eyes' gaze position with respect to a cyclopean origin located between the eyes. YAM learning combines this cyclopean representation of present gaze position with binocular retinal information about target position into an invariant representation of 3-D target position with respect to the head. YAM learning can use a teaching vector that is externally derived from the positions of the eyes when they foveate the target. A YAM can also autonomously discover and learn the invariant representation, without an explicit teacher, by generating internal error signals from environmental fluctuations in which these invariant properties are implicit. YAM error signals are computed by Difference Vectors, or DVs, that are zeroed by the YAM learning process. YAMs may be organized into YAM Cascades for learning and performing both sensory-to-spatial maps and spatial-to-motor maps. These multiple uses clarify why DV-type properties are computed by cells in the parietal, frontal, and motor cortices of many mammals. YAMs are modulated by gating signals that express different aspects of the will-to-act. These signals transform a single invariant representation into movements of different speed (GO signal) and size (GRO signal), and thereby enable YAM controllers to match a planned action sequence to variable environmental conditions.National Science Foundation (IRI-87-16960, IRI-90-24877); Office of Naval Research (N00014-92-J-1309

    Neural Representations for Sensory-Motor Control, III: Learning a Body-Centered Representation of 3-D Target Position

    Full text link
    A neural model is described of how the brain may autonomously learn a body-centered representation of 3-D target position by combining information about retinal target position, eye position, and head position in real time. Such a body-centered spatial representation enables accurate movement commands to the limbs to be generated despite changes in the spatial relationships between the eyes, head, body, and limbs through time. The model learns a vector representation--otherwise known as a parcellated distributed representation--of target vergence with respect to the two eyes, and of the horizontal and vertical spherical angles of the target with respect to a cyclopean egocenter. Such a vergence-spherical representation has been reported in the caudal midbrain and medulla of the frog, as well as in psychophysical movement studies in humans. A head-centered vergence-spherical representation of foveated target position can be generated by two stages of opponent processing that combine corollary discharges of outflow movement signals to the two eyes. Sums and differences of opponent signals define angular and vergence coordinates, respectively. The head-centered representation interacts with a binocular visual representation of non-foveated target position to learn a visuomotor representation of both foveated and non-foveated target position that is capable of commanding yoked eye movementes. This head-centered vector representation also interacts with representations of neck movement commands to learn a body-centered estimate of target position that is capable of commanding coordinated arm movements. Learning occurs during head movements made while gaze remains fixed on a foveated target. An initial estimate is stored and a VOR-mediated gating signal prevents the stored estimate from being reset during a gaze-maintaining head movement. As the head moves, new estimates arc compared with the stored estimate to compute difference vectors which act as error signals that drive the learning process, as well as control the on-line merging of multimodal information.Air Force Office of Scientific Research (F49620-92-J-0499); National Science Foundation (IRI -87-16960, IRI-90-24877); Office of Naval Research (N00014-92-J-l309

    Neural Representations for Sensory-Motor Control I: Head-Centered 3-D Target Positions from Opponent Eye Commands

    Full text link
    This article describes how corollary discharges from outflow eye movement commands can be transformed by two stages of opponent neural processing into a head-centered representation of 3-D target position. This representation implicitly defines a cyclopean coordinate system whose variables approximate the binocular vergence and spherical horizontal and vertical angles with respect to the observer's head. Various psychophysical data concerning binocular distance perception and reaching behavior are clarified by this representation. The representation provides a foundation for learning head-centered and body-centered invariant representations of both foveated and non-foveated 3-D target positions. It also enables a solution to be developed of the classical motor equivalence problem, whereby many different joint configurations of a redundant manipulator can all be used to realize a desired trajectory in 3-D space.Air Force Office of Scientific Research (URI 90-0175); Defense Advanced Research Projects Agency (AFOSR-90-0083); National Science Foundation (IRI-87-16960, IRI-90-24877

    Encoding of Intention and Spatial Location in the Posterior Parietal Cortex

    Get PDF
    The posterior parietal cortex is functionally situated between sensory cortex and motor cortex. The responses of cells in this area are difficult to classify as strictly sensory or motor, since many have both sensory- and movement-related activities, as well as activities related to higher cognitive functions such as attention and intention. In this review we will provide evidence that the posterior parietal cortex is an interface between sensory and motor structures and performs various functions important for sensory-motor integration. The review will focus on two specific sensory-motor tasks-the formation of motor plans and the abstract representation of space. Cells in the lateral intraparietal area, a subdivision of the parietal cortex, have activity related to eye movements the animal intends to make. This finding represents the lowest stage in the sensory-motor cortical pathway in which activity related to intention has been found and may represent the cortical stage in which sensory signals go "over the hump" to become intentions and plans to make movements. The second part of the review will discuss the representation of space in the posterior parietal cortex. Encoding spatial locations is an essential step in sensory-motor transformations. Since movements are made to locations in space, these locations should be coded invariant of eye and head position or the sensory modality signaling the target for a movement Data will be reviewed demonstrating that there exists in the posterior parietal cortex an abstract representation of space that is constructed from the integration of visual, auditory, vestibular, eye position, and propriocaptive head position signals. This representation is in the form of a population code and the above signals are not combined in a haphazard fashion. Rather, they are brought together using a specific operation to form "planar gain fields" that are the common foundation of the population code for the neural construct of space

    View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention Are Coordinated Using Surface-Based Attentional Shrouds

    Full text link
    Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
    • …