920 research outputs found

    Credit assignment in multiple goal embodied visuomotor behavior

    Get PDF
    The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of embodied cognition emphasizes that understanding brain function at the system level requires one to address the role of the brain-body interface. It has only recently been appreciated that this interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain’s abstract states can refer to coded representations of the world created by the body. But even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry mean that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used a decomposition whereby behavioral primitives could be quickly accessed and combined. This realization motivates our study of independent sensorimotor task solvers, which we call modules, in directing behavior. The issue we focus on herein is how an embodied agent can learn to calibrate such individual visuomotor modules while pursuing multiple goals. The biologically plausible standard for module programming is that of reinforcement given during exploration of the environment. However this formulation contains a substantial issue when sensorimotor modules are used in combination: The credit for their overall performance must be divided amongst them. We show that this problem can be solved and that diverse task combinations are beneficial in learning and not a complication, as usually assumed. Our simulations show that fast algorithms are available that allot credit correctly and are insensitive to measurement noise

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Object Transfer Point Estimation for Prompt Human to Robot Handovers

    Get PDF
    Handing over objects is the foundation of many human-robot interaction and collaboration tasks. In the scenario where a human is handing over an object to a robot, the human chooses where the object needs to be transferred. The robot needs to accurately predict this point of transfer to reach out proactively, instead of waiting for the final position to be presented. We first conduct a human-to-robot handover motion study to analyze the effect of user height, arm length, position, orientation and robot gaze on the object transfer point. Our study presents new observations on the effect of robot\u27s gaze on the point of object transfer. Next, we present an efficient method for predicting the Object Transfer Point (OTP), which synthesizes (1) an offline OTP calculated based on human preferences observed in the human-robot motion study with (2) a dynamic OTP predicted based on the observed human motion. Our proposed OTP predictor is implemented on a humanoid nursing robot and experimentally validated in human-robot handover tasks. Compared to using only static or dynamic OTP estimators, it has better accuracy at the earlier phase of handover (up to 45% of the handover motion) and can render fluent handovers with a reach-to-grasp response time (about 3.1 secs) close to natural human receiver\u27s response. In addition, the OTP prediction accuracy is maintained across the robot\u27s visible workspace by utilizing a user-adaptive reference frame

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Video-based iris feature extraction and matching using Deep Learning

    Get PDF
    This research is initiated to enhance the video-based eye tracker’s performance to detect small eye movements.[1] Chaudhary and Pelz, 2019, created an excellent foundation on their motion tracking of iris features to detect small eye movements[1], where they successfully used the classical handcrafted feature extraction methods like Scale InvariantFeature Transform (SIFT) to match the features on iris image frames. They extracted features from the eye-tracking videos and then used patent [2] an approach of tracking the geometric median of the distribution. This patent [2] excludes outliers, and the velocity is approximated by scaling by the sampling rate. To detect the microsaccades (small, rapid eye movements that occur in only one eye at a time) thresholding was used to estimate the velocity in the following paper[1]. Our goal is to create a robust mathematical model to create a 2D feature distribution in the given patent [2]. In this regard, we worked in two steps. First, we studied a large number of multiple recent deep learning approaches along with the classical hand-crafted feature extractor like SIFT, to extract the features from the collected eye tracker videos from Multidisciplinary Vision Research Lab(MVRL) and then showed the best matching process for our given RIT-Eyes dataset[3]. The goal is to make the feature extraction as robust as possible. Secondly, we clearly showed that deep learning methods can detect more feature points from the iris images and that matching of the extracted features frame by frame is more accurate than the classical approach
    • …
    corecore