21 research outputs found

    Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression

    Full text link
    We design a new approach that allows robot learning of new activities from unlabeled human example videos. Given videos of humans executing the same activity from a human's viewpoint (i.e., first-person videos), our objective is to make the robot learn the temporal structure of the activity as its future regression network, and learn to transfer such model for its own motor execution. We present a new deep learning model: We extend the state-of-the-art convolutional object detection network for the representation/estimation of human hands in training videos, and newly introduce the concept of using a fully convolutional network to regress (i.e., predict) the intermediate scene representation corresponding to the future frame (e.g., 1-2 seconds later). Combining these allows direct prediction of future locations of human hands and objects, which enables the robot to infer the motor control plan using our manipulation network. We experimentally confirm that our approach makes learning of robot activities from unlabeled human interaction videos possible, and demonstrate that our robot is able to execute the learned collaborative activities in real-time directly based on its camera input

    Forecasting Hands and Objects in Future Frames

    Full text link
    This paper presents an approach to forecast future presence and location of human hands and objects. Given an image frame, the goal is to predict what objects will appear in the future frame (e.g., 5 seconds later) and where they will be located at, even when they are not visible in the current frame. The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene information in its frame and that (2) we can predict (i.e., regress) such representations corresponding to the future frames based on that of the current frame. We design a new two-stream convolutional neural network (CNN) architecture for videos by extending the state-of-the-art convolutional object detection network, and present a new fully convolutional regression network for predicting future scene representations. Our experiments confirm that combining the regressed future representation with our detection network allows reliable estimation of future hands and objects in videos. We obtain much higher accuracy compared to the state-of-the-art future object presence forecast method on a public dataset

    Identifying First-person Camera Wearers in Third-person Videos

    Full text link
    We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks in environments in which multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, we need to establish person-level correspondences across first- and third-person videos, which is challenging because the camera wearer is not visible from his/her own egocentric video, preventing the use of direct feature matching. In this paper, we propose a new semi-Siamese Convolutional Neural Network architecture to address this novel challenge. We formulate the problem as learning a joint embedding space for first- and third-person videos that considers both spatial- and motion-domain cues. A new triplet loss function is designed to minimize the distance between correct first- and third-person matches while maximizing the distance between incorrect ones. This end-to-end approach performs significantly better than several baselines, in part by learning the first- and third-person features optimized for matching jointly with the distance measure itself

    Solar Farm Suitability Using Geographic Information System Fuzzy Sets and Analytic Hierarchy Processes: Case Study of Ulleung Island, Korea

    No full text
    Solar farm suitability in remote areas will involve a multi-criteria evaluation (MCE) process, particularly well suited for the geographic information system (GIS) environment. Photovoltaic (PV) solar farm criteria were evaluated for an island-based case region having complex topographic and regulatory criteria, along with high demand for low-carbon local electricity production: Ulleung Island, Korea. Constraint variables that identified areas forbidden to PV farm development were consolidated into a single binary constraint layer (e.g., environmental regulation, ecological protection, future land use). Six factor variables were selected as influential on-site suitability within the geospatial database to seek out increased annual average power performance and reduced potential investment costs, forming new criteria layers for site suitability: solar irradiation, sunshine hours, average temperature in summer, proximity to transmission line, proximity to roads, and slope. Each factor variable was normalized via a fuzzy membership function (FMF) and parameter setting based on the local characteristics and criteria for a fixed axis PV system. Representative weighting of the relative importance for each factor variable was assigned via pairwise comparison completed by experts. A suitability index (SI) with six factor variables was derived using a weighted fuzzy summation method. Sensitivity analysis was conducted to assess four different SI based on the development scenarios (i.e., the combination of factors being considered). From the resulting map, three highly suitable regions were suggested and validated by comparison with satellite images to confirm the candidate sites for solar farm development. The GIS-MCE method proposed can also be applicable widely to other PV solar farm site selection projects with appropriate adaption for local variables

    Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging

    No full text
    This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the same stimuli. The aligned corpus offers the advantages of high temporal resolution (from EMA) and a complete mid-sagittal view (from rtMRI). The co-registration also yields optimum placement of EMA sensors as articulatory landmarks on the magnetic resonance images, thus providing richer spatio-temporal information about articulatory dynamics. (C) 2014 Acoustical Society of Americ

    A Systematic Review of Systematic Reviews on the Epidemiology, Evaluation, and Treatment of Plantar Fasciitis

    No full text
    The number of systematic review and meta-analyses on plantar fasciitis is expanding. The purpose of this review was to provide a comprehensive summary of reviews on the topic pertaining to plantar fasciitis, identify any conflicting and inconsistent results, and propose future research direction. A qualitative review of all systematic reviews and meta-analyses related to plantar fasciitis up to February 2021 was performed using PubMed, Embase, Web of Science, and the Cochrane Database. A total of 1052 articles were initially identified and 96 met the inclusion criteria. Included articles were summarized and divided into the following topics: epidemiology, diagnosis, and treatment. While the majority of reviews had high level of heterogeneity and included a small number of studies, there was general consensus on certain topics, such as BMI as a risk factor for plantar fasciitis and extracorporeal shockwave therapy as an effective mode of therapy. A qualitative summary of systematic reviews and meta-analyses published on plantar fasciitis provides a single source of updated information for clinicians. Evidence on topics such as the epidemiology, exercise therapy, or cost-effectiveness of treatment options for plantar fasciitis are lacking and warrant future research

    Spatial Distribution of Lead Iodide and Local Passivation on Organo-Lead Halide Perovskite

    No full text
    We identify nanoscale spatial distribution of PbI2 on the (FAPbI3)0.85(MAPbBr3)0.15 perovskite thin film and investigate the local passivation effect using confocal based optical microscopy of steady state and time-resolved photoluminescence (PL). Different from a typical scanning electron microscope (SEM) morphology study, confocal based PL spectroscopy and microscopy allow researchers to map the morphologies of both perovskite and PbI2 grains simultaneously, by selectively detecting their characteristic fluorescent bands using band-pass filters. In this work, we compare the perovskite samples without and with excess PbI2 incorporation and unambiguously reveal PbI2 distribution for the PbI2-rich sample. In addition, using the nanoscale time-resolved PL technique we show that the PbI2-rich regions exhibit longer lifetime due to suppressed defect trapping, compared to the PbI2-poor regions. The measurement on the PbI2-rich sample indicates that the passivation effect of PbI2 in perovskite film is effective, especially in localized regions. Hence, this finding is important for further improvement of the solar cells by considering the strategy of excess PbI2 incorporation.clos
    corecore