21 research outputs found
Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression
We design a new approach that allows robot learning of new activities from
unlabeled human example videos. Given videos of humans executing the same
activity from a human's viewpoint (i.e., first-person videos), our objective is
to make the robot learn the temporal structure of the activity as its future
regression network, and learn to transfer such model for its own motor
execution. We present a new deep learning model: We extend the state-of-the-art
convolutional object detection network for the representation/estimation of
human hands in training videos, and newly introduce the concept of using a
fully convolutional network to regress (i.e., predict) the intermediate scene
representation corresponding to the future frame (e.g., 1-2 seconds later).
Combining these allows direct prediction of future locations of human hands and
objects, which enables the robot to infer the motor control plan using our
manipulation network. We experimentally confirm that our approach makes
learning of robot activities from unlabeled human interaction videos possible,
and demonstrate that our robot is able to execute the learned collaborative
activities in real-time directly based on its camera input
Forecasting Hands and Objects in Future Frames
This paper presents an approach to forecast future presence and location of
human hands and objects. Given an image frame, the goal is to predict what
objects will appear in the future frame (e.g., 5 seconds later) and where they
will be located at, even when they are not visible in the current frame. The
key idea is that (1) an intermediate representation of a convolutional object
recognition model abstracts scene information in its frame and that (2) we can
predict (i.e., regress) such representations corresponding to the future frames
based on that of the current frame. We design a new two-stream convolutional
neural network (CNN) architecture for videos by extending the state-of-the-art
convolutional object detection network, and present a new fully convolutional
regression network for predicting future scene representations. Our experiments
confirm that combining the regressed future representation with our detection
network allows reliable estimation of future hands and objects in videos. We
obtain much higher accuracy compared to the state-of-the-art future object
presence forecast method on a public dataset
Identifying First-person Camera Wearers in Third-person Videos
We consider scenarios in which we wish to perform joint scene understanding,
object tracking, activity recognition, and other tasks in environments in which
multiple people are wearing body-worn cameras while a third-person static
camera also captures the scene. To do this, we need to establish person-level
correspondences across first- and third-person videos, which is challenging
because the camera wearer is not visible from his/her own egocentric video,
preventing the use of direct feature matching. In this paper, we propose a new
semi-Siamese Convolutional Neural Network architecture to address this novel
challenge. We formulate the problem as learning a joint embedding space for
first- and third-person videos that considers both spatial- and motion-domain
cues. A new triplet loss function is designed to minimize the distance between
correct first- and third-person matches while maximizing the distance between
incorrect ones. This end-to-end approach performs significantly better than
several baselines, in part by learning the first- and third-person features
optimized for matching jointly with the distance measure itself
Solar Farm Suitability Using Geographic Information System Fuzzy Sets and Analytic Hierarchy Processes: Case Study of Ulleung Island, Korea
Solar farm suitability in remote areas will involve a multi-criteria evaluation (MCE) process, particularly well suited for the geographic information system (GIS) environment. Photovoltaic (PV) solar farm criteria were evaluated for an island-based case region having complex topographic and regulatory criteria, along with high demand for low-carbon local electricity production: Ulleung Island, Korea. Constraint variables that identified areas forbidden to PV farm development were consolidated into a single binary constraint layer (e.g., environmental regulation, ecological protection, future land use). Six factor variables were selected as influential on-site suitability within the geospatial database to seek out increased annual average power performance and reduced potential investment costs, forming new criteria layers for site suitability: solar irradiation, sunshine hours, average temperature in summer, proximity to transmission line, proximity to roads, and slope. Each factor variable was normalized via a fuzzy membership function (FMF) and parameter setting based on the local characteristics and criteria for a fixed axis PV system. Representative weighting of the relative importance for each factor variable was assigned via pairwise comparison completed by experts. A suitability index (SI) with six factor variables was derived using a weighted fuzzy summation method. Sensitivity analysis was conducted to assess four different SI based on the development scenarios (i.e., the combination of factors being considered). From the resulting map, three highly suitable regions were suggested and validated by comparison with satellite images to confirm the candidate sites for solar farm development. The GIS-MCE method proposed can also be applicable widely to other PV solar farm site selection projects with appropriate adaption for local variables
Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging
This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the same stimuli. The aligned corpus offers the advantages of high temporal resolution (from EMA) and a complete mid-sagittal view (from rtMRI). The co-registration also yields optimum placement of EMA sensors as articulatory landmarks on the magnetic resonance images, thus providing richer spatio-temporal information about articulatory dynamics. (C) 2014 Acoustical Society of Americ
A Systematic Review of Systematic Reviews on the Epidemiology, Evaluation, and Treatment of Plantar Fasciitis
The number of systematic review and meta-analyses on plantar fasciitis is expanding. The purpose of this review was to provide a comprehensive summary of reviews on the topic pertaining to plantar fasciitis, identify any conflicting and inconsistent results, and propose future research direction. A qualitative review of all systematic reviews and meta-analyses related to plantar fasciitis up to February 2021 was performed using PubMed, Embase, Web of Science, and the Cochrane Database. A total of 1052 articles were initially identified and 96 met the inclusion criteria. Included articles were summarized and divided into the following topics: epidemiology, diagnosis, and treatment. While the majority of reviews had high level of heterogeneity and included a small number of studies, there was general consensus on certain topics, such as BMI as a risk factor for plantar fasciitis and extracorporeal shockwave therapy as an effective mode of therapy. A qualitative summary of systematic reviews and meta-analyses published on plantar fasciitis provides a single source of updated information for clinicians. Evidence on topics such as the epidemiology, exercise therapy, or cost-effectiveness of treatment options for plantar fasciitis are lacking and warrant future research
Spatial Distribution of Lead Iodide and Local Passivation on Organo-Lead Halide Perovskite
We identify nanoscale spatial distribution of PbI2 on the (FAPbI3)0.85(MAPbBr3)0.15 perovskite thin film and investigate the local passivation effect using confocal based optical microscopy of steady state and time-resolved photoluminescence (PL). Different from a typical scanning electron microscope (SEM) morphology study, confocal based PL spectroscopy and microscopy allow researchers to map the morphologies of both perovskite and PbI2 grains simultaneously, by selectively detecting their characteristic fluorescent bands using band-pass filters. In this work, we compare the perovskite samples without and with excess PbI2 incorporation and unambiguously reveal PbI2 distribution for the PbI2-rich sample. In addition, using the nanoscale time-resolved PL technique we show that the PbI2-rich regions exhibit longer lifetime due to suppressed defect trapping, compared to the PbI2-poor regions. The measurement on the PbI2-rich sample indicates that the passivation effect of PbI2 in perovskite film is effective, especially in localized regions. Hence, this finding is important for further improvement of the solar cells by considering the strategy of excess PbI2 incorporation.clos