Search CORE

5,189 research outputs found

Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models

Author: Cangelosi Angelo
Ogata Tetsuya
Zhong Junpei
Publication venue
Publication date: 01/01/2017
Field of study

The abstraction tasks are challenging for multi- modal sequences as they require a deeper semantic understanding and a novel text generation for the data. Although the recurrent neural networks (RNN) can be used to model the context of the time-sequences, in most cases the long-term dependencies of multi-modal data make the back-propagation through time training of RNN tend to vanish in the time domain. Recently, inspired from Multiple Time-scale Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU), called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to learn the long-term dependencies in natural language processing. Particularly it is also able to accomplish the abstraction task for paragraphs given that the time constants are well defined. In this paper, we compare the MTRNN and MTGRU in terms of its learning performances as well as their abstraction representation on higher level (with a slower neural activation). This was done by conducting two studies based on a smaller data- set (two-dimension time sequences from non-linear functions) and a relatively large data-set (43-dimension time sequences from iCub manipulation tasks with multi-modal data). We conclude that gated recurrent mechanisms may be necessary for learning long-term dependencies in large dimension multi-modal data-sets (e.g. learning of robot manipulation), even when natural language commands was not involved. But for smaller learning tasks with simple time-sequences, generic version of recurrent models, such as MTRNN, were sufficient to accomplish the abstraction task.Comment: Accepted by IJCNN 201

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Understanding of Object Manipulation Actions Using Human Multi-Modal Sensory Data

Author: Abbasi Bahareh
Noohi Ehsan
Parastegari Sina
Zefran Milos
Publication venue
Publication date: 01/01/2019
Field of study

Object manipulation actions represent an important share of the Activities of Daily Living (ADLs). In this work, we study how to enable service robots to use human multi-modal data to understand object manipulation actions, and how they can recognize such actions when humans perform them during human-robot collaboration tasks. The multi-modal data in this study consists of videos, hand motion data, applied forces as represented by the pressure patterns on the hand, and measurements of the bending of the fingers, collected as human subjects performed manipulation actions. We investigate two different approaches. In the first one, we show that multi-modal signal (motion, finger bending and hand pressure) generated by the action can be decomposed into a set of primitives that can be seen as its building blocks. These primitives are used to define 24 multi-modal primitive features. The primitive features can in turn be used as an abstract representation of the multi-modal signal and employed for action recognition. In the latter approach, the visual features are extracted from the data using a pre-trained image classification deep convolutional neural network. The visual features are subsequently used to train the classifier. We also investigate whether adding data from other modalities produces a statistically significant improvement in the classifier performance. We show that both approaches produce a comparable performance. This implies that image-based methods can successfully recognize human actions during human-robot collaboration. On the other hand, in order to provide training data for the robot so it can learn how to perform object manipulation actions, multi-modal data provides a better alternative

arXiv.org e-Print Archive

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

A comparative study of speculative retrieval for multi-modal data trails: towards user-friendly Human-Vehicle interactions

Author: Huang Zhentao
Li Rongze
Luo Min
Sun Xu
Wang Yaohua
Yin Xinyu
Zhang Zheng
Publication venue
Publication date: 01/12/2020
Field of study

In the era of growing developments in Autonomous Vehicles, the importance of Human-Vehicle Interaction has become apparent. However, the requirements of retrieving in-vehicle drivers’ multi- modal data trails, by utilizing embedded sensors, have been consid- ered user unfriendly and impractical. Hence, speculative designs, for in-vehicle multi-modal data retrieval, has been demanded for future personalized and intelligent Human-Vehicle Interaction. In this paper, we explore the feasibility to utilize facial recog- nition techniques to build in-vehicle multi-modal data retrieval. We first perform a comprehensive user study to collect relevant data and extra trails through sensors, cameras and questionnaire. Then, we build the whole pipeline through Convolution Neural Net- works to predict multi-model values of three particular categories of data, which are Heart Rate, Skin Conductance and Vehicle Speed, by solely taking facial expressions as input. We further evaluate and validate its effectiveness within the data set, which suggest the promising future of Speculative Designs for Multi-modal Data Retrieval through this approach

Nottingham ePrints

Nottingham eTheses

Recommended from our members

A multi-modal data resource for investigating topographic heterogeneity in patient-derived xenograft tumors.

Author: Altschuler Steven J
Atreya Chloe E
Hann Byron
Malato Julia
Rajaram Satwik
Roth Maike A
VandenBerg Scott
Wu Lani F
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

Patient-derived xenografts (PDXs) are an essential pre-clinical resource for investigating tumor biology. However, cellular heterogeneity within and across PDX tumors can strongly impact the interpretation of PDX studies. Here, we generated a multi-modal, large-scale dataset to investigate PDX heterogeneity in metastatic colorectal cancer (CRC) across tumor models, spatial scales and genomic, transcriptomic, proteomic and imaging assay modalities. To showcase this dataset, we present analysis to assess sources of PDX variation, including anatomical orientation within the implanted tumor, mouse contribution, and differences between replicate PDX tumors. A unique aspect of our dataset is deep characterization of intra-tumor heterogeneity via immunofluorescence imaging, which enables investigation of variation across multiple spatial scales, from subcellular to whole tumor levels. Our study provides a benchmark data resource to investigate PDX models of metastatic CRC and serves as a template for future, quantitative investigations of spatial heterogeneity within and across PDX tumor models

eScholarship - University of California