5,533 research outputs found

    Learning Manipulation under Physics Constraints with Visual Perception

    Full text link
    Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. In this work, we consider the problem of autonomous block stacking and explore solutions to learning manipulation under physics constraints with visual perception inherent to the task. Inspired by the intuitive physics in humans, we first present an end-to-end learning-based approach to predict stability directly from appearance, contrasting a more traditional model-based approach with explicit 3D representations and physical simulation. We study the model's behavior together with an accompanied human subject test. It is then integrated into a real-world robotic system to guide the placement of a single wood block into the scene without collapsing existing tower structure. To further automate the process of consecutive blocks stacking, we present an alternative approach where the model learns the physics constraint through the interaction with the environment, bypassing the dedicated physics learning as in the former part of this work. In particular, we are interested in the type of tasks that require the agent to reach a given goal state that may be different for every new trial. Thereby we propose a deep reinforcement learning framework that learns policies for stacking tasks which are parametrized by a target structure.Comment: arXiv admin note: substantial text overlap with arXiv:1609.04861, arXiv:1711.00267, arXiv:1604.0006

    Learning Manipulation under Physics Constraints with Visual Perception

    No full text
    Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. In this work, we consider the problem of autonomous block stacking and explore solutions to learning manipulation under physics constraints with visual perception inherent to the task. Inspired by the intuitive physics in humans, we first present an end-to-end learning-based approach to predict stability directly from appearance, contrasting a more traditional model-based approach with explicit 3D representations and physical simulation. We study the model's behavior together with an accompanied human subject test. It is then integrated into a real-world robotic system to guide the placement of a single wood block into the scene without collapsing existing tower structure. To further automate the process of consecutive blocks stacking, we present an alternative approach where the model learns the physics constraint through the interaction with the environment, bypassing the dedicated physics learning as in the former part of this work. In particular, we are interested in the type of tasks that require the agent to reach a given goal state that may be different for every new trial. Thereby we propose a deep reinforcement learning framework that learns policies for stacking tasks which are parametrized by a target structure

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    CGAMES'2009

    Get PDF

    Vision Based Activity Recognition Using Machine Learning and Deep Learning Architecture

    Get PDF
    Human Activity recognition, with wide application in fields like video surveillance, sports, human interaction, elderly care has shown great influence in upbringing the standard of life of people. With the constant development of new architecture, models, and an increase in the computational capability of the system, the adoption of machine learning and deep learning for activity recognition has shown great improvement with high performance in recent years. My research goal in this thesis is to design and compare machine learning and deep learning models for activity recognition through videos collected from different media in the field of sports. Human activity recognition (HAR) mostly is to recognize the action performed by a human through the data collected from different sources automatically. Based on the literature review, most data collected for analysis is based on time series data collected through different sensors and video-based data collected through the camera. So firstly, our research analyzes and compare different machine learning and deep learning architecture with sensor-based data collected from an accelerometer of a smartphone place at different position of the human body. Without any hand-crafted feature extraction methods, we found that deep learning architecture outperforms most of the machine learning architecture and the use of multiple sensors has higher accuracy than a dataset collected from a single sensor. Secondly, as collecting data from sensors in real-time is not feasible in all the fields such as sports, we study the activity recognition by using the video dataset. For this, we used two state-of-the-art deep learning architectures previously trained on the big, annotated dataset using transfer learning methods for activity recognition in three different sports-related publicly available datasets. Extending the study to the different activities performed on a single sport, and to avoid the current trend of using special cameras and expensive set up around the court for data collection, we developed our video dataset using sports coverage of basketball games broadcasted through broadcasting media. The detailed analysis and experiments based on different criteria such as range of shots taken, scoring activities is presented for 8 different activities using state-of-art deep learning architecture for video classification

    Trajectory data mining: A review of methods and applications

    Get PDF
    The increasing use of location-aware devices has led to an increasing availability of trajectory data. As a result, researchers devoted their efforts to developing analysis methods including different data mining methods for trajectories. However, the research in this direction has so far produced mostly isolated studies and we still lack an integrated view of problems in applications of trajectory mining that were solved, the methods used to solve them, and applications using the obtained solutions. In this paper, we first discuss generic methods of trajectory mining and the relationships between them. Then, we discuss and classify application problems that were solved using trajectory data and relate them to the generic mining methods that were used and real world applications based on them. We classify trajectory-mining application problems under major problem groups based on how they are related. This classification of problems can guide researchers in identifying new application problems. The relationships between the methods together with the association between the application problems and mining methods can help researchers in identifying gaps between methods and inspire them to develop new methods. This paper can also guide analysts in choosing a suitable method for a specific problem. The main contribution of this paper is to provide an integrated view relating applications of mining trajectory data and the methods used

    SenseFi: A library and benchmark on deep-learning-empowered WiFi human sensing

    Get PDF
    Over the recent years, WiFi sensing has been rapidly developed for privacy-preserving, ubiquitous human-sensing applications, enabled by signal processing and deep-learning methods. However, a comprehensive public benchmark for deep learning in WiFi sensing, similar to that available for visual recognition, does not yet exist. In this article, we review recent progress in topics ranging from WiFi hardware platforms to sensing algorithms and propose a new library with a comprehensive benchmark, SenseFi. On this basis, we evaluate various deep-learning models in terms of distinct sensing tasks, WiFi platforms, recognition accuracy, model size, computational complexity, and feature transferability. Extensive experiments are performed whose results provide valuable insights into model design, learning strategy, and training techniques for real-world applications. In summary, SenseFi is a comprehensive benchmark with an open-source library for deep learning in WiFi sensing research that offers researchers a convenient tool to validate learning-based WiFi-sensing methods on multiple datasets and platforms.Nanyang Technological UniversityPublished versionThis research is supported by NTU Presidential Postdoctoral Fellowship, ‘‘Adaptive Multi-modal Learning for Robust Sensing and Recognition in Smart Cities’’ project fund (020977-00001), at the Nanyang Technological University, Singapore
    • …
    corecore