340 research outputs found

    Use of human gestures for controlling a mobile robot via adaptive CMAC network and fuzzy logic controller

    Get PDF
    Mobile robots with manipulators have been more and more commonly applied in extreme and hostile environments to assist or even replace human operators for complex tasks. In addition to autonomous abilities, mobile robots need to facilitate the human–robot interaction control mode that enables human users to easily control or collaborate with robots. This paper proposes a system which uses human gestures to control an autonomous mobile robot integrating a manipulator and a video surveillance platform. A human user can control the mobile robot just as one drives an actual vehicle in the vehicle’s driving cab. The proposed system obtains human’s skeleton joints information using a motion sensing input device, which is then recognized and interpreted into a set of control commands. This is implemented, based on the availability of training data set and requirement of in-time performance, by an adaptive cerebellar model articulation controller neural network, a finite state machine, a fuzzy controller and purposely designed gesture recognition and control command generation systems. These algorithms work together implement the steering and velocity control of the mobile robot in real-time. The experimental results demonstrate that the proposed approach is able to conveniently control a mobile robot using virtual driving method, with smooth manoeuvring trajectories in various speeds

    Translating SIBI (Sign System for Indonesian Gesture) Gesture-to-Text in Real-Time using a Mobile Device

    Get PDF
    The SIBI gesture translation framework by Rakun was built using a series of machine learning technologies: MobileNetV2 for feature extraction, Conditional Random Field for finding the epenthesis movement frame, and Long Short-Term Memory for word classification. This high computational translation system was previously implemented on a personal computer system, which lacks portability and accessibility. This study implemented the system on a smartphone using an on-device inference method: the translation process is embedded into the smartphone to provide lower latency and zero data usage. The system was then improved using a parallel multi-inference method, which reduced the average translation time by 25%. The final mobile SIBI gesture-to-text translation system achieved a word accuracy of 90.560%, a sentence accuracy of 64%, and an average translation time of 20 seconds

    Deep Learning-Based Action Recognition

    Get PDF
    The classification of human action or behavior patterns is very important for analyzing situations in the field and maintaining social safety. This book focuses on recent research findings on recognizing human action patterns. Technology for the recognition of human action pattern includes the processing technology of human behavior data for learning, technology of expressing feature values ​​of images, technology of extracting spatiotemporal information of images, technology of recognizing human posture, and technology of gesture recognition. Research on these technologies has recently been conducted using general deep learning network modeling of artificial intelligence technology, and excellent research results have been included in this edition

    A CSI-Based Human Activity Recognition Using Deep Learning

    Get PDF
    The Internet of Things (IoT) has become quite popular due to advancements in Information and Communications technologies and has revolutionized the entire research area in Human Activity Recognition (HAR). For the HAR task, vision-based and sensor-based methods can present better data but at the cost of users’ inconvenience and social constraints such as privacy issues. Due to the ubiquity of WiFi devices, the use of WiFi in intelligent daily activity monitoring for elderly persons has gained popularity in modern healthcare applications. Channel State Information (CSI) as one of the characteristics ofWiFi signals, can be utilized to recognize different human activities. We have employed a Raspberry Pi 4 to collect CSI data for seven different human daily activities, and converted CSI data to images and then used these images as inputs of a 2D Convolutional Neural Network (CNN) classifier. Our experiments have shown that the proposed CSI-based HAR outperforms other competitor methods including 1D-CNN, Long Short-Term Memory (LSTM), and Bi-directional LSTM, and achieves an accuracy of around 95% for seven activities

    Applications of Silicon Retinas: from Neuroscience to Computer Vision

    Full text link
    Traditional visual sensor technology is firmly rooted in the concept of sequences of image frames. The sequence of stroboscopic images in these "frame cameras" is very different compared to the information running from the retina to the visual cortex. While conventional cameras have improved in the direction of smaller pixels and higher frame rates, the basics of image acquisition have remained the same. Event-based vision sensors were originally known as "silicon retinas" but are now widely called "event cameras." They are a new type of vision sensors that take inspiration from the mechanisms developed by nature for the mammalian retina and suggest a different way of perceiving the world. As in the neural system, the sensed information is encoded in a train of spikes, or so-called events, comparable to the action potential generated in the nerve. Event-based sensors produce sparse and asynchronous output that represents in- formative changes in the scene. These sensors have advantages in terms of fast response, low latency, high dynamic range, and sparse output. All these char- acteristics are appealing for computer vision and robotic applications, increasing the interest in this kind of sensor. However, since the sensor’s output is very dif- ferent, algorithms applied for frames need to be rethought and re-adapted. This thesis focuses on several applications of event cameras in scientific scenarios. It aims to identify where they can make the difference compared to frame cam- eras. The presented applications use the Dynamic Vision Sensor (event camera developed by the Sensors Group of the Institute of Neuroinformatics, University of Zurich and ETH). To explore some applications in more extreme situations, the first chapters of the thesis focus on the characterization of several advanced versions of the standard DVS. The low light condition represents a challenging situation for every vision sensor. Taking inspiration from standard Complementary Metal Oxide Semiconductor (CMOS) technology, the DVS pixel performances in a low light scenario can be improved, increasing sensitivity and quantum efficiency, by using back-side illumination. This thesis characterizes the so-called Back Side Illumination DAVIS (BSI DAVIS) camera and shows results from its application in calcium imaging of neural activity. The BSI DAVIS has shown better performance in the low light scene due to its high Quantum Efficiency (QE) of 93% and proved to be the best type of technology for microscopy application. The BSI DAVIS allows detecting fast dynamic changes in neural fluorescent imaging using the green fluorescent calcium indicator GCaMP6f. Event camera advances have pushed the exploration of event-based cameras in computer vision tasks. Chapters of this thesis focus on two of the most active research areas in computer vision: human pose estimation and hand gesture classification. Both chapters report the datasets collected to achieve the task, fulfilling the continuous need for data for this kind of new technology. The Dynamic Vision Sensor Human Pose dataset (DHP19) is an extensive collection of 33 whole-body human actions from 17 subjects. The chapter presents the first benchmark neural network model for 3D pose estimation using DHP19. The network archives a mean error of less than 8 mm in the 3D space, which is comparable with frame-based Human Pose Estimation (HPE) methods using frames. The gesture classification chapter reports an application running on a mobile device and explores future developments in the direction of embedded portable low power devices for online processing. The sparse output from the sensor suggests using a small model with a reduced number of parameters and low power consumption. The thesis also describes pilot results from two other scientific imaging applica- tions for raindrop size measurement and laser speckle analysis presented in the appendices

    Virtual Reality Engine Development

    Get PDF
    With the advent of modern graphics and computing hardware and cheaper sensor and display technologies, virtual reality is becoming increasingly popular in the fields of gaming, therapy, training and visualization. Earlier attempts at popularizing VR technology were plagued by issues of cost, portability and marketability to the general public. Modern screen technologies make it possible to produce cheap, light head-mounted displays (HMDs) like the Oculus Rift, and modern GPUs make it possible to create and deliver a seamless real-time 3D experience to the user. 3D sensing has found an application in virtual and augmented reality as well, allowing for a higher level of interaction between the real and the simulated. There are still issues that persist, however. Many modern graphics/game engines still do not provide developers with an intuitive or adaptable interface to incorporate these new technologies. Those that do, tend to think of VR as a novelty afterthought, and even then only provide tailor-made extensions for specific hardware. The goal of this paper is to design and implement a functional, general-purpose VR engine using abstract interfaces for much of the hardware components involved to allow for easy extensibility for the developer

    Multi-modal perception and sensor fusion for human-robot collaboration

    Get PDF
    In an era where human-robot collaboration is becoming increasingly common. sensor-based approaches are emerging as critical factors in enabling robots to work harmoniously alongside humans. In some scenarios, the utilization of multiple sensors becomes imperative due to the complexity of certain tasks. This multi-modal approach enhances cognitive capabilities and ensures higher levels of safety and reliability. The integration of diverse sensor modalities not only improves the accuracy of perception but also enables robots to adapt to surrounding environments and analyze human intentions and commands more effectively. This paper explores sensor fusion and multi-modal approaches within an industrial context. The proposed fusion framework adopts a holistic approach encompassing multiple modalities, integrating object detection, human gesture recognition, and speech recognition techniques. These components are pivotal for human-robot interaction, enabling robots to comprehend their environment and interpret human inputs efficiently. Each modality is individually tested and evaluated within the sensor fusion and multi-modal framework to ensure efficient functionality. Successful industrial experimentation underscores the practicality and relevance of sensor fusion and multi-modality approaches. This work underscores the potential of sensor fusion while emphasizing the ongoing need for exploration and improvement in this field
    • …
    corecore