745 research outputs found

    Real-time human detection from depth images with heuristic approach

    Get PDF
    Abstract. The first industrial robot was built in the mid-20th century. The idea of the industrial robots was to replace humans in assembly lines, where the tasks were repetitive and easy to do. The benefits of these robots are that they are able to work around the clock and only need electricity as compensation. Over the years, robots capable of only doing repetitive tasks have evolved to operate fully autonomously in challenging environments. Some examples of these are self-driving cars and service robots that can work as customer servants. This is mainly accomplished through advancements in artificial intelligence, machine vision, and depth camera technologies. With machine vision and depth perception, robots are able to construct a fully structured environment around them and this allows them to properly react to sudden changes in their surroundings. In this project, a naive detection algorithm was implemented to separate humans from depth images. The algorithm works by removing the ground plane, after which the floating objects can be separated more easily. The floating objects are further processed, and the human detection part is then achieved using a heuristic approach. The proposed algorithm works in real time and reliably detects people standing in a relatively open environment. However, because of the naive approach, human-sized items are wrongly detected as humans in some scenarios.Tiivistelmä. Ensimmäinen teollisuusrobotti rakennettiin 1900-luvun puolivälissä. Teollisuusrobottien tarkoitus oli korvata ihmiset tehtaiden kokoonpanolinjoilla, joissa työtehtävät olivat pääsääntöisesti yksinkertaisia ja itseään toistavia. Näiden robottien etuna on, että ne kykenevät työskentelemään kellon ympäri pelkän sähkön varassa. Vuosien mittaan robotit ovat kehittyneet yksinkertaisista koneista roboteiksi, jotka kykenevät toimimaan täysin itsenäisesti haastavissakin olosuhteissa. Itseajavat autot ja asiakaspalvelijana toimivat palvelurobotit ovat näistä hyviä esimerkkejä. Tällaiset saavutukset ovat olleet mahdollisia tekoälyn, konenäön ja syvyyskameroiden kehityksen myötä. Kone- ja syvyysnäön avulla robotit pystyvät muodostamaan itselleen selkeän kuvan ympäristöstään, mikä mahdollistaa nopean reagoinnin yllättäviinkin muutoksiin ympäristössä. Tässä työssä toteutettiin naiivi havaitsemisalgoritmi erottelemaan ihmiset syvyyskuvista. Algoritmi poistaa maatason, jonka jälkeen ilmassa leijuvat esineet voidaan erotella toisistaan. Erotetut esineet jatkokäsitellään, jonka jälkeen ihmisten havaitseminen toteutetaan heuristisella menetelmällä. Työssä esitelty algoritmi toimii reaaliajassa ja pystyy luotettavasti havaitsemaan ihmiset suhteellisen avoimessa ympäristössä, vaikkakin joissain tapauksissa ihmisen kokoiset esineet luokitellaan väärin ihmisiksi naiivin lähestymistavan vuoksi

    Peer Attention Modeling with Head Pose Trajectory Tracking Using Temporal Thermal Maps

    Get PDF
    Human head pose trajectories can represent a wealth of implicit information such as areas of attention, body language, potential future actions, and more. This signal is of high value for use in Human-Robot teams due to the implicit information encoded within it. Although team-based tasks require both explicit and implicit communication among peers, large team sizes, noisy environments, distance, and mission urgency can inhibit the frequency and quality of explicit communication. The goal for this thesis is to improve the capabilities of Human-Robot teams by making use of implicit communication. In support of this goal, the following hypotheses are investigated: ● Implicit information about a human subject’s attention can be reliably extracted with software by tracking the subject’s head pose trajectory, and ● Attention can be represented with a 3D temporal thermal map for implicitly determining a subject’s Objects Of Interest (OOIs). These hypotheses are investigated by experimentation with a new tool for peer attention modeling by Head Pose Trajectory Tracking using Temporal Thermal Maps (HPT4M). This system allows a robot Observing Agent (OA) to view a human teammate and temporally model their Regions Of Interest (ROIs) by generating a 3D thermal map based on the subject’s head pose trajectory. The findings in this work are that HPT4M can be used by an OA to contribute to a team search mission by implicitly discovering a human subject’s OOI type, mapping the item’s location within the searched space, and labeling the item’s discovery state. Furthermore, this work discusses some of the discovered limitations of this technology and hurdles that must be overcome before implementing HPT4M in a reliable real-world system. Finally, the techniques used in this work are provided as an open source Robot Operating System (ROS) node at github.com/HPT4M with the intent that it will aid other developers in the robotics community with improving Human-Robot teams. Furthermore, the proofs of principle and tools developed in this thesis are a foundational platform for deeper investigation in future research on improving Human-Robot teams via implicit communication techniques

    Human-Robot Collaborations in Industrial Automation

    Get PDF
    Technology is changing the manufacturing world. For example, sensors are being used to track inventories from the manufacturing floor up to a retail shelf or a customer’s door. These types of interconnected systems have been called the fourth industrial revolution, also known as Industry 4.0, and are projected to lower manufacturing costs. As industry moves toward these integrated technologies and lower costs, engineers will need to connect these systems via the Internet of Things (IoT). These engineers will also need to design how these connected systems interact with humans. The focus of this Special Issue is the smart sensors used in these human–robot collaborations

    Multimodal machine learning for intelligent mobility

    Get PDF
    Scientific problems are solved by finding the optimal solution for a specific task. Some problems can be solved analytically while other problems are solved using data driven methods. The use of digital technologies to improve the transportation of people and goods, which is referred to as intelligent mobility, is one of the principal beneficiaries of data driven solutions. Autonomous vehicles are at the heart of the developments that propel Intelligent Mobility. Due to the high dimensionality and complexities involved in real-world environments, it needs to become commonplace for intelligent mobility to use data-driven solutions. As it is near impossible to program decision making logic for every eventuality manually. While recent developments of data-driven solutions such as deep learning facilitate machines to learn effectively from large datasets, the application of techniques within safety-critical systems such as driverless cars remain scarce.Autonomous vehicles need to be able to make context-driven decisions autonomously in different environments in which they operate. The recent literature on driverless vehicle research is heavily focused only on road or highway environments but have discounted pedestrianized areas and indoor environments. These unstructured environments tend to have more clutter and change rapidly over time. Therefore, for intelligent mobility to make a significant impact on human life, it is vital to extend the application beyond the structured environments. To further advance intelligent mobility, researchers need to take cues from multiple sensor streams, and multiple machine learning algorithms so that decisions can be robust and reliable. Only then will machines indeed be able to operate in unstructured and dynamic environments safely. Towards addressing these limitations, this thesis investigates data driven solutions towards crucial building blocks in intelligent mobility. Specifically, the thesis investigates multimodal sensor data fusion, machine learning, multimodal deep representation learning and its application of intelligent mobility. This work demonstrates that mobile robots can use multimodal machine learning to derive driver policy and therefore make autonomous decisions.To facilitate autonomous decisions necessary to derive safe driving algorithms, we present an algorithm for free space detection and human activity recognition. Driving these decision-making algorithms are specific datasets collected throughout this study. They include the Loughborough London Autonomous Vehicle dataset, and the Loughborough London Human Activity Recognition dataset. The datasets were collected using an autonomous platform design and developed in house as part of this research activity. The proposed framework for Free-Space Detection is based on an active learning paradigm that leverages the relative uncertainty of multimodal sensor data streams (ultrasound and camera). It utilizes an online learning methodology to continuously update the learnt model whenever the vehicle experiences new environments. The proposed Free Space Detection algorithm enables an autonomous vehicle to self-learn, evolve and adapt to new environments never encountered before. The results illustrate that online learning mechanism is superior to one-off training of deep neural networks that require large datasets to generalize to unfamiliar surroundings. The thesis takes the view that human should be at the centre of any technological development related to artificial intelligence. It is imperative within the spectrum of intelligent mobility where an autonomous vehicle should be aware of what humans are doing in its vicinity. Towards improving the robustness of human activity recognition, this thesis proposes a novel algorithm that classifies point-cloud data originated from Light Detection and Ranging sensors. The proposed algorithm leverages multimodality by using the camera data to identify humans and segment the region of interest in point cloud data. The corresponding 3-dimensional data was converted to a Fisher Vector Representation before being classified by a deep Convolutional Neural Network. The proposed algorithm classifies the indoor activities performed by a human subject with an average precision of 90.3%. When compared to an alternative point cloud classifier, PointNet[1], [2], the proposed framework out preformed on all classes. The developed autonomous testbed for data collection and algorithm validation, as well as the multimodal data-driven solutions for driverless cars, is the major contributions of this thesis. It is anticipated that these results and the testbed will have significant implications on the future of intelligent mobility by amplifying the developments of intelligent driverless vehicles.</div

    Robot Assisted 3D Block Building to Augment Spatial Visualization Skills in Children - An exploratory study

    Get PDF
    The unique social presence of robots can be leveraged in learning situations to increase comfortability and engagement of kids, while still providing instructional guidance. When and how to interfere to provide feedback on their mistakes is still not fully clear. One effective feedback strategy used by human tutors is to implicitly inform the students of their errors rather than explicitly providing corrective feedback. This essay explores if and how a social robot can be utilized to provide implicit feedback to a user who is performing spatial visualization tasks. We explore impact of implicit and explicit feedback strategies on user's learning gains, self-regulation and perception of robot during 3D block building tasks in one-on-one child-robot tutoring. We demonstrate a realtime system that tracks the assembly of a 3D block structure using a RealSense RGB-D camera. The system allows three control actions: Add, Remove and Adjust on blocks of four basic colors to manipulate the structure in the play area. 3D structures can be authored in the Learning mode for system to record, and tracking enables the robot to provide selected feedback in the Teaching mode depending on the type of mistake made by the user. Proposed system can detect five types of mistakes i.e., mistake in: shape, color, orientation, level from base and position of the block. The feedback provided by the robot is based on mistake made by the user. Either implicit or explicit feedback, chosen randomly, is narrated by the robot. Various feedback statements are designed to implicitly inform the user of the mistake made. Two robot behaviours have been designed to support the effective delivery of feedback statements i.e., nodding and referential gaze. We conducted an exploratory study to evaluate our robot assisted 3D block building system to augment spatial visualization skills with one participant. We found that the system was easy to use. The robot was perceived as trustworthy, fun and interesting. Intentions of the robot are communicated through feedback statements and its behaviour. Our goal is to explore that the suggestion of mistakes in implicit ways can help the users self-regulate and scaffold their learning processes

    Multimodal agents for cooperative interaction

    Get PDF
    2020 Fall.Includes bibliographical references.Embodied virtual agents offer the potential to interact with a computer in a more natural manner, similar to how we interact with other people. To reach this potential requires multimodal interaction, including both speech and gesture. This project builds on earlier work at Colorado State University and Brandeis University on just such a multimodal system, referred to as Diana. I designed and developed a new software architecture to directly address some of the difficulties of the earlier system, particularly with regard to asynchronous communication, e.g., interrupting the agent after it has begun to act. Various other enhancements were made to the agent systems, including the model itself, as well as speech recognition, speech synthesis, motor control, and gaze control. Further refactoring and new code were developed to achieve software engineering goals that are not outwardly visible, but no less important: decoupling, testability, improved networking, and independence from a particular agent model. This work, combined with the effort of others in the lab, has produced a "version 2'' Diana system that is well positioned to serve the lab's research needs in the future. In addition, in order to pursue new research opportunities related to developmental and intervention science, a "Faelyn Fox'' agent was developed. This is a different model, with a simplified cognitive architecture, and a system for defining an experimental protocol (for example, a toy-sorting task) based on Unity's visual state machine editor. This version too lays a solid foundation for future research

    VSLAM and Navigation System of Unmanned Ground Vehicle Based on RGB-D Camera

    Get PDF
    In this thesis, ROS (Robot Operating System) is used as the software platform and a simple unmanned ground vehicle that is designed and constructed by myself is used as the hardware platform. The most critical issues in the navigation technology of unmanned ground vehicles in unknown environments -SLAM (Simultaneous Localization and Mapping) and autonomous navigation technology are studied. Through the analysis of the principle and structure of visual SLAM, a visual simultaneous localization and mapping algorithm is build. Moreover, accelerate the visual SLAM algorithm through hardware replacement and software algorithm optimization. RealSense D435 is used as the camera of the VSLAM sensor. The algorithm extracts the features from the data of depth camera and calculates the odometry information of the unmanned vehicle through the features matching of the adjacent image. Then update the vehicle’s location and map data using the odometry information. Under the condition that the visual SLAM algorithm works normally, this thesis also uses the 3D map generated to derive the real-time 2D projection map. So as to apply it to the navigation algorithm. Then this thesis realize autonomous navigation and avoids the obstacle function of unmanned vehicle by controlling the driving speed and direction of the vehicle through the navigation algorithm using the 2D projection map. Unmanned ground vehicle path planning is mainly two parts: local path planning and global path planning. Global path planning is mainly used to plan the optimal path to the destination. Local path planning is mainly used to control the speed and direction of the UGV. This thesis analyzes and compares Dijkstra’s algorithm and A* algorithm. Considering the compatible to ROS, Dijkstra’s algorithm is finally used as the global path-planning algorithm. DWA (Dynamic Window Approach) algorithm is used as Local path planning. Under the control of the Dijkstra’s algorithm and the DWA algorithm, unmanned ground vehicles can automatically plan the optimal path to the target point and avoid obstacles. This thesis also designed and constructed a simple unmanned ground vehicle as an experimental platform and design a simple control method basing on differential wheeled unmanned ground vehicle and finally realized the autonomous navigation of unmanned ground vehicles and the function of avoiding obstacles through visual SLAM algorithm and autonomous navigation algorithm. Finally, the main work and deficiencies of this thesis are summarized. And the prospects and difficulties of the research field of unmanned ground vehicles are presented
    • …
    corecore