31 research outputs found

    Learning of Generalized Manipulation Strategies in Service Robotics

    Get PDF
    This thesis makes a contribution to autonomous robotic manipulation. The core is a novel constraint-based representation of manipulation tasks suitable for flexible online motion planning. Interactive learning from natural human demonstrations is combined with parallelized optimization to enable efficient learning of complex manipulation tasks with limited training data. Prior planning results are encoded automatically into the model to reduce planning time and solve the correspondence problem

    Improving robotic grasping system using deep learning approach

    Get PDF
    Traditional robots can only move according to a pre-planned trajectory which limits the range of applications that they could be engaged in. Despite their long history, the use of computer vision technology for grasp prediction and object detection is still an active research area. However, the generating of a full grasp configuration of a target object is the main challenge to plan a successful robotic operation of the physical robotic grasp. Integrating computer vision technology with tactile sensing feedback has given rise to a new capability of robots that can accomplish various robotic tasks. However, the recently conducted studies had used tactile sensing with grasp detection models to improve prediction accuracy, not physical grasp success. Thus, the problem of detecting the slip event of the grasped objects that have different weights is addressed in this research. This research aimed to develop a Deep Learning grasp detection model and a slip detection algorithm and integrating them into one innovative robotic grasping system. By proposing a four-step data augmentation technique, the achieved grasping accuracy was 98.2 % exceeding the best-reported results by almost 0.5 % where 625 new instances were generated per original image with different grasp labels. Besides, using the twostage- transfer-learning technique improved the obtained results in the second stage by 0.3 % compared to the first stage results. For the physical robot grasp, the proposed sevendimensional grasp representations method allows the autonomous prediction of the grasp size and depth. The developed model achieved 74.8 milliseconds as prediction time, which makes it possible to use the model in real-time robotic applications. By observing the real-time feedback of a force sensing resistor sensor, the proposed slip detection algorithm indicated a quick response within 86 milliseconds. These results allowed the system to maintain holding the target objects by an immediate increase of the grasping force. The integration of the Deep Learning and slip detection models has shown a significant improvement of 18.4% in the results of the experimental grasps conducted on a SCARA robot. Besides, the utilized Zerocross-Canny edge detector has improved the robot positioning error by 0.27 mm compared to the related studies. The achieved results introduced an innovative robotic grasping system with a Grasp-NoDrop-Place scheme

    Breakdown: mechanical dysfunction and anthropomorphism

    Get PDF
    Breakdown: Mechanical Dysfunction and Anthropomorphism is a practice-led research project examining the role of mechanical breakdown in the anthropomorphic process. Current theoretical approaches to mechanical breakdown identify it as a homogenous, revelatory event “a sort of breach opened up by objects.” (Baudrillard, 2004: 62). Breakdown challenges this stereotyping and seeks to examine the range of gesture and affect that differing forms of mechanical breakdown exhibit. In doing so it also develops Sherry Turkle’s notion of anthropomorphism as a connective rather than ascriptive process (2005: 351) in the light of Karen Barad’s “performative account of material bodies” (2007: 139). Leading this research is Breakdown, the making, remaking, exhibition and reexhibition of 36 breaking-machines. These breaking-machines; simple mechanical devices made from reconfigured found materials; approach breakdown and fail during their exhibition. They are then repaired or reconfigured by the artist ‘live’ while still on show. Throughout the research this role of the artist as repairman became a key method. The continual recombination of human and machine responding to the call of breakdown allowed for a more detailed understanding of the gestures of mechanical breakdown. This performative relationship considers the posthuman decentring of the Vitruvian man in the writing of Rosi Braidotti (2013: 2) and Karen Barad’s agential realism (Barad, 2007: 44) both of which insist that the human, rather than bounded and individual, be considered as part of a dispersed network of interacting parts. The thesis begins by investigating the performative relationship of Breakdown in detail. It describes a machine-human body that is materialised fleetingly by mechanical dysfunction. Through an intimate relationship with one machine, it then goes on to identify a typology of breakdown: seize, play, burnout and cutting loose, concluding that each emits differing expanding and contracting forces around which bodies disperse and coalesce. Finally, employing the flicker of a thaumatrope and the making of the science fiction film robot, the thesis posits that anthropomorphism is an integral element in the dissipation and reformation of human-machine bodies

    Freeform 3D interactions in everyday environments

    Get PDF
    PhD ThesisPersonal computing is continuously moving away from traditional input using mouse and keyboard, as new input technologies emerge. Recently, natural user interfaces (NUI) have led to interactive systems that are inspired by our physical interactions in the real-world, and focus on enabling dexterous freehand input in 2D or 3D. Another recent trend is Augmented Reality (AR), which follows a similar goal to further reduce the gap between the real and the virtual, but predominately focuses on output, by overlaying virtual information onto a tracked real-world 3D scene. Whilst AR and NUI technologies have been developed for both immersive 3D output as well as seamless 3D input, these have mostly been looked at separately. NUI focuses on sensing the user and enabling new forms of input; AR traditionally focuses on capturing the environment around us and enabling new forms of output that are registered to the real world. The output of NUI systems is mainly presented on a 2D display, while the input technologies for AR experiences, such as data gloves and body-worn motion trackers are often uncomfortable and restricting when interacting in the real world. NUI and AR can be seen as very complimentary, and bringing these two fields together can lead to new user experiences that radically change the way we interact with our everyday environments. The aim of this thesis is to enable real-time, low latency, dexterous input and immersive output without heavily instrumenting the user. The main challenge is to retain and to meaningfully combine the positive qualities that are attributed to both NUI and AR systems. I review work in the intersecting research fields of AR and NUI, and explore freehand 3D interactions with varying degrees of expressiveness, directness and mobility in various physical settings. There a number of technical challenges that arise when designing a mixed NUI/AR system, which I will address is this work: What can we capture, and how? How do we represent the real in the virtual? And how do we physically couple input and output? This is achieved by designing new systems, algorithms, and user experiences that explore the combination of AR and NUI

    Enhanced Virtuality: Increasing the Usability and Productivity of Virtual Environments

    Get PDF
    Mit stetig steigender Bildschirmauflösung, genauerem Tracking und fallenden Preisen stehen Virtual Reality (VR) Systeme kurz davor sich erfolgreich am Markt zu etablieren. Verschiedene Werkzeuge helfen Entwicklern bei der Erstellung komplexer Interaktionen mit mehreren Benutzern innerhalb adaptiver virtueller Umgebungen. Allerdings entstehen mit der Verbreitung der VR-Systeme auch zusĂ€tzliche Herausforderungen: Diverse EingabegerĂ€te mit ungewohnten Formen und Tastenlayouts verhindern eine intuitive Interaktion. DarĂŒber hinaus zwingt der eingeschrĂ€nkte Funktionsumfang bestehender Software die Nutzer dazu, auf herkömmliche PC- oder Touch-basierte Systeme zurĂŒckzugreifen. Außerdem birgt die Zusammenarbeit mit anderen Anwendern am gleichen Standort Herausforderungen hinsichtlich der Kalibrierung unterschiedlicher Trackingsysteme und der Kollisionsvermeidung. Beim entfernten Zusammenarbeiten wird die Interaktion durch Latenzzeiten und Verbindungsverluste zusĂ€tzlich beeinflusst. Schließlich haben die Benutzer unterschiedliche Anforderungen an die Visualisierung von Inhalten, z.B. GrĂ¶ĂŸe, Ausrichtung, Farbe oder Kontrast, innerhalb der virtuellen Welten. Eine strikte Nachbildung von realen Umgebungen in VR verschenkt Potential und wird es nicht ermöglichen, die individuellen BedĂŒrfnisse der Benutzer zu berĂŒcksichtigen. Um diese Probleme anzugehen, werden in der vorliegenden Arbeit Lösungen in den Bereichen Eingabe, Zusammenarbeit und Erweiterung von virtuellen Welten und Benutzern vorgestellt, die darauf abzielen, die Benutzerfreundlichkeit und ProduktivitĂ€t von VR zu erhöhen. ZunĂ€chst werden PC-basierte Hardware und Software in die virtuelle Welt ĂŒbertragen, um die Vertrautheit und den Funktionsumfang bestehender Anwendungen in VR zu erhalten. Virtuelle Stellvertreter von physischen GerĂ€ten, z.B. Tastatur und Tablet, und ein VR-Modus fĂŒr Anwendungen ermöglichen es dem Benutzer reale FĂ€higkeiten in die virtuelle Welt zu ĂŒbertragen. Des Weiteren wird ein Algorithmus vorgestellt, der die Kalibrierung mehrerer ko-lokaler VR-GerĂ€te mit hoher Genauigkeit und geringen Hardwareanforderungen und geringem Aufwand ermöglicht. Da VR-Headsets die reale Umgebung der Benutzer ausblenden, wird die Relevanz einer Ganzkörper-Avatar-Visualisierung fĂŒr die Kollisionsvermeidung und das entfernte Zusammenarbeiten nachgewiesen. DarĂŒber hinaus werden personalisierte rĂ€umliche oder zeitliche Modifikationen vorgestellt, die es erlauben, die Benutzerfreundlichkeit, Arbeitsleistung und soziale PrĂ€senz von Benutzern zu erhöhen. Diskrepanzen zwischen den virtuellen Welten, die durch persönliche Anpassungen entstehen, werden durch Methoden der Avatar-Umlenkung (engl. redirection) kompensiert. Abschließend werden einige der Methoden und Erkenntnisse in eine beispielhafte Anwendung integriert, um deren praktische Anwendbarkeit zu verdeutlichen. Die vorliegende Arbeit zeigt, dass virtuelle Umgebungen auf realen FĂ€higkeiten und Erfahrungen aufbauen können, um eine vertraute und einfache Interaktion und Zusammenarbeit von Benutzern zu gewĂ€hrleisten. DarĂŒber hinaus ermöglichen individuelle Erweiterungen des virtuellen Inhalts und der Avatare EinschrĂ€nkungen der realen Welt zu ĂŒberwinden und das Erlebnis von VR-Umgebungen zu steigern

    From Line Drawings to Human Actions: Deep Neural Networks for Visual Data Representation

    No full text
    In recent years, deep neural networks have been very successful in computer vision, speech recognition, and artificial intelligent systems. The rapid growth of data and fast increasing computational tools provide solid foundations for the applications which rely on the learning of large scale deep neural networks with millions of parameters. The deep learning approaches have been proved to be able to learn powerful representations of the inputs in various tasks, such as image classification, object recognition, and scene understanding. This thesis demonstrates the generality and capacity of deep learning approaches through a series of case studies including image matching and human activity understanding. In these studies, I explore the combinations of the neural network models with existing machine learning techniques and extend the deep learning approach for each task. Four related tasks are investigated: 1) image matching through similarity learning; 2) human action prediction; 3) finger force estimation in manipulation actions; and 4) bimodal learning for human action understanding. Deep neural networks have been shown to be very efficient in supervised learning. Further, in some tasks, one would like to group the features of the samples in the same category close to each other, in additional to the discriminative representation. Such kind of properties is desired in a number of applications, such as semantic retrieval, image quality measurement, and social network analysis, etc. My first study is to develop a similarity learning method based on deep neural networks for image matching between sketch images and 3D models. In this task, I propose to use Siamese network to learn similarities of sketches and develop a novel method for sketch based 3D shape retrieval. The proposed method can successfully learn the representations of sketch images as well as the similarities, then the 3D shape retrieval problem can be solved with off-the-shelf nearest neighbor methods. After studying the representation learning methods for static inputs, my focus turns to learning the representations of sequential data. To be specific, I focus on manipulation actions, because they are widely used in the daily life and play important parts in the human-robot collaboration system. Deep neural networks have been shown to be powerful to represent short video clips [Donahue et al., 2015]. However, most existing methods consider the action recognition problem as a classification task. These methods assume the inputs are pre-segmented videos and the outputs are category labels. In the scenarios such as the human-robot collaboration system, the ability to predict the ongoing human actions at an early stage is highly important. I first attempt to address this issue with a fast manipulation action prediction method. Then I build the action prediction model based on Long Short-Term Memory (LSTM) architecture. The proposed approach processes the sequential inputs as continuous signals and keeps updating the prediction of the intended action based on the learned action representations. Further, I study the relationships between visual inputs and the physical information, such as finger forces, that involved in the manipulation actions. This is motivated by recent studies in cognitive science which show that the subject’s intention is strongly related to the hand movements during an action execution. Human observers can interpret other’s actions in terms of movements and forces, which can be used to repeat the observed actions. If a robot system has the ability to estimate the force feedbacks, it can learn how to manipulate an object by watching human demonstrations. In this work, the finger forces are estimated by only watching the movement of hands. A modified LSTM model is used to regress the finger forces from video frames. To facilitate this study, a specially designed sensor glove has been used to collect data of finger forces, and a new dataset has been collected to provide synchronized streams of videos and finger forces. Last, I investigate the usefulness of physical information in human action recognition, which is an application of bimodal learning, where both the vision inputs and the additional information are used to learn the action representation. My study demonstrates that, by combining additional information with the vision inputs, the accuracy of human action recognition can be improved steadily. I extend the LSTM architecture to accept both video frames and sensor data as bimodal inputs to predict the action. A hallucination network is jointly trained to approximate the representations of the additional inputs. During the testing stage, the hallucination network generates approximated representations that used for classification. In this way, the proposed method does not rely on the additional inputs for testing

    NASA Tech Briefs, April 1991

    Get PDF
    Topics: New Product Ideas; NASA TU Services; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Programs; Mechanics; Machinery; Fabrication Technology; Mathematics and Information Sciences; Life Sciences
    corecore