266 research outputs found

    Human-robot collaborative task planning using anticipatory brain responses

    Get PDF
    Human-robot interaction (HRI) describes scenarios in which both human and robot work as partners, sharing the same environment or complementing each other on a joint task. HRI is characterized by the need for high adaptability and flexibility of robotic systems toward their human interaction partners. One of the major challenges in HRI is task planning with dynamic subtask assignment, which is particularly challenging when subtask choices of the human are not readily accessible by the robot. In the present work, we explore the feasibility of using electroencephalogram (EEG) based neuro-cognitive measures for online robot learning of dynamic subtask assignment. To this end, we demonstrate in an experimental human subject study, featuring a joint HRI task with a UR10 robotic manipulator, the presence of EEG measures indicative of a human partner anticipating a takeover situation from human to robot or vice-versa. The present work further proposes a reinforcement learning based algorithm employing these measures as a neuronal feedback signal from the human to the robot for dynamic learning of subtask-assignment. The efficacy of this algorithm is validated in a simulation-based study. The simulation results reveal that even with relatively low decoding accuracies, successful robot learning of subtask-assignment is feasible, with around 80% choice accuracy among four subtasks within 17 minutes of collaboration. The simulation results further reveal that scalability to more subtasks is feasible and mainly accompanied with longer robot learning times. These findings demonstrate the usability of EEG-based neuro-cognitive measures to mediate the complex and largely unsolved problem of human-robot collaborative task planning

    Interfaces for Modular Surgical Planning and Assistance Systems

    Get PDF
    Modern surgery of the 21st century relies in many aspects on computers or, in a wider sense, digital data processing. Department administration, OR scheduling, billing, and - with increasing pervasion - patient data management are performed with the aid of so called Surgical Information Systems (SIS) or, more general, Hospital Information Systems (HIS). Computer Assisted Surgery (CAS) summarizes techniques which assist a surgeon in the preparation and conduction of surgical interventions. Today still predominantly based on radiology images, these techniques include the preoperative determination of an optimal surgical strategy and intraoperative systems which aim at increasing the accuracy of surgical manipulations. CAS is a relatively young field of computer science. One of the unsolved "teething troubles" of CAS is the absence of technical standards for the interconnectivity of CAS system. Current CAS systems are usually "islands of information" with no connection to other devices within the operating room or hospital-wide information systems. Several workshop reports and individual publications point out that this situation leads to ergonomic, logistic, and economic limitations in hospital work. Perioperative processes are prolonged by the manual installation and configuration of an increasing amount of technical devices. Intraoperatively, a large amount of the surgeons'' attention is absorbed by the requirement to monitor and operate systems. The need for open infrastructures which enable the integration of CAS devices from different vendors in order to exchange information as well as commands among these devices through a network has been identified by numerous experts with backgrounds in medicine as well as engineering. This thesis contains two approaches to the integration of CAS systems: - For perioperative data exchange, the specification of new data structures as an amendment to the existing DICOM standard for radiology image management is presented. The extension of DICOM towards surgical application allows for the seamless integration of surgical planning and reporting systems into DICOM-based Picture Archiving and Communication Systems (PACS) as they are installed in most hospitals for the exchange and long-term archival of patient images and image-related patient data. - For the integration of intraoperatively used CAS devices, such as, e.g., navigation systems, video image sources, or biosensors, the concept of a surgical middleware is presented. A c++ class library, the TiCoLi, is presented which facilitates the configuration of ad-hoc networks among the modules of a distributed CAS system as well as the exchange of data streams, singular data objects, and commands between these modules. The TiCoLi is the first software library for a surgical field of application to implement all of these services. To demonstrate the suitability of the presented specifications and their implementation, two modular CAS applications are presented which utilize the proposed DICOM extensions for perioperative exchange of surgical planning data as well as the TiCoLi for establishing an intraoperative network of autonomous, yet not independent, CAS modules.Die moderne Hochleistungschirurgie des 21. Jahrhunderts ist auf vielerlei Weise abhängig von Computern oder, im weiteren Sinne, der digitalen Datenverarbeitung. Administrative Abläufe, wie die Erstellung von Nutzungsplänen für die verfügbaren technischen, räumlichen und personellen Ressourcen, die Rechnungsstellung und - in zunehmendem Maße - die Verwaltung und Archivierung von Patientendaten werden mit Hilfe von digitalen Informationssystemen rationell und effizient durchgeführt. Innerhalb der Krankenhausinformationssysteme (KIS, oder englisch HIS) stehen für die speziellen Bedürfnisse der einzelnen Fachabteilungen oft spezifische Informationssysteme zur Verfügung. Chirurgieinformationssysteme (CIS, oder englisch SIS) decken hierbei vor allen Dingen die Bereiche Operationsplanung sowie Materialwirtschaft für spezifisch chirurgische Verbrauchsmaterialien ab. Während die genannten HIS und SIS vornehmlich der Optimierung administrativer Aufgaben dienen, stehen die Systeme der Computerassistierten Chirugie (CAS) wesentlich direkter im Dienste der eigentlichen chirugischen Behandlungsplanung und Therapie. Die CAS verwendet Methoden der Robotik, digitalen Bild- und Signalverarbeitung, künstlichen Intelligenz, numerischen Simulation, um nur einige zu nennen, zur patientenspezifischen Behandlungsplanung und zur intraoperativen Unterstützung des OP-Teams, allen voran des Chirurgen. Vor allen Dingen Fortschritte in der räumlichen Verfolgung von Werkzeugen und Patienten ("Tracking"), die Verfügbarkeit dreidimensionaler radiologischer Aufnahmen (CT, MRT, ...) und der Einsatz verschiedener Robotersysteme haben in den vergangenen Jahrzehnten den Einzug des Computers in den Operationssaal - medienwirksam - ermöglicht. Weniger prominent, jedoch keinesfalls von untergeordnetem praktischen Nutzen, sind Beispiele zur automatisierten Überwachung klinischer Messwerte, wie etwa Blutdruck oder Sauerstoffsättigung. Im Gegensatz zu den meist hochgradig verteilten und gut miteinander verwobenen Informationssystemen für die Krankenhausadministration und Patientendatenverwaltung, sind die Systeme der CAS heutzutage meist wenig oder überhaupt nicht miteinander und mit Hintergrundsdatenspeichern vernetzt. Eine Reihe wissenschaftlicher Publikationen und interdisziplinärer Workshops hat sich in den vergangen ein bis zwei Jahrzehnten mit den Problemen des Alltagseinsatzes von CAS Systemen befasst. Mit steigender Intensität wurde hierbei auf den Mangel an infrastrukturiellen Grundlagen für die Vernetzung intraoperativ eingesetzter CAS Systeme miteinander und mit den perioperativ eingesetzten Planungs-, Dokumentations- und Archivierungssystemen hingewiesen. Die sich daraus ergebenden negativen Einflüsse auf die Effizienz perioperativer Abläufe - jedes Gerät muss manuell in Betrieb genommen und mit den spezifischen Daten des nächsten Patienten gefüttert werden - sowie die zunehmende Aufmerksamkeit, welche der Operateur und sein Team auf die Überwachung und dem Betrieb der einzelnen Geräte verwenden muss, werden als eine der "Kinderkrankheiten" dieser relativ jungen Technologie betrachtet und stehen einer Verbreitung über die Grenzen einer engagierten technophilen Nutzergruppe hinaus im Wege. Die vorliegende Arbeit zeigt zwei parallel von einander (jedoch, im Sinne der Schnittstellenkompatibilität, nicht gänzlich unabhängig voneinander) zu betreibende Ansätze zur Integration von CAS Systemen. - Für den perioperativen Datenaustausch wird die Spezifikation zusätzlicher Datenstrukturen zum Transfer chirurgischer Planungsdaten im Rahmen des in radiologischen Bildverarbeitungssystemen weit verbreiteten DICOM Standards vorgeschlagen und an zwei Beispielen vorgeführt. Die Erweiterung des DICOM Standards für den perioperativen Einsatz ermöglicht hierbei die nahtlose Integration chirurgischer Planungssysteme in existierende "Picture Archiving and Communication Systems" (PACS), welche in den meisten Fällen auf dem DICOM Standard basieren oder zumindest damit kompatibel sind. Dadurch ist einerseits der Tatsache Rechnung getragen, dass die patientenspezifische OP-Planung in hohem Masse auf radiologischen Bildern basiert und andererseits sicher gestellt, dass die Planungsergebnisse entsprechend der geltenden Bestimmungen langfristig archiviert und gegen unbefugten Zugriff geschützt sind - PACS Server liefern hier bereits wohlerprobte Lösungen. - Für die integration intraoperativer CAS Systeme, wie etwa Navigationssysteme, Videobildquellen oder Sensoren zur Überwachung der Vitalparameter, wird das Konzept einer "chirurgischen Middleware" vorgestellt. Unter dem Namen TiCoLi wurde eine c++ Klassenbibliothek entwickelt, auf deren Grundlage die Konfiguration von ad-hoc Netzwerken während der OP-Vorbereitung mittels plug-and-play Mechanismen erleichtert wird. Nach erfolgter Konfiguration ermöglicht die TiCoLi den Austausch kontinuierlicher Datenströme sowie einzelner Datenpakete und Kommandos zwischen den Modulen einer verteilten CAS Anwendung durch ein Ethernet-basiertes Netzwerk. Die TiCoLi ist die erste frei verfügbare Klassenbibliothek welche diese Funktionalitäten dediziert für einen Einsatz im chirurgischen Umfeld vereinigt. Zum Nachweis der Tauglichkeit der gezeigten Spezifikationen und deren Implementierungen, werden zwei modulare CAS Anwendungen präsentiert, welche die vorgeschlagenen DICOM Erweiterungen zum perioperativen Austausch von Planungsergebnissen sowie die TiCoLi zum intraoperativen Datenaustausch von Messdaten unter echzeitnahen Anforderungen verwenden

    Closed-form Continuous-Depth Models

    Get PDF
    Continuous-depth neural models, where the derivative of the model's hidden state is defined by a neural network, have enabled strong sequential data processing capabilities. However, these models rely on advanced numerical differential equation (DE) solvers resulting in a significant overhead both in terms of computational cost and model complexity. In this paper, we present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster while exhibiting equally strong modeling abilities compared to their ODE-based counterparts. The models are hereby derived from the analytical closed-form solution of an expressive subset of time-continuous models, thus alleviating the need for complex DE solvers all together. In our experimental evaluations, we demonstrate that CfC networks outperform advanced, recurrent models over a diverse set of time-series prediction tasks, including those with long-term dependencies and irregularly sampled data. We believe our findings open new opportunities to train and deploy rich, continuous neural models in resource-constrained settings, which demand both performance and efficiency.Comment: 17 page

    Multimodal Sensing and Data Processing for Speaker and Emotion Recognition using Deep Learning Models with Audio, Video and Biomedical Sensors

    Full text link
    The focus of the thesis is on Deep Learning methods and their applications on multimodal data, with a potential to explore the associations between modalities and replace missing and corrupt ones if necessary. We have chosen two important real-world applications that need to deal with multimodal data: 1) Speaker recognition and identification; 2) Facial expression recognition and emotion detection. The first part of our work assesses the effectiveness of speech-related sensory data modalities and their combinations in speaker recognition using deep learning models. First, the role of electromyography (EMG) is highlighted as a unique biometric sensor in improving audio-visual speaker recognition or as a substitute in noisy or poorly-lit environments. Secondly, the effectiveness of deep learning is empirically confirmed through its higher robustness to all types of features in comparison to a number of commonly used baseline classifiers. Not only do deep models outperform the baseline methods, their power increases when they integrate multiple modalities, as different modalities contain information on different aspects of the data, especially between EMG and audio. Interestingly, our deep learning approach is word-independent. Plus, the EMG, audio, and visual parts of the samples from each speaker do not need to match. This increases the flexibility of our method in using multimodal data, particularly if one or more modalities are missing. With a dataset of 23 individuals speaking 22 words five times, we show that EMG can replace the audio/visual modalities, and when combined, significantly improve the accuracy of speaker recognition. The second part describes a study on automated emotion recognition using four different modalities – audio, video, electromyography (EMG), and electroencephalography (EEG). We collected a dataset by recording the 4 modalities as 12 human subjects expressed six different emotions or maintained a neutral expression. Three different aspects of emotion recognition were investigated: model selection, feature selection, and data selection. Both generative models (DBNs) and discriminative models (LSTMs) were applied to the four modalities, and from these analyses we conclude that LSTM is better for audio and video together with their corresponding sophisticated feature extractors (MFCC and CNN), whereas DBN is better for both EMG and EEG. By examining these signals at different stages (pre-speech, during-speech, and post-speech) of the current and following trials, we have found that the most effective stages for emotion recognition from EEG occur after the emotion has been expressed, suggesting that the neural signals conveying an emotion are long-lasting

    Haptics Rendering and Applications

    Get PDF
    There has been significant progress in haptic technologies but the incorporation of haptics into virtual environments is still in its infancy. A wide range of the new society's human activities including communication, education, art, entertainment, commerce and science would forever change if we learned how to capture, manipulate and reproduce haptic sensory stimuli that are nearly indistinguishable from reality. For the field to move forward, many commercial and technological barriers need to be overcome. By rendering how objects feel through haptic technology, we communicate information that might reflect a desire to speak a physically- based language that has never been explored before. Due to constant improvement in haptics technology and increasing levels of research into and development of haptics-related algorithms, protocols and devices, there is a belief that haptics technology has a promising future

    Robots learn to behave: improving human-robot collaboration in flexible manufacturing applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    ISMCR 1994: Topical Workshop on Virtual Reality. Proceedings of the Fourth International Symposium on Measurement and Control in Robotics

    Get PDF
    This symposium on measurement and control in robotics included sessions on: (1) rendering, including tactile perception and applied virtual reality; (2) applications in simulated medical procedures and telerobotics; (3) tracking sensors in a virtual environment; (4) displays for virtual reality applications; (5) sensory feedback including a virtual environment application with partial gravity simulation; and (6) applications in education, entertainment, technical writing, and animation

    Liquid Time-constant Networks

    Full text link
    We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics and compute their expressive power by the trajectory length measure in latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs) compared to classical and modern RNNs. Code and data are available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21

    Closed-form continuous-time neural networks

    Get PDF
    Continuous-time neural networks are a class of machine learning systems that can tackle representation learning on spatiotemporal decision-making tasks. These models are typically represented by continuous differential equations. However, their expressive power when they are deployed on computers is bottlenecked by numerical differential equation solvers. This limitation has notably slowed down the scaling and understanding of numerous natural physical phenomena such as the dynamics of nervous systems. Ideally, we would circumvent this bottleneck by solving the given dynamical system in closed form. This is known to be intractable in general. Here, we show that it is possible to closely approximate the interaction between neurons and synapses—the building blocks of natural and artificial neural networks—constructed by liquid time-constant networks efficiently in closed form. To this end, we compute a tightly bounded approximation of the solution of an integral appearing in liquid time-constant dynamics that has had no known closed-form solution so far. This closed-form solution impacts the design of continuous-time and continuous-depth neural models. For instance, since time appears explicitly in closed form, the formulation relaxes the need for complex numerical solvers. Consequently, we obtain models that are between one and five orders of magnitude faster in training and inference compared with differential equation-based counterparts. More importantly, in contrast to ordinary differential equation-based continuous networks, closed-form networks can scale remarkably well compared with other deep learning instances. Lastly, as these models are derived from liquid networks, they show good performance in time-series modelling compared with advanced recurrent neural network models
    • …
    corecore