570 research outputs found

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Multi-sensor fusion for human-robot interaction in crowded environments

    Get PDF
    For challenges associated with the ageing population, robot assistants are becoming a promising solution. Human-Robot Interaction (HRI) allows a robot to understand the intention of humans in an environment and react accordingly. This thesis proposes HRI techniques to facilitate the transition of robots from lab-based research to real-world environments. The HRI aspects addressed in this thesis are illustrated in the following scenario: an elderly person, engaged in conversation with friends, wishes to attract a robot's attention. This composite task consists of many problems. The robot must detect and track the subject in a crowded environment. To engage with the user, it must track their hand movement. Knowledge of the subject's gaze would ensure that the robot doesn't react to the wrong person. Understanding the subject's group participation would enable the robot to respect existing human-human interaction. Many existing solutions to these problems are too constrained for natural HRI in crowded environments. Some require initial calibration or static backgrounds. Others deal poorly with occlusions, illumination changes, or real-time operation requirements. This work proposes algorithms that fuse multiple sensors to remove these restrictions and increase the accuracy over the state-of-the-art. The main contributions of this thesis are: A hand and body detection method, with a probabilistic algorithm for their real-time association when multiple users and hands are detected in crowded environments; An RGB-D sensor-fusion hand tracker, which increases position and velocity accuracy by combining a depth-image based hand detector with Monte-Carlo updates using colour images; A sensor-fusion gaze estimation system, combining IR and depth cameras on a mobile robot to give better accuracy than traditional visual methods, without the constraints of traditional IR techniques; A group detection method, based on sociological concepts of static and dynamic interactions, which incorporates real-time gaze estimates to enhance detection accuracy.Open Acces

    Affective Computing

    Get PDF
    This book provides an overview of state of the art research in Affective Computing. It presents new ideas, original results and practical experiences in this increasingly important research field. The book consists of 23 chapters categorized into four sections. Since one of the most important means of human communication is facial expression, the first section of this book (Chapters 1 to 7) presents a research on synthesis and recognition of facial expressions. Given that we not only use the face but also body movements to express ourselves, in the second section (Chapters 8 to 11) we present a research on perception and generation of emotional expressions by using full-body motions. The third section of the book (Chapters 12 to 16) presents computational models on emotion, as well as findings from neuroscience research. In the last section of the book (Chapters 17 to 22) we present applications related to affective computing

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Robust 3D IMU-LIDAR Calibration and Multi Sensor Probabilistic State Estimation

    Get PDF
    Autonomous robots are highly complex systems. In order to operate in dynamic environments, adaptability in their decision-making algorithms is a must. Thus, the internal and external information that robots obtain from sensors is critical to re-evaluate their decisions in real time. Accuracy is key in this endeavor, both from the hardware side and the modeling point of view. In order to guarantee the highest performance, sensors need to be correctly calibrated. To this end, some parameters are tuned so that the particular realization of a sensor best matches a generalized mathematical model. This step grows in complexity with the integration of multiple sensors, which is generally a requirement in order to cope with the dynamic nature of real world applications. This project aims to deal with the calibration of an inertial measurement unit, or IMU, and a Light Detection and Ranging device, or LiDAR. An offline batch optimization procedure is proposed to optimally estimate the intrinsic and extrinsic parameters of the model. Then, an online state estimation module that makes use of the aforementioned parameters and the fusion of LiDAR-inertial data for local navigation is proposed. Additionally, it incorporates real time corrections to account for the time-varying nature of the model, essential to deal with exposure to continued operation and wear and tear. Keywords: sensor fusion, multi-sensor calibration, factor graphs, batch optimization, Gaussian Processes, state estimation, LiDAR-inertial odometry, Error State Kalman Filter, Normal Distributions Transform

    Computer-Assisted Planning and Robotics in Epilepsy Surgery

    Get PDF
    Epilepsy is a severe and devastating condition that affects ~1% of the population. Around 30% of these patients are drug-refractory. Epilepsy surgery may provide a cure in selected individuals with drug-resistant focal epilepsy if the epileptogenic zone can be identified and safely resected or ablated. Stereoelectroencephalography (SEEG) is a diagnostic procedure that is performed to aid in the delineation of the seizure onset zone when non-invasive investigations are not sufficiently informative or discordant. Utilizing a multi-modal imaging platform, a novel computer-assisted planning (CAP) algorithm was adapted, applied and clinically validated for optimizing safe SEEG trajectory planning. In an initial retrospective validation study, 13 patients with 116 electrodes were enrolled and safety parameters between automated CAP trajectories and expert manual plans were compared. The automated CAP trajectories returned statistically significant improvements in all of the compared clinical metrics including overall risk score (CAP 0.57 +/- 0.39 (mean +/- SD) and manual 1.00 +/- 0.60, p < 0.001). Assessment of the inter-rater variability revealed there was no difference in external expert surgeon ratings. Both manual and CAP electrodes were rated as feasible in 42.8% (42/98) of cases. CAP was able to provide feasible electrodes in 19.4% (19/98), whereas manual planning was able to generate a feasible electrode in 26.5% (26/98) when the alternative generation method was not feasible. Based on the encouraging results from the retrospective analysis a prospective validation study including an additional 125 electrodes in 13 patients was then undertaken to compare CAP to expert manual plans from two neurosurgeons. The manual plans were performed separately and blindly from the CAP. Computer-generated trajectories were found to carry lower risks scores (absolute difference of 0.04 mm (95% CI = -0.42-0.01), p = 0.04) and were subsequently implanted in all cases without complication. The pipeline has been fully integrated into the clinical service and has now replaced manual SEEG planning at our institution. Further efforts were then focused on the distillation of optimal entry and target points for common SEEG trajectories and applying machine learning methods to develop an active learning algorithm to adapt to individual surgeon preferences. Thirty-two patients were prospectively enrolled in the study. The first 12 patients underwent prospective CAP planning and implantation following the pipeline outlined in the previous study. These patients were used as a training set and all of the 108 electrodes after successful implantation were normalized to atlas space to generate ‘spatial priors’, using a K-Nearest Neighbour (K-NN) classifier. A subsequent test set of 20 patients (210 electrodes) were then used to prospectively validate the spatial priors. From the test set, 78% (123/157) of the implanted trajectories passed through both the entry and target spatial priors defined from the training set. To improve the generalizability of the spatial priors to other neurosurgical centres undertaking SEEG and to take into account the potential for changing institutional practices, an active learning algorithm was implemented. The K-NN classifier was shown to dynamically learn and refine the spatial priors. The progressive refinement of CAP SEEG planning outlined in this and previous studies has culminated in an algorithm that not only optimizes the surgical heuristics and risk scores related to SEEG planning but can also learn from previous experience. Overall, safe and feasible trajectory schema were returning in 30% of the time required for manual SEEG planning. Computer-assisted planning was then applied to optimize laser interstitial thermal therapy (LITT) trajectory planning, which is a minimally invasive alternative to open mesial temporal resections, focal lesion ablation and anterior 2/3 corpus callosotomy. We describe and validate the first CAP algorithm for mesial temporal LITT ablations for epilepsy treatment. Twenty-five patients that had previously undergone LITT ablations at a single institution and with a median follow up of 2 years were included. Trajectory parameters for the CAP algorithm were derived from expert consensus to maximize distance from vasculature and ablation of the amygdalohippocampal complex, minimize collateral damage to adjacent brain structures whilst avoiding transgression of the ventricles and sulci. Trajectory parameters were also optimized to reduce the drilling angle to the skull and overall catheter length. Simulated cavities attributable to the CAP trajectories were calculated using a 5-15 mm ablation diameter. In comparison to manually planned and implemented LITT trajectories,CAP resulted in a significant increase in the percentage ablation of the amygdalohippocampal complex (manual 57.82 +/- 15.05% (mean +/- S.D.) and unablated medial hippocampal head depth (manual 4.45 +/- 1.58 mm (mean +/- S.D.), CAP 1.19 +/- 1.37 (mean +/- S.D.), p = 0.0001). As LITT ablation of the mesial temporal structures is a novel procedure there are no established standards for trajectory planning. A data-driven machine learning approach was, therefore, applied to identify hitherto unknown CAP trajectory parameter combinations. All possible combinations of planning parameters were calculated culminating in 720 unique combinations per patient. Linear regression and random forest machine learning algorithms were trained on half of the data set (3800 trajectories) and tested on the remaining unseen trajectories (3800 trajectories). The linear regression and random forest methods returned good predictive accuracies with both returning Pearson correlations of ρ = 0.7 and root mean squared errors of 0.13 and 0.12 respectively. The machine learning algorithm revealed that the optimal entry points were centred over the junction of the inferior occipital, middle temporal and middle occipital gyri. The optimal target points were anterior and medial translations of the centre of the amygdala. A large multicenter external validation study of 95 patients was then undertaken comparing the manually planned and implemented trajectories, CAP trajectories targeting the centre of the amygdala, the CAP parameters derived from expert consensus and the CAP trajectories utilizing the machine learning derived parameters. Three external blinded expert surgeons were then selected to undertake feasibility ratings and preference rankings of the trajectories. CAP generated trajectories result in a significant improvement in many of the planning metrics, notably the risk score (manual 1.3 +/- 0.1 (mean +/- S.D.), CAP 1.1 +/- 0.2 (mean +/- S.D.), p<0.000) and overall ablation of the amygdala (manual 45.3 +/- 22.2 % (mean +/- S.D.), CAP 64.2 +/- 20 % (mean +/- S.D.), p<0.000). Blinded external feasibility ratings revealed that manual trajectories were less preferable than CAP planned trajectories with an estimated probability of being ranked 4th (lowest) of 0.62. Traditional open corpus callosotomy requires a midline craniotomy, interhemispheric dissection and disconnection of the rostrum, genu and body of the corpus callosum. In cases where drop attacks persist a completion corpus callosotomy to disrupt the remaining fibres in the splenium is then performed. The emergence of LITT technology has raised the possibility of being able to undertake this procedure in a minimally invasive fashion and without the need for a craniotomy using two or three individual trajectories. Early case series have shown LITT anterior two-thirds corpus callosotomy to be safe and efficacious. Whole-brain probabilistic tractography connectomes were generated utilizing 3-Tesla multi-shell imaging data and constrained spherical deconvolution (CSD). Two independent blinded expert neurosurgeons with experience of performing the procedure using LITT then planned the trajectories in each patient following their current clinical practice. Automated trajectories returned a significant reduction in the risk score (manual 1.3 +/- 0.1 (mean +/- S.D.), CAP 1.1 +/- 0.1 (mean +/- S.D.), p<0.000). Finally, we investigate the different methods of surgical implantation for SEEG electrodes. As an initial study, a systematic review and meta-analysis of the literature to date were performed. This revealed a wide variety of implantation methods including traditional frame-based, frameless, robotic and custom-3D printed jigs were being used in clinical practice. Of concern, all comparative reports from institutions that had changed from one implantation method to another, such as following the introduction of robotic systems, did not undertake parallel-group comparisons. This suggests that patients may have been exposed to risks associated with learning curves and potential harms related to the new device until the efficacy was known. A pragmatic randomized control trial of a novel non-CE marked robotic trajectory guidance system (iSYS1) was then devised. Before clinical implantations began a series of pre-clinical investigations utilizing 3D printed phantom heads from previously implanted patients was performed to provide pilot data and also assess the surgical learning curve. The surgeons had comparatively little clinical experience with the new robotic device which replicates the introduction of such novel technologies to clinical practice. The study confirmed that the learning curve with the iSYS1 devices was minimal and the accuracies and workflow were similar to the conventional manual method. The randomized control trial represents the first of its kind for stereotactic neurosurgical procedures. Thirty-two patients were enrolled with 16 patients randomized to the iSYS1 intervention arm and 16 patients to the manual implantation arm. The intervention allocation was concealed from the patients. The surgical and research team could be not blinded. Trial management, independent data monitoring and trial steering committees were convened at four points doing the trial (after every 8 patients implanted). Based on the high level of accuracy required for both methods, the main distinguishing factor would be the time to achieve the alignment to the prespecified trajectory. The primary outcome for comparison, therefore, was the time for individual SEEG electrode implantation. Secondary outcomes included the implantation accuracy derived from the post-operative CT scan, infection, intracranial haemorrhage and neurological deficit rates. Overall, 32 patients (328 electrodes) completed the trial (16 in each intervention arm) and the baseline demographics were broadly similar between the two groups. The time for individual electrode implantation was significantly less with the iSYS1 device (median of 3.36 (95% CI 5.72 to 7.07) than for the PAD group (median of 9.06 minutes (95% CI 8.16 to 10.06), p=0.0001). Target point accuracy was significantly greater with the PAD (median of 1.58 mm (95% CI 1.38 to 1.82) compared to the iSYS1 (median of 1.16 mm (95% CI 1.01 to 1.33), p=0.004). The difference between the target point accuracies are not clinically significant for SEEG but may have implications for procedures such as deep brain stimulation that require higher placement accuracy. All of the electrodes achieved their respective intended anatomical targets. In 12 of 16 patients following robotic implantations, and 10 of 16 following manual PAD implantations a seizure onset zone was identified and resection recommended. The aforementioned systematic review and meta-analysis were updated to include additional studies published during the trial duration. In this context, the iSYS1 device entry and target point accuracies were similar to those reported in other published studies of robotic devices including the ROSA, Neuromate and iSYS1. The PAD accuracies, however, outperformed the previously published results for other frameless stereotaxy methods. In conclusion, the presented studies report the integration and validation of a complex clinical decision support software into the clinical neurosurgical workflow for SEEG planning. The stereotactic planning platform was further refined by integrating machine learning techniques and also extended towards optimisation of LITT trajectories for ablation of mesial temporal structures and corpus callosotomy. The platform was then used to seamlessly integrate with a novel trajectory planning software to effectively and safely guide the implantation of the SEEG electrodes. Through a single-blinded randomised control trial, the ISYS1 device was shown to reduce the time taken for individual electrode insertion. Taken together, this work presents and validates the first fully integrated stereotactic trajectory planning platform that can be used for both SEEG and LITT trajectory planning followed by surgical implantation through the use of a novel trajectory guidance system
    corecore