605 research outputs found

    A Framework for Tumor Localization in Robot-Assisted Minimally Invasive Surgery

    Get PDF
    Manual palpation of tissue is frequently used in open surgery, e.g., for localization of tumors and buried vessels and for tissue characterization. The overall objective of this work is to explore how tissue palpation can be performed in Robot-Assisted Minimally Invasive Surgery (RAMIS) using laparoscopic instruments conventionally used in RAMIS. This thesis presents a framework where a surgical tool is moved teleoperatively in a manner analogous to the repetitive pressing motion of a finger during manual palpation. We interpret the changes in parameters due to this motion such as the applied force and the resulting indentation depth to accurately determine the variation in tissue stiffness. This approach requires the sensorization of the laparoscopic tool for force sensing. In our work, we have used a da Vinci needle driver which has been sensorized in our lab at CSTAR for force sensing using Fiber Bragg Grating (FBG). A computer vision algorithm has been developed for 3D surgical tool-tip tracking using the da Vinci \u27s stereo endoscope. This enables us to measure changes in surface indentation resulting from pressing the needle driver on the tissue. The proposed palpation framework is based on the hypothesis that the indentation depth is inversely proportional to the tissue stiffness when a constant pressing force is applied. This was validated in a telemanipulated setup using the da Vinci surgical system with a phantom in which artificial tumors were embedded to represent areas of different stiffnesses. The region with high stiffness representing tumor and region with low stiffness representing healthy tissue showed an average indentation depth change of 5.19 mm and 10.09 mm respectively while maintaining a maximum force of 8N during robot-assisted palpation. These indentation depth variations were then distinguished using the k-means clustering algorithm to classify groups of low and high stiffnesses. The results were presented in a colour-coded map. The unique feature of this framework is its use of a conventional laparoscopic tool and minimal re-design of the existing da Vinci surgical setup. Additional work includes a vision-based algorithm for tracking the motion of the tissue surface such as that of the lung resulting from respiratory and cardiac motion. The extracted motion information was analyzed to characterize the lung tissue stiffness based on the lateral strain variations as the surface inflates and deflates

    Omnidirectional Stereo Vision for Autonomous Vehicles

    Get PDF
    Environment perception with cameras is an important requirement for many applications for autonomous vehicles and robots. This work presents a stereoscopic omnidirectional camera system for autonomous vehicles which resolves the problem of a limited field of view and provides a 360° panoramic view of the environment. We present a new projection model for these cameras and show that the camera setup overcomes major drawbacks of traditional perspective cameras in many applications

    Pedestrian and Vehicle Detection in Autonomous Vehicle Perception Systems—A Review

    Get PDF
    Autonomous Vehicles (AVs) have the potential to solve many traffic problems, such as accidents, congestion and pollution. However, there are still challenges to overcome, for instance, AVs need to accurately perceive their environment to safely navigate in busy urban scenarios. The aim of this paper is to review recent articles on computer vision techniques that can be used to build an AV perception system. AV perception systems need to accurately detect non-static objects and predict their behaviour, as well as to detect static objects and recognise the information they are providing. This paper, in particular, focuses on the computer vision techniques used to detect pedestrians and vehicles. There have been many papers and reviews on pedestrians and vehicles detection so far. However, most of the past papers only reviewed pedestrian or vehicle detection separately. This review aims to present an overview of the AV systems in general, and then review and investigate several detection computer vision techniques for pedestrians and vehicles. The review concludes that both traditional and Deep Learning (DL) techniques have been used for pedestrian and vehicle detection; however, DL techniques have shown the best results. Although good detection results have been achieved for pedestrians and vehicles, the current algorithms still struggle to detect small, occluded, and truncated objects. In addition, there is limited research on how to improve detection performance in difficult light and weather conditions. Most of the algorithms have been tested on well-recognised datasets such as Caltech and KITTI; however, these datasets have their own limitations. Therefore, this paper recommends that future works should be implemented on more new challenging datasets, such as PIE and BDD100K.EPSRC DTP PhD studentshi

    Deep learning based 3D object detection for automotive radar and camera fusion

    Get PDF
    La percepción en el dominio de los vehículos autónomos es una disciplina clave para lograr la automatización de los Sistemas Inteligentes de Transporte. Por ello, este Trabajo Fin de Máster tiene como objetivo el desarrollo de una técnica de fusión sensorial para RADAR y cámara que permita crear una representación del entorno enriquecida para la Detección de Objetos 3D mediante algoritmos Deep Learning. Para ello, se parte de la idea de PointPainting [1] y se adapta a un sensor en auge, el RADAR 3+1D, donde nube de puntos RADAR e información semántica de la cámara son agregadas para generar una representación enriquecida del entorno.Perception in the domain of autonomous vehicles is a key discipline to achieve the au tomation of Intelligent Transport Systems. Therefore, this Master Thesis aims to develop a sensor fusion technique for RADAR and camera to create an enriched representation of the environment for 3D Object Detection using Deep Learning algorithms. To this end, the idea of PointPainting [1] is used as a starting point and is adapted to a growing sensor, the 3+1D RADAR, in which the radar point cloud is aggregated with the semantic information from the camera.Máster Universitario en Ingeniería Industrial (M141

    Vehicle and Traffic Safety

    Get PDF
    The book is devoted to contemporary issues regarding the safety of motor vehicles and road traffic. It presents the achievements of scientists, specialists, and industry representatives in the following selected areas of road transport safety and automotive engineering: active and passive vehicle safety, vehicle dynamics and stability, testing of vehicles (and their assemblies), including electric cars as well as autonomous vehicles. Selected issues from the area of accident analysis and reconstruction are discussed. The impact on road safety of aspects such as traffic control systems, road infrastructure, and human factors is also considered

    Data-centric Design and Training of Deep Neural Networks with Multiple Data Modalities for Vision-based Perception Systems

    Get PDF
    224 p.Los avances en visión artificial y aprendizaje automático han revolucionado la capacidad de construir sistemas que procesen e interpreten datos digitales, permitiéndoles imitar la percepción humana y abriendo el camino a un amplio rango de aplicaciones. En los últimos años, ambas disciplinas han logrado avances significativos,impulsadas por los progresos en las técnicas de aprendizaje profundo(deep learning). El aprendizaje profundo es una disciplina que utiliza redes neuronales profundas (DNNs, por sus siglas en inglés) para enseñar a las máquinas a reconocer patrones y hacer predicciones basadas en datos. Los sistemas de percepción basados en el aprendizaje profundo son cada vez más frecuentes en diversos campos, donde humanos y máquinas colaboran para combinar sus fortalezas.Estos campos incluyen la automoción, la industria o la medicina, donde mejorar la seguridad, apoyar el diagnóstico y automatizar tareas repetitivas son algunos de los objetivos perseguidos.Sin embargo, los datos son uno de los factores clave detrás del éxito de los algoritmos de aprendizaje profundo. La dependencia de datos limita fuertemente la creación y el éxito de nuevas DNN. La disponibilidad de datos de calidad para resolver un problema específico es esencial pero difícil de obtener, incluso impracticable,en la mayoría de los desarrollos. La inteligencia artificial centrada en datos enfatiza la importancia de usar datos de alta calidad que transmitan de manera efectiva lo que un modelo debe aprender. Motivada por los desafíos y la necesidad de los datos, esta tesis formula y valida cinco hipótesis sobre la adquisición y el impacto de los datos en el diseño y entrenamiento de las DNNs.Específicamente, investigamos y proponemos diferentes metodologías para obtener datos adecuados para entrenar DNNs en problemas con acceso limitado a fuentes de datos de gran escala. Exploramos dos posibles soluciones para la obtención de datos de entrenamiento, basadas en la generación de datos sintéticos. En primer lugar, investigamos la generación de datos sintéticos utilizando gráficos 3D y el impacto de diferentes opciones de diseño en la precisión de los DNN obtenidos. Además, proponemos una metodología para automatizar el proceso de generación de datos y producir datos anotados variados, mediante la replicación de un entorno 3D personalizado a partir de un archivo de configuración de entrada. En segundo lugar, proponemos una red neuronal generativa(GAN) que genera imágenes anotadas utilizando conjuntos de datos anotados limitados y datos sin anotaciones capturados en entornos no controlados

    Development of a text reading system on video images

    Get PDF
    Since the early days of computer science researchers sought to devise a machine which could automatically read text to help people with visual impairments. The problem of extracting and recognising text on document images has been largely resolved, but reading text from images of natural scenes remains a challenge. Scene text can present uneven lighting, complex backgrounds or perspective and lens distortion; it usually appears as short sentences or isolated words and shows a very diverse set of typefaces. However, video sequences of natural scenes provide a temporal redundancy that can be exploited to compensate for some of these deficiencies. Here we present a complete end-to-end, real-time scene text reading system on video images based on perspective aware text tracking. The main contribution of this work is a system that automatically detects, recognises and tracks text in videos of natural scenes in real-time. The focus of our method is on large text found in outdoor environments, such as shop signs, street names and billboards. We introduce novel efficient techniques for text detection, text aggregation and text perspective estimation. Furthermore, we propose using a set of Unscented Kalman Filters (UKF) to maintain each text region¿s identity and to continuously track the homography transformation of the text into a fronto-parallel view, thereby being resilient to erratic camera motion and wide baseline changes in orientation. The orientation of each text line is estimated using a method that relies on the geometry of the characters themselves to estimate a rectifying homography. This is done irrespective of the view of the text over a large range of orientations. We also demonstrate a wearable head-mounted device for text reading that encases a camera for image acquisition and a pair of headphones for synthesized speech output. Our system is designed for continuous and unsupervised operation over long periods of time. It is completely automatic and features quick failure recovery and interactive text reading. It is also highly parallelised in order to maximize the usage of available processing power and to achieve real-time operation. We show comparative results that improve the current state-of-the-art when correcting perspective deformation of scene text. The end-to-end system performance is demonstrated on sequences recorded in outdoor scenarios. Finally, we also release a dataset of text tracking videos along with the annotated ground-truth of text regions

    Street Surfaces and Boundaries from Depth Image Sequences Using Probabilistic Models

    Get PDF
    This thesis presents an approach for the detection and reconstruction of street surfaces and boundaries from depth image sequences. Active driver assistance systems which monitor and interpret the environment based on vehicle mounted sensors to support the driver embody a current research focus of the automotive industry. An essential task of these systems is the modeling of the vehicle's static environment. This comprises the determination of the vertical slope and curvature characteristics of the street surface as well as the robust detection of obstacles and, thus, the free drivable space (alias free-space). In this regard, obstacles of low height, e.g. curbs, are of special interest since they often embody the first geometric delimiter of the free-space. The usage of depth images acquired from stereo camera systems becomes more important in this context due to the high data rate and affordable price of the sensor. However, recent approaches for object detection are often limited to the detection of objects which are distinctive in height, such as cars and guardrails, or explicitly address the detection of particular object classes. These approaches are usually based on extremely restrictive assumptions, such as planar street surfaces, in order to deal with the high measurement noise. The main contribution of this thesis is the development, analysis and evaluation of an approach which detects the free-space in the immediate maneuvering area in front of the vehicle and explicitly models the free-space boundary by means of a spline curve. The approach considers in particular obstacles of low height (higher than 10 cm) without limitation on particular object classes. Furthermore, the approach has the ability to cope with various slope and curvature characteristics of the observed street surface and is able to reconstruct this surface by means of a flexible spline model. In order to allow for robust results despite the flexibility of the model and the high measurement noise, the approach employs probabilistic models for the preprocessing of the depth map data as well as for the detection of the drivable free-space. An elevation model is computed from the depth map considering the paths of the optical rays and the uncertainty of the depth measurements. Based on this elevation model, an iterative two step approach is performed which determines the drivable free-space by means of a Markov Random Field and estimates the spline parameters of the free-space boundary curve and the street surface. Outliers in the elevation data are explicitly modeled. The performance of the overall approach and the influence of key components are systematically evaluated within experiments on synthetic and real world test scenarios. The results demonstrate the ability of the approach to accurately model the boundary of the drivable free-space as well as the street surface even in complex scenarios with multiple obstacles or strong curvature of the street surface. The experiments further reveal the limitations of the approach, which are discussed in detail.Schätzung von Straßenoberflächen und -begrenzungen aus Sequenzen von Tiefenkarten unter Verwendung probabilistischer Modelle Diese Arbeit präsentiert ein Verfahren zur Detektion und Rekonstruktion von Straßenoberflächen und -begrenzungen auf der Basis von Tiefenkarten. Aktive Fahrerassistenzsysteme, welche mit der im Fahrzeug verbauten Sensorik die Umgebung erfassen, interpretieren und den Fahrer unterstützen, sind ein aktueller Forschungsschwerpunkt der Fahrzeugindustrie. Eine wesentliche Aufgabe dieser Systeme ist die Modellierung der statischen Fahrzeugumgebung. Dies beinhaltet die Bestimmung der vertikalen Neigungs- und Krümmungseigenschaften der Fahrbahn, sowie die robuste Detektion von Hindernissen und somit des befahrbaren Freiraumes. Hindernisse von geringer Höhe, wie z.B. Bordsteine, sind in diesem Zusammenhang von besonderem Interesse, da sie häufig die erste geometrische Begrenzung des Fahrbahnbereiches darstellen. In diesem Kontext gewinnt die Verwendung von Tiefenkarten aus Stereo-Kamera-Systemen wegen der hohen Datenrate und relativ geringen Kosten des Sensors zunehmend an Bedeutung. Aufgrund des starken Messrauschens beschränken sich herkömmliche Verfahren zur Hinderniserkennung jedoch meist auf erhabene Objekte wie Fahrzeuge oder Leitplanken, oder aber adressieren einzelne Objektklassen wie Bordsteine explizit. Dazu werden häufig extrem restriktive Annahmen verwendet wie z.B. planare Straßenoberflächen. Der Hauptbeitrag dieser Arbeit besteht in der Entwicklung, Analyse und Evaluation eines Verfahrens, welches den befahrbaren Freiraum im Nahbereich des Fahrzeugs detektiert und dessen Begrenzung mit Hilfe einer Spline-Kurve explizit modelliert. Das Verfahren berücksichtigt insbesondere Hindernisse geringer Höhe (größer als 10 cm) ohne Beschränkung auf bestimmte Objektklassen. Weiterhin ist das Verfahren in der Lage, mit verschiedenartigen Neigungs- und Krümmungseigenschaften der vor dem Fahrzeug liegenden Fahrbahnoberfläche umzugehen und diese durch Verwendung eines flexiblen Spline-Modells zu rekonstruieren. Um trotz der hohen Flexibilität des Modells und des hohen Messrauschens robuste Ergebnisse zu erzielen, verwendet das Verfahren probabilistische Modelle zur Vorverarbeitung der Eingabedaten und zur Detektion des befahrbaren Freiraumes. Aus den Tiefenkarten wird unter Berücksichtigung der Strahlengänge und Unsicherheiten der Tiefenmessungen ein Höhenmodell berechnet. In einem iterativen Zwei-Schritt-Verfahren werden anhand dieses Höhenmodells der befahrbare Freiraum mit Hilfe eines Markov-Zufallsfeldes bestimmt sowie die Parameter der begrenzenden Spline-Kurve und Straßenoberfläche geschätzt. Ausreißer in den Höhendaten werden dabei explizit modelliert. Die Leistungsfähigkeit des Gesamtverfahrens sowie der Einfluss zentraler Komponenten, wird im Rahmen von Experimenten auf synthetischen und realen Testszenen systematisch analysiert. Die Ergebnisse demonstrieren die Fähigkeit des Verfahrens, die Begrenzung des befahrbaren Freiraumes sowie die Fahrbahnoberfläche selbst in komplexen Szenarien mit multiplen Hindernissen oder starker Fahrbahnkrümmung akkurat zu modellieren. Weiterhin werden die Grenzen des Verfahrens aufgezeigt und detailliert untersucht
    corecore