10 research outputs found

    Survey on LiDAR Perception in Adverse Weather Conditions

    Full text link
    Autonomous vehicles rely on a variety of sensors to gather information about their surrounding. The vehicle's behavior is planned based on the environment perception, making its reliability crucial for safety reasons. The active LiDAR sensor is able to create an accurate 3D representation of a scene, making it a valuable addition for environment perception for autonomous vehicles. Due to light scattering and occlusion, the LiDAR's performance change under adverse weather conditions like fog, snow or rain. This limitation recently fostered a large body of research on approaches to alleviate the decrease in perception performance. In this survey, we gathered, analyzed, and discussed different aspects on dealing with adverse weather conditions in LiDAR-based environment perception. We address topics such as the availability of appropriate data, raw point cloud processing and denoising, robust perception algorithms and sensor fusion to mitigate adverse weather induced shortcomings. We furthermore identify the most pressing gaps in the current literature and pinpoint promising research directions.Comment: published at IEEE IV 202

    Non-Rigid Liver Registration for Laparoscopy using Data-Driven Biomechanical Models

    Get PDF
    During laparoscopic liver resection, the limited access to the organ, the small field of view and lack of palpation can obstruct a surgeon’s workflow. Automatic navigation systems could use the images from preoperative volumetric organ scans to help the surgeons find their target (tumors) and risk-structures (vessels) more efficiently. This requires the preoperative data to be fused (or registered) with the intraoperative scene in order to display information at the correct intraoperative position. One key challenge in this setting is the automatic estimation of the organ’s current intra-operative deformation, which is required in order to predict the position of internal structures. Parameterizing the many patient-specific unknowns (tissue properties, boundary conditions, interactions with other tissues, direction of gravity) is very difficult. Instead, this work explores how to employ deep neural networks to solve the registration problem in a data-driven manner. To this end, convolutional neural networks are trained on synthetic data to estimate an organ’s intraoperative displacement field and thus its current deformation. To drive this estimation, visible surface cues from the intraoperative camera view must be supplied to the networks. Since reliable surface features are very difficult to find, the networks are adapted to also find correspondences between the pre- and intraoperative liver geometry automatically. This combines the search for correspondences with the biomechanical behavior estimation and allows the networks to tackle the full non-rigid registration problem in one single step. The result is a model which can quickly predict the volume deformation of a liver, given only sparse surface information. The model combines the advantages of a physically accurate biomechanical simulation with the speed and powerful feature extraction capabilities of deep neural networks. To test the method intraoperatively, a registration pipeline is developed which constructs a map of the liver and its surroundings from the laparoscopic video and then uses the neural networks to fuse the preoperative volume data into this map. The deformed organ volume can then be rendered as an overlay directly onto the laparoscopic video stream. The focus of this pipeline is to be applicable to real surgery, where everything should be quick and non-intrusive. To meet these requirements, a SLAM system is used to localize the laparoscopic camera (avoiding setup of an external tracking system), various neural networks are used to quickly interpret the scene and semi-automatic tools let the surgeons guide the system. Beyond the concrete advantages of the data-driven approach for intraoperative registration, this work also demonstrates general benefits of training a registration system preoperatively on synthetic data. The method lets the engineer decide which values need to be known explicitly and which should be estimated implicitly by the networks, which opens the door to many new possibilities.:1 Introduction 1.1 Motivation 1.1.1 Navigated Liver Surgery 1.1.2 Laparoscopic Liver Registration 1.2 Challenges in Laparoscopic Liver Registration 1.2.1 Preoperative Model 1.2.2 Intraoperative Data 1.2.3 Fusion/Registration 1.2.4 Data 1.3 Scope and Goals of this Work 1.3.1 Data-Driven, Biomechanical Model 1.3.2 Data-Driven Non-Rigid Registration 1.3.3 Building a Working Prototype 2 State of the Art 2.1 Rigid Registration 2.2 Non-Rigid Liver Registration 2.3 Neural Networks for Simulation and Registration 3 Theoretical Background 3.1 Liver 3.2 Laparoscopic Liver Resection 3.2.1 Staging Procedure 3.3 Biomechanical Simulation 3.3.1 Physical Balance Principles 3.3.2 Material Models 3.3.3 Numerical Solver: The Finite Element Method (FEM) 3.3.4 The Lagrangian Specification 3.4 Variables and Data in Liver Registration 3.4.1 Observable 3.4.2 Unknowns 4 Generating Simulations of Deforming Organs 4.1 Organ Volume 4.2 Forces and Boundary Conditions 4.2.1 Surface Forces 4.2.2 Zero-Displacement Boundary Conditions 4.2.3 Surrounding Tissues and Ligaments 4.2.4 Gravity 4.2.5 Pressure 4.3 Simulation 4.3.1 Static Simulation 4.3.2 Dynamic Simulation 4.4 Surface Extraction 4.4.1 Partial Surface Extraction 4.4.2 Surface Noise 4.4.3 Partial Surface Displacement 4.5 Voxelization 4.5.1 Voxelizing the Liver Geometry 4.5.2 Voxelizing the Displacement Field 4.5.3 Voxelizing Boundary Conditions 4.6 Pruning Dataset - Removing Unwanted Results 4.7 Data Augmentation 5 Deep Neural Networks for Biomechanical Simulation 5.1 Training Data 5.2 Network Architecture 5.3 Loss Functions and Training 6 Deep Neural Networks for Non-Rigid Registration 6.1 Training Data 6.2 Architecture 6.3 Loss 6.4 Training 6.5 Mesh Deformation 6.6 Example Application 7 Intraoperative Prototype 7.1 Image Acquisition 7.2 Stereo Calibration 7.3 Image Rectification, Disparity- and Depth- estimation 7.4 Liver Segmentation 7.4.1 Synthetic Image Generation 7.4.2 Automatic Segmentation 7.4.3 Manual Segmentation Modifier 7.5 SLAM 7.6 Dense Reconstruction 7.7 Rigid Registration 7.8 Non-Rigid Registration 7.9 Rendering 7.10 Robotic Operating System 8 Evaluation 8.1 Evaluation Datasets 8.1.1 In-Silico 8.1.2 Phantom Torso and Liver 8.1.3 In-Vivo, Human, Breathing Motion 8.1.4 In-Vivo, Human, Laparoscopy 8.2 Metrics 8.2.1 Mean Displacement Error 8.2.2 Target Registration Error (TRE) 8.2.3 Champfer Distance 8.2.4 Volumetric Change 8.3 Evaluation of the Synthetic Training Data 8.4 Data-Driven Biomechanical Model (DDBM) 8.4.1 Amount of Intraoperative Surface 8.4.2 Dynamic Simulation 8.5 Volume to Surface Registration Network (V2S-Net) 8.5.1 Amount of Intraoperative Surface 8.5.2 Dependency on Initial Rigid Alignment 8.5.3 Registration Accuracy in Comparison to Surface Noise 8.5.4 Registration Accuracy in Comparison to Material Stiffness 8.5.5 Champfer-Distance vs. Mean Displacement Error 8.5.6 In-vivo, Human Breathing Motion 8.6 Full Intraoperative Pipeline 8.6.1 Intraoperative Reconstruction: SLAM and Intraoperative Map 8.6.2 Full Pipeline on Laparoscopic Human Data 8.7 Timing 9 Discussion 9.1 Intraoperative Model 9.2 Physical Accuracy 9.3 Limitations in Training Data 9.4 Limitations Caused by Difference in Pre- and Intraoperative Modalities 9.5 Ambiguity 9.6 Intraoperative Prototype 10 Conclusion 11 List of Publications List of Figures Bibliograph

    Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles

    Get PDF
    Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE

    The in vivo functional neuroanatomy and neurochemistry of vibrotactile processing

    Get PDF
    Touch is a sense with which humans are able to actively explore the world around them. Primary somatosensory cortex (S1) processing has been studied to differing degrees at both the macroscopic and microscopic levels in both humans and animals. Both levels of enquiry have their advantages, but attempts to combine the two approaches are still in their infancy. One mechanism that is possibly involved in determining the reponse properties of neurons that are involved in sensory discrimination is inhibition by γ-aminobutyric acid (GABA). Several studies have shown that inhibition is an important mechanism to “tune” the response of neurons. Recently it has become possible to measure the concentration of GABA in vivo using edited Magnetic Resonance Spectroscopy (MRS), whereas magnetoencephalography (MEG) offers the possibility to look at changes in neuromagnetic activation with millisecond accuracy. With these methods we aimed to establish whether in vivo non-invasive neuroimaging can elucidate the underlying neuronal mechanisms of human tactile behaviour and to determine how such findings can be integrated with what is currently known from invasive methods. Edited GABA-MRS has shown that individual GABA concentration in S1 correlates strongly with tactile frequency discrimination. MEG was used to investigate the neuromagnetic correlates of a frequency discrimination paradigm in which we induced adaptation to a 25 Hz frequency. We showed that S1 is driven by the adapting stimulus and shows that neural rhythms are modulated as a result of adaptation. This is the first time that behavioural psychophysics of tactile adaptation has been investigated using complimentary neuroimaging methods. We combined different methods to complement both physiological and behavioural studies of tactile processing in S1 to investigate the factors involved in the neural dynamics of tactile processing and we show that non-invasive studies on humans can be used to understand physiological underpinnings of somatosensory processing.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    The in vivo functional neuroanatomy and neurochemistry of vibrotactile processing

    Get PDF
    Touch is a sense with which humans are able to actively explore the world around them. Primary somatosensory cortex (S1) processing has been studied to differing degrees at both the macroscopic and microscopic levels in both humans and animals. Both levels of enquiry have their advantages, but attempts to combine the two approaches are still in their infancy. One mechanism that is possibly involved in determining the reponse properties of neurons that are involved in sensory discrimination is inhibition by γ-aminobutyric acid (GABA). Several studies have shown that inhibition is an important mechanism to “tune” the response of neurons. Recently it has become possible to measure the concentration of GABA in vivo using edited Magnetic Resonance Spectroscopy (MRS), whereas magnetoencephalography (MEG) offers the possibility to look at changes in neuromagnetic activation with millisecond accuracy. With these methods we aimed to establish whether in vivo non-invasive neuroimaging can elucidate the underlying neuronal mechanisms of human tactile behaviour and to determine how such findings can be integrated with what is currently known from invasive methods. Edited GABA-MRS has shown that individual GABA concentration in S1 correlates strongly with tactile frequency discrimination. MEG was used to investigate the neuromagnetic correlates of a frequency discrimination paradigm in which we induced adaptation to a 25 Hz frequency. We showed that S1 is driven by the adapting stimulus and shows that neural rhythms are modulated as a result of adaptation. This is the first time that behavioural psychophysics of tactile adaptation has been investigated using complimentary neuroimaging methods. We combined different methods to complement both physiological and behavioural studies of tactile processing in S1 to investigate the factors involved in the neural dynamics of tactile processing and we show that non-invasive studies on humans can be used to understand physiological underpinnings of somatosensory processing

    Efficient Analysis in Multimedia Databases

    Get PDF
    The rapid progress of digital technology has led to a situation where computers have become ubiquitous tools. Now we can find them in almost every environment, be it industrial or even private. With ever increasing performance computers assumed more and more vital tasks in engineering, climate and environmental research, medicine and the content industry. Previously, these tasks could only be accomplished by spending enormous amounts of time and money. By using digital sensor devices, like earth observation satellites, genome sequencers or video cameras, the amount and complexity of data with a spatial or temporal relation has gown enormously. This has led to new challenges for the data analysis and requires the use of modern multimedia databases. This thesis aims at developing efficient techniques for the analysis of complex multimedia objects such as CAD data, time series and videos. It is assumed that the data is modeled by commonly used representations. For example CAD data is represented as a set of voxels, audio and video data is represented as multi-represented, multi-dimensional time series. The main part of this thesis focuses on finding efficient methods for collision queries of complex spatial objects. One way to speed up those queries is to employ a cost-based decompositioning, which uses interval groups to approximate a spatial object. For example, this technique can be used for the Digital Mock-Up (DMU) process, which helps engineers to ensure short product cycles. This thesis defines and discusses a new similarity measure for time series called threshold-similarity. Two time series are considered similar if they expose a similar behavior regarding the transgression of a given threshold value. Another part of the thesis is concerned with the efficient calculation of reverse k-nearest neighbor (RkNN) queries in general metric spaces using conservative and progressive approximations. The aim of such RkNN queries is to determine the impact of single objects on the whole database. At the end, the thesis deals with video retrieval and hierarchical genre classification of music using multiple representations. The practical relevance of the discussed genre classification approach is highlighted with a prototype tool that helps the user to organize large music collections. Both the efficiency and the effectiveness of the presented techniques are thoroughly analyzed. The benefits over traditional approaches are shown by evaluating the new methods on real-world test datasets

    Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems

    Get PDF
    Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle Objekte an einem willkĂŒrlichen Ort anzuzeigen, kann AR Technologie auch genutzt werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese Spielart von AR als "Fused Reality" definiert und eingehend diskutiert. Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lĂ€sst sich gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen GebĂ€udemodellen demonstrieren, wo sich gebĂ€udespezifische Informationen - beispielsweise der Verlauf von Leitungen und Kabeln innerhalb der WĂ€nde - lagegerecht am realen Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality Anwendung realisieren zu können, mĂŒssen einige grundlegende Bedingungen erfĂŒllt sein. So kann ein bestimmtes GebĂ€ude nur dann mit ortsbezogenen Informationen augmentiert werden, wenn von diesem GebĂ€ude ein digitales Modell verfĂŒgbar ist. Zwar werden grĂ¶ĂŸere Bauprojekt heutzutage oft unter Zuhilfename von Building Information Modelling (BIM) geplant und durchgefĂŒhrt, sodass ein digitales Modell direkt zusammen mit dem realen GebĂ€ude ensteht, jedoch sind im Falle Ă€lterer BestandsgebĂ€ude digitale Modelle meist nicht verfĂŒgbar. Ein digitales Modell eines bestehenden GebĂ€udes manuell zu erstellen, ist zwar möglich, jedoch mit großem Aufwand verbunden. Ist ein passendes GebĂ€udemodell vorhanden, muss ein AR GerĂ€t außerdem in der Lage sein, die eigene Position und Orientierung im GebĂ€ude relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht anzeigen zu können. Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik untersucht und diskutiert. Dabei werden zunĂ€chst verschiedene Möglichkeiten diskutiert, Indoor-GebĂ€udegeometrie mittels Sensorsystemen zu erfassen. Anschließend wird eine Untersuchung prĂ€sentiert, inwiefern moderne AR GerĂ€te, die in der Regel ebenfalls ĂŒber eine Vielzahl an Sensoren verfĂŒgen, ebenfalls geeignet sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor Mapping DatensĂ€tze können daraufhin genutzt werden, um automatisiert GebĂ€udemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes, voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird außerdem auf der Grundlage vierer zu diesem Zweck erfasster DatensĂ€tze mit zugehörigen Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene Möglichkeiten diskutiert, mobile AR GerĂ€te innerhalb eines GebĂ€udes und des zugehörigen GebĂ€udemodells zu lokalisieren. In diesem Kontext wird außerdem auch die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode prĂ€sentiert. Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping DatensĂ€tze an den Achsen des Koordinatensystems auszurichten, vorgestellt

    GPU data structures for graphics and vision

    Get PDF
    Graphics hardware has in recent years become increasingly programmable, and its programming APIs use the stream processor model to expose massive parallelization to the programmer. Unfortunately, the inherent restrictions of the stream processor model, used by the GPU in order to maintain high performance, often pose a problem in porting CPU algorithms for both video and volume processing to graphics hardware. Serial data dependencies which accelerate CPU processing are counterproductive for the data-parallel GPU. This thesis demonstrates new ways for tackling well-known problems of large scale video/volume analysis. In some instances, we enable processing on the restricted hardware model by re-introducing algorithms from early computer graphics research. On other occasions, we use newly discovered, hierarchical data structures to circumvent the random-access read/fixed write restriction that had previously kept sophisticated analysis algorithms from running solely on graphics hardware. For 3D processing, we apply known game graphics concepts such as mip-maps, projective texturing, and dependent texture lookups to show how video/volume processing can benefit algorithmically from being implemented in a graphics API. The novel GPU data structures provide drastically increased processing speed, and lift processing heavy operations to real-time performance levels, paving the way for new and interactive vision/graphics applications.Graphikhardware wurde in den letzen Jahren immer weiter programmierbar. Ihre APIs verwenden das Streamprozessor-Modell, um die massive Parallelisierung auch fĂŒr den Programmierer verfĂŒgbar zu machen. Leider folgen aus dem strikten Streamprozessor-Modell, welches die GPU fĂŒr ihre hohe Rechenleistung benötigt, auch Hindernisse in der Portierung von CPU-Algorithmen zur Video- und Volumenverarbeitung auf die GPU. Serielle DatenabhĂ€ngigkeiten beschleunigen zwar CPU-Verarbeitung, sind aber fĂŒr die daten-parallele GPU kontraproduktiv . Diese Arbeit prĂ€sentiert neue Herangehensweisen fĂŒr bekannte Probleme der Video- und Volumensverarbeitung. Teilweise wird die Verarbeitung mit Hilfe von modifizierten Algorithmen aus der frĂŒhen Computergraphik-Forschung an das beschrĂ€nkte Hardwaremodell angepasst. Anderswo helfen neu entdeckte, hierarchische Datenstrukturen beim Umgang mit den Schreibzugriff-Restriktionen die lange die Portierung von komplexeren Bildanalyseverfahren verhindert hatten. In der 3D-Verarbeitung nutzen wir bekannte Konzepte aus der Computerspielegraphik wie Mipmaps, projektive Texturierung, oder verkettete Texturzugriffe, und zeigen auf welche Vorteile die Video- und Volumenverarbeitung aus hardwarebeschleunigter Graphik-API-Implementation ziehen kann. Die prĂ€sentierten GPU-Datenstrukturen bieten drastisch schnellere Verarbeitung und heben rechenintensive Operationen auf Echtzeit-Niveau. Damit werden neue, interaktive Bildverarbeitungs- und Graphik-Anwendungen möglich
    corecore