10 research outputs found
Survey on LiDAR Perception in Adverse Weather Conditions
Autonomous vehicles rely on a variety of sensors to gather information about
their surrounding. The vehicle's behavior is planned based on the environment
perception, making its reliability crucial for safety reasons. The active LiDAR
sensor is able to create an accurate 3D representation of a scene, making it a
valuable addition for environment perception for autonomous vehicles. Due to
light scattering and occlusion, the LiDAR's performance change under adverse
weather conditions like fog, snow or rain. This limitation recently fostered a
large body of research on approaches to alleviate the decrease in perception
performance. In this survey, we gathered, analyzed, and discussed different
aspects on dealing with adverse weather conditions in LiDAR-based environment
perception. We address topics such as the availability of appropriate data, raw
point cloud processing and denoising, robust perception algorithms and sensor
fusion to mitigate adverse weather induced shortcomings. We furthermore
identify the most pressing gaps in the current literature and pinpoint
promising research directions.Comment: published at IEEE IV 202
Non-Rigid Liver Registration for Laparoscopy using Data-Driven Biomechanical Models
During laparoscopic liver resection, the limited access to the organ, the small field of view and lack of palpation can obstruct a surgeonâs workflow. Automatic navigation systems could use the images from preoperative volumetric organ scans to help the surgeons find their target (tumors) and risk-structures (vessels) more efficiently. This requires the preoperative data to be fused (or registered) with the intraoperative scene in order to display information at the correct intraoperative position.
One key challenge in this setting is the automatic estimation of the organâs current intra-operative deformation, which is required in order to predict the position of internal structures. Parameterizing the many patient-specific unknowns (tissue properties, boundary conditions, interactions with other tissues, direction of gravity) is very difficult. Instead, this work explores how to employ deep neural networks to solve the registration problem in a data-driven manner. To this end, convolutional neural networks are trained on synthetic data to estimate an organâs intraoperative displacement field and thus its current deformation. To drive this estimation, visible surface cues from the intraoperative camera view must be supplied to the networks. Since reliable surface features are very difficult to find, the networks are adapted to also find correspondences between the pre- and intraoperative liver geometry automatically. This combines the search for correspondences with the biomechanical behavior estimation and allows the networks to tackle the full non-rigid registration problem in one single step. The result is a model which can quickly predict the volume deformation of a liver, given only sparse surface information. The model combines the advantages of a physically accurate biomechanical simulation with the speed and powerful feature extraction capabilities of deep neural networks.
To test the method intraoperatively, a registration pipeline is developed which constructs a map of the liver and its surroundings from the laparoscopic video and then uses the neural networks to fuse the preoperative volume data into this map. The deformed organ volume can then be rendered as an overlay directly onto the laparoscopic video stream. The focus of this pipeline is to be applicable to real surgery, where everything should be quick and non-intrusive. To meet these requirements, a SLAM system is used to localize the laparoscopic camera (avoiding setup of an external tracking system), various neural networks are used to quickly interpret the scene and semi-automatic tools let the surgeons guide the system.
Beyond the concrete advantages of the data-driven approach for intraoperative registration, this work also demonstrates general benefits of training a registration system preoperatively on synthetic data. The method lets the engineer decide which values need to be known explicitly and which should be estimated implicitly by the networks, which opens the door to many new possibilities.:1 Introduction
1.1 Motivation
1.1.1 Navigated Liver Surgery
1.1.2 Laparoscopic Liver Registration
1.2 Challenges in Laparoscopic Liver Registration
1.2.1 Preoperative Model
1.2.2 Intraoperative Data
1.2.3 Fusion/Registration
1.2.4 Data
1.3 Scope and Goals of this Work
1.3.1 Data-Driven, Biomechanical Model
1.3.2 Data-Driven Non-Rigid Registration
1.3.3 Building a Working Prototype
2 State of the Art
2.1 Rigid Registration
2.2 Non-Rigid Liver Registration
2.3 Neural Networks for Simulation and Registration
3 Theoretical Background
3.1 Liver
3.2 Laparoscopic Liver Resection
3.2.1 Staging Procedure
3.3 Biomechanical Simulation
3.3.1 Physical Balance Principles
3.3.2 Material Models
3.3.3 Numerical Solver: The Finite Element Method (FEM)
3.3.4 The Lagrangian Specification
3.4 Variables and Data in Liver Registration
3.4.1 Observable
3.4.2 Unknowns
4 Generating Simulations of Deforming Organs
4.1 Organ Volume
4.2 Forces and Boundary Conditions
4.2.1 Surface Forces
4.2.2 Zero-Displacement Boundary Conditions
4.2.3 Surrounding Tissues and Ligaments
4.2.4 Gravity
4.2.5 Pressure
4.3 Simulation
4.3.1 Static Simulation
4.3.2 Dynamic Simulation
4.4 Surface Extraction
4.4.1 Partial Surface Extraction
4.4.2 Surface Noise
4.4.3 Partial Surface Displacement
4.5 Voxelization
4.5.1 Voxelizing the Liver Geometry
4.5.2 Voxelizing the Displacement Field
4.5.3 Voxelizing Boundary Conditions
4.6 Pruning Dataset - Removing Unwanted Results
4.7 Data Augmentation
5 Deep Neural Networks for Biomechanical Simulation
5.1 Training Data
5.2 Network Architecture
5.3 Loss Functions and Training
6 Deep Neural Networks for Non-Rigid Registration
6.1 Training Data
6.2 Architecture
6.3 Loss
6.4 Training
6.5 Mesh Deformation
6.6 Example Application
7 Intraoperative Prototype
7.1 Image Acquisition
7.2 Stereo Calibration
7.3 Image Rectification, Disparity- and Depth- estimation
7.4 Liver Segmentation
7.4.1 Synthetic Image Generation
7.4.2 Automatic Segmentation
7.4.3 Manual Segmentation Modifier
7.5 SLAM
7.6 Dense Reconstruction
7.7 Rigid Registration
7.8 Non-Rigid Registration
7.9 Rendering
7.10 Robotic Operating System
8 Evaluation
8.1 Evaluation Datasets
8.1.1 In-Silico
8.1.2 Phantom Torso and Liver
8.1.3 In-Vivo, Human, Breathing Motion
8.1.4 In-Vivo, Human, Laparoscopy
8.2 Metrics
8.2.1 Mean Displacement Error
8.2.2 Target Registration Error (TRE)
8.2.3 Champfer Distance
8.2.4 Volumetric Change
8.3 Evaluation of the Synthetic Training Data
8.4 Data-Driven Biomechanical Model (DDBM)
8.4.1 Amount of Intraoperative Surface
8.4.2 Dynamic Simulation
8.5 Volume to Surface Registration Network (V2S-Net)
8.5.1 Amount of Intraoperative Surface
8.5.2 Dependency on Initial Rigid Alignment
8.5.3 Registration Accuracy in Comparison to Surface Noise
8.5.4 Registration Accuracy in Comparison to Material Stiffness
8.5.5 Champfer-Distance vs. Mean Displacement Error
8.5.6 In-vivo, Human Breathing Motion
8.6 Full Intraoperative Pipeline
8.6.1 Intraoperative Reconstruction: SLAM and Intraoperative Map
8.6.2 Full Pipeline on Laparoscopic Human Data
8.7 Timing
9 Discussion
9.1 Intraoperative Model
9.2 Physical Accuracy
9.3 Limitations in Training Data
9.4 Limitations Caused by Difference in Pre- and Intraoperative Modalities
9.5 Ambiguity
9.6 Intraoperative Prototype
10 Conclusion
11 List of Publications
List of Figures
Bibliograph
Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles
Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE
The in vivo functional neuroanatomy and neurochemistry of vibrotactile processing
Touch is a sense with which humans are able to actively explore the world around them. Primary somatosensory cortex (S1) processing has been studied to differing degrees at both the macroscopic and microscopic levels in both humans and animals. Both levels of enquiry have their advantages, but attempts to combine the two approaches are still in their infancy. One mechanism that is possibly involved in determining the reponse properties of neurons that are involved in sensory discrimination is inhibition by Îł-aminobutyric acid (GABA). Several studies have shown that inhibition is an important mechanism to âtuneâ the response of neurons. Recently it has become possible to measure the concentration of GABA in vivo using edited Magnetic Resonance Spectroscopy (MRS), whereas magnetoencephalography (MEG) offers the possibility to look at changes in neuromagnetic activation with millisecond accuracy. With these methods we aimed to establish whether in vivo non-invasive neuroimaging can elucidate the underlying neuronal mechanisms of human tactile behaviour and to determine how such findings can be integrated with what is currently known from invasive methods. Edited GABA-MRS has shown that individual GABA concentration in S1 correlates strongly with tactile frequency discrimination. MEG was used to investigate the neuromagnetic correlates of a frequency discrimination paradigm in which we induced adaptation to a 25 Hz frequency. We showed that S1 is driven by the adapting stimulus and shows that neural rhythms are modulated as a result of adaptation. This is the first time that behavioural psychophysics of tactile adaptation has been investigated using complimentary neuroimaging methods. We combined different methods to complement both physiological and behavioural studies of tactile processing in S1 to investigate the factors involved in the neural dynamics of tactile processing and we show that non-invasive studies on humans can be used to understand physiological underpinnings of somatosensory processing.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
The in vivo functional neuroanatomy and neurochemistry of vibrotactile processing
Touch is a sense with which humans are able to actively explore the world around them. Primary somatosensory cortex (S1) processing has been studied to differing degrees at both the macroscopic and microscopic levels in both humans and animals. Both levels of enquiry have their advantages, but attempts to combine the two approaches are still in their infancy. One mechanism that is possibly involved in determining the reponse properties of neurons that are involved in sensory discrimination is inhibition by Îł-aminobutyric acid (GABA). Several studies have shown that inhibition is an important mechanism to âtuneâ the response of neurons. Recently it has become possible to measure the concentration of GABA in vivo using edited Magnetic Resonance Spectroscopy (MRS), whereas magnetoencephalography (MEG) offers the possibility to look at changes in neuromagnetic activation with millisecond accuracy. With these methods we aimed to establish whether in vivo non-invasive neuroimaging can elucidate the underlying neuronal mechanisms of human tactile behaviour and to determine how such findings can be integrated with what is currently known from invasive methods. Edited GABA-MRS has shown that individual GABA concentration in S1 correlates strongly with tactile frequency discrimination. MEG was used to investigate the neuromagnetic correlates of a frequency discrimination paradigm in which we induced adaptation to a 25 Hz frequency. We showed that S1 is driven by the adapting stimulus and shows that neural rhythms are modulated as a result of adaptation. This is the first time that behavioural psychophysics of tactile adaptation has been investigated using complimentary neuroimaging methods. We combined different methods to complement both physiological and behavioural studies of tactile processing in S1 to investigate the factors involved in the neural dynamics of tactile processing and we show that non-invasive studies on humans can be used to understand physiological underpinnings of somatosensory processing
Efficient Analysis in Multimedia Databases
The rapid progress of digital technology has led to a situation
where computers have become ubiquitous tools. Now we can find them
in almost every environment, be it industrial or even private. With
ever increasing performance computers assumed more and more vital
tasks in engineering, climate and environmental research, medicine
and the content industry. Previously, these tasks could only be
accomplished by spending enormous amounts of time and money. By
using digital sensor devices, like earth observation satellites,
genome sequencers or video cameras, the amount and complexity of
data with a spatial or temporal relation has gown enormously. This
has led to new challenges for the data analysis and requires the use
of modern multimedia databases.
This thesis aims at developing efficient techniques for the analysis
of complex multimedia objects such as CAD data, time series and
videos. It is assumed that the data is modeled by commonly used
representations. For example CAD data is represented as a set of
voxels, audio and video data is represented as multi-represented,
multi-dimensional time series.
The main part of this thesis focuses on finding efficient methods
for collision queries of complex spatial objects. One way to speed
up those queries is to employ a cost-based decompositioning,
which uses interval groups to approximate a spatial object. For
example, this technique can be used for the Digital Mock-Up (DMU)
process, which helps engineers to ensure short product cycles. This
thesis defines and discusses a new similarity measure for time
series called threshold-similarity. Two time series are
considered similar if they expose a similar behavior regarding the
transgression of a given threshold value. Another part of the thesis
is concerned with the efficient calculation of reverse
k-nearest neighbor (RkNN) queries in general metric spaces
using conservative and progressive approximations. The aim of such
RkNN queries is to determine the impact of single objects on the
whole database. At the end, the thesis deals with video
retrieval and hierarchical genre classification of music
using multiple representations. The practical relevance of the
discussed genre classification approach is highlighted with a
prototype tool that helps the user to organize large music
collections.
Both the efficiency and the effectiveness of the presented
techniques are thoroughly analyzed. The benefits over traditional
approaches are shown by evaluating the new methods on real-world
test datasets
Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems
Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt
innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle
Objekte an einem willkĂŒrlichen Ort anzuzeigen, kann AR Technologie auch genutzt
werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten
beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene
Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese
Spielart von AR als "Fused Reality" definiert und eingehend diskutiert.
Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lÀsst sich
gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen GebÀudemodellen
demonstrieren, wo sich gebÀudespezifische Informationen - beispielsweise der
Verlauf von Leitungen und Kabeln innerhalb der WĂ€nde - lagegerecht am realen
Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality
Anwendung realisieren zu können, mĂŒssen einige grundlegende Bedingungen erfĂŒllt
sein. So kann ein bestimmtes GebÀude nur dann mit ortsbezogenen Informationen
augmentiert werden, wenn von diesem GebĂ€ude ein digitales Modell verfĂŒgbar ist.
Zwar werden gröĂere Bauprojekt heutzutage oft unter Zuhilfename von Building
Information Modelling (BIM) geplant und durchgefĂŒhrt, sodass ein digitales Modell
direkt zusammen mit dem realen GebÀude ensteht, jedoch sind im Falle Àlterer
BestandsgebĂ€ude digitale Modelle meist nicht verfĂŒgbar. Ein digitales Modell eines
bestehenden GebĂ€udes manuell zu erstellen, ist zwar möglich, jedoch mit groĂem
Aufwand verbunden. Ist ein passendes GebÀudemodell vorhanden, muss ein AR
GerĂ€t auĂerdem in der Lage sein, die eigene Position und Orientierung im GebĂ€ude
relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht
anzeigen zu können.
Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik
untersucht und diskutiert. Dabei werden zunÀchst verschiedene Möglichkeiten
diskutiert, Indoor-GebĂ€udegeometrie mittels Sensorsystemen zu erfassen. AnschlieĂend
wird eine Untersuchung prÀsentiert, inwiefern moderne AR GerÀte, die
in der Regel ebenfalls ĂŒber eine Vielzahl an Sensoren verfĂŒgen, ebenfalls geeignet
sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor
Mapping DatensÀtze können daraufhin genutzt werden, um automatisiert
GebÀudemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes,
voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird auĂerdem
auf der Grundlage vierer zu diesem Zweck erfasster DatensÀtze mit zugehörigen
Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene
Möglichkeiten diskutiert, mobile AR GerÀte innerhalb eines GebÀudes und des zugehörigen
GebĂ€udemodells zu lokalisieren. In diesem Kontext wird auĂerdem auch
die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode prÀsentiert.
AbschlieĂend wird zudem ein neuer Ansatz, Indoor-Mapping DatensĂ€tze an den
Achsen des Koordinatensystems auszurichten, vorgestellt
GPU data structures for graphics and vision
Graphics hardware has in recent years become increasingly programmable, and its programming APIs use the stream processor model to expose massive parallelization to the programmer. Unfortunately, the inherent restrictions of the stream processor model, used by the GPU in order to maintain high performance, often pose a problem in porting CPU algorithms for both video and volume processing to graphics hardware. Serial data dependencies which accelerate CPU processing are counterproductive for the data-parallel GPU.
This thesis demonstrates new ways for tackling well-known problems of large scale video/volume analysis. In some instances, we enable processing on the restricted hardware model by re-introducing algorithms from early computer graphics research. On other occasions, we use newly discovered, hierarchical data structures to circumvent the random-access read/fixed write restriction that had previously kept sophisticated analysis algorithms from running solely on graphics hardware. For 3D processing, we apply known game graphics concepts such as mip-maps, projective texturing, and dependent texture lookups to show how video/volume processing can benefit algorithmically from being implemented in a graphics API.
The novel GPU data structures provide drastically increased processing speed, and lift processing heavy operations to real-time performance levels, paving the way for new and interactive vision/graphics applications.Graphikhardware wurde in den letzen Jahren immer weiter programmierbar. Ihre APIs verwenden das Streamprozessor-Modell, um die massive Parallelisierung auch fĂŒr den Programmierer verfĂŒgbar zu machen. Leider folgen aus dem strikten Streamprozessor-Modell, welches die GPU fĂŒr ihre hohe Rechenleistung benötigt, auch Hindernisse in der Portierung von CPU-Algorithmen zur Video- und Volumenverarbeitung auf die GPU. Serielle DatenabhĂ€ngigkeiten beschleunigen zwar CPU-Verarbeitung, sind aber fĂŒr die daten-parallele GPU kontraproduktiv .
Diese Arbeit prĂ€sentiert neue Herangehensweisen fĂŒr bekannte Probleme der Video- und Volumensverarbeitung. Teilweise wird die Verarbeitung mit Hilfe von modifizierten Algorithmen aus der frĂŒhen Computergraphik-Forschung an das beschrĂ€nkte Hardwaremodell angepasst. Anderswo helfen neu entdeckte, hierarchische Datenstrukturen beim Umgang mit den Schreibzugriff-Restriktionen die lange die Portierung von komplexeren Bildanalyseverfahren verhindert hatten. In der 3D-Verarbeitung nutzen wir bekannte Konzepte aus der Computerspielegraphik wie Mipmaps, projektive Texturierung, oder verkettete Texturzugriffe, und zeigen auf welche Vorteile die Video- und Volumenverarbeitung aus hardwarebeschleunigter Graphik-API-Implementation ziehen kann.
Die prÀsentierten GPU-Datenstrukturen bieten drastisch schnellere Verarbeitung und heben rechenintensive Operationen auf Echtzeit-Niveau. Damit werden neue, interaktive Bildverarbeitungs- und Graphik-Anwendungen möglich