4,848 research outputs found
Efficient illumination independent appearance-based face tracking
One of the major challenges that visual tracking algorithms face nowadays is being
able to cope with changes in the appearance of the target during tracking. Linear
subspace models have been extensively studied and are possibly the most popular
way of modelling target appearance. We introduce a linear subspace representation
in which the appearance of a face is represented by the addition of two approxi-
mately independent linear subspaces modelling facial expressions and illumination
respectively. This model is more compact than previous bilinear or multilinear ap-
proaches. The independence assumption notably simplifies system training. We only
require two image sequences. One facial expression is subject to all possible illumina-
tions in one sequence and the face adopts all facial expressions under one particular
illumination in the other. This simple model enables us to train the system with
no manual intervention. We also revisit the problem of efficiently fitting a linear
subspace-based model to a target image and introduce an additive procedure for
solving this problem. We prove that Matthews and Baker’s Inverse Compositional
Approach makes a smoothness assumption on the subspace basis that is equiva-
lent to Hager and Belhumeur’s, which worsens convergence. Our approach differs
from Hager and Belhumeur’s additive and Matthews and Baker’s compositional ap-
proaches in that we make no smoothness assumptions on the subspace basis. In the
experiments conducted we show that the model introduced accurately represents
the appearance variations caused by illumination changes and facial expressions.
We also verify experimentally that our fitting procedure is more accurate and has
better convergence rate than the other related approaches, albeit at the expense of
a slight increase in computational cost. Our approach can be used for tracking a
human face at standard video frame rates on an average personal computer
Efficient Algorithms for Moral Lineage Tracing
Lineage tracing, the joint segmentation and tracking of living cells as they
move and divide in a sequence of light microscopy images, is a challenging
task. Jug et al. have proposed a mathematical abstraction of this task, the
moral lineage tracing problem (MLTP), whose feasible solutions define both a
segmentation of every image and a lineage forest of cells. Their branch-and-cut
algorithm, however, is prone to many cuts and slow convergence for large
instances. To address this problem, we make three contributions: (i) we devise
the first efficient primal feasible local search algorithms for the MLTP, (ii)
we improve the branch-and-cut algorithm by separating tighter cutting planes
and by incorporating our primal algorithms, (iii) we show in experiments that
our algorithms find accurate solutions on the problem instances of Jug et al.
and scale to larger instances, leveraging moral lineage tracing to practical
significance.Comment: Accepted at ICCV 201
Long Range Automated Persistent Surveillance
This dissertation addresses long range automated persistent surveillance with focus on three topics: sensor planning, size preserving tracking, and high magnification imaging.
field of view should be reserved so that camera handoff can be executed successfully before the object of interest becomes unidentifiable or untraceable. We design a sensor planning algorithm that not only maximizes coverage but also ensures uniform and sufficient overlapped camera’s field of view for an optimal handoff success rate. This algorithm works for environments with multiple dynamic targets using different types of cameras. Significantly improved handoff success rates are illustrated via experiments using floor plans of various scales.
Size preserving tracking automatically adjusts the camera’s zoom for a consistent view of the object of interest. Target scale estimation is carried out based on the paraperspective projection model which compensates for the center offset and considers system latency and tracking errors. A computationally efficient foreground segmentation strategy, 3D affine shapes, is proposed. The 3D affine shapes feature direct and real-time implementation and improved flexibility in accommodating the target’s 3D motion, including off-plane rotations. The effectiveness of the scale estimation and foreground segmentation algorithms is validated via both offline and real-time tracking of pedestrians at various resolution levels.
Face image quality assessment and enhancement compensate for the performance degradations in face recognition rates caused by high system magnifications and long observation distances. A class of adaptive sharpness measures is proposed to evaluate and predict this degradation. A wavelet based enhancement algorithm with automated frame selection is developed and proves efficient by a considerably elevated face recognition rate for severely blurred long range face images
High-Speed Vision and Force Feedback for Motion-Controlled Industrial Manipulators
Over the last decades, both force sensors and cameras have emerged as useful sensors for different applications in robotics. This thesis considers a number of dynamic visual tracking and control problems, as well as the integration of these techniques with contact force control. Different topics ranging from basic theory to system implementation and applications are treated. A new interface developed for external sensor control is presented, designed by making non-intrusive extensions to a standard industrial robot control system. The structure of these extensions are presented, the system properties are modeled and experimentally verified, and results from force-controlled stub grinding and deburring experiments are presented. A novel system for force-controlled drilling using a standard industrial robot is also demonstrated. The solution is based on the use of force feedback to control the contact forces and the sliding motions of the pressure foot, which would otherwise occur during the drilling phase. Basic methods for feature-based tracking and servoing are presented, together with an extension for constrained motion estimation based on a dual quaternion pose parametrization. A method for multi-camera real-time rigid body tracking with time constraints is also presented, based on an optimal selection of the measured features. The developed tracking methods are used as the basis for two different approaches to vision/force control, which are illustrated in experiments. Intensity-based techniques for tracking and vision-based control are also developed. A dynamic visual tracking technique based directly on the image intensity measurements is presented, together with new stability-based methods suitable for dynamic tracking and feedback problems. The stability-based methods outperform the previous methods in many situations, as shown in simulations and experiments
A continuum robotic platform for endoscopic non-contact laser surgery: design, control, and preclinical evaluation
The application of laser technologies in surgical interventions has been accepted in the clinical
domain due to their atraumatic properties. In addition to manual application of fibre-guided
lasers with tissue contact, non-contact transoral laser microsurgery (TLM) of laryngeal tumours
has been prevailed in ENT surgery. However, TLM requires many years of surgical training
for tumour resection in order to preserve the function of adjacent organs and thus preserve the
patient’s quality of life. The positioning of the microscopic laser applicator outside the patient
can also impede a direct line-of-sight to the target area due to anatomical variability and limit
the working space. Further clinical challenges include positioning the laser focus on the tissue
surface, imaging, planning and performing laser ablation, and motion of the target area during
surgery. This dissertation aims to address the limitations of TLM through robotic approaches and
intraoperative assistance. Although a trend towards minimally invasive surgery is apparent, no
highly integrated platform for endoscopic delivery of focused laser radiation is available to date.
Likewise, there are no known devices that incorporate scene information from endoscopic imaging
into ablation planning and execution. For focusing of the laser beam close to the target tissue, this
work first presents miniaturised focusing optics that can be integrated into endoscopic systems.
Experimental trials characterise the optical properties and the ablation performance. A robotic
platform is realised for manipulation of the focusing optics. This is based on a variable-length
continuum manipulator. The latter enables movements of the endoscopic end effector in five
degrees of freedom with a mechatronic actuation unit. The kinematic modelling and control of the
robot are integrated into a modular framework that is evaluated experimentally. The manipulation
of focused laser radiation also requires precise adjustment of the focal position on the tissue. For
this purpose, visual, haptic and visual-haptic assistance functions are presented. These support
the operator during teleoperation to set an optimal working distance. Advantages of visual-haptic
assistance are demonstrated in a user study. The system performance and usability of the overall
robotic system are assessed in an additional user study. Analogous to a clinical scenario, the
subjects follow predefined target patterns with a laser spot. The mean positioning accuracy of the
spot is 0.5 mm. Finally, methods of image-guided robot control are introduced to automate laser
ablation. Experiments confirm a positive effect of proposed automation concepts on non-contact
laser surgery.Die Anwendung von Lasertechnologien in chirurgischen Interventionen hat sich aufgrund der atraumatischen Eigenschaften in der Klinik etabliert. Neben manueller Applikation von fasergeführten
Lasern mit Gewebekontakt hat sich die kontaktfreie transorale Lasermikrochirurgie (TLM) von
Tumoren des Larynx in der HNO-Chirurgie durchgesetzt. Die TLM erfordert zur Tumorresektion
jedoch ein langjähriges chirurgisches Training, um die Funktion der angrenzenden Organe zu
sichern und damit die Lebensqualität der Patienten zu erhalten. Die Positionierung des mikroskopis chen Laserapplikators außerhalb des Patienten kann zudem die direkte Sicht auf das Zielgebiet
durch anatomische Variabilität erschweren und den Arbeitsraum einschränken. Weitere klinische
Herausforderungen betreffen die Positionierung des Laserfokus auf der Gewebeoberfläche, die
Bildgebung, die Planung und Ausführung der Laserablation sowie intraoperative Bewegungen
des Zielgebietes. Die vorliegende Dissertation zielt darauf ab, die Limitierungen der TLM durch
robotische Ansätze und intraoperative Assistenz zu adressieren. Obwohl ein Trend zur minimal
invasiven Chirurgie besteht, sind bislang keine hochintegrierten Plattformen für die endoskopische
Applikation fokussierter Laserstrahlung verfügbar. Ebenfalls sind keine Systeme bekannt, die
Szeneninformationen aus der endoskopischen Bildgebung in die Ablationsplanung und -ausführung
einbeziehen. Für eine situsnahe Fokussierung des Laserstrahls wird in dieser Arbeit zunächst
eine miniaturisierte Fokussieroptik zur Integration in endoskopische Systeme vorgestellt. Experimentelle Versuche charakterisieren die optischen Eigenschaften und das Ablationsverhalten. Zur
Manipulation der Fokussieroptik wird eine robotische Plattform realisiert. Diese basiert auf einem
längenveränderlichen Kontinuumsmanipulator. Letzterer ermöglicht in Kombination mit einer
mechatronischen Aktuierungseinheit Bewegungen des Endoskopkopfes in fünf Freiheitsgraden.
Die kinematische Modellierung und Regelung des Systems werden in ein modulares Framework
eingebunden und evaluiert. Die Manipulation fokussierter Laserstrahlung erfordert zudem eine
präzise Anpassung der Fokuslage auf das Gewebe. Dafür werden visuelle, haptische und visuell haptische Assistenzfunktionen eingeführt. Diese unterstützen den Anwender bei Teleoperation
zur Einstellung eines optimalen Arbeitsabstandes. In einer Anwenderstudie werden Vorteile der
visuell-haptischen Assistenz nachgewiesen. Die Systemperformanz und Gebrauchstauglichkeit
des robotischen Gesamtsystems werden in einer weiteren Anwenderstudie untersucht. Analog zu
einem klinischen Einsatz verfolgen die Probanden mit einem Laserspot vorgegebene Sollpfade. Die
mittlere Positioniergenauigkeit des Spots beträgt dabei 0,5 mm. Zur Automatisierung der Ablation
werden abschließend Methoden der bildgestützten Regelung vorgestellt. Experimente bestätigen
einen positiven Effekt der Automationskonzepte für die kontaktfreie Laserchirurgie
Volumetric velocimetry for fluid flows
In recent years, several techniques have been introduced that are capable of extracting 3D three-component velocity fields in fluid flows. Fast-paced developments in both hardware and processing algorithms have generated a diverse set of methods, with a growing range of applications in flow diagnostics. This has been further enriched by the increasingly marked trend of hybridization, in which the differences between techniques are fading. In this review, we carry out a survey of the prominent methods, including optical techniques and approaches based on medical imaging. An overview of each is given with an example of an application from the literature, while focusing on their respective strengths and challenges. A framework for the evaluation of velocimetry performance in terms of dynamic spatial range is discussed, along with technological trends and emerging strategies to exploit 3D data. While critical challenges still exist, these observations highlight how volumetric techniques are transforming experimental fluid mechanics, and that the possibilities they offer have just begun to be explored.SD was partially supported under Grant No. DPI2016-79401-R funded by the Spanish State Research Agency (SRA) and the European Regional Development Fund (ERDF). FC was partially supported by the U.S. National Science Foundation (Chemical, Bioengineering, Environmental, and Transport Systems, Grant No. 1453538)
Event-based Simultaneous Localization and Mapping: A Comprehensive Survey
In recent decades, visual simultaneous localization and mapping (vSLAM) has
gained significant interest in both academia and industry. It estimates camera
motion and reconstructs the environment concurrently using visual sensors on a
moving robot. However, conventional cameras are limited by hardware, including
motion blur and low dynamic range, which can negatively impact performance in
challenging scenarios like high-speed motion and high dynamic range
illumination. Recent studies have demonstrated that event cameras, a new type
of bio-inspired visual sensor, offer advantages such as high temporal
resolution, dynamic range, low power consumption, and low latency. This paper
presents a timely and comprehensive review of event-based vSLAM algorithms that
exploit the benefits of asynchronous and irregular event streams for
localization and mapping tasks. The review covers the working principle of
event cameras and various event representations for preprocessing event data.
It also categorizes event-based vSLAM methods into four main categories:
feature-based, direct, motion-compensation, and deep learning methods, with
detailed discussions and practical guidance for each approach. Furthermore, the
paper evaluates the state-of-the-art methods on various benchmarks,
highlighting current challenges and future opportunities in this emerging
research area. A public repository will be maintained to keep track of the
rapid developments in this field at
{\url{https://github.com/kun150kun/ESLAM-survey}}
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
- …