40 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Optical Flow and Deep Learning Based Approach to Visual Odometry
Visual odometry is a challenging approach to simultaneous localization and mapping algorithms. Based on one or two cameras, motion is estimated from features and pixel differences from one set of frames to the next. A different but related topic to visual odometry is optical flow, which aims to calculate the exact distance and direction every pixel moves in consecutive frames of a video sequence. Because of the frame rate of the cameras, there are generally small, incremental changes between subsequent frames, in which optical flow can be assumed to be proportional to the physical distance moved by an egocentric reference, such as a camera on a vehicle. Combining these two issues, a visual odometry system using optical flow and deep learning is proposed. Optical flow images are used as input to a convolutional neural network, which calculates a rotation and displacement based on the image. The displacements and rotations are applied incrementally in sequence to construct a map of where the camera has traveled. The system is trained and tested on the KITTI visual odometry dataset, and accuracy is measured by the difference in distances between ground truth and predicted driving trajectories. Different convolutional neural network architecture configurations are tested for accuracy, and then results are compared to other state-of-the-art monocular odometry systems using the same dataset. The average translation error from this system is 10.77%, and the average rotation error is 0.0623 degrees per meter. This system also exhibits at least a 23.796x speedup over the next fastest odometry estimation system
Enriching remote labs with computer vision and drones
165 p.With the technological advance, new learning technologies are being developed in order to contribute to better learning experience. In particular, remote labs constitute an interesting and a practical way that can motivate nowadays students to learn. The studen can at anytime, and from anywhere, access the remote lab and do his lab-work. Despite many advantages, remote tecnologies in education create a distance between the student and the teacher. Without the presence of a teacher, students can have difficulties, if no appropriate interventions can be taken to help them. In this thesis, we aim to enrich an existing remote electronic lab made for engineering students called "LaboREM" (for remote Laboratory) in two ways: first we enable the student to send high level commands to a mini-drone available in the remote lab facility. The objective is to examine the front panels of electronic measurement instruments, by the camera embedded on the drone. Furthermore, we allow remote student-teacher communication using the drone, in case there is a teacher present in the remote lab facility. Finally, the drone has to go back home when the mission is over to land on a platform for automatic recharge of the batteries. Second, we propose an automatic system that estimates the affective state of the student (frustrated/confused/flow) in order to take appropriate interventions to ensure good learning outcomes. For example, if the studen is having major difficulties we can try to give him hints or to reduce the difficulty level of the lab experiment. We propose to do this by using visual cues (head pose estimation and facil expression analysis). Many evidences on the state of the student can be acquired, however these evidences are incomplete, sometims inaccurate, and do not cover all the aspects of the state of the student alone. This is why we propose to fuse evidences using the theory of Dempster-Shafer that allows the fusion of incomplete evidence
Airborne Navigation by Fusing Inertial and Camera Data
Unmanned aircraft systems (UASs) are often used as measuring system. Therefore, precise knowledge of their position and orientation are required. This thesis provides research in the conception and realization of a system which combines GPS-assisted inertial navigation systems with the advances in the area of camera-based navigation. It is presented how these complementary approaches can be used in a joint framework. In contrast to widely used concepts utilizing only one of the two approaches, a more robust overall system is realized. The presented algorithms are based on the mathematical concepts of rigid body motions. After derivation of the underlying equations, the methods are evaluated in numerical studies and simulations. Based on the results, real-world systems are used to collect data, which is evaluated and discussed. Two approaches for the system calibration, which describes the offsets between the coordinate systems of the sensors, are proposed. The first approach integrates the parameters of the system calibration in the classical bundle adjustment. The optimization is presented very descriptive in a graph based formulation. Required is a high precision INS and data from a measurement flight. In contrast to classical methods, a flexible flight course can be used and no cost intensive ground control points are required. The second approach enables the calibration of inertial navigation systems with a low positional accuracy. Line observations are used to optimize the rotational part of the offsets. Knowledge of the offsets between the coordinate systems of the sensors allows transforming measurements bidirectional. This is the basis for a fusion concept combining measurements from the inertial navigation system with an approach for the visual navigation. As a result, more robust estimations of the own position and orientation are achieved. Moreover, the map created from the camera images is georeferenced. It is shown how this map can be used to navigate an unmanned aerial system back to its starting position in the case of a disturbed or failed GPS reception. The high precision of the map allows the navigation through previously unexplored area by taking into consideration the maximal drift for the camera-only navigation. The evaluated concept provides insight into the possibility of the robust navigation of unmanned aerial systems with complimentary sensors. The constantly increasing computing power allows the evaluation of big amounts of data and the development of new concept to fuse the information. Future navigation systems will use the data of all available sensors to achieve the best navigation solution at any time
Visual and Camera Sensors
This book includes 13 papers published in Special Issue ("Visual and Camera Sensors") of the journal Sensors. The goal of this Special Issue was to invite high-quality, state-of-the-art research papers dealing with challenging issues in visual and camera sensors
On the relationship between neuronal codes and mental models
Das ĂŒbergeordnete Ziel meiner Arbeit an dieser Dissertation
war ein besseres VerstÀndnis des Zusammenhangs
von mentalen Modellen
und den zugrundeliegenden Prinzipien,
die zur Selbstorganisation neuronaler Verschaltung fĂŒhren.
Die Dissertation besteht aus vier individuellen Publikationen,
die dieses Ziel aus unterschiedlichen Perspektiven angehen.
WÀhrend die Selbstorganisation von Sparse-Coding-ReprÀsentationen
in neuronalem Substrat
bereits ausgiebig untersucht worden ist,
sind viele Forschungsfragen dazu,
wie Sparse-Coding fĂŒr höhere, kognitive Prozesse genutzt werden könnte
noch offen.
Die ersten zwei Studien,
die in Kapitel 2 und Kapitel 3 enthalten sind,
behandeln die Frage,
inwieweit ReprÀsentationen, die mit Sparse-Coding entstehen,
mentalen Modellen entsprechen.
Wir haben folgende SelektivitÀten
in Sparse-Coding-ReprÀsentationen identifiziert:
mit Stereo-Bildern als Eingangsdaten
war die ReprĂ€sentation selektiv fĂŒr die DisparitĂ€ten von Bildstrukturen,
welche fĂŒr das AbschĂ€tzen der Entfernung der Strukturen zum Beobachter genutzt werden können.
AuĂerdem war die ReprĂ€sentation selektiv fĂŒr die die vorherrschende Orientierung in Texturen,
was fĂŒr das AbschĂ€tzen der Neigung von OberflĂ€chen genutzt werden kann.
Mit optischem Fluss von Eigenbewegung als Eingangsdaten
war die ReprĂ€sentation selektiv fĂŒr die Richtung der Eigenbewegung
in den sechs Freiheitsgraden.
Wegen des direkten Zusammenhangs der SelektivitÀten mit physikalischen Eigenschaften
können ReprÀsentationen, die mit Sparse-Coding entstehen,
als frĂŒhe sensorische Modelle der Umgebung dienen.
Die kognitiven Prozesse hinter rÀumlichem Wissen
ruhen auf mentalen Modellen, welche die Umgebung representieren.
Wir haben in der dritten Studie,
welche in Kapitel 4 enthalten ist,
ein topologisches Modell zur Navigation prÀsentiert,
Es beschreibt einen dualen Populations-Code,
bei dem der erste Populations-Code Orte anhand von Orts-Feldern (Place-Fields) kodiert
und der zweite Populations-Code Bewegungs-Instruktionen,
basierend auf der VerknĂŒpfung von Orts-Feldern, kodiert.
Der Fokus lag nicht auf der Implementation in biologischem Substrat
oder auf einer exakten Modellierung physiologischer Ergebnisse.
Das Modell ist eine biologisch plausible, einfache Methode zur Navigation,
welche sich an einen Zwischenschritt emergenter Navigations-FĂ€higkeiten
in einer evolutiven Navigations-Hierarchie annÀhert.
Unser automatisierter Test der Sehleistungen von MĂ€usen,
welcher in Kapitel 5 beschrieben wird,
ist ein Beispiel von Verhaltens-Tests
im Wahrnehmungs-Handlungs-Zyklus (Perception-Action-Cycle).
Das Ziel dieser Studie war die Quantifizierung des optokinetischen Reflexes.
Wegen des reichhaltigen Verhaltensrepertoires von MĂ€usen
sind fĂŒr die Quantifizierung viele umfangreiche Analyseschritte erforderlich.
Tiere und Menschen sind verkörperte (embodied) lebende Systeme
und daher aus stark miteinander verwobenen Modulen oder EntitÀten zusammengesetzt,
welche auĂerdem auch mit der Umgebung verwoben sind.
Um lebende Systeme als Ganzes zu studieren
ist es notwendig Hypothesen,
zum Beispiel zur Natur mentaler Modelle,
im Wahrnehmungs-Handlungs-Zyklus zu testen.
Zusammengefasst erweitern die Studien dieser Dissertation
unser VerstĂ€ndnis des Charakters frĂŒher sensorischer ReprĂ€sentationen als mentale Modelle,
sowie unser VerstĂ€ndnis höherer, mentalen Modellen fĂŒr die rĂ€umliche Navigation.
DarĂŒber hinaus enthĂ€lt es ein Beispiel
fĂŒr das Evaluieren von Hypothesn im Wahr\-neh\-mungs-Handlungs-Zyklus.The superordinate aim of my work towards this thesis
was a better understanding
of the relationship between mental models
and the underlying principles that lead to the self-organization
of neuronal circuitry.
The thesis consists of four individual publications,
which approach this goal from differing perspectives.
While the formation of sparse coding representations in neuronal substrate
has been investigated extensively,
many research questions
on how sparse coding may be exploited for higher cognitive processing
are still open.
The first two studies,
included as chapter 2 and chapter 3,
asked to what extend representations obtained with sparse coding
match mental models.
We identified the following selectivities in sparse coding representations:
with stereo images as input,
the representation was selective for the disparity of image structures,
which can be used to infer the distance of structures to the observer.
Furthermore, it was selective to the predominant orientation in textures,
which can be used to infer the orientation of surfaces.
With optic flow from egomotion as input,
the representation was selective to the direction of egomotion
in 6 degrees of freedom.
Due to the direct relation between selectivity and physical properties,
these representations, obtained with sparse coding,
can serve as early sensory models of the environment.
The cognitive processes behind spatial knowledge
rest on mental models that represent the environment.
We presented a topological model for wayfinding
in the third study,
included as chapter 4.
It describes a dual population code,
where the first population code encodes places
by means of place fields,
and the second population code encodes motion instructions
based on links between place fields.
We did not focus on an implementation in biological substrate
or on an exact fit to physiological findings.
The model is a biologically plausible, parsimonious method for wayfinding,
which may be close to an intermediate step
of emergent skills in an evolutionary navigational hierarchy.
Our automated testing for visual performance in mice,
included in chapter 5,
is an example of behavioral testing in the perception-action cycle.
The goal of this study was to quantify the optokinetic reflex.
Due to the rich behavioral repertoire of mice,
quantification required many elaborate steps of computational analyses.
Animals and humans are embodied living systems,
and therefore composed of strongly enmeshed modules or entities,
which are also enmeshed with the environment.
In order to study living systems as a whole,
it is necessary to test hypothesis,
for example on the nature of mental models,
in the perception-action cycle.
In summary,
the studies included in this thesis
extend our view on the character of early sensory representations
as mental models,
as well as on high-level mental models
for spatial navigation.
Additionally it contains an example
for the evaluation of hypotheses in the perception-action cycle