13,928 research outputs found
Automatic facial analysis for objective assessment of facial paralysis
Facial Paralysis is a condition causing decreased movement on one side of the face. A quantitative, objective and reliable assessment system would be an invaluable tool for clinicians treating patients with this condition. This paper presents an approach based on the automatic analysis of patient video data. Facial feature localization and facial movement detection methods are discussed. An algorithm is presented to process the optical flow data to obtain the motion features in the relevant facial regions. Three classification methods are applied to provide quantitative evaluations of regional facial nerve function and the overall facial nerve function based on the House-Brackmann Scale. Experiments show the Radial Basis Function (RBF) Neural Network to have superior performance
Micro Fourier Transform Profilometry (FTP): 3D shape measurement at 10,000 frames per second
Recent advances in imaging sensors and digital light projection technology
have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces
of complex-shaped objects to be captured with improved resolution and accuracy.
However, due to the large number of projection patterns required for phase
recovery and disambiguation, the maximum fame rates of current 3D shape
measurement techniques are still limited to the range of hundreds of frames per
second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro
Fourier Transform Profilometry (FTP), which can capture 3D surfaces of
transient events at up to 10,000 fps based on our newly developed high-speed
fringe projection system. Compared with existing techniques, FTP has the
prominent advantage of recovering an accurate, unambiguous, and dense 3D point
cloud with only two projected patterns. Furthermore, the phase information is
encoded within a single high-frequency fringe image, thereby allowing
motion-artifact-free reconstruction of transient events with temporal
resolution of 50 microseconds. To show FTP's broad utility, we use it to
reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating
fan blades, bullet fired from a toy gun, and balloon's explosion triggered by a
flying dart, which were previously difficult or even unable to be captured with
conventional approaches.Comment: This manuscript was originally submitted on 30th January 1
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
End-to-end Projector Photometric Compensation
Projector photometric compensation aims to modify a projector input image
such that it can compensate for disturbance from the appearance of projection
surface. In this paper, for the first time, we formulate the compensation
problem as an end-to-end learning problem and propose a convolutional neural
network, named CompenNet, to implicitly learn the complex compensation
function. CompenNet consists of a UNet-like backbone network and an autoencoder
subnet. Such architecture encourages rich multi-level interactions between the
camera-captured projection surface image and the input image, and thus captures
both photometric and environment information of the projection surface. In
addition, the visual details and interaction information are carried to deeper
layers along the multi-level skip convolution layers. The architecture is of
particular importance for the projector compensation task, for which only a
small training dataset is allowed in practice. Another contribution we make is
a novel evaluation benchmark, which is independent of system setup and thus
quantitatively verifiable. Such benchmark is not previously available, to our
best knowledge, due to the fact that conventional evaluation requests the
hardware system to actually project the final results. Our key idea, motivated
from our end-to-end problem formulation, is to use a reasonable surrogate to
avoid such projection process so as to be setup-independent. Our method is
evaluated carefully on the benchmark, and the results show that our end-to-end
learning solution outperforms state-of-the-arts both qualitatively and
quantitatively by a significant margin.Comment: To appear in the 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). Source code and dataset are available at
https://github.com/BingyaoHuang/compenne
Using the discrete hadamard transform to detect moving objects in surveillance video
In this paper we present an approach to object detection in surveillance video based on detecting moving edges
using the Hadamard transform. The proposed method is characterized by robustness to illumination changes
and ghosting effects and provides high speed detection, making it particularly suitable for surveillance applications.
In addition to presenting an approach to moving edge detection using the Hadamard transform, we
introduce two measures to track edge history, Pixel Bit Mask Difference (PBMD) and History Update Value
(H UV ) that help reduce the false detections commonly experienced by approaches based on moving edges.
Experimental results show that the proposed algorithm overcomes the traditional drawbacks of frame differencing
and outperforms existing edge-based approaches in terms of both detection results and computational
complexity
- âŠ