3,939 research outputs found
Kernel Cross-Correlator
Cross-correlator plays a significant role in many visual perception tasks,
such as object detection and tracking. Beyond the linear cross-correlator, this
paper proposes a kernel cross-correlator (KCC) that breaks traditional
limitations. First, by introducing the kernel trick, the KCC extends the linear
cross-correlation to non-linear space, which is more robust to signal noises
and distortions. Second, the connection to the existing works shows that KCC
provides a unified solution for correlation filters. Third, KCC is applicable
to any kernel function and is not limited to circulant structure on training
data, thus it is able to predict affine transformations with customized
properties. Last, by leveraging the fast Fourier transform (FFT), KCC
eliminates direct calculation of kernel vectors, thus achieves better
performance yet still with a reasonable computational cost. Comprehensive
experiments on visual tracking and human activity recognition using wearable
devices demonstrate its robustness, flexibility, and efficiency. The source
codes of both experiments are released at https://github.com/wang-chen/KCCComment: The Thirty-Second AAAI Conference on Artificial Intelligence
(AAAI-18
RED: Deep Recurrent Neural Networks for Sleep EEG Event Detection
The brain electrical activity presents several short events during sleep that
can be observed as distinctive micro-structures in the electroencephalogram
(EEG), such as sleep spindles and K-complexes. These events have been
associated with biological processes and neurological disorders, making them a
research topic in sleep medicine. However, manual detection limits their study
because it is time-consuming and affected by significant inter-expert
variability, motivating automatic approaches. We propose a deep learning
approach based on convolutional and recurrent neural networks for sleep EEG
event detection called Recurrent Event Detector (RED). RED uses one of two
input representations: a) the time-domain EEG signal, or b) a complex
spectrogram of the signal obtained with the Continuous Wavelet Transform (CWT).
Unlike previous approaches, a fixed time window is avoided and temporal context
is integrated to better emulate the visual criteria of experts. When evaluated
on the MASS dataset, our detectors outperform the state of the art in both
sleep spindle and K-complex detection with a mean F1-score of at least 80.9%
and 82.6%, respectively. Although the CWT-domain model obtained a similar
performance than its time-domain counterpart, the former allows in principle a
more interpretable input representation due to the use of a spectrogram. The
proposed approach is event-agnostic and can be used directly to detect other
types of sleep events.Comment: 8 pages, 5 figures. In proceedings of the 2020 International Joint
Conference on Neural Networks (IJCNN 2020
Error-resilient performance of Dirac video codec over packet-erasure channel
Video transmission over the wireless or wired network requires error-resilient mechanism since compressed video bitstreams are sensitive to transmission errors because of the use of predictive coding and variable length coding. This paper investigates the performance of a simple and low complexity error-resilient coding scheme which combines source and channel coding to protect compressed bitstream of wavelet-based Dirac video codec in the packet-erasure channel. By partitioning the wavelet transform coefficients of the motion-compensated residual frame into groups and independently processing each group using arithmetic and Forward Error Correction (FEC) coding, Dirac could achieves the robustness to transmission errors by giving the video quality which is gracefully decreasing over a range of packet loss rates up to 30% when compared with conventional FEC only methods. Simulation results also show that the proposed scheme using multiple partitions can achieve up to 10 dB PSNR gain over its existing un-partitioned format. This paper also investigates the error-resilient performance of the proposed scheme in comparison with H.264 over packet-erasure channel
A comprehensive study on light signals of opportunity for subdecimetre unmodulated visible light positioning
Currently, visible light positioning (VLP) enabling an illumination infrastructure requires a costly retrofit. Intensity modulation systems not only necessitate changes to the internal LED driving module, but decrease the LEDs' radiant flux as well. This hinders the infrastructure's ability to meet the maintained illuminance standards. Ideally, the LEDs could be left unmodulated, i.e., unmodulated VLP (uVLP). uVLP systems, inherently low-cost, exploit the characteristics of the light signals of opportunity (LSOOP) to infer a position. In this paper, it is shown that proper signal processing allows using the LED's characteristic frequency (CF) as a discriminative feature in photodiode (PD)-based received signal strength (RSS) uVLP. This manuscript investigates and compares the aptitude of (future) RSS-based uVLP and VLP systems in terms of their feasibility, cost and accuracy. It demonstrates that CF-based uVLP exhibits an acceptable loss of accuracy compared to (regular) VLP. For point source-like LEDs, uVLP only worsens the trilateration-based median p50 and 90th percentile root-mean-square error p90 from 5.3cm to 7.9cm (+50%) and from 9.6cm to 15.6cm (+62%), in the 4m x 4m room under consideration. A large experimental validation shows that employing a robust model-based fingerprinting localisation procedure, instead of trilateration, further boosts uVLP's p50 and p90 accuracy to 5.0cm and 10.6cm. When collating with VLP's p50=3.5cm and p90=6.8cm, uVLP exhibits a comparable positioning performance at a significantly lower cost and at a higher maintained illuminance, all of which underline uVLP's high adoption potential. With this work, a significant step is taken towards the development of an accurate and low-cost tracking system
Local Descriptors Optimized for Average Precision
Extraction of local feature descriptors is a vital stage in the solution
pipelines for numerous computer vision tasks. Learning-based approaches improve
performance in certain tasks, but still cannot replace handcrafted features in
general. In this paper, we improve the learning of local feature descriptors by
optimizing the performance of descriptor matching, which is a common stage that
follows descriptor extraction in local feature based pipelines, and can be
formulated as nearest neighbor retrieval. Specifically, we directly optimize a
ranking-based retrieval performance metric, Average Precision, using deep
neural networks. This general-purpose solution can also be viewed as a listwise
learning to rank approach, which is advantageous compared to recent local
ranking approaches. On standard benchmarks, descriptors learned with our
formulation achieve state-of-the-art results in patch verification, patch
retrieval, and image matching.Comment: 13 pages, 8 figures. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 201
Shaped Pupil Lyot Coronagraphs: High-Contrast Solutions for Restricted Focal Planes
Coronagraphs of the apodized pupil and shaped pupil varieties use the
Fraunhofer diffraction properties of amplitude masks to create regions of high
contrast in the vicinity of a target star. Here we present a hybrid coronagraph
architecture in which a binary, hard-edged shaped pupil mask replaces the gray,
smooth apodizer of the apodized pupil Lyot coronagraph (APLC). For any contrast
and bandwidth goal in this configuration, as long as the prescribed region of
contrast is restricted to a finite area in the image, a shaped pupil is the
apodizer with the highest transmission. We relate the starlight cancellation
mechanism to that of the conventional APLC. We introduce a new class of
solutions in which the amplitude profile of the Lyot stop, instead of being
fixed as a padded replica of the telescope aperture, is jointly optimized with
the apodizer. Finally, we describe shaped pupil Lyot coronagraph (SPLC) designs
for the baseline architecture of the Wide-Field Infrared Survey
Telescope-Astrophysics Focused Telescope Assets (WFIRST-AFTA) coronagraph.
These SPLCs help to enable two scientific objectives of the WFIRST-AFTA
mission: (1) broadband spectroscopy to characterize exoplanet atmospheres in
reflected starlight and (2) debris disk imaging.Comment: 41 pages, 15 figures; published in the JATIS special section on
WFIRST-AFTA coronagraph
Efficient neural network verification and training
In spite of their highly-publicized achievements in disparate applications, neural networks are yet to be widely deployed in safety-critical applications. In fact, fundamental concerns exist on the robustness, fairness, privacy and explainability of deep learning systems. In this thesis, we strive to increase trust in deep learning systems by presenting contributions pertaining to neural network verification and training. First, by designing dual solvers for popular network relaxations, we provide fast and scalable bounds on neural network outputs. In particular, we present two solvers for the convex hull of element-wise activation functions, and two algorithms for a formulation based on the convex hull of the composition of ReLU activations with the preceding linear layer. We show that these methods are significantly faster than off-the-shelf solvers, and improve on the speed-accuracy trade-offs of previous dual algorithms. In order to efficiently employ them for formal neural network verification, we design a massively parallel Branch-and-Bound framework around the bounding algorithms. Our contributions, which we publicly released as part of the OVAL verification framework, improved on the scalability of existing network verifiers, and proved to be influential for the development of more recent algorithms. Second, we present an intuitive and inexpensive algorithm to train neural networks for verifiability via Branch-and-Bound. Our method is shown to yield state-of-the- art performance on verifying robustness to small adversarial perturbations while reducing the training costs compared to previous algorithms. Finally, we conduct a comprehensive experimental evaluation of specialized training schemes to train networks for multiple tasks at once, showing that they perform on par with a simple baseline. We provide a partial explanation of our surprising results, aiming to stir further research towards the understanding of deep multi-task learning
- âŠ