51 research outputs found
A wide dynamic range cmos imager with extended shunting inhibition image processing capabilities
A CMOS imager based on a novel mixed-mode VLSI implementation of biologically inspired shunting inhibition vision models is presented. It can achieve a wide range of image processing tasks such as image enhancement or edge detection via a programmable shunting inhibition processor. Its most important feature is a gain control mechanism allowing local and global adaptation to the mean input light intensity. This feature is shown to be very suitable for wide dynamic range imager
Hands-on Bayesian Neural Networks -- a Tutorial for Deep Learning Users
Modern deep learning methods constitute incredibly powerful tools to tackle a
myriad of challenging problems. However, since deep learning methods operate as
black boxes, the uncertainty associated with their predictions is often
challenging to quantify. Bayesian statistics offer a formalism to understand
and quantify the uncertainty associated with deep neural network predictions.
This tutorial provides an overview of the relevant literature and a complete
toolset to design, implement, train, use and evaluate Bayesian Neural Networks,
i.e. Stochastic Artificial Neural Networks trained using Bayesian methods.Comment: 35 pages, 15 figure
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods
In stereo vision, self-similar or bland regions can make it difficult to
match patches between two images. Active stereo-based methods mitigate this
problem by projecting a pseudo-random pattern on the scene so that each patch
of an image pair can be identified without ambiguity. However, the projected
pattern significantly alters the appearance of the image. If this pattern acts
as a form of adversarial noise, it could negatively impact the performance of
deep learning-based methods, which are now the de-facto standard for dense
stereo vision. In this paper, we propose the Active-Passive SimStereo dataset
and a corresponding benchmark to evaluate the performance gap between passive
and active stereo images for stereo matching algorithms. Using the proposed
benchmark and an additional ablation study, we show that the feature extraction
and matching modules of a selection of twenty selected deep learning-based
stereo matching methods generalize to active stereo without a problem. However,
the disparity refinement modules of three of the twenty architectures (ACVNet,
CascadeStereo, and StereoNet) are negatively affected by the active stereo
patterns due to their reliance on the appearance of the input images.Comment: 22 pages, 12 figures, accepted in NeurIPS 2022 Datasets and
Benchmarks Trac
Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art
Transformers have rapidly gained popularity in computer vision, especially in
the field of object recognition and detection. Upon examining the outcomes of
state-of-the-art object detection methods, we noticed that transformers
consistently outperformed well-established CNN-based detectors in almost every
video or image dataset. While transformer-based approaches remain at the
forefront of small object detection (SOD) techniques, this paper aims to
explore the performance benefits offered by such extensive networks and
identify potential reasons for their SOD superiority. Small objects have been
identified as one of the most challenging object types in detection frameworks
due to their low visibility. We aim to investigate potential strategies that
could enhance transformers' performance in SOD. This survey presents a taxonomy
of over 60 research studies on developed transformers for the task of SOD,
spanning the years 2020 to 2023. These studies encompass a variety of detection
applications, including small object detection in generic images, aerial
images, medical images, active millimeter images, underwater images, and
videos. We also compile and present a list of 12 large-scale datasets suitable
for SOD that were overlooked in previous studies and compare the performance of
the reviewed studies using popular metrics such as mean Average Precision
(mAP), Frames Per Second (FPS), number of parameters, and more. Researchers can
keep track of newer studies on our web page, which is available at
\url{https://github.com/arekavandi/Transformer-SOD}
Reinforced Learning for Label-Efficient 3D Face Reconstruction
3D face reconstruction plays a major role in many human-robot interaction systems, from automatic face authentication to human-computer interface-based entertainment. To improve robustness against occlusions and noise, 3D face reconstruction networks are often trained on a set of in-the-wild face images preferably captured along different viewpoints of the subject. However, collecting the required large amounts of 3D annotated face data is expensive and time-consuming. To address the high annotation cost and due to the importance of training on a useful set, we propose an Active Learning (AL) framework that actively selects the most informative and representative samples to be labeled. To the best of our knowledge, this paper is the first work on tackling active learning for 3D face reconstruction to enable a label-efficient training strategy. In particular, we propose a Reinforcement Active Learning approach in conjunction with a clustering-based pooling strategy to select informative view-points of the subjects. Experimental results on 300W-LP and AFLW2000 datasets demonstrate that our proposed method is able to 1) efficiently select the most influencing view-points for labeling and outperforms several baseline AL techniques and 2) further improve the performance of a 3D Face Reconstruction network trained on the full dataset
Automatic Hierarchical Classification of Kelps utilizing Deep Residual Feature
Across the globe, remote image data is rapidly being collected for the
assessment of benthic communities from shallow to extremely deep waters on
continental slopes to the abyssal seas. Exploiting this data is presently
limited by the time it takes for experts to identify organisms found in these
images. With this limitation in mind, a large effort has been made globally to
introduce automation and machine learning algorithms to accelerate both
classification and assessment of marine benthic biota. One major issue lies
with organisms that move with swell and currents, like kelps. This paper
presents an automatic hierarchical classification method (local binary
classification as opposed to the conventional flat classification) to classify
kelps in images collected by autonomous underwater vehicles. The proposed kelp
classification approach exploits learned feature representations extracted from
deep residual networks. We show that these generic features outperform the
traditional off-the-shelf CNN features and the conventional hand-crafted
features. Experiments also demonstrate that the hierarchical classification
method outperforms the traditional parallel multi-class classifications by a
significant margin (90.0% vs 57.6% and 77.2% vs 59.0%) on Benthoz15 and
Rottnest datasets respectively. Furthermore, we compare different hierarchical
classification approaches and experimentally show that the sibling hierarchical
training approach outperforms the inclusive hierarchical approach by a
significant margin. We also report an application of our proposed method to
study the change in kelp cover over time for annually repeated AUV surveys.Comment: MDPI Sensor
Deep Image Representations for Coral Image Classification
Healthy coral reefs play a vital role in maintaining biodiversity in tropical marine ecosystems. Remote imaging techniques have facilitated the scientific investigations of these intricate ecosystems, particularly at depths beyond 10 m where SCUBA diving techniques are not time or cost efficient. With millions of digital images of the seafloor collected using remotely operated vehicles and autonomous underwater vehicles (AUVs), manual annotation of these data by marine experts is a tedious, repetitive, and time-consuming task. It takes 10–30 min for a marine expert to meticulously annotate a single image. Automated technology to monitor the health of the oceans would allow for transformational ecological outcomes by standardizing methods to detect and identify species. This paper aims to automate the analysis of large available AUV imagery by developing advanced deep learning tools for rapid and large-scale automatic annotation of marine coral species. Such an automated technology would greatly benefit marine ecological studies in terms of cost, speed, and accuracy. To this end, we propose a deep learning based classification method for coral reefs and report the application of the proposed technique to the automatic annotation of unlabeled mosaics of the coral reef in the Abrolhos Islands, W.A., Australia. Our proposed method automatically quantified the coral coverage in this region and detected a decreasing trend in coral population, which is in line with conclusions drawn by marine ecologists
A High Resolution Color Image Restoration Algorithm for Thin TOMBO Imaging Systems
In this paper, we present a blind image restoration algorithm to reconstruct a high resolution (HR) color image from multiple, low resolution (LR), degraded and noisy images captured by thin (< 1mm) TOMBO imaging systems. The proposed algorithm is an extension of our grayscale algorithm reported in [1] to the case of color images. In this color extension, each Point Spread Function (PSF) of each captured image is assumed to be different from one color component to another and from one imaging unit to the other. For the task of image restoration, we use all spectral information in each captured image to restore each output pixel in the reconstructed HR image, i.e., we use the most efficient global category of point operations. First, the composite RGB color components of each captured image are extracted. A blind estimation technique is then applied to estimate the spectra of each color component and its associated blurring PSF. The estimation process is formed in a way that minimizes significantly the interchannel cross-correlations and additive noise. The estimated PSFs together with advanced interpolation techniques are then combined to compensate for blur and reconstruct a HR color image of the original scene. Finally, a histogram normalization process adjusts the balance between image color components, brightness and contrast. Simulated and experimental results reveal that the proposed algorithm is capable of restoring HR color images from degraded, LR and noisy observations even at low Signal-to-Noise Energy ratios (SNERs). The proposed algorithm uses FFT and only two fundamental image restoration constraints, making it suitable for silicon integration with the TOMBO imager
- …