51 research outputs found

    A wide dynamic range cmos imager with extended shunting inhibition image processing capabilities

    Get PDF
    A CMOS imager based on a novel mixed-mode VLSI implementation of biologically inspired shunting inhibition vision models is presented. It can achieve a wide range of image processing tasks such as image enhancement or edge detection via a programmable shunting inhibition processor. Its most important feature is a gain control mechanism allowing local and global adaptation to the mean input light intensity. This feature is shown to be very suitable for wide dynamic range imager

    Hands-on Bayesian Neural Networks -- a Tutorial for Deep Learning Users

    Full text link
    Modern deep learning methods constitute incredibly powerful tools to tackle a myriad of challenging problems. However, since deep learning methods operate as black boxes, the uncertainty associated with their predictions is often challenging to quantify. Bayesian statistics offer a formalism to understand and quantify the uncertainty associated with deep neural network predictions. This tutorial provides an overview of the relevant literature and a complete toolset to design, implement, train, use and evaluate Bayesian Neural Networks, i.e. Stochastic Artificial Neural Networks trained using Bayesian methods.Comment: 35 pages, 15 figure

    Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods

    Full text link
    In stereo vision, self-similar or bland regions can make it difficult to match patches between two images. Active stereo-based methods mitigate this problem by projecting a pseudo-random pattern on the scene so that each patch of an image pair can be identified without ambiguity. However, the projected pattern significantly alters the appearance of the image. If this pattern acts as a form of adversarial noise, it could negatively impact the performance of deep learning-based methods, which are now the de-facto standard for dense stereo vision. In this paper, we propose the Active-Passive SimStereo dataset and a corresponding benchmark to evaluate the performance gap between passive and active stereo images for stereo matching algorithms. Using the proposed benchmark and an additional ablation study, we show that the feature extraction and matching modules of a selection of twenty selected deep learning-based stereo matching methods generalize to active stereo without a problem. However, the disparity refinement modules of three of the twenty architectures (ACVNet, CascadeStereo, and StereoNet) are negatively affected by the active stereo patterns due to their reliance on the appearance of the input images.Comment: 22 pages, 12 figures, accepted in NeurIPS 2022 Datasets and Benchmarks Trac

    Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art

    Full text link
    Transformers have rapidly gained popularity in computer vision, especially in the field of object recognition and detection. Upon examining the outcomes of state-of-the-art object detection methods, we noticed that transformers consistently outperformed well-established CNN-based detectors in almost every video or image dataset. While transformer-based approaches remain at the forefront of small object detection (SOD) techniques, this paper aims to explore the performance benefits offered by such extensive networks and identify potential reasons for their SOD superiority. Small objects have been identified as one of the most challenging object types in detection frameworks due to their low visibility. We aim to investigate potential strategies that could enhance transformers' performance in SOD. This survey presents a taxonomy of over 60 research studies on developed transformers for the task of SOD, spanning the years 2020 to 2023. These studies encompass a variety of detection applications, including small object detection in generic images, aerial images, medical images, active millimeter images, underwater images, and videos. We also compile and present a list of 12 large-scale datasets suitable for SOD that were overlooked in previous studies and compare the performance of the reviewed studies using popular metrics such as mean Average Precision (mAP), Frames Per Second (FPS), number of parameters, and more. Researchers can keep track of newer studies on our web page, which is available at \url{https://github.com/arekavandi/Transformer-SOD}

    Reinforced Learning for Label-Efficient 3D Face Reconstruction

    Get PDF
    3D face reconstruction plays a major role in many human-robot interaction systems, from automatic face authentication to human-computer interface-based entertainment. To improve robustness against occlusions and noise, 3D face reconstruction networks are often trained on a set of in-the-wild face images preferably captured along different viewpoints of the subject. However, collecting the required large amounts of 3D annotated face data is expensive and time-consuming. To address the high annotation cost and due to the importance of training on a useful set, we propose an Active Learning (AL) framework that actively selects the most informative and representative samples to be labeled. To the best of our knowledge, this paper is the first work on tackling active learning for 3D face reconstruction to enable a label-efficient training strategy. In particular, we propose a Reinforcement Active Learning approach in conjunction with a clustering-based pooling strategy to select informative view-points of the subjects. Experimental results on 300W-LP and AFLW2000 datasets demonstrate that our proposed method is able to 1) efficiently select the most influencing view-points for labeling and outperforms several baseline AL techniques and 2) further improve the performance of a 3D Face Reconstruction network trained on the full dataset

    Automatic Hierarchical Classification of Kelps utilizing Deep Residual Feature

    Get PDF
    Across the globe, remote image data is rapidly being collected for the assessment of benthic communities from shallow to extremely deep waters on continental slopes to the abyssal seas. Exploiting this data is presently limited by the time it takes for experts to identify organisms found in these images. With this limitation in mind, a large effort has been made globally to introduce automation and machine learning algorithms to accelerate both classification and assessment of marine benthic biota. One major issue lies with organisms that move with swell and currents, like kelps. This paper presents an automatic hierarchical classification method (local binary classification as opposed to the conventional flat classification) to classify kelps in images collected by autonomous underwater vehicles. The proposed kelp classification approach exploits learned feature representations extracted from deep residual networks. We show that these generic features outperform the traditional off-the-shelf CNN features and the conventional hand-crafted features. Experiments also demonstrate that the hierarchical classification method outperforms the traditional parallel multi-class classifications by a significant margin (90.0% vs 57.6% and 77.2% vs 59.0%) on Benthoz15 and Rottnest datasets respectively. Furthermore, we compare different hierarchical classification approaches and experimentally show that the sibling hierarchical training approach outperforms the inclusive hierarchical approach by a significant margin. We also report an application of our proposed method to study the change in kelp cover over time for annually repeated AUV surveys.Comment: MDPI Sensor

    Deep Image Representations for Coral Image Classification

    Get PDF
    Healthy coral reefs play a vital role in maintaining biodiversity in tropical marine ecosystems. Remote imaging techniques have facilitated the scientific investigations of these intricate ecosystems, particularly at depths beyond 10 m where SCUBA diving techniques are not time or cost efficient. With millions of digital images of the seafloor collected using remotely operated vehicles and autonomous underwater vehicles (AUVs), manual annotation of these data by marine experts is a tedious, repetitive, and time-consuming task. It takes 10–30 min for a marine expert to meticulously annotate a single image. Automated technology to monitor the health of the oceans would allow for transformational ecological outcomes by standardizing methods to detect and identify species. This paper aims to automate the analysis of large available AUV imagery by developing advanced deep learning tools for rapid and large-scale automatic annotation of marine coral species. Such an automated technology would greatly benefit marine ecological studies in terms of cost, speed, and accuracy. To this end, we propose a deep learning based classification method for coral reefs and report the application of the proposed technique to the automatic annotation of unlabeled mosaics of the coral reef in the Abrolhos Islands, W.A., Australia. Our proposed method automatically quantified the coral coverage in this region and detected a decreasing trend in coral population, which is in line with conclusions drawn by marine ecologists

    A High Resolution Color Image Restoration Algorithm for Thin TOMBO Imaging Systems

    Get PDF
    In this paper, we present a blind image restoration algorithm to reconstruct a high resolution (HR) color image from multiple, low resolution (LR), degraded and noisy images captured by thin (< 1mm) TOMBO imaging systems. The proposed algorithm is an extension of our grayscale algorithm reported in [1] to the case of color images. In this color extension, each Point Spread Function (PSF) of each captured image is assumed to be different from one color component to another and from one imaging unit to the other. For the task of image restoration, we use all spectral information in each captured image to restore each output pixel in the reconstructed HR image, i.e., we use the most efficient global category of point operations. First, the composite RGB color components of each captured image are extracted. A blind estimation technique is then applied to estimate the spectra of each color component and its associated blurring PSF. The estimation process is formed in a way that minimizes significantly the interchannel cross-correlations and additive noise. The estimated PSFs together with advanced interpolation techniques are then combined to compensate for blur and reconstruct a HR color image of the original scene. Finally, a histogram normalization process adjusts the balance between image color components, brightness and contrast. Simulated and experimental results reveal that the proposed algorithm is capable of restoring HR color images from degraded, LR and noisy observations even at low Signal-to-Noise Energy ratios (SNERs). The proposed algorithm uses FFT and only two fundamental image restoration constraints, making it suitable for silicon integration with the TOMBO imager
    corecore