384 research outputs found
Underwater Fish Detection with Weak Multi-Domain Supervision
Given a sufficiently large training dataset, it is relatively easy to train a
modern convolution neural network (CNN) as a required image classifier.
However, for the task of fish classification and/or fish detection, if a CNN
was trained to detect or classify particular fish species in particular
background habitats, the same CNN exhibits much lower accuracy when applied to
new/unseen fish species and/or fish habitats. Therefore, in practice, the CNN
needs to be continuously fine-tuned to improve its classification accuracy to
handle new project-specific fish species or habitats. In this work we present a
labelling-efficient method of training a CNN-based fish-detector (the Xception
CNN was used as the base) on relatively small numbers (4,000) of project-domain
underwater fish/no-fish images from 20 different habitats. Additionally, 17,000
of known negative (that is, missing fish) general-domain (VOC2012) above-water
images were used. Two publicly available fish-domain datasets supplied
additional 27,000 of above-water and underwater positive/fish images. By using
this multi-domain collection of images, the trained Xception-based binary
(fish/not-fish) classifier achieved 0.17% false-positives and 0.61%
false-negatives on the project's 20,000 negative and 16,000 positive holdout
test images, respectively. The area under the ROC curve (AUC) was 99.94%.Comment: Published in the 2019 International Joint Conference on Neural
Networks (IJCNN-2019), Budapest, Hungary, July 14-19, 2019,
https://www.ijcnn.org/ , https://ieeexplore.ieee.org/document/885190
AquaSAM: Underwater Image Foreground Segmentation
The Segment Anything Model (SAM) has revolutionized natural image
segmentation, nevertheless, its performance on underwater images is still
restricted. This work presents AquaSAM, the first attempt to extend the success
of SAM on underwater images with the purpose of creating a versatile method for
the segmentation of various underwater targets. To achieve this, we begin by
classifying and extracting various labels automatically in SUIM dataset.
Subsequently, we develop a straightforward fine-tuning method to adapt SAM to
general foreground underwater image segmentation. Through extensive experiments
involving eight segmentation tasks like human divers, we demonstrate that
AquaSAM outperforms the default SAM model especially at hard tasks like coral
reefs. AquaSAM achieves an average Dice Similarity Coefficient (DSC) of 7.13
(%) improvement and an average of 8.27 (%) on mIoU improvement in underwater
segmentation tasks
Developing deep learning methods for aquaculture applications
Alzayat Saleh developed a computer vision framework that can aid aquaculture experts in analyzing fish habitats. In particular, he developed a labelling efficient method of training a CNN-based fish-detector and also developed a model that estimates the fish weight directly from its image
Robust A*-Search Image Segmentation Algorithm for Mine-like Objects Segmentation in SONAR Images
This paper addresses a sonar image segmentation method employing a Robust A*-Search Image Segmentation (RASIS) algorithm. RASIS is applied on Mine-Like Objects (MLO) in sonar images, where an object is defined by highlight and shadow regions, i.e. regions of high and low pixel intensities in a side-scan sonar image. RASIS uses a modified A*-Search method, which is usually used in mobile robotics for finding the shortest path where the environment map is predefined, and the start/goal locations are known. RASIS algorithm represents the image segmentation problem as a path-finding problem. Main modification concerning the original A*-Search is in the cost function that takes pixel intensities and contour curvature in order to navigate the 2D segmentation contour. The proposed method is implemented in Matlab and tested on real MLO images. MLO image dataset consist of 70 MLO images with manta mine present, and 70 MLO images with cylinder mine present. Segmentation success rate is obtained by comparing the ground truth data given by the human technician who is detecting MLOs. Measured overall success rate (highlight and shadow regions) is 91% for manta mines and 81% for cylinder mines
μμ μ‘μ μ κ±°μ μμ€ μμ 볡μμ μν μ κ·ν λ°©λ²
νμλ
Όλ¬Έ(λ°μ¬)--μμΈλνκ΅ λνμ :μμ°κ³Όνλν μ리과νλΆ,2020. 2. κ°λͺ
μ£Ό.In this thesis, we discuss regularization methods for denoising images corrupted by Gaussian or Cauchy noise and image dehazing in underwater. In image denoising, we introduce the second-order extension of structure tensor total variation and propose a hybrid method for additive Gaussian noise. Furthermore, we apply the weighted nuclear norm under nonlocal framework to remove additive Cauchy noise in images. We adopt the nonconvex alternating direction method of multiplier to solve the problem iteratively. Subsequently, based on the color ellipsoid prior which is effective for restoring hazy image in the atmosphere, we suggest novel dehazing method adapted for underwater condition. Because attenuation rate of light varies depending on wavelength of light in water, we apply the color ellipsoid prior only for green and blue channels and combine it with intensity map of red channel to refine the obtained depth map further. Numerical experiments show that our proposed methods show superior results compared with other methods both in quantitative and qualitative aspects.λ³Έ λ
Όλ¬Έμμ μ°λ¦¬λ κ°μ°μμ λλ μ½μ λΆν¬λ₯Ό λ°λ₯΄λ μ‘μμΌλ‘ μ€μΌλ μμκ³Ό λ¬Ό μμμ μ»μ μμμ 볡μνκΈ° μν μ κ·ν λ°©λ²μ λν΄ λ
Όμνλ€. μμ μ‘μ λ¬Έμ μμ μ°λ¦¬λ λ§μ
κ°μ°μμ μ‘μμ ν΄κ²°μ μν΄ κ΅¬μ‘° ν
μ μ΄λ³μ΄μ μ΄μ°¨ νμ₯μ λμ
νκ³ μ΄κ²μ μ΄μ©ν νΌν© λ°©λ²μ μ μνλ€. λμκ° λ§μ
μ½μ μ‘μ λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν΄ μ°λ¦¬λ κ°μ€ ν΅ λ
Έλ¦μ λΉκ΅μμ μΈ νμμ μ μ©νκ³ λΉλ³Όλ‘ κ΅μ°¨ μΉμλ²μ ν΅ν΄μ λ°λ³΅μ μΌλ‘ λ¬Έμ λ₯Ό νΌλ€. μ΄μ΄μ λκΈ° μ€μ μκ° λ μμμ 볡μνλλ° ν¨κ³Όμ μΈ μ νμλ©΄ κ°μ μ κΈ°μ΄νμ¬, μ°λ¦¬λ λ¬Ό μμ μν©μ μλ§μ μμ 볡μ λ°©λ²μ μ μνλ€. λ¬Ό μμμ λΉμ κ°μ μ λλ λΉμ νμ₯μ λ°λΌ λ¬λΌμ§κΈ° λλ¬Έμ, μ°λ¦¬λ μ νμλ©΄ κ°μ μ μμμ λ
Ήμκ³Ό μ²μ μ±λμ μ μ©νκ³ κ·Έλ‘λΆν° μ»μ κΉμ΄ μ§λλ₯Ό μ μ μ±λμ κ°λ μ§λμ νΌν©νμ¬ κ°μ λ κΉμ΄ μ§λλ₯Ό μ»λλ€. μμΉμ μ€νμ ν΅ν΄μ μ°λ¦¬κ° μ μν λ°©λ²λ€μ λ€λ₯Έ λ°©λ²κ³Ό λΉκ΅νκ³ μ§μ μΈ μΈ‘λ©΄κ³Ό νκ° μ§νμ λ°λ₯Έ μμ μΈ μΈ‘λ©΄ λͺ¨λμμ μ°μν¨μ νμΈνλ€.1 Introduction 1
1.1 Image denoising for Gaussian and Cauchy noise 2
1.2 Underwater image dehazing 5
2 Preliminaries 9
2.1 Variational models for image denoising 9
2.1.1 Data-fidelity 9
2.1.2 Regularization 11
2.1.3 Optimization algorithm 14
2.2 Methods for image dehazing in the air 15
2.2.1 Dark channel prior 16
2.2.2 Color ellipsoid prior 19
3 Image denoising for Gaussian and Cauchy noise 23
3.1 Second-order structure tensor and hybrid STV 23
3.1.1 Structure tensor total variation 24
3.1.2 Proposed model 28
3.1.3 Discretization of the model 31
3.1.4 Numerical algorithm 35
3.1.5 Experimental results 37
3.2 Weighted nuclear norm minimization for Cauchy noise 46
3.2.1 Variational models for Cauchy noise 46
3.2.2 Low rank minimization by weighted nuclear norm 52
3.2.3 Proposed method 55
3.2.4 ADMM algorithm 56
3.2.5 Numerical method and experimental results 58
4 Image restoration in underwater 71
4.1 Scientific background 72
4.2 Proposed method 73
4.2.1 Color ellipsoid prior on underwater 74
4.2.2 Background light estimation 78
4.3 Experimental results 80
5 Conclusion 87
Appendices 89Docto
Deep Neural Network Architectures and Learning Methodologies for Classification and Application in 3D Reconstruction
In this work we explore two different scenarios of 3D reconstruction. The first, urban scenes, is approached using a deep learning network trained to identify structurally important classes within aerial imagery of cities. The network was trained using data taken from ISPRS benchmark dataset of the city of Vaihingen. Using the segmented maps generated by the network we can proceed to more accurately reconstruct the scenes by a process of clustering and then class specific model generation. The second scenario is that of underwater scenes. We use two separate networks to first identify caustics and then remove them from a scene. Data was generated synthetically as real world datasets for this subject are extremely hard to produce. Using the generated caustic free image we can then reconstruct the scene with more precision and accuracy through a process of structure from motion. We investigate different deep learning architectures and parameters for both scenarios. Our results are evaluated to be efficient and effective by comparing them with online benchmarks and alternative reconstruction attempts. We conclude by discussing the limitations of problem specific datasets and our potential research into the generation of datasets through the use of Generative-Adverserial-Networks
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Spectral methods for multimodal data analysis
Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image
- β¦