Search CORE

384 research outputs found

Underwater Fish Detection with Weak Multi-Domain Supervision

Author: Bradley Michael
Konovalov Dmitry A.
Marini Simone
Saleh Alzayat
Sankupellay Mangalam
Sheaves Marcus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Given a sufficiently large training dataset, it is relatively easy to train a modern convolution neural network (CNN) as a required image classifier. However, for the task of fish classification and/or fish detection, if a CNN was trained to detect or classify particular fish species in particular background habitats, the same CNN exhibits much lower accuracy when applied to new/unseen fish species and/or fish habitats. Therefore, in practice, the CNN needs to be continuously fine-tuned to improve its classification accuracy to handle new project-specific fish species or habitats. In this work we present a labelling-efficient method of training a CNN-based fish-detector (the Xception CNN was used as the base) on relatively small numbers (4,000) of project-domain underwater fish/no-fish images from 20 different habitats. Additionally, 17,000 of known negative (that is, missing fish) general-domain (VOC2012) above-water images were used. Two publicly available fish-domain datasets supplied additional 27,000 of above-water and underwater positive/fish images. By using this multi-domain collection of images, the trained Xception-based binary (fish/not-fish) classifier achieved 0.17% false-positives and 0.61% false-negatives on the project's 20,000 negative and 16,000 positive holdout test images, respectively. The area under the ROC curve (AUC) was 99.94%.Comment: Published in the 2019 International Joint Conference on Neural Networks (IJCNN-2019), Budapest, Hungary, July 14-19, 2019, https://www.ijcnn.org/ , https://ieeexplore.ieee.org/document/885190

arXiv.org e-Print Archive

Crossref

ResearchOnline at James Cook University

AquaSAM: Underwater Image Foreground Segmentation

Author: Liu Yutao
Su Jianhao
Xu Muduo
Publication venue
Publication date: 08/08/2023
Field of study

The Segment Anything Model (SAM) has revolutionized natural image segmentation, nevertheless, its performance on underwater images is still restricted. This work presents AquaSAM, the first attempt to extend the success of SAM on underwater images with the purpose of creating a versatile method for the segmentation of various underwater targets. To achieve this, we begin by classifying and extracting various labels automatically in SUIM dataset. Subsequently, we develop a straightforward fine-tuning method to adapt SAM to general foreground underwater image segmentation. Through extensive experiments involving eight segmentation tasks like human divers, we demonstrate that AquaSAM outperforms the default SAM model especially at hard tasks like coral reefs. AquaSAM achieves an average Dice Similarity Coefficient (DSC) of 7.13 (%) improvement and an average of 8.27 (%) on mIoU improvement in underwater segmentation tasks

arXiv.org e-Print Archive

Developing deep learning methods for aquaculture applications

Author: Saleh Alzayat
Publication venue
Publication date: 01/01/2020
Field of study

Alzayat Saleh developed a computer vision framework that can aid aquaculture experts in analyzing fish habitats. In particular, he developed a labelling efficient method of training a CNN-based fish-detector and also developed a model that estimates the fish weight directly from its image

ResearchOnline at James Cook University

Robust A*-Search Image Segmentation Algorithm for Mine-like Objects Segmentation in SONAR Images

Author: Benjamin Lehmann
Dieter Kraus
Ivan Aleksi
Tomislav Matić
Publication venue: 'Faculty of Electrical Engineering, Computer Science and Information Technology Osijek'
Publication date: 01/01/2020
Field of study

This paper addresses a sonar image segmentation method employing a Robust A*-Search Image Segmentation (RASIS) algorithm. RASIS is applied on Mine-Like Objects (MLO) in sonar images, where an object is defined by highlight and shadow regions, i.e. regions of high and low pixel intensities in a side-scan sonar image. RASIS uses a modified A*-Search method, which is usually used in mobile robotics for finding the shortest path where the environment map is predefined, and the start/goal locations are known. RASIS algorithm represents the image segmentation problem as a path-finding problem. Main modification concerning the original A*-Search is in the cost function that takes pixel intensities and contour curvature in order to navigate the 2D segmentation contour. The proposed method is implemented in Matlab and tested on real MLO images. MLO image dataset consist of 70 MLO images with manta mine present, and 70 MLO images with cylinder mine present. Segmentation success rate is obtained by comparing the ground truth data given by the human technician who is detecting MLOs. Measured overall success rate (highlight and shadow regions) is 91% for manta mines and 81% for cylinder mines

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

영상 잡음 제거와 수중 영상 복원을 위한 정규화 방법

Author: 조중희
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :자연과학대학 수리과학부,2020. 2. 강명주.In this thesis, we discuss regularization methods for denoising images corrupted by Gaussian or Cauchy noise and image dehazing in underwater. In image denoising, we introduce the second-order extension of structure tensor total variation and propose a hybrid method for additive Gaussian noise. Furthermore, we apply the weighted nuclear norm under nonlocal framework to remove additive Cauchy noise in images. We adopt the nonconvex alternating direction method of multiplier to solve the problem iteratively. Subsequently, based on the color ellipsoid prior which is effective for restoring hazy image in the atmosphere, we suggest novel dehazing method adapted for underwater condition. Because attenuation rate of light varies depending on wavelength of light in water, we apply the color ellipsoid prior only for green and blue channels and combine it with intensity map of red channel to refine the obtained depth map further. Numerical experiments show that our proposed methods show superior results compared with other methods both in quantitative and qualitative aspects.본 논문에서 우리는 가우시안 또는 코시 분포를 따르는 잡음으로 오염된 영상과 물 속에서 얻은 영상을 복원하기 위한 정규화 방법에 대해 논의한다. 영상 잡음 문제에서 우리는 덧셈 가우시안 잡음의 해결을 위해 구조 텐서 총변이의 이차 확장을 도입하고 이것을 이용한 혼합 방법을 제안한다. 나아가 덧셈 코시 잡음 문제를 해결하기 위해 우리는 가중 핵 노름을 비국소적인 틀에서 적용하고 비볼록 교차 승수법을 통해서 반복적으로 문제를 푼다. 이어서 대기 중의 안개 낀 영상을 복원하는데 효과적인 색 타원면 가정에 기초하여, 우리는 물 속의 상황에 알맞은 영상 복원 방법을 제시한다. 물 속에서 빛의 감쇠 정도는 빛의 파장에 따라 달라지기 때문에, 우리는 색 타원면 가정을 영상의 녹색과 청색 채널에 적용하고 그로부터 얻은 깊이 지도를 적색 채널의 강도 지도와 혼합하여 개선된 깊이 지도를 얻는다. 수치적 실험을 통해서 우리가 제시한 방법들을 다른 방법과 비교하고 질적인 측면과 평가 지표에 따른 양적인 측면 모두에서 우수함을 확인한다.1 Introduction 1 1.1 Image denoising for Gaussian and Cauchy noise 2 1.2 Underwater image dehazing 5 2 Preliminaries 9 2.1 Variational models for image denoising 9 2.1.1 Data-fidelity 9 2.1.2 Regularization 11 2.1.3 Optimization algorithm 14 2.2 Methods for image dehazing in the air 15 2.2.1 Dark channel prior 16 2.2.2 Color ellipsoid prior 19 3 Image denoising for Gaussian and Cauchy noise 23 3.1 Second-order structure tensor and hybrid STV 23 3.1.1 Structure tensor total variation 24 3.1.2 Proposed model 28 3.1.3 Discretization of the model 31 3.1.4 Numerical algorithm 35 3.1.5 Experimental results 37 3.2 Weighted nuclear norm minimization for Cauchy noise 46 3.2.1 Variational models for Cauchy noise 46 3.2.2 Low rank minimization by weighted nuclear norm 52 3.2.3 Proposed method 55 3.2.4 ADMM algorithm 56 3.2.5 Numerical method and experimental results 58 4 Image restoration in underwater 71 4.1 Scientific background 72 4.2 Proposed method 73 4.2.1 Color ellipsoid prior on underwater 74 4.2.2 Background light estimation 78 4.3 Experimental results 80 5 Conclusion 87 Appendices 89Docto

SNU Open Repository and Archive

Deep Neural Network Architectures and Learning Methodologies for Classification and Application in 3D Reconstruction

Author: Forbes Timothy
Publication venue
Publication date: 01/12/2018
Field of study

In this work we explore two different scenarios of 3D reconstruction. The first, urban scenes, is approached using a deep learning network trained to identify structurally important classes within aerial imagery of cities. The network was trained using data taken from ISPRS benchmark dataset of the city of Vaihingen. Using the segmented maps generated by the network we can proceed to more accurately reconstruct the scenes by a process of clustering and then class specific model generation. The second scenario is that of underwater scenes. We use two separate networks to first identify caustics and then remove them from a scene. Data was generated synthetically as real world datasets for this subject are extremely hard to produce. Using the generated caustic free image we can then reconstruct the scene with more precision and accuracy through a process of structure from motion. We investigate different deep learning architectures and parameters for both scenarios. Our results are evaluated to be efficient and effective by comparing them with online benchmarks and alternative reconstruction attempts. We conclude by discussing the limitations of problem specific datasets and our potential research into the generation of datasets through the use of Generative-Adverserial-Networks

Concordia University Research Repository

A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

Author: Ball John E.
Anderson Derek T.
Chan Chee Seng
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

arXiv.org e-Print Archive

Crossref

FigShare

Spectral methods for multimodal data analysis

Author: Bronstein Michael
Kovnatsky Artiom
Publication venue
Publication date: 10/11/2016
Field of study

Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image

RERO DOC Digital Library