384 research outputs found

    Underwater Fish Detection with Weak Multi-Domain Supervision

    Full text link
    Given a sufficiently large training dataset, it is relatively easy to train a modern convolution neural network (CNN) as a required image classifier. However, for the task of fish classification and/or fish detection, if a CNN was trained to detect or classify particular fish species in particular background habitats, the same CNN exhibits much lower accuracy when applied to new/unseen fish species and/or fish habitats. Therefore, in practice, the CNN needs to be continuously fine-tuned to improve its classification accuracy to handle new project-specific fish species or habitats. In this work we present a labelling-efficient method of training a CNN-based fish-detector (the Xception CNN was used as the base) on relatively small numbers (4,000) of project-domain underwater fish/no-fish images from 20 different habitats. Additionally, 17,000 of known negative (that is, missing fish) general-domain (VOC2012) above-water images were used. Two publicly available fish-domain datasets supplied additional 27,000 of above-water and underwater positive/fish images. By using this multi-domain collection of images, the trained Xception-based binary (fish/not-fish) classifier achieved 0.17% false-positives and 0.61% false-negatives on the project's 20,000 negative and 16,000 positive holdout test images, respectively. The area under the ROC curve (AUC) was 99.94%.Comment: Published in the 2019 International Joint Conference on Neural Networks (IJCNN-2019), Budapest, Hungary, July 14-19, 2019, https://www.ijcnn.org/ , https://ieeexplore.ieee.org/document/885190

    AquaSAM: Underwater Image Foreground Segmentation

    Full text link
    The Segment Anything Model (SAM) has revolutionized natural image segmentation, nevertheless, its performance on underwater images is still restricted. This work presents AquaSAM, the first attempt to extend the success of SAM on underwater images with the purpose of creating a versatile method for the segmentation of various underwater targets. To achieve this, we begin by classifying and extracting various labels automatically in SUIM dataset. Subsequently, we develop a straightforward fine-tuning method to adapt SAM to general foreground underwater image segmentation. Through extensive experiments involving eight segmentation tasks like human divers, we demonstrate that AquaSAM outperforms the default SAM model especially at hard tasks like coral reefs. AquaSAM achieves an average Dice Similarity Coefficient (DSC) of 7.13 (%) improvement and an average of 8.27 (%) on mIoU improvement in underwater segmentation tasks

    Developing deep learning methods for aquaculture applications

    Get PDF
    Alzayat Saleh developed a computer vision framework that can aid aquaculture experts in analyzing fish habitats. In particular, he developed a labelling efficient method of training a CNN-based fish-detector and also developed a model that estimates the fish weight directly from its image

    Robust A*-Search Image Segmentation Algorithm for Mine-like Objects Segmentation in SONAR Images

    Get PDF
    This paper addresses a sonar image segmentation method employing a Robust A*-Search Image Segmentation (RASIS) algorithm. RASIS is applied on Mine-Like Objects (MLO) in sonar images, where an object is defined by highlight and shadow regions, i.e. regions of high and low pixel intensities in a side-scan sonar image. RASIS uses a modified A*-Search method, which is usually used in mobile robotics for finding the shortest path where the environment map is predefined, and the start/goal locations are known. RASIS algorithm represents the image segmentation problem as a path-finding problem. Main modification concerning the original A*-Search is in the cost function that takes pixel intensities and contour curvature in order to navigate the 2D segmentation contour. The proposed method is implemented in Matlab and tested on real MLO images. MLO image dataset consist of 70 MLO images with manta mine present, and 70 MLO images with cylinder mine present. Segmentation success rate is obtained by comparing the ground truth data given by the human technician who is detecting MLOs. Measured overall success rate (highlight and shadow regions) is 91% for manta mines and 81% for cylinder mines

    μ˜μƒ 작음 μ œκ±°μ™€ μˆ˜μ€‘ μ˜μƒ 볡원을 μœ„ν•œ μ •κ·œν™” 방법

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :μžμ—°κ³Όν•™λŒ€ν•™ μˆ˜λ¦¬κ³Όν•™λΆ€,2020. 2. κ°•λͺ…μ£Ό.In this thesis, we discuss regularization methods for denoising images corrupted by Gaussian or Cauchy noise and image dehazing in underwater. In image denoising, we introduce the second-order extension of structure tensor total variation and propose a hybrid method for additive Gaussian noise. Furthermore, we apply the weighted nuclear norm under nonlocal framework to remove additive Cauchy noise in images. We adopt the nonconvex alternating direction method of multiplier to solve the problem iteratively. Subsequently, based on the color ellipsoid prior which is effective for restoring hazy image in the atmosphere, we suggest novel dehazing method adapted for underwater condition. Because attenuation rate of light varies depending on wavelength of light in water, we apply the color ellipsoid prior only for green and blue channels and combine it with intensity map of red channel to refine the obtained depth map further. Numerical experiments show that our proposed methods show superior results compared with other methods both in quantitative and qualitative aspects.λ³Έ λ…Όλ¬Έμ—μ„œ μš°λ¦¬λŠ” κ°€μš°μ‹œμ•ˆ λ˜λŠ” μ½”μ‹œ 뢄포λ₯Ό λ”°λ₯΄λŠ” 작음으둜 μ˜€μ—Όλœ μ˜μƒκ³Ό λ¬Ό μ†μ—μ„œ 얻은 μ˜μƒμ„ λ³΅μ›ν•˜κΈ° μœ„ν•œ μ •κ·œν™” 방법에 λŒ€ν•΄ λ…Όμ˜ν•œλ‹€. μ˜μƒ 작음 λ¬Έμ œμ—μ„œ μš°λ¦¬λŠ” λ§μ…ˆ κ°€μš°μ‹œμ•ˆ 작음의 해결을 μœ„ν•΄ ꡬ쑰 ν…μ„œ μ΄λ³€μ΄μ˜ 이차 ν™•μž₯을 λ„μž…ν•˜κ³  이것을 μ΄μš©ν•œ ν˜Όν•© 방법을 μ œμ•ˆν•œλ‹€. λ‚˜μ•„κ°€ λ§μ…ˆ μ½”μ‹œ 작음 문제λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄ μš°λ¦¬λŠ” 가쀑 ν•΅ 노름을 λΉ„κ΅­μ†Œμ μΈ ν‹€μ—μ„œ μ μš©ν•˜κ³  비볼둝 ꡐ차 μŠΉμˆ˜λ²•μ„ ν†΅ν•΄μ„œ 반볡적으둜 문제λ₯Ό ν‘Όλ‹€. μ΄μ–΄μ„œ λŒ€κΈ° μ€‘μ˜ μ•ˆκ°œ λ‚€ μ˜μƒμ„ λ³΅μ›ν•˜λŠ”λ° 효과적인 색 타원면 가정에 κΈ°μ΄ˆν•˜μ—¬, μš°λ¦¬λŠ” λ¬Ό μ†μ˜ 상황에 μ•Œλ§žμ€ μ˜μƒ 볡원 방법을 μ œμ‹œν•œλ‹€. λ¬Ό μ†μ—μ„œ λΉ›μ˜ 감쇠 μ •λ„λŠ” λΉ›μ˜ 파μž₯에 따라 달라지기 λ•Œλ¬Έμ—, μš°λ¦¬λŠ” 색 타원면 가정을 μ˜μƒμ˜ 녹색과 청색 채널에 μ μš©ν•˜κ³  κ·Έλ‘œλΆ€ν„° 얻은 깊이 지도λ₯Ό 적색 μ±„λ„μ˜ 강도 지도와 ν˜Όν•©ν•˜μ—¬ κ°œμ„ λœ 깊이 지도λ₯Ό μ–»λŠ”λ‹€. 수치적 μ‹€ν—˜μ„ ν†΅ν•΄μ„œ μš°λ¦¬κ°€ μ œμ‹œν•œ 방법듀을 λ‹€λ₯Έ 방법과 λΉ„κ΅ν•˜κ³  질적인 μΈ‘λ©΄κ³Ό 평가 μ§€ν‘œμ— λ”°λ₯Έ 양적인 μΈ‘λ©΄ λͺ¨λ‘μ—μ„œ μš°μˆ˜ν•¨μ„ ν™•μΈν•œλ‹€.1 Introduction 1 1.1 Image denoising for Gaussian and Cauchy noise 2 1.2 Underwater image dehazing 5 2 Preliminaries 9 2.1 Variational models for image denoising 9 2.1.1 Data-fidelity 9 2.1.2 Regularization 11 2.1.3 Optimization algorithm 14 2.2 Methods for image dehazing in the air 15 2.2.1 Dark channel prior 16 2.2.2 Color ellipsoid prior 19 3 Image denoising for Gaussian and Cauchy noise 23 3.1 Second-order structure tensor and hybrid STV 23 3.1.1 Structure tensor total variation 24 3.1.2 Proposed model 28 3.1.3 Discretization of the model 31 3.1.4 Numerical algorithm 35 3.1.5 Experimental results 37 3.2 Weighted nuclear norm minimization for Cauchy noise 46 3.2.1 Variational models for Cauchy noise 46 3.2.2 Low rank minimization by weighted nuclear norm 52 3.2.3 Proposed method 55 3.2.4 ADMM algorithm 56 3.2.5 Numerical method and experimental results 58 4 Image restoration in underwater 71 4.1 Scientific background 72 4.2 Proposed method 73 4.2.1 Color ellipsoid prior on underwater 74 4.2.2 Background light estimation 78 4.3 Experimental results 80 5 Conclusion 87 Appendices 89Docto

    Deep Neural Network Architectures and Learning Methodologies for Classification and Application in 3D Reconstruction

    Get PDF
    In this work we explore two different scenarios of 3D reconstruction. The first, urban scenes, is approached using a deep learning network trained to identify structurally important classes within aerial imagery of cities. The network was trained using data taken from ISPRS benchmark dataset of the city of Vaihingen. Using the segmented maps generated by the network we can proceed to more accurately reconstruct the scenes by a process of clustering and then class specific model generation. The second scenario is that of underwater scenes. We use two separate networks to first identify caustics and then remove them from a scene. Data was generated synthetically as real world datasets for this subject are extremely hard to produce. Using the generated caustic free image we can then reconstruct the scene with more precision and accuracy through a process of structure from motion. We investigate different deep learning architectures and parameters for both scenarios. Our results are evaluated to be efficient and effective by comparing them with online benchmarks and alternative reconstruction attempts. We conclude by discussing the limitations of problem specific datasets and our potential research into the generation of datasets through the use of Generative-Adverserial-Networks

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Spectral methods for multimodal data analysis

    Get PDF
    Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image
    • …
    corecore