Search CORE

8 research outputs found

Weakly supervised underwater fish segmentation using affinity LCFCN

Author: Laradji Issam H.
Nowrouzezahrai Derek
Rahimi Azghadi Mostafa
Rodriguez Pau
Saleh Alzayat
Vazquez David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Estimating fish body measurements like length, width, and mass has received considerable research due to its potential in boosting productivity in marine and aquaculture applications. Some methods are based on manual collection of these measurements using tools like a ruler which is time consuming and labour intensive. Others rely on fully-supervised segmentation models to automatically acquire these measurements but require collecting per-pixel labels which are also time consuming. It can take up to 2 minutes per fish to acquire accurate segmentation labels. To address this problem, we propose a segmentation model that can efficiently train on images labeled with point-level supervision, where each fish is annotated with a single click. This labeling scheme takes an average of only 1 second per fish. Our model uses a fully convolutional neural network with one branch that outputs per-pixel scores and another that outputs an affinity matrix. These two outputs are aggregated using a random walk to get the final, refined per-pixel output. The whole model is trained end-to-end using the localization-based counting fully convolutional neural network (LCFCN) loss and thus we call our method Affinity-LCFCN (A-LCFCN). We conduct experiments on the DeepFish dataset, which contains several fish habitats from north-eastern Australia. The results show that A-LCFCN outperforms a fully-supervised segmentation model when the annotation budget is fixed. They also show that A-LCFCN achieves better segmentation results than LCFCN and a standard baseline

ResearchOnline@JCU

ResearchOnline at James Cook University

Directory of Open Access Journals

Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

Author: Rui Yan
Shu Xiangbo
Song Yan
Wang Bin
Wang Fanming
Zhao Yang
Publication venue
Publication date: 12/12/2022
Field of study

To balance the annotation labor and the granularity of supervision, single-frame annotation has been introduced in temporal action localization. It provides a rough temporal location for an action but implicitly overstates the supervision from the annotated-frame during training, leading to the confusion between actions and backgrounds, i.e., action incompleteness and background false positives. To tackle the two challenges, in this work, we present the Snippet Classification model and the Dilation-Erosion module. In the Dilation-Erosion module, we expand the potential action segments with a loose criterion to alleviate the problem of action incompleteness and then remove the background from the potential action segments to alleviate the problem of action incompleteness. Relying on the single-frame annotation and the output of the snippet classification, the Dilation-Erosion module mines pseudo snippet-level ground-truth, hard backgrounds and evident backgrounds, which in turn further trains the Snippet Classification model. It forms a cyclic dependency. Furthermore, we propose a new embedding loss to aggregate the features of action instances with the same label and separate the features of actions from backgrounds. Experiments on THUMOS14 and ActivityNet 1.2 validate the effectiveness of the proposed method. Code has been made publicly available (https://github.com/LingJun123/single-frame-TAL).Comment: 28 pages, 8 figure

arXiv.org e-Print Archive

Learning Instance Segmentation from Sparse Supervision

Author: Wolny Adrian
Publication venue
Publication date: 01/01/2023
Field of study

Instance segmentation is an important task in many domains of automatic image processing, such as self-driving cars, robotics and microscopy data analysis. Recently, deep learning-based algorithms have brought image segmentation close to human performance. However, most existing models rely on dense groundtruth labels for training, which are expensive, time consuming and often require experienced annotators to perform the labeling. Besides the annotation burden, training complex high-capacity neural networks depends upon non-trivial expertise in the choice and tuning of hyperparameters, making the adoption of these models challenging for researchers in other fields. The aim of this work is twofold. The first is to make the deep learning segmentation methods accessible to non-specialist. The second is to address the dense annotation problem by developing instance segmentation methods trainable with limited groundtruth data. In the first part of this thesis, I bring state-of-the-art instance segmentation methods closer to non-experts by developing PlantSeg: a pipeline for volumetric segmentation of light microscopy images of biological tissues into cells. PlantSeg comes with a large repository of pre-trained models and delivers highly accurate results on a variety of samples and image modalities. We exemplify its usefulness to answer biological questions in several collaborative research projects. In the second part, I tackle the dense annotation bottleneck by introducing SPOCO, an instance segmentation method, which can be trained from just a few annotated objects. It demonstrates strong segmentation performance on challenging natural and biological benchmark datasets at a very reduced manual annotation cost and delivers state-of-the-art results on the CVPPP benchmark. In summary, my contributions enable training of instance segmentation models with limited amounts of labeled data and make these methods more accessible for non-experts, speeding up the process of quantitative data analysis

Heidelberger Dokumentenserver

객체 인식의 레이블 효율적 학습

Author: 이정범
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2023. 2. 윤성로.딥러닝의 발전은 이미지 물체 인식 분야를 크게 발전시켰다. 하지만 이러한 발전은 수많은 학습 이미지와 각 이미지에 사람이 직접 생성한 물체의 위치 정보에 대한 레이블 덕분에 가능한 것이였다. 이미지 물체 인식 분야를 실생활에서 활용하기 위해서는 다양한 물체의 카테고리를 인식 할 수 있어야 하며, 이를 위해선 각 카테고리당 수많은 학습 데이터가 필요하다. 하지만 각 이미지당 물체의 위치를 각 픽셀마다 주석을 다는 것은 많은 비용이 들어간다. 이러한 정보를 얻을 때 필요한 비용은 약한지도학습으로 줄일 수 있다. 약한 지도 학습이란, 물체의 명시적인 위치 정보를 포함하는 레이블보다 더 값싸게 얻을 수는 있지만, 약한 위치 정보를 활용하여 뉴럴네트워크를 학습하는 것이다. 본 학위논문에서는 물체의 카테고리 정보, 학습 외 분포 데이터 (out-of-distribution) 데이터, 그리고 물체의 박스 레이블을 활용하는 약한지도학습 방법론들을 다룬다. 첫 번째로, 물체의 카테고리 정보를 이용한 약한 지도 학습을 다룬다. 대부분의 카테로기 정보를 활용하는 방법들은 학습된 분류기로부터 얻어진 기여도맵 (attribution map) 을 활용하지만, 이들은 물체의 일부만을 찾아내는 문제가 있다. 우리는 이 문제에 대한 근본 원인을 이론적인 관점에서 의논하고, 이 문제를 해결할 수 있는 세 가지의 방법론을 제안한다. 하지만, 물체의 카테고리 정보만 활용하게 되면 이미지의 전경과 배경이 악의적인 상관관계를 가진다고 잘 알려져 있다. 우리는 이러한 상관관계를 학습 외 분포 데이터를 활용하여 완화한다. 마지막으로, 물체의 카테고리 정보에 기반한 방법론들은 같은 카테고리의 다른 물체를 분리하지 못하기 때문에 인스턴스 분할 (instance segmentation) 에 적용되기는 힘들다. 따라서 물체의 박스 레이블을 활용한 약한 지도학습 방법론을 제안한다. 제안된 방법론을 통해 레이블을 제작하는 시간을 획기적으로 줄일 수 있다는 것을 실험결과를 통해 확인했다. 어려운 데이터셋인 Pascal VOC 에 대해 우리는 91%의 데이터 비용을 감소하면서, 강한 레이블로 학습된 비교군의 89%의 성능을 달성하였다. 또한, 물체의 박스 정보를 활용해서는 83% 의 데이터 비용을 감소하면서, 강한 레이블로 학습된 비교군의 96%의 성능을 달성하였다. 본 학위논문에서 제안된 방법론들이 딥러닝 기반의 물체 인식이 다양한 데이터와 다양한 환경에서 활용되는 데에 있어 도움이 되기를 기대한다.Advances in deep neural network approaches have produced tremendous progress in object recognition tasks, but it has come at the cost of annotating a huge amount of training images with explicit localization cues. To use object recognition tasks in real-life applications requires a large variety of object classes and a great deal of labeled data for each class. However, labeling pixel-level annotations of each object class is laborious, and hampers the expansion of object classes. The need for such expensive annotations is sidestepped by weakly supervised learning, in which a DNN is trained on images with some form of abbreviated annotation that is cheaper than explicit localization cues. In the dissertation, we study the methods of using various form of weak supervision, i.e., image-level class labels, out-of-distribution data, and bounding box labels. We first study image-level class labels for weakly supervised semantic segmentation. Most of the weakly supervised methods on image-level class labels depend on attribution maps from a trained classifier, but their focus tends to be restricted to a small discriminative region of the target object. We theoretically discuss the root cause of this problem, and propose three novel techniques to address this issue. However, built on class labels only, the produced localization maps are known to suffer from the confusion between foreground and background cues, i.e., spurious correlation. We address the spurious correlation problem by utilizing out-of-distribution data. Finally, methods based on class labels cannot separate different instance objects of the same class, which is essential for instance segmentation. Therefore, we utilize bounding box labels for weakly supervised instance segmentation as boxes provide information about individual objects and their locations. Experimental results show that annotation cost for learning semantic segmentation and instance segmentation can be significantly reduced: On the challenging Pascal VOC dataset, we have achieved 89% of the performance of the fully supervised equivalent by using only class labels, which reduces the label cost by 91%. In addition, we have achieved 96% of the performance of the fully supervised equivalent by using bounding box labels, which reduces the label cost by 83%. We expect that the methods introduced in this dissertation will be helpful for applying deep learning based object recognition tasks in a variety of domains and scenarios.1 Introduction 1 2 Background 8 2.1 Object Recognition 8 2.2 Weak Supervision 13 2.3 Preliminary Algirothms 16 2.3.1 Attribution Methods for Image Classifier 16 2.3.2 Refinement Techniques of Localization Maps 18 3 Learning with Image-Level Class Labels 22 3.1 Introduction 22 3.2 Related Work 23 3.2.1 FickleNet: Stochastic Inference Approach 23 3.2.2 Other Recent Approaches 26 3.3 Anti-Adversarially Manipulated Attribution 28 3.3.1 Adversarial Attack 28 3.3.2 Proposed Method 29 3.3.3 Experiments 33 3.3.4 Discussion 36 3.3.5 Analysis of Results by Class 42 3.4 Reducing Information Bottleneck 46 3.4.1 Information Bottleneck 46 3.4.2 Motivation 47 3.4.3 Proposed Method 49 3.4.4 Experiments 52 3.5 Summary 60 4 Learning with Auxiliary Data 62 4.1 Introduction 62 4.2 Related Work 65 4.3 Methods 66 4.3.1 Collecting the Hard Out-of-Distribution Data 67 4.3.2 Learning with the Hard Out-of-Distribution Data 69 4.3.3 Training Segmentation Networks 71 4.4 Experiments 73 4.4.1 Experimental Setup 73 4.4.2 Experimental Results 73 4.4.3 Analysis and Discussion 76 4.5 Analysis of OoD Collection Process 81 4.6 Integrating Proposed Methods 82 4.7 Summary 83 5 Learning with Bounding Box Labels 85 5.1 Introduction 85 5.2 Related Work 87 5.3 Methods 89 5.3.1 Revisiting Object Detectors 89 5.3.2 Bounding Box Attribution Map 90 5.3.3 Training the Segmentation Network 91 5.4 Experiments 93 5.4.1 Experimental Setup 93 5.4.2 Weakly Supervised Instance Segmentation 94 5.4.3 Weakly Supervised Semantic Segmentation 96 5.4.4 Ablation Study 98 5.5 Detailed Analysis of the BBAM 100 5.6 Summary 104 6 Conclusion 105 6.1 Dissertation Summary 105 6.2 Limitations and Future Direction 107 Abstract (In Korean) 133박

SNU Open Repository and Archive

River Ice Segmentation under a Limited Compute and Annotation Budget

Author: Sola Daniel
Publication venue: 'University of Waterloo'
Publication date: 20/04/2022
Field of study

River ice segmentation, used to differentiate ice and water, can give valuable information regarding ice cover and ice distribution. These are important factors when evaluating flooding risks caused by ice jams that may harm local ecosystems and infrastructure. Furthermore, discriminating specifically between anchor ice and frazil ice is important in understanding sediment transport and release events that can affect geomorphology and cause landslide risks. Modern deep learning techniques have proved to deliver promising segmentation results; however, they can require hours of expensive manual image labelling, can show poor generalization ability, and can be inefficient when hardware and computing power are limited. As river ice images are often collected in remote locations by unmanned aerial vehicles with limited computation power, we explore the performance-latency trade-offs for river ice segmentation. We propose a novel convolution block inspired by both depthwise separable convolutions and local binary convolutions giving additional efficiency, parameter savings, and generalization ability to river ice segmentation networks. Our novel convolution block is used in a shallow architecture that has 99.9% fewer trainable parameters, 99% fewer multiply-add operations, and 69.8% less memory usage than a UNet, while achieving virtually the same segmentation performance. We find that this network trains fast and is able to achieve high segmentation performance early in training due to an emphasis on both pixel intensity and texture. When compared to very efficient segmentation networks such as LR-ASPP with a MobileNetV3 backbone, we achieve good performance (mIoU of 64) 91% faster during training on a CPU and and an overall mIoU that is 7.7% higher. We also find that our novel convolution block is able to generalize better to new domains such as snowy environments or datasets with varying illumination. Diving deeper into river ice segmentation with resource constraints, we take on a separate task of training a segmentation model when labelling time is limited. As the ice type, environment, and image quality can vary drastically between rivers of interest, training new segmentation models for new environments can be infeasible due to the laborious task of pixel-wise annotation. We explore a point labelling method leveraging object proposals and a post processing technique that delivers a 14.6% increase in mIoU as compared to a fully supervised UNet with the same labelling budget. Our point labelling method also achieves a mIoU that is only 6.3% lower than a fully supervised model with a annotation budget that is 23x larger

University of Waterloo's Institutional Repository