6,104 research outputs found

    On information captured by neural networks: connections with memorization and generalization

    Full text link
    Despite the popularity and success of deep learning, there is limited understanding of when, how, and why neural networks generalize to unseen examples. Since learning can be seen as extracting information from data, we formally study information captured by neural networks during training. Specifically, we start with viewing learning in presence of noisy labels from an information-theoretic perspective and derive a learning algorithm that limits label noise information in weights. We then define a notion of unique information that an individual sample provides to the training of a deep network, shedding some light on the behavior of neural networks on examples that are atypical, ambiguous, or belong to underrepresented subpopulations. We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds. Finally, by studying knowledge distillation, we highlight the important role of data and label complexity in generalization. Overall, our findings contribute to a deeper understanding of the mechanisms underlying neural network generalization.Comment: PhD thesi

    Segmentation of Pathology Images: A Deep Learning Strategy with Annotated Data

    Get PDF
    Cancer has significantly threatened human life and health for many years. In the clinic, histopathology image segmentation is the golden stand for evaluating the prediction of patient prognosis and treatment outcome. Generally, manually labelling tumour regions in hundreds of high-resolution histopathological images is time-consuming and expensive for pathologists. Recently, the advancements in hardware and computer vision have allowed deep-learning-based methods to become mainstream to segment tumours automatically, significantly reducing the workload of pathologists. However, most current methods rely on large-scale labelled histopathological images. Therefore, this research studies label-effective tumour segmentation methods using deep-learning paradigms to relieve the annotation limitations. Chapter 3 proposes an ensemble framework for fully-supervised tumour segmentation. Usually, the performance of an individual-trained network is limited by significant morphological variances in histopathological images. We propose a fully-supervised learning ensemble fusion model that uses both shallow and deep U-Nets, trained with images of different resolutions and subsets of images, for robust predictions of tumour regions. Noise elimination is achieved with Convolutional Conditional Random Fields. Two open datasets are used to evaluate the proposed method: the ACDC@LungHP challenge at ISBI2019 and the DigestPath challenge at MICCAI2019. With a dice coefficient of 79.7 %, the proposed method takes third place in ACDC@LungHP. In DigestPath 2019, the proposed method achieves a dice coefficient 77.3 %. Well-annotated images are an indispensable part of training fully-supervised segmentation strategies. However, large-scale histopathology images are hardly annotated finely in clinical practice. It is common for labels to be of poor quality or for only a few images to be manually marked by experts. Consequently, fully-supervised methods cannot perform well in these cases. Chapter 4 proposes a self-supervised contrast learning for tumour segmentation. A self-supervised cancer segmentation framework is proposed to reduce label dependency. An innovative contrastive learning scheme is developed to represent tumour features based on unlabelled images. Unlike a normal U-Net, the backbone is a patch-based segmentation network. Additionally, data augmentation and contrastive losses are applied to improve the discriminability of tumour features. A convolutional Conditional Random Field is used to smooth and eliminate noise. Three labelled, and fourteen unlabelled images are collected from a private skin cancer dataset called BSS. Experimental results show that the proposed method achieves better tumour segmentation performance than other popular self-supervised methods. However, by evaluated on the same public dataset as chapter 3, the proposed self-supervised method is hard to handle fine-grained segmentation around tumour boundaries compared to the supervised method we proposed. Chapter 5 proposes a sketch-based weakly-supervised tumour segmentation method. To segment tumour regions precisely with coarse annotations, a sketch-supervised method is proposed, containing a dual CNN-Transformer network and a global normalised class activation map. CNN-Transformer networks simultaneously model global and local tumour features. With the global normalised class activation map, a gradient-based tumour representation can be obtained from the dual network predictions. We invited experts to mark fine and coarse annotations in the private BSS and the public PAIP2019 datasets to facilitate reproducible performance comparisons. Using the BSS dataset, the proposed method achieves 76.686 % IOU and 86.6 % Dice scores, outperforming state-of-the-art methods. Additionally, the proposed method achieves a Dice gain of 8.372 % compared with U-Net on the PAIP2019 dataset. The thesis presents three approaches to segmenting cancers from histology images: fully-supervised, unsupervised, and weakly supervised methods. This research effectively segments tumour regions based on histopathological annotations and well-designed modules. Our studies comprehensively demonstrate label-effective automatic histopathological image segmentation. Experimental results prove that our works achieve state-of-the-art segmentation performances on private and public datasets. In the future, we plan to integrate more tumour feature representation technologies with other medical modalities and apply them to clinical research

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

    Understanding and Mitigating Privacy Vulnerabilities in Deep Learning

    Get PDF
    Advancements in Deep Learning (DL) have enabled leveraging large-scale datasets to train models that perform challenging tasks at a level that mimics human intelligence. In several real-world scenarios, the data used for training, the trained model, and the data used for inference can be private and distributed across multiple distrusting parties, posing a challenge for training and inference. Several privacy-preserving training and inference frameworks have been developed to address this challenge. For instance, frameworks like federated learning and split learning have been proposed to train a model collaboratively on distributed data without explicitly sharing the private data to protect training data privacy. To protect model privacy during inference, the model owners have adopted a client-server architecture to provide inference services, wherein the end-users are only allowed black-box access to the model’s predictions for their input queries. The goal of this thesis is to provide a better understanding of the privacy properties of the DL frameworks used for privacy-preserving training and inference. While these frameworks have the appearance of keeping the data and model private, the information exchanged during training/inference has the potential to compromise the privacy of the parties involved by leaking sensitive data. We aim to understand if these frameworks are truly capable of preventing the leakage of model and training data in realistic settings. In this pursuit, we discover new vulnerabilities that can be exploited to design powerful attacks that can overcome the limitations of prior works and break the illusion of privacy. Our findings highlight the limitations of these frameworks and underscore the importance of principled techniques to protect privacy. Furthermore, we leverage our improved understanding to design better defenses that can significantly deter the efficacy of an attack.Ph.D

    The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation

    Full text link
    In this paper, we introduce a novel learning scheme named weakly semi-supervised instance segmentation (WSSIS) with point labels for budget-efficient and high-performance instance segmentation. Namely, we consider a dataset setting consisting of a few fully-labeled images and a lot of point-labeled images. Motivated by the main challenge of semi-supervised approaches mainly derives from the trade-off between false-negative and false-positive instance proposals, we propose a method for WSSIS that can effectively leverage the budget-friendly point labels as a powerful weak supervision source to resolve the challenge. Furthermore, to deal with the hard case where the amount of fully-labeled data is extremely limited, we propose a MaskRefineNet that refines noise in rough masks. We conduct extensive experiments on COCO and BDD100K datasets, and the proposed method achieves promising results comparable to those of the fully-supervised model, even with 50% of the fully labeled COCO data (38.8% vs. 39.7%). Moreover, when using as little as 5% of fully labeled COCO data, our method shows significantly superior performance over the state-of-the-art semi-supervised learning method (33.7% vs. 24.9%). The code is available at https://github.com/clovaai/PointWSSIS.Comment: CVPR 202

    Application of improved you only look once model in road traffic monitoring system

    Get PDF
    The present research focuses on developing an intelligent traffic management solution for tracking the vehicles on roads. Our proposed work focuses on a much better you only look once (YOLOv4) traffic monitoring system that uses the CSPDarknet53 architecture as its foundation. Deep-sort learning methodology for vehicle multi-target detection from traffic video is also part of our research study. We have included features like the Kalman filter, which estimates unknown objects and can track moving targets. Hungarian techniques identify the correct frame for the object. We are using enhanced object detection network design and new data augmentation techniques with YOLOv4, which ultimately aids in traffic monitoring. Until recently, object identification models could either perform quickly or draw conclusions quickly. This was a big improvement, as YOLOv4 has an astoundingly good performance for a very high frames per second (FPS). The current study is focused on developing an intelligent video surveillance-based vehicle tracking system that tracks the vehicles using a neural network, image-based tracking, and YOLOv4. Real video sequences of road traffic are used to test the effectiveness of the method that has been suggested in the research. Through simulations, it is demonstrated that the suggested technique significantly increases graphics processing unit (GPU) speed and FSP as compared to baseline algorithms

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    ICEBEAR-3D: An Advanced Low Elevation Angle Auroral E region Imaging Radar

    Get PDF
    The Ionospheric Continuous-wave E region Bistatic Experimental Auroral Radar (ICEBEAR) is an auroral E~region radar which has operated from 7 December 2017 until the September 2019. During the first two years of operation, ICEBEAR was only capable of spatially locating E~region scatter and meteor trail targets in range and azimuth. Elevation angles were not determinable due to its East-West uniform linear receiving antenna array. Measuring elevation angles of targets when viewing from low elevation angles with radar interferometers has been a long standing problem. Past high latitude radars have attempted to obtain elevation angles of E~region targets using North-South baselines, but have always resulted in erroneous elevation angles being measured in the low elevation regime (0° to ≈30° above the horizon), leaving interesting scientific questions about scatter altitudes in the auroral E~region unanswered. The work entailed in this thesis encompasses the design of the ICEBEAR-3D system for the acquisition of these important elevation angles. The receiver antenna array was redesigned using a custom phase error minimization and stochastic antenna location perturbation technique, which produces phase tolerant receiver antenna arrays. The resulting 45-baseline sparse non-uniform coplanar T-shaped array was designed for aperture synthesis radar imaging. Conventional aperture synthesis radar imaging techniques assume point-like incoherent targets and image using a Cartesian basis over a narrow field of view. These methods are incompatible with horizon pointing E~region radars such as ICEBEAR. Instead, radar targets were imaged using the Suppressed Spherical Wave Harmonic Transform (Suppressed-SWHT) technique. This imaging method uses precalculated spherical harmonic coefficient matrices to transform the visibilities to brightness maps by direct matrix multiplication. The under sampled image domain artefacts (dirty beam) were suppressed by the products of differing harmonic order brightness maps. From the images, elevation and azimuth angles of arrival were obtained. Due to the excellent phase tolerance of ICEBEAR new light was shed on the long standing low elevation angle problem. This led to the development of the proper phase reference vertical interferometry geometry, which allowed horizon pointing radar interferometers to unambiguously measure elevation angles near the horizon. Ultimately resulting in accurate elevation angles from zenith to horizon
    • …
    corecore