6,104 research outputs found
On information captured by neural networks: connections with memorization and generalization
Despite the popularity and success of deep learning, there is limited
understanding of when, how, and why neural networks generalize to unseen
examples. Since learning can be seen as extracting information from data, we
formally study information captured by neural networks during training.
Specifically, we start with viewing learning in presence of noisy labels from
an information-theoretic perspective and derive a learning algorithm that
limits label noise information in weights. We then define a notion of unique
information that an individual sample provides to the training of a deep
network, shedding some light on the behavior of neural networks on examples
that are atypical, ambiguous, or belong to underrepresented subpopulations. We
relate example informativeness to generalization by deriving nonvacuous
generalization gap bounds. Finally, by studying knowledge distillation, we
highlight the important role of data and label complexity in generalization.
Overall, our findings contribute to a deeper understanding of the mechanisms
underlying neural network generalization.Comment: PhD thesi
Segmentation of Pathology Images: A Deep Learning Strategy with Annotated Data
Cancer has significantly threatened human life and health for many years. In the clinic, histopathology image segmentation is the golden stand for evaluating the prediction of patient prognosis and treatment outcome. Generally, manually labelling tumour regions in hundreds of high-resolution histopathological images is time-consuming and expensive for pathologists. Recently, the advancements in hardware and computer vision have allowed deep-learning-based methods to become mainstream to segment tumours automatically, significantly reducing the workload of pathologists. However, most current methods rely on large-scale labelled histopathological images. Therefore, this research studies label-effective tumour segmentation methods using deep-learning paradigms to relieve the annotation limitations. Chapter 3 proposes an ensemble framework for fully-supervised tumour segmentation. Usually, the performance of an individual-trained network is limited by significant morphological variances in histopathological images. We propose a fully-supervised learning ensemble fusion model that uses both shallow and deep U-Nets, trained with images of different resolutions and subsets of images, for robust predictions of tumour regions. Noise elimination is achieved with Convolutional Conditional Random Fields. Two open datasets are used to evaluate the proposed method: the ACDC@LungHP challenge at ISBI2019 and the DigestPath challenge at MICCAI2019. With a dice coefficient of 79.7 %, the proposed method takes third place in ACDC@LungHP. In DigestPath 2019, the proposed method achieves a dice coefficient 77.3 %. Well-annotated images are an indispensable part of training fully-supervised segmentation strategies. However, large-scale histopathology images are hardly annotated finely in clinical practice. It is common for labels to be of poor quality or for only a few images to be manually marked by experts. Consequently, fully-supervised methods cannot perform well in these cases. Chapter 4 proposes a self-supervised contrast learning for tumour segmentation. A self-supervised cancer segmentation framework is proposed to reduce label dependency. An innovative contrastive learning scheme is developed to represent tumour features based on unlabelled images. Unlike a normal U-Net, the backbone is a patch-based segmentation network. Additionally, data augmentation and contrastive losses are applied to improve the discriminability of tumour features. A convolutional Conditional Random Field is used to smooth and eliminate noise. Three labelled, and fourteen unlabelled images are collected from a private skin cancer dataset called BSS. Experimental results show that the proposed method achieves better tumour segmentation performance than other popular self-supervised methods. However, by evaluated on the same public dataset as chapter 3, the proposed self-supervised method is hard to handle fine-grained segmentation around tumour boundaries compared to the supervised method we proposed. Chapter 5 proposes a sketch-based weakly-supervised tumour segmentation method. To segment tumour regions precisely with coarse annotations, a sketch-supervised method is proposed, containing a dual CNN-Transformer network and a global normalised class activation map. CNN-Transformer networks simultaneously model global and local tumour features. With the global normalised class activation map, a gradient-based tumour representation can be obtained from the dual network predictions. We invited experts to mark fine and coarse annotations in the private BSS and the public PAIP2019 datasets to facilitate reproducible performance comparisons. Using the BSS dataset, the proposed method achieves 76.686 % IOU and 86.6 % Dice scores, outperforming state-of-the-art methods. Additionally, the proposed method achieves a Dice gain of 8.372 % compared with U-Net on the PAIP2019 dataset. The thesis presents three approaches to segmenting cancers from histology images: fully-supervised, unsupervised, and weakly supervised methods. This research effectively segments tumour regions based on histopathological annotations and well-designed modules. Our studies comprehensively demonstrate label-effective automatic histopathological image segmentation. Experimental results prove that our works achieve state-of-the-art segmentation performances on private and public datasets. In the future, we plan to integrate more tumour feature representation technologies with other medical modalities and apply them to clinical research
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Recommended from our members
Rigorous Experimentation For Reinforcement Learning
Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, generating accurate knowledge from empirical experiments is often challenging due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers must hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in RL experimentation.
Evaluating the performance of an RL algorithm is the most common type of experiment in RL literature. Most performance evaluations are often incapable of answering a specific research question and produce misleading results. Thus, the first issue we address is how to create a performance evaluation procedure that holds up to scientific standards.
Despite the prevalence of performance evaluation, these types of experiments produce limited knowledge, e.g., they can only show how well an algorithm worked and not why, and they require significant amounts of time and computational resources. As an alternative, this dissertation proposes that scientific testing, the process of conducting carefully controlled experiments designed to further the knowledge and understanding of how an algorithm works, should be the primary form of experimentation.
Lastly, this dissertation provides a case study using policy gradient methods, showing how scientific testing can replace performance evaluation as the primary form of experimentation. As a result, this dissertation can motivate others in the field to adopt more rigorous experimental practices
Understanding and Mitigating Privacy Vulnerabilities in Deep Learning
Advancements in Deep Learning (DL) have enabled leveraging large-scale datasets to train models that perform challenging tasks at a level that mimics human intelligence. In several real-world scenarios, the data used for training, the trained model, and the data used for inference can be private and distributed across multiple distrusting parties, posing a challenge for training and inference. Several privacy-preserving training and inference frameworks have been developed to address this challenge. For instance, frameworks like federated learning and split learning have been proposed to train a model collaboratively on distributed data without explicitly sharing the private data to protect training data privacy. To protect model privacy during inference, the model owners have adopted a client-server architecture to provide inference services, wherein the end-users are only allowed black-box access to the model’s predictions for their input queries.
The goal of this thesis is to provide a better understanding of the privacy properties of the DL frameworks used for privacy-preserving training and inference. While these frameworks have the appearance of keeping the data and model private, the information exchanged during training/inference has the potential to compromise the privacy of the parties involved by leaking sensitive data. We aim to understand if these frameworks are truly capable of preventing the leakage of model and training data in realistic settings. In this pursuit, we discover new vulnerabilities that can be exploited to design powerful attacks that can overcome the limitations of prior works and break the illusion of privacy. Our findings highlight the limitations of these frameworks and underscore the importance of principled techniques to protect privacy. Furthermore, we leverage our improved understanding to design better defenses that can significantly deter the efficacy of an attack.Ph.D
The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
In this paper, we introduce a novel learning scheme named weakly
semi-supervised instance segmentation (WSSIS) with point labels for
budget-efficient and high-performance instance segmentation. Namely, we
consider a dataset setting consisting of a few fully-labeled images and a lot
of point-labeled images. Motivated by the main challenge of semi-supervised
approaches mainly derives from the trade-off between false-negative and
false-positive instance proposals, we propose a method for WSSIS that can
effectively leverage the budget-friendly point labels as a powerful weak
supervision source to resolve the challenge. Furthermore, to deal with the hard
case where the amount of fully-labeled data is extremely limited, we propose a
MaskRefineNet that refines noise in rough masks. We conduct extensive
experiments on COCO and BDD100K datasets, and the proposed method achieves
promising results comparable to those of the fully-supervised model, even with
50% of the fully labeled COCO data (38.8% vs. 39.7%). Moreover, when using as
little as 5% of fully labeled COCO data, our method shows significantly
superior performance over the state-of-the-art semi-supervised learning method
(33.7% vs. 24.9%). The code is available at
https://github.com/clovaai/PointWSSIS.Comment: CVPR 202
Application of improved you only look once model in road traffic monitoring system
The present research focuses on developing an intelligent traffic management solution for tracking the vehicles on roads. Our proposed work focuses on a much better you only look once (YOLOv4) traffic monitoring system that uses the CSPDarknet53 architecture as its foundation. Deep-sort learning methodology for vehicle multi-target detection from traffic video is also part of our research study. We have included features like the Kalman filter, which estimates unknown objects and can track moving targets. Hungarian techniques identify the correct frame for the object. We are using enhanced object detection network design and new data augmentation techniques with YOLOv4, which ultimately aids in traffic monitoring. Until recently, object identification models could either perform quickly or draw conclusions quickly. This was a big improvement, as YOLOv4 has an astoundingly good performance for a very high frames per second (FPS). The current study is focused on developing an intelligent video surveillance-based vehicle tracking system that tracks the vehicles using a neural network, image-based tracking, and YOLOv4. Real video sequences of road traffic are used to test the effectiveness of the method that has been suggested in the research. Through simulations, it is demonstrated that the suggested technique significantly increases graphics processing unit (GPU) speed and FSP as compared to baseline algorithms
Synthetic Aperture Radar (SAR) Meets Deep Learning
This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports
A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery
Semantic segmentation (classification) of Earth Observation imagery is a
crucial task in remote sensing. This paper presents a comprehensive review of
technical factors to consider when designing neural networks for this purpose.
The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural
Networks (RNNs), Generative Adversarial Networks (GANs), and transformer
models, discussing prominent design patterns for these ANN families and their
implications for semantic segmentation. Common pre-processing techniques for
ensuring optimal data preparation are also covered. These include methods for
image normalization and chipping, as well as strategies for addressing data
imbalance in training samples, and techniques for overcoming limited data,
including augmentation techniques, transfer learning, and domain adaptation. By
encompassing both the technical aspects of neural network design and the
data-related considerations, this review provides researchers and practitioners
with a comprehensive and up-to-date understanding of the factors involved in
designing effective neural networks for semantic segmentation of Earth
Observation imagery.Comment: 145 pages with 32 figure
ICEBEAR-3D: An Advanced Low Elevation Angle Auroral E region Imaging Radar
The Ionospheric Continuous-wave E region Bistatic Experimental Auroral Radar (ICEBEAR) is an auroral E~region radar which has operated from 7 December 2017 until the September 2019. During the first two years of operation, ICEBEAR was only capable of spatially locating E~region scatter and meteor trail targets in range and azimuth. Elevation angles were not determinable due to its East-West uniform linear receiving antenna array. Measuring elevation angles of targets when viewing from low elevation angles with radar interferometers has been a long standing problem. Past high latitude radars have attempted to obtain elevation angles of E~region targets using North-South baselines, but have always resulted in erroneous elevation angles being measured in the low elevation regime (0° to ≈30° above the horizon), leaving interesting scientific questions about scatter altitudes in the auroral E~region unanswered. The work entailed in this thesis encompasses the design of the ICEBEAR-3D system for the acquisition of these important elevation angles.
The receiver antenna array was redesigned using a custom phase error minimization and stochastic antenna location perturbation technique, which produces phase tolerant receiver antenna arrays. The resulting 45-baseline sparse non-uniform coplanar T-shaped array was designed for aperture synthesis radar imaging. Conventional aperture synthesis radar imaging techniques assume point-like incoherent targets and image using a Cartesian basis over a narrow field of view. These methods are incompatible with horizon pointing E~region radars such as ICEBEAR. Instead, radar targets were imaged using the Suppressed Spherical Wave Harmonic Transform (Suppressed-SWHT) technique. This imaging method uses precalculated spherical harmonic coefficient matrices to transform the visibilities to brightness maps by direct matrix multiplication. The under sampled image domain artefacts (dirty beam) were suppressed by the products of differing harmonic order brightness maps. From the images, elevation and azimuth angles of arrival were obtained. Due to the excellent phase tolerance of ICEBEAR new light was shed on the long standing low elevation angle problem. This led to the development of the proper phase reference vertical interferometry geometry, which allowed horizon pointing radar interferometers to unambiguously measure elevation angles near the horizon. Ultimately resulting in accurate elevation angles from zenith to horizon
- …