78 research outputs found
Verification of unsupervised neural networks
Neural networks are at the forefront of machine learning being responsible for
achievements such as AlphaGo. As they are being deployed in more and more
environments - even in safety-critical ones such as health care - we are naturally
interested in assuring their reliability. However, the discovery of so-called adver-
sarial attacks for supervised neural networks demonstrated that tiny distortions
in the input space can lead to misclassifications and thus, to potentially catas-
trophic errors: Patients could be diagnosed wrongly, or a car might confuse stop
signs and traffic lights. Thus, ideally, we would like to guarantee that these types
of attacks cannot occur.
In this thesis we extend the research on reliable neural networks to the realm
of unsupervised learning. This includes defining proper notions of reliability,
as well as analyzing and adapting unsupervised neural networks with respect
to this notion. Our definitions of reliability depend on the underlying neural
networks and the problems they are meant to solve. However, in all our cases, we
aim for guarantees on a continuous input space containing infinitely many points.
Therefore we extend the traditional setting of testing against a finite dataset such
that we require specialized tools to actually check a given network for reliability.
We will demonstrate how we can leverage neural network verification for these
purposes. Using neural network verification, however, entails a major challenge:
It does not scale up to large networks. To overcome this limitation, we design a
novel training procedure yielding networks that are both more reliable according
to our definition as well as more amenable for neural network verification. By
exploiting the piecewise affine structure of our networks, we can locally simplify
them and thus decrease verification runtime significantly. We also take a per-
spective that complements a neural network’s training by exploring how we can
repair non-reliable neural network ensembles. With this thesis, we paradigmatically show the necessity and the complications of unsupervised neural network verification. It aims to pave the way for more research to come and towards a safe usage of these simple-to-build yet difficult-to-understand models given by unsupervised neural networks
A comprehensive review of 3D convolutional neural network-based classification techniques of diseased and defective crops using non-UAV-based hyperspectral images
Hyperspectral imaging (HSI) is a non-destructive and contactless technology that provides valuable information about the structure and composition of an object. It has the ability to capture detailed information about the chemical and physical properties of agricultural crops. Due to its wide spectral range, compared with multispectral-or RGB-based imaging methods, HSI can be a more effective tool for monitoring crop health and productivity. With the advent of this imaging tool in agrotechnology, researchers can more accurately address issues related to the detection of diseased and defective crops in the agriculture industry. This allows to implement the most suitable and accurate farming solutions, such as irrigation and fertilization, before crops enter a damaged and difficult-to-recover phase of growth in the field. While HSI provides valuable insights into the object under investigation, the limited number of HSI datasets for crop evaluation presently poses a bottleneck. Dealing with the curse of dimensionality presents another challenge due to the abundance of spectral and spatial information in each hyperspectral cube. State-of-the-art methods based on 1D and 2D convolutional neural networks (CNNs) struggle to efficiently extract spectral and spatial information. On the other hand, 3D-CNN-based models have shown significant promise in achieving better classification and detection results by leveraging spectral and spatial features simultaneously. Despite the apparent benefits of 3D-CNN-based models, their usage for classification purposes in this area of research has remained limited. This paper seeks to address this gap by reviewing 3D-CNN-based architectures and the typical deep learning pipeline, including preprocessing and visualization of results, for the classification of hyperspectral images of diseased and defective crops. Furthermore, we discuss open research areas and challenges when utilizing 3D-CNNs with HSI data."This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors."https://www.sciencedirect.com/science/article/pii/S277237552300145
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
Classical Planning in Deep Latent Space
Current domain-independent, classical planners require symbolic models of the
problem domain and instance as input, resulting in a knowledge acquisition
bottleneck. Meanwhile, although deep learning has achieved significant success
in many fields, the knowledge is encoded in a subsymbolic representation which
is incompatible with symbolic systems such as planners. We propose Latplan, an
unsupervised architecture combining deep learning and classical planning. Given
only an unlabeled set of image pairs showing a subset of transitions allowed in
the environment (training inputs), Latplan learns a complete propositional PDDL
action model of the environment. Later, when a pair of images representing the
initial and the goal states (planning inputs) is given, Latplan finds a plan to
the goal state in a symbolic latent space and returns a visualized plan
execution. We evaluate Latplan using image-based versions of 6 planning
domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of
LightsOut.Comment: Under review at Journal of Artificial Intelligence Research (JAIR
Is Evolution an Algorithm? Effects of local entropy in unsupervised learning and protein evolution
L'abstract è presente nell'allegato / the abstract is in the attachmen
Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability
The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities.
Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio
An application of machine learning to statistical physics: from the phases of quantum control to satisfiability problems
This dissertation presents a study of machine learning methods with a focus on applications to statistical and condensed matter physics, in particular the problem of quantum state preparation, spin-glass and constraint satisfiability. We will start by introducing the core principles of machine learning such as overfitting, bias-variance tradeoff and the disciplines of supervised, unsupervised and reinforcement learning. This discussion will be set in the context of recent applications of machine learning to statistical physics and condensed matter physics. We then present the problem of quantum state preparation and show how reinforcement learning along with stochastic optimization methods can be applied to identify and define phases of quantum control. Reminiscent of condensed matter physics, the underlying phases of quantum control are identified via a set of order parameters and further detailed in terms of their universal implications for optimal quantum control. In particular, casting the optimal quantum control problem as an optimization problem, we show that it exhibits a generic glassy phase and establish a connection with the fields of spin-glass physics and constraint satisfiability problems. We then demonstrate how unsupervised learning methods can be used to obtain important information about the complexity of the phases described. We end by presenting a novel clustering framework, termed HAL for hierarchical agglomerative learning, which exploits out-of-sample accuracy estimates of machine learning classifiers to perform robust clustering of high-dimensional data. We show applications of HAL to various clustering problems
What to Do When Your Discrete Optimization Is the Size of a Neural Network?
Oftentimes, machine learning applications using neural networks involve
solving discrete optimization problems, such as in pruning,
parameter-isolation-based continual learning and training of binary networks.
Still, these discrete problems are combinatorial in nature and are also not
amenable to gradient-based optimization. Additionally, classical approaches
used in discrete settings do not scale well to large neural networks, forcing
scientists and empiricists to rely on alternative methods. Among these, two
main distinct sources of top-down information can be used to lead the model to
good solutions: (1) extrapolating gradient information from points outside of
the solution set (2) comparing evaluations between members of a subset of the
valid solutions. We take continuation path (CP) methods to represent using
purely the former and Monte Carlo (MC) methods to represent the latter, while
also noting that some hybrid methods combine the two. The main goal of this
work is to compare both approaches. For that purpose, we first overview the two
classes while also discussing some of their drawbacks analytically. Then, on
the experimental section, we compare their performance, starting with smaller
microworld experiments, which allow more fine-grained control of problem
variables, and gradually moving towards larger problems, including neural
network regression and neural network pruning for image classification, where
we additionally compare against magnitude-based pruning.Comment: Submitted to JML
- …