78 research outputs found

    Verification of unsupervised neural networks

    Get PDF
    Neural networks are at the forefront of machine learning being responsible for achievements such as AlphaGo. As they are being deployed in more and more environments - even in safety-critical ones such as health care - we are naturally interested in assuring their reliability. However, the discovery of so-called adver- sarial attacks for supervised neural networks demonstrated that tiny distortions in the input space can lead to misclassifications and thus, to potentially catas- trophic errors: Patients could be diagnosed wrongly, or a car might confuse stop signs and traffic lights. Thus, ideally, we would like to guarantee that these types of attacks cannot occur. In this thesis we extend the research on reliable neural networks to the realm of unsupervised learning. This includes defining proper notions of reliability, as well as analyzing and adapting unsupervised neural networks with respect to this notion. Our definitions of reliability depend on the underlying neural networks and the problems they are meant to solve. However, in all our cases, we aim for guarantees on a continuous input space containing infinitely many points. Therefore we extend the traditional setting of testing against a finite dataset such that we require specialized tools to actually check a given network for reliability. We will demonstrate how we can leverage neural network verification for these purposes. Using neural network verification, however, entails a major challenge: It does not scale up to large networks. To overcome this limitation, we design a novel training procedure yielding networks that are both more reliable according to our definition as well as more amenable for neural network verification. By exploiting the piecewise affine structure of our networks, we can locally simplify them and thus decrease verification runtime significantly. We also take a per- spective that complements a neural network’s training by exploring how we can repair non-reliable neural network ensembles. With this thesis, we paradigmatically show the necessity and the complications of unsupervised neural network verification. It aims to pave the way for more research to come and towards a safe usage of these simple-to-build yet difficult-to-understand models given by unsupervised neural networks

    A comprehensive review of 3D convolutional neural network-based classification techniques of diseased and defective crops using non-UAV-based hyperspectral images

    Get PDF
    Hyperspectral imaging (HSI) is a non-destructive and contactless technology that provides valuable information about the structure and composition of an object. It has the ability to capture detailed information about the chemical and physical properties of agricultural crops. Due to its wide spectral range, compared with multispectral-or RGB-based imaging methods, HSI can be a more effective tool for monitoring crop health and productivity. With the advent of this imaging tool in agrotechnology, researchers can more accurately address issues related to the detection of diseased and defective crops in the agriculture industry. This allows to implement the most suitable and accurate farming solutions, such as irrigation and fertilization, before crops enter a damaged and difficult-to-recover phase of growth in the field. While HSI provides valuable insights into the object under investigation, the limited number of HSI datasets for crop evaluation presently poses a bottleneck. Dealing with the curse of dimensionality presents another challenge due to the abundance of spectral and spatial information in each hyperspectral cube. State-of-the-art methods based on 1D and 2D convolutional neural networks (CNNs) struggle to efficiently extract spectral and spatial information. On the other hand, 3D-CNN-based models have shown significant promise in achieving better classification and detection results by leveraging spectral and spatial features simultaneously. Despite the apparent benefits of 3D-CNN-based models, their usage for classification purposes in this area of research has remained limited. This paper seeks to address this gap by reviewing 3D-CNN-based architectures and the typical deep learning pipeline, including preprocessing and visualization of results, for the classification of hyperspectral images of diseased and defective crops. Furthermore, we discuss open research areas and challenges when utilizing 3D-CNNs with HSI data."This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors."https://www.sciencedirect.com/science/article/pii/S277237552300145

    Quantum machine learning: a classical perspective

    Get PDF
    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde

    Classical Planning in Deep Latent Space

    Full text link
    Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We evaluate Latplan using image-based versions of 6 planning domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of LightsOut.Comment: Under review at Journal of Artificial Intelligence Research (JAIR

    On the edges of clustering

    Get PDF

    Is Evolution an Algorithm? Effects of local entropy in unsupervised learning and protein evolution

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability

    Get PDF
    The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities. Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio

    An application of machine learning to statistical physics: from the phases of quantum control to satisfiability problems

    Full text link
    This dissertation presents a study of machine learning methods with a focus on applications to statistical and condensed matter physics, in particular the problem of quantum state preparation, spin-glass and constraint satisfiability. We will start by introducing the core principles of machine learning such as overfitting, bias-variance tradeoff and the disciplines of supervised, unsupervised and reinforcement learning. This discussion will be set in the context of recent applications of machine learning to statistical physics and condensed matter physics. We then present the problem of quantum state preparation and show how reinforcement learning along with stochastic optimization methods can be applied to identify and define phases of quantum control. Reminiscent of condensed matter physics, the underlying phases of quantum control are identified via a set of order parameters and further detailed in terms of their universal implications for optimal quantum control. In particular, casting the optimal quantum control problem as an optimization problem, we show that it exhibits a generic glassy phase and establish a connection with the fields of spin-glass physics and constraint satisfiability problems. We then demonstrate how unsupervised learning methods can be used to obtain important information about the complexity of the phases described. We end by presenting a novel clustering framework, termed HAL for hierarchical agglomerative learning, which exploits out-of-sample accuracy estimates of machine learning classifiers to perform robust clustering of high-dimensional data. We show applications of HAL to various clustering problems

    What to Do When Your Discrete Optimization Is the Size of a Neural Network?

    Full text link
    Oftentimes, machine learning applications using neural networks involve solving discrete optimization problems, such as in pruning, parameter-isolation-based continual learning and training of binary networks. Still, these discrete problems are combinatorial in nature and are also not amenable to gradient-based optimization. Additionally, classical approaches used in discrete settings do not scale well to large neural networks, forcing scientists and empiricists to rely on alternative methods. Among these, two main distinct sources of top-down information can be used to lead the model to good solutions: (1) extrapolating gradient information from points outside of the solution set (2) comparing evaluations between members of a subset of the valid solutions. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter, while also noting that some hybrid methods combine the two. The main goal of this work is to compare both approaches. For that purpose, we first overview the two classes while also discussing some of their drawbacks analytically. Then, on the experimental section, we compare their performance, starting with smaller microworld experiments, which allow more fine-grained control of problem variables, and gradually moving towards larger problems, including neural network regression and neural network pruning for image classification, where we additionally compare against magnitude-based pruning.Comment: Submitted to JML
    • …
    corecore