15 research outputs found

    Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks

    Full text link
    Understanding the implicit regularization imposed by neural network architectures and gradient based optimization methods is a key challenge in deep learning and AI. In this work we provide sharp results for the implicit regularization imposed by the gradient flow of Diagonal Linear Networks (DLNs) in the over-parameterized regression setting and, potentially surprisingly, link this to the phenomenon of phase transitions in generalized hardness of approximation (GHA). GHA generalizes the phenomenon of hardness of approximation from computer science to, among others, continuous and robust optimization. It is well-known that the 1\ell^1-norm of the gradient flow of DLNs with tiny initialization converges to the objective function of basis pursuit. We improve upon these results by showing that the gradient flow of DLNs with tiny initialization approximates minimizers of the basis pursuit optimization problem (as opposed to just the objective function), and we obtain new and sharp convergence bounds w.r.t.\ the initialization size. Non-sharpness of our results would imply that the GHA phenomenon would not occur for the basis pursuit optimization problem -- which is a contradiction -- thus implying sharpness. Moreover, we characterize which\textit{which} 1\ell_1 minimizer of the basis pursuit problem is chosen by the gradient flow whenever the minimizer is not unique. Interestingly, this depends on the depth of the DLN

    The troublesome kernel: why deep learning for inverse problems is typically unstable

    Full text link
    There is overwhelming empirical evidence that Deep Learning (DL) leads to unstable methods in applications ranging from image classification and computer vision to voice recognition and automated diagnosis in medicine. Recently, a similar instability phenomenon has been discovered when DL is used to solve certain problems in computational science, namely, inverse problems in imaging. In this paper we present a comprehensive mathematical analysis explaining the many facets of the instability phenomenon in DL for inverse problems. Our main results not only explain why this phenomenon occurs, they also shed light as to why finding a cure for instabilities is so difficult in practice. Additionally, these theorems show that instabilities are typically not rare events - rather, they can occur even when the measurements are subject to completely random noise - and consequently how easy it can be to destablise certain trained neural networks. We also examine the delicate balance between reconstruction performance and stability, and in particular, how DL methods may outperform state-of-the-art sparse regularization methods, but at the cost of instability. Finally, we demonstrate a counterintuitive phenomenon: training a neural network may generically not yield an optimal reconstruction method for an inverse problem

    On instabilities of deep learning in image reconstruction - Does AI come at a cost?

    Full text link
    Deep learning, due to its unprecedented success in tasks such as image classification, has emerged as a new tool in image reconstruction with potential to change the field. In this paper we demonstrate a crucial phenomenon: deep learning typically yields unstablemethods for image reconstruction. The instabilities usually occur in several forms: (1) tiny, almost undetectable perturbations, both in the image and sampling domain, may result in severe artefacts in the reconstruction, (2) a small structural change, for example a tumour, may not be captured in the reconstructed image and (3) (a counterintuitive type of instability) more samples may yield poorer performance. Our new stability test with algorithms and easy to use software detects the instability phenomena. The test is aimed at researchers to test their networks for instabilities and for government agencies, such as the Food and Drug Administration (FDA), to secure safe use of deep learning methods

    On Assessing Trustworthy AI in Healthcare. Machine Learning as a Supportive Tool to Recognize Cardiac Arrest in Emergency Calls

    Get PDF
    Artificial Intelligence (AI) has the potential to greatly improve the delivery of healthcare and other services that advance population health and wellbeing. However, the use of AI in healthcare also brings potential risks that may cause unintended harm. To guide future developments in AI, the High-Level Expert Group on AI set up by the European Commission (EC), recently published ethics guidelines for what it terms “trustworthy” AI. These guidelines are aimed at a variety of stakeholders, especially guiding practitioners toward more ethical and more robust applications of AI. In line with efforts of the EC, AI ethics scholarship focuses increasingly on converting abstract principles into actionable recommendations. However, the interpretation, relevance, and implementation of trustworthy AI depend on the domain and the context in which the AI system is used. The main contribution of this paper is to demonstrate how to use the general AI HLEG trustworthy AI guidelines in practice in the healthcare domain. To this end, we present a best practice of assessing the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls. The AI system under assessment is currently in use in the city of Copenhagen in Denmark. The assessment is accomplished by an independent team composed of philosophers, policy makers, social scientists, technical, legal, and medical experts. By leveraging an interdisciplinary team, we aim to expose the complex trade-offs and the necessity for such thorough human review when tackling socio-technical applications of AI in healthcare. For the assessment, we use a process to assess trustworthy AI, called 1Z-Inspection® to identify specific challenges and potential ethical trade-offs when we consider AI in practice.</jats:p

    Stability and accuracy in compressive sensing and deep learning

    No full text
    There are currently two paradigm shifts happening in society and scientific computing: (1) Artificial Intelligence (AI) is replacing humans in problem solving, and, (2) AI is replacing the standard algorithms in computational science and engineering. Since reliable numerical calculations are paramount, algorithms for computational science are traditionally based on two pillars: accuracy and stability. Notably, this is true for image reconstruction, which is a mainstay of computational science, providing fundamental tools in medical, scientific and industrial imaging. In this thesis, we demonstrate that the stability pillar is typically absent in current deep learning and AI-based algorithms for image reconstruction, and we present a solution to why this phenomenon occurs for AI-based methods applied both to image reconstruction and to classification in general. This raises two fundamental questions: how reliable are such algorithms when applied in society, and do AI-based algorithms have the unavoidable Achilles heel of instability? We investigate these phenomena, and we introduce a framework designed to demonstrate, investigate and ultimately answer these fundamental questions

    Coherence estimates between Hadamard matrices and Daubechies wavelets

    No full text
    Traditionally the compressive sensing theory have been focusing on the three principles of sparsity, incoherence and uniform random subsampling. Recent years research have shown that these principles yield insufficient results in many practical setups. This has lead to the development of the principles of asymptotic sparsity, asymptotic incoherence and multilevel random subsampling. As a result of these principles, the current theory is limited to unitary sampling and sparsifying operators. For large scale reconstruction, the theory is further restricted to operators whose product can be computed in O(N log N) operations, due to memory constraints of computers. Accordingly this has increased the popularity of the Fourier and Hadamard sampling operators, for applications where these operators can model the underlying sampling structure. As the sparsifying operator the wavelet transform have proven to yield satisfactory results in most setups. Since all of these operators needs to be unitary, this have restricted us to only consider Daubechies compactly supported orthonormal wavelets. By using wavelets as the sparsifying transform it has been proven that a Fourier sampling basis will be asymptotically incoherent to a unitary wavelet basis. The same result can easily be calculated numerically between a Hadamard sampling basis and a Daubechies wavelet basis. However, any theoretical result of this fact have been lacking. The purpose of this text is to provide such a theoretical result

    On the Unification of Schemes and Software for Wavelets on the Interval

    No full text
    Abstract We revisit the construction of wavelets on the interval with various degrees of polynomial exactness, and explain how existing schemes for orthogonal- and Spline wavelets can be extended to compactly supported delay-normalized wavelets. The contribution differs substantially from previous ones in how results are stated and deduced: linear algebra notation is exploited more heavily, and the use of sums and complicated index notation is reduced. This extended use of linear algebra eases translation to software, and a general open source implementation, which uses the deductions in this paper as a reference, has been developed. Key features of this implementation is its flexibility w.r.t. the length of the input, as well as its generality regarding the wavelet transform

    Uniform recovery in infinite-dimensional compressed sensing and applications to structured binary sampling

    No full text
    Infinite-dimensional compressed sensing deals with the recovery of analog signals (functions) from linear measurements, often in the form of integral transforms such as the Fourier transform. This framework is well-suited to many real-world inverse problems, which are typically modeled in infinite-dimensional spaces, and where the application of finite-dimensional approaches can lead to noticeable artefacts. Another typical feature of such problems is that the signals are not only sparse in some dictionary, but possess a so-called local sparsity in levels structure. Consequently, the sampling scheme should be designed so as to exploit this additional structure. In this paper, we introduce a series of uniform recovery guarantees for infinite-dimensional compressed sensing based on sparsity in levels and so-called multilevel random subsampling. By using a weighted -regularizer we derive measurement conditions that are sharp up to log factors, in the sense that they agree with the best known measurement conditions for oracle estimators in which the support is known a priori. These guarantees also apply in finite dimensions, and improve existing results for unweighted -regularization. To illustrate our results, we consider the problem of binary sampling with the Walsh transform using orthogonal wavelets. Binary sampling is an important mechanism for certain imaging modalities. Through carefully estimating the local coherence between the Walsh and wavelet bases, we derive the first known recovery guarantees for this problem

    The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale's 18th problem

    No full text
    Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities, however, there does not exist any algorithm, even randomised, that can train (or compute) such a NN. For any positive integers K>2K > 2 and LL, there are cases where simultaneously: (a) no randomised training algorithm can compute a NN correct to KK digits with probability greater than 1/21/2, (b) there exists a deterministic training algorithm that computes a NN with K1K-1 correct digits, but any such (even randomised) algorithm needs arbitrarily many training data, (c) there exists a deterministic training algorithm that computes a NN with K2K-2 correct digits using no more than LL training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce Fast Iterative REstarted NETworks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only O(log(ϵ))\mathcal{O}(|\log(\epsilon)|) layers are needed for an ϵ\epsilon-accurate solution to the inverse problem
    corecore