Search CORE

5 research outputs found

Model-Based Deep Learning

Author: Dimakis Alexandros G.
Eldar Yonina C.
Shlezinger Nir
Whang Jay
Publication venue
Publication date: 27/06/2021
Field of study

Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Such model-based methods utilize mathematical formulations that represent the underlying physics, prior information and additional domain knowledge. Simple classical models are useful but sensitive to inaccuracies and may lead to poor performance when real systems display complex or dynamic behavior. On the other hand, purely data-driven approaches that are model-agnostic are becoming increasingly popular as datasets become abundant and the power of modern deep learning pipelines increases. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance, especially for supervised problems. However, DNNs typically require massive amounts of data and immense computational resources, limiting their applicability for some signal processing scenarios. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches. Such model-based deep learning methods exploit both partial domain knowledge, via mathematical structures designed for specific problems, as well as learning from limited data. In this article we survey the leading approaches for studying and designing model-based deep learning systems. We divide hybrid model-based/data-driven systems into categories based on their inference mechanism. We provide a comprehensive review of the leading approaches for combining model-based algorithms with deep learning in a systematic manner, along with concrete guidelines and detailed signal processing oriented examples from recent literature. Our aim is to facilitate the design and study of future systems on the intersection of signal processing and machine learning that incorporate the advantages of both domains

arXiv.org e-Print Archive

Recommended from our members

Learning-based Optimization for Signal and Image Processing

Author: Liu Jialin
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Incorporating machine learning techniques into optimization problems and solvers attracts increasing attention. Given a particular type of optimization problem that needs to be solved repeatedly, machine learning techniques can find some features for this category of optimization and develop algorithms with excellent performance. This thesis deals with algorithms and convergence analysis in learning-based optimization in three aspects: learning dictionaries, learning optimization solvers and learning regularizers.Learning dictionaries for sparse coding is significant for signal processing. Convolutional sparse coding is a form of sparse coding with a structured, translation invariant dictionary. Most convolutional dictionary learning algorithms to date operate in the batch mode, requiring simultaneous access to all training images during the learning process, which results in very high memory usage, and severely limits the training data size that can be used. I proposed two online convolutional dictionary learning algorithms that offered far better scaling of memory and computational cost than batch methods and provided a rigorous theoretical analysis of these methods.Learning fast solvers for optimization is a rising research topic. In recent years, unfolding iterative algorithms as neural networks has become an empirical success in solving sparse recovery problems. However, its theoretical understanding is still immature, which prevents us from fully utilizing the power of neural networks. I studied unfolded ISTA (Iterative Shrinkage Thresholding Algorithm) for sparse signal recovery and established its convergence. Based on the properties of parameters required by convergence, the model can be significantly simplified and, consequently, has much less training cost and better recovery performance.Learning regularizers or priors improves the performance of optimization solvers, especially for signal and image processing tasks. Plug-and-play (PnP) is a non-convex framework that integrates modern priors, such as BM3D or deep learning-based denoisers, into ADMM or other proximal algorithms. Although PnP has been recently studied extensively with great empirical success, theoretical analysis addressing even the most basic question of convergence has been insufficient. In this thesis, the theoretical convergence of PnP-FBS and PnP-ADMM was established, without using diminishing stepsizes, under a certain Lipschitz condition on the denoisers. Furthermore, real spectral normalization was proposed for training deep learning-based denoisers to satisfy the proposed Lipschitz condition

eScholarship - University of California

Bayesian methods for inverse problems with point clouds : applications to single-photon lidar

Author: Tachella Julian Andres
Publication venue: Engineering and Physical Sciences
Publication date: 01/11/2019
Field of study

Single-photon light detection and ranging (lidar) has emerged as a prime candidate technology for depth imaging through challenging environments. This modality relies on constructing, for each pixel, a histogram of time delays between emitted light pulses and detected photon arrivals. The problem of estimating the number of imaged surfaces, their reflectivity and position becomes very challenging in the low-photon regime (which equates to short acquisition times) or relatively high background levels (i.e., strong ambient illumination). In a general setting, a variable number of surfaces can be observed per imaged pixel. The majority of existing methods assume exactly one surface per pixel, simplifying the reconstruction problem so that standard image processing techniques can be easily applied. However, this assumption hinders practical three-dimensional (3D) imaging applications, being restricted to controlled indoor scenarios. Moreover, other existing methods that relax this assumption achieve worse reconstructions, suffering from long execution times and large memory requirements. This thesis presents novel approaches to 3D reconstruction from single-photon lidar data, which are capable of identifying multiple surfaces in each pixel. The resulting algorithms obtain new state-of-the-art reconstructions without strong assumptions about the sensed scene. The models proposed here differ from standard image processing tools, being designed to capture correlations of manifold-like structures. Until now, a major limitation has been the significant amount of time required for the analysis of the recorded data. By combining statistical models with highly scalable computational tools from the computer graphics community, we demonstrate 3D reconstruction of complex outdoor scenes with processing times of the order of 20 ms, where the lidar data was acquired in broad daylight from distances up to 320 m. This has enabled robust, real-time target reconstruction of complex moving scenes, paving the way for single-photon lidar at video rates for practical 3D imaging applications

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Compressive learning: new models and applications

Author: Sheehan Michael Patrick
Publication venue: The University of Edinburgh
Publication date: 13/06/2022
Field of study

Today’s world is fuelled by data. From self-driving cars through to agriculture, massive amounts of data are used to fit learning models to provide valuable insights and predictions. Such insights come at a significant price as many traditional learning procedures have both memory and computational costs that scale with the size of the data. This quickly becomes prohibitive, even when substantial resources are available. A new way of learning is therefore needed to allow for efficient model fitting in the 21st century. The birth of compressive learning in recent years has provided a novel solution to the bottleneck of learning from big data. Situated at the core of the compressive learning framework is the construction of a so-called sketch. The sketch is a compact representation of the data that provides sufficient information for specific learning tasks. In this thesis we develop the compressive learning framework to a host of new models and applications. In the first part of the thesis, we consider the group of semi-parametric models and demonstrate the unique advantages and challenges associated with creating a compressive learning paradigm for these particular models. Concentrating on the independent component analysis model, we develop a framework of algorithms and theory enabling magnitudes of compression with respect to memory complexity compared to existing methods. In the second part of the thesis, we develop a compressive learning framework to the emerging technology of single-photon counting lidar. We demonstrate that forming a sketch of the time-of-flight data circumvents the inherent data-transfer bottleneck of existing lidar techniques. Finally, we extend the compressive lidar technology by developing both an efficient sketch-based detection algorithm that can detect the presence of a surface solely from the sketch and a sketched plug and play framework that can integrate existing powerful denoisers that are robust to noisy lidar scenes with low photon counts

Edinburgh Research Archive

Accelerating and Privatizing Diffusion Models

Author: Dockhorn Tim
Publication venue: 'University of Waterloo'
Publication date: 16/08/2023
Field of study

Diffusion models (DMs) have emerged as a powerful class of generative models. DMs offer both state-of-the-art synthesis quality and sample diversity in combination with a robust and scalable learning objective. DMs rely on a diffusion process that gradually perturbs the data towards a normal distribution, while the neural network learns to denoise. Formally, the problem reduces to learning the score function, i.e., the gradient of the log-density of the perturbed data. The reverse of the diffusion process can be approximated by a differential equation, defined by the learned score function, and can therefore be used for generation when starting from random noise. In this thesis, we give a thorough and beginner-friendly introduction to DMs and discuss their history starting from early work on score-based generative models. Furthermore, we discuss connections to other statistical models and lay out applications of DMs, with a focus on image generative modeling. We then present CLD: a new DM based on critically-damped Langevin dynamics. CLD can be interpreted as running a joint diffusion in an extended space, where the auxiliary variables can be considered "velocities" that are coupled to the data variables as in Hamiltonian dynamics. We derive a novel score matching objective for CLD-based DMs and introduce a fast solver for the reverse diffusion process which is inspired by methods from the statistical mechanics literature. The CLD framework provides new insights into DMs and generalizes many existing DMs which are based on overdamped Langevin dynamics. Next, we present GENIE, a novel higher-order numerical solver for DMs. Many existing higher-order solvers for DMs built on finite difference schemes which break down in the large step size limit as approximations become too crude. GENIE, on the other hand, learns neural network-based models for higher-order derivatives whose precision do not depend on the step size. The additional networks in GENIE are implemented as small output heads on top of the neural backbone of the original DM, keeping the computational overhead minimal. Unlike recent sampling distillation methods that fundamentally alter the generation process in DMs, GENIE still solves the true generative differential equation, and therefore naturally enables applications such as encoding and guided sampling. The fourth chapter presents differentially private diffusion models (DPDMs), DMs trained with strict differential privacy guarantees. While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained on sensitive data with differential privacy guarantees can sidestep this challenge, providing access to synthetic data instead. DPDMs enforce privacy by using differentially private stochastic gradient descent for training. We thoroughly study the design space of DPDMs and propose noise multiplicity, a simple yet powerful modification of the DM training objective tailored to the differential privacy setting. We motivate and show numerically why DMs are better suited for differentially private generative modeling than one-shot generators such as generative adversarial networks or normalizing flows. Finally, we propose to distill the knowledge of large pre-trained DMs into smaller student DMs. Large-scale DMs have achieved unprecedented results across several domains, however, they generally require a large amount of GPU memory and are slow at inference time, making it difficult to deploy them in real-time or on resource-limited devices. In particular, we propose an approximate score matching objective that regresses the student model towards predictions of the teacher DM rather than the clean data as is done in standard DM training. We show that student models outperform the larger teacher model for a variety of compute budgets. Additionally, the student models may also be deployed on GPUs with significantly less memory than was required for the original teacher model

University of Waterloo's Institutional Repository