149 research outputs found
Image denoising with multi-layer perceptrons, part 1: comparison with existing algorithms and with bounds
Image denoising can be described as the problem of mapping from a noisy image
to a noise-free image. The best currently available denoising methods
approximate this mapping with cleverly engineered algorithms. In this work we
attempt to learn this mapping directly with plain multi layer perceptrons (MLP)
applied to image patches. We will show that by training on large image
databases we are able to outperform the current state-of-the-art image
denoising methods. In addition, our method achieves results that are superior
to one type of theoretical bound and goes a large way toward closing the gap
with a second type of theoretical bound. Our approach is easily adapted to less
extensively studied types of noise, such as mixed Poisson-Gaussian noise, JPEG
artifacts, salt-and-pepper noise and noise resembling stripes, for which we
achieve excellent results as well. We will show that combining a block-matching
procedure with MLPs can further improve the results on certain images. In a
second paper, we detail the training trade-offs and the inner mechanisms of our
MLPs
DeepPCR: Parallelizing Sequential Operations in Neural Networks
Parallelization techniques have become ubiquitous for accelerating inference
and training of deep neural networks. Despite this, several operations are
still performed in a sequential manner. For instance, the forward and backward
passes are executed layer-by-layer, and the output of diffusion models is
produced by applying a sequence of denoising steps. This sequential approach
results in a computational cost proportional to the number of steps involved,
presenting a potential bottleneck as the number of steps increases. In this
work, we introduce DeepPCR, a novel algorithm which parallelizes typically
sequential operations in order to speed up inference and training of neural
networks. DeepPCR is based on interpreting a sequence of steps as the
solution of a specific system of equations, which we recover using the Parallel
Cyclic Reduction algorithm. This reduces the complexity of computing the
sequential operations from to , thus
yielding a speedup for large . To verify the theoretical lower complexity of
the algorithm, and to identify regimes for speedup, we test the effectiveness
of DeepPCR in parallelizing the forward and backward pass in multi-layer
perceptrons, and reach speedups of up to for the forward and
for the backward pass. We additionally showcase the flexibility of
DeepPCR by parallelizing training of ResNets with as many as 1024 layers, and
generation in diffusion models, enabling up to faster training and
faster generation, respectively, when compared to the sequential
approach
Non-local Neural Networks
Both convolutional and recurrent operations are building blocks that process
one local neighborhood at a time. In this paper, we present non-local
operations as a generic family of building blocks for capturing long-range
dependencies. Inspired by the classical non-local means method in computer
vision, our non-local operation computes the response at a position as a
weighted sum of the features at all positions. This building block can be
plugged into many computer vision architectures. On the task of video
classification, even without any bells and whistles, our non-local models can
compete or outperform current competition winners on both Kinetics and Charades
datasets. In static image recognition, our non-local models improve object
detection/segmentation and pose estimation on the COCO suite of tasks. Code is
available at https://github.com/facebookresearch/video-nonlocal-net .Comment: CVPR 2018, code is available at:
https://github.com/facebookresearch/video-nonlocal-ne
Fast and Interpretable Nonlocal Neural Networks for Image Denoising via Group-Sparse Convolutional Dictionary Learning
Nonlocal self-similarity within natural images has become an increasingly
popular prior in deep-learning models. Despite their successful image
restoration performance, such models remain largely uninterpretable due to
their black-box construction. Our previous studies have shown that
interpretable construction of a fully convolutional denoiser (CDLNet), with
performance on par with state-of-the-art black-box counterparts, is achievable
by unrolling a dictionary learning algorithm. In this manuscript, we seek an
interpretable construction of a convolutional network with a nonlocal
self-similarity prior that performs on par with black-box nonlocal models. We
show that such an architecture can be effectively achieved by upgrading the
sparsity prior of CDLNet to a weighted group-sparsity prior. From this
formulation, we propose a novel sliding-window nonlocal operation, enabled by
sparse array arithmetic. In addition to competitive performance with black-box
nonlocal DNNs, we demonstrate the proposed sliding-window sparse attention
enables inference speeds greater than an order of magnitude faster than its
competitors.Comment: 11 pages, 8 figures, 6 table
Generating tabular datasets under differential privacy
Machine Learning (ML) is accelerating progress across fields and industries,
but relies on accessible and high-quality training data. Some of the most
important datasets are found in biomedical and financial domains in the form of
spreadsheets and relational databases. But this tabular data is often sensitive
in nature. Synthetic data generation offers the potential to unlock sensitive
data, but generative models tend to memorise and regurgitate training data,
which undermines the privacy goal. To remedy this, researchers have
incorporated the mathematical framework of Differential Privacy (DP) into the
training process of deep neural networks. But this creates a trade-off between
the quality and privacy of the resulting data. Generative Adversarial Networks
(GANs) are the dominant paradigm for synthesising tabular data under DP, but
suffer from unstable adversarial training and mode collapse, which are
exacerbated by the privacy constraints and challenging tabular data modality.
This work optimises the quality-privacy trade-off of generative models,
producing higher quality tabular datasets with the same privacy guarantees. We
implement novel end-to-end models that leverage attention mechanisms to learn
reversible tabular representations. We also introduce TableDiffusion, the first
differentially-private diffusion model for tabular data synthesis. Our
experiments show that TableDiffusion produces higher-fidelity synthetic
datasets, avoids the mode collapse problem, and achieves state-of-the-art
performance on privatised tabular data synthesis. By implementing
TableDiffusion to predict the added noise, we enabled it to bypass the
challenges of reconstructing mixed-type tabular data. Overall, the diffusion
paradigm proves vastly more data and privacy efficient than the adversarial
paradigm, due to augmented re-use of each data batch and a smoother iterative
training process
Deep Learning for Recommender Systems
The widespread adoption of the Internet has led to an explosion in the number of choices available to consumers. Users begin to expect personalized content in modern E-commerce, entertainment and social media platforms. Recommender Systems (RS) provide a critical solution to this problem by maintaining user engagement and satisfaction with personalized content.
Traditional RS techniques are often linear limiting the expressivity required to model complex user-item interactions and require extensive handcrafted features from domain experts. Deep learning demonstrated significant breakthroughs in solving problems that have alluded the artificial intelligence community for many years advancing state-of-the-art results in domains such as computer vision and natural language processing.
The recommender domain consists of heterogeneous and semantically rich data such as unstructured text (e.g. product descriptions), categorical attributes (e.g. genre of a movie), and user-item feedback (e.g. purchases). Deep learning can automatically capture the intricate structure of user preferences by encoding learned feature representations from high dimensional data.
In this thesis, we explore five novel applications of deep learning-based techniques to address top-n recommendation. First, we propose Collaborative Memory Network, which unifies the strengths of the latent factor model and neighborhood-based methods inspired by Memory Networks to address collaborative filtering with implicit feedback. Second, we propose Neural Semantic Personalized Ranking, a novel probabilistic generative modeling approach to integrate deep neural network with pairwise ranking for the item cold-start problem. Third, we propose Attentive Contextual Denoising Autoencoder augmented with a context-driven attention mechanism to integrate arbitrary user and item attributes. Fourth, we propose a flexible encoder-decoder architecture called Neural Citation Network, embodying a powerful max time delay neural network encoder augmented with an attention mechanism and author networks to address context-aware citation recommendation. Finally, we propose a generic framework to perform conversational movie recommendations which leverages transfer learning to infer user preferences from natural language. Comprehensive experiments validate the effectiveness of all five proposed models against competitive baseline methods and demonstrate the successful adaptation of deep learning-based techniques to the recommendation domain
- …