39 research outputs found

    A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks

    Full text link
    Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data. Unlike conventional neural networks or updated versions of Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM), transformer models excel in handling long dependencies between input sequence elements and enable parallel processing. As a result, transformer-based models have attracted substantial interest among researchers in the field of artificial intelligence. This can be attributed to their immense potential and remarkable achievements, not only in Natural Language Processing (NLP) tasks but also in a wide range of domains, including computer vision, audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published highlighting the transformer's contributions in specific fields, architectural differences, or performance evaluations, there is still a significant absence of a comprehensive survey paper encompassing its major applications across various domains. Therefore, we undertook the task of filling this gap by conducting an extensive survey of proposed transformer models from 2017 to 2022. Our survey encompasses the identification of the top five application domains for transformer-based models, namely: NLP, Computer Vision, Multi-Modality, Audio and Speech Processing, and Signal Processing. We analyze the impact of highly influential transformer-based models in these domains and subsequently classify them based on their respective tasks using a proposed taxonomy. Our aim is to shed light on the existing potential and future possibilities of transformers for enthusiastic researchers, thus contributing to the broader understanding of this groundbreaking technology

    Intelligent Sensing and Learning for Advanced MIMO Communication Systems

    Get PDF

    Neural distribution estimation as a two-part problem

    Get PDF
    Given a dataset of examples, distribution estimation is the task of approximating the assumed underlying probability distribution from which those samples were drawn. Neural distribution estimation relies on the powerful function approximation capabilities of deep neural networks to build models for this purpose, and excels when data is high-dimensional and exhibits complex, nonlinear dependencies. In this thesis, we explore several approaches to neural distribution estimation, and present a unified perspective for these methods based on a two-part design principle. In particular, we examine how many models iteratively break down the task of distribution estimation into a series of tractable sub-tasks, before fitting a multi-step generative process which combines solutions to these sub-tasks in order to approximate the data distribution of interest. Framing distribution estimation as a two-part problem provides a shared language in which to compare and contrast prevalent models in the literature, and also allows for discussion of alternative approaches which do not follow this structure. We first present the Autoregressive Energy Machine, an energy-based model which is trained by approximate maximum likelihood through an autoregressive decomposition. The method demonstrates the flexibility of an energy-based model over an explicitly normalized model, and the novel application of autoregressive importance sampling highlights the benefit of an autoregressive approach to distribution estimation which recursively transforms the problem into a series of univariate tasks. Next, we present Neural Spline Flows, a class of normalizing flow models based on monotonic spline transformations which admit both an explicit inverse and a tractable Jacobian determinant. Normalizing flows tackle distribution estimation by searching for an invertible map between the data distribution and a more tractable base distribution, and this map is typically constructed as the composition of a series of invertible building blocks. We demonstrate that spline flows can be used to enhance density estimation of tabular data, variational inference in latent variable models, and generative modeling of natural images. The third chapter presents Maximum Likelihood Training of Score-Based Diffusion Models. Generative models based on estimation of the gradient of the logarithm of the probability density---or score function---have recently gained traction as a powerful modeling paradigm, in which the data distribution is gradually transformed toward a tractable base distribution by means of a stochastic process. The paper illustrates how this class of models can be trained by maximum likelihood, resulting in a model which is functionally equivalent to a continuous normalizing flow, and which bridges the gap between two branches of the literature. We also discuss latent-variable generative models more broadly, of which diffusion models are a structured special case. Finally, we present On Contrastive Learning for Likelihood-Free Inference, a unifying perspective for likelihood-free inference methods which perform Bayesian inference using either density estimation or density-ratio estimation. Likelihood-free inference focuses on inference in stochastic simulator models where the likelihood of parameters given observations is computationally intractable, and traditional inference methods fall short. In addition to illustrating the power of normalizing flows as generic tools for density estimation, this chapter also gives us the opportunity to discuss likelihood-free models more broadly. These so-called implicit generative models form a large part of the distribution estimation literature under the umbrella of generative adversarial networks, and are distinct in how they treat distribution estimation as a one-part problem

    Improving and generalizing flow-based generative models with minibatch optimal transport

    Full text link
    Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their simulation-based maximum likelihood training. We introduce the generalized conditional flow matching (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schr\"odinger bridge inference.Comment: A version of this paper appeared in the New Frontiers in Learning, Control, and Dynamical Systems workshop at ICML 2023. Title change from v1. Code: https://github.com/atong01/conditional-flow-matchin

    Inductive Program Synthesis via Iterative Forward-Backward Abstract Interpretation

    Full text link
    A key challenge in example-based program synthesis is the gigantic search space of programs. To address this challenge, various work proposed to use abstract interpretation to prune the search space. However, most of existing approaches have focused only on forward abstract interpretation, and thus cannot fully exploit the power of abstract interpretation. In this paper, we propose a novel approach to inductive program synthesis via iterative forward-backward abstract interpretation. The forward abstract interpretation computes possible outputs of a program given inputs, while the backward abstract interpretation computes possible inputs of a program given outputs. By iteratively performing the two abstract interpretations in an alternating fashion, we can effectively determine if any completion of each partial program as a candidate can satisfy the input-output examples. We apply our approach to a standard formulation, syntax-guided synthesis (SyGuS), thereby supporting a wide range of inductive synthesis tasks. We have implemented our approach and evaluated it on a set of benchmarks from the prior work. The experimental results show that our approach significantly outperforms the state-of-the-art approaches thanks to the sophisticated abstract interpretation techniques

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 13371 and 13372 constitutes the refereed proceedings of the 34rd International Conference on Computer Aided Verification, CAV 2022, which was held in Haifa, Israel, in August 2022. The 40 full papers presented together with 9 tool papers and 2 case studies were carefully reviewed and selected from 209 submissions. The papers were organized in the following topical sections: Part I: Invited papers; formal methods for probabilistic programs; formal methods for neural networks; software Verification and model checking; hyperproperties and security; formal methods for hardware, cyber-physical, and hybrid systems. Part II: Probabilistic techniques; automata and logic; deductive verification and decision procedures; machine learning; synthesis and concurrency. This is an open access book

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing

    Advanced Computational Methods for Oncological Image Analysis

    Get PDF
    [Cancer is the second most common cause of death worldwide and encompasses highly variable clinical and biological scenarios. Some of the current clinical challenges are (i) early diagnosis of the disease and (ii) precision medicine, which allows for treatments targeted to specific clinical cases. The ultimate goal is to optimize the clinical workflow by combining accurate diagnosis with the most suitable therapies. Toward this, large-scale machine learning research can define associations among clinical, imaging, and multi-omics studies, making it possible to provide reliable diagnostic and prognostic biomarkers for precision oncology. Such reliable computer-assisted methods (i.e., artificial intelligence) together with clinicians’ unique knowledge can be used to properly handle typical issues in evaluation/quantification procedures (i.e., operator dependence and time-consuming tasks). These technical advances can significantly improve result repeatability in disease diagnosis and guide toward appropriate cancer care. Indeed, the need to apply machine learning and computational intelligence techniques has steadily increased to effectively perform image processing operations—such as segmentation, co-registration, classification, and dimensionality reduction—and multi-omics data integration.

    Annales Mathematicae et Informaticae 2021

    Get PDF
    corecore