45,926 research outputs found

    Deep Learning for Joint Source-Channel Coding of Text

    Full text link
    We consider the problem of joint source and channel coding of structured data such as natural language over a noisy channel. The typical approach to this problem in both theory and practice involves performing source coding to first compress the text and then channel coding to add robustness for the transmission across the channel. This approach is optimal in terms of minimizing end-to-end distortion with arbitrarily large block lengths of both the source and channel codes when transmission is over discrete memoryless channels. However, the optimality of this approach is no longer ensured for documents of finite length and limitations on the length of the encoding. We will show in this scenario that we can achieve lower word error rates by developing a deep learning based encoder and decoder. While the approach of separate source and channel coding would minimize bit error rates, our approach preserves semantic information of sentences by first embedding sentences in a semantic space where sentences closer in meaning are located closer together, and then performing joint source and channel coding on these embeddings.Comment: accepted for publication in the proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 201

    Neural Joint Source-Channel Coding

    Full text link
    For reliable transmission across a noisy communication channel, classical results from information theory show that it is asymptotically optimal to separate out the source and channel coding processes. However, this decomposition can fall short in the finite bit-length regime, as it requires non-trivial tuning of hand-crafted codes and assumes infinite computational power for decoding. In this work, we propose to jointly learn the encoding and decoding processes using a new discrete variational autoencoder model. By adding noise into the latent codes to simulate the channel during training, we learn to both compress and error-correct given a fixed bit-length and computational budget. We obtain codes that are not only competitive against several separation schemes, but also learn useful robust representations of the data for downstream tasks such as classification. Finally, inference amortization yields an extremely fast neural decoder, almost an order of magnitude faster compared to standard decoding methods based on iterative belief propagation

    Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features

    Full text link
    This work presents a novel method of exploring human brain-visual representations, with a view towards replicating these processes in machines. The core idea is to learn plausible computational and biological representations by correlating human neural activity and natural images. Thus, we first propose a model, EEG-ChannelNet, to learn a brain manifold for EEG classification. After verifying that visual information can be extracted from EEG data, we introduce a multimodal approach that uses deep image and EEG encoders, trained in a siamese configuration, for learning a joint manifold that maximizes a compatibility measure between visual features and brain representations. We then carry out image classification and saliency detection on the learned manifold. Performance analyses show that our approach satisfactorily decodes visual information from neural signals. This, in turn, can be used to effectively supervise the training of deep learning models, as demonstrated by the high performance of image classification and saliency detection on out-of-training classes. The obtained results show that the learned brain-visual features lead to improved performance and simultaneously bring deep models more in line with cognitive neuroscience work related to visual perception and attention

    Multimodal sparse representation learning and applications

    Full text link
    Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between modalities. The framework can model relationships at a higher level by forcing the shared sparse representation. In particular, we propose the use of joint dictionary learning technique for sparse coding and formulate the joint representation for concision, cross-modal representations (in case of a missing modality), and union of the cross-modal representations. Given the accelerated growth of multimodal data posted on the Web such as YouTube, Wikipedia, and Twitter, learning good multimodal features is becoming increasingly important. We show that the shared representations enabled by our framework substantially improve the classification performance under both unimodal and multimodal settings. We further show how deep architectures built on the proposed framework are effective for the case of highly nonlinear correlations between modalities. The effectiveness of our approach is demonstrated experimentally in image denoising, multimedia event detection and retrieval on the TRECVID dataset (audio-video), category classification on the Wikipedia dataset (image-text), and sentiment classification on PhotoTweet (image-text)

    Deep Learning for Wireless Communications

    Full text link
    Existing communication systems exhibit inherent limitations in translating theory to practice when handling the complexity of optimization for emerging wireless applications with high degrees of freedom. Deep learning has a strong potential to overcome this challenge via data-driven solutions and improve the performance of wireless systems in utilizing limited spectrum resources. In this chapter, we first describe how deep learning is used to design an end-to-end communication system using autoencoders. This flexible design effectively captures channel impairments and optimizes transmitter and receiver operations jointly in single-antenna, multiple-antenna, and multiuser communications. Next, we present the benefits of deep learning in spectrum situation awareness ranging from channel modeling and estimation to signal detection and classification tasks. Deep learning improves the performance when the model-based methods fail. Finally, we discuss how deep learning applies to wireless communication security. In this context, adversarial machine learning provides novel means to launch and defend against wireless attacks. These applications demonstrate the power of deep learning in providing novel means to design, optimize, adapt, and secure wireless communications

    Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

    Full text link
    The mutual information is a core statistical quantity that has applications in all areas of machine learning, whether this is in training of density models over multiple data modalities, in maximising the efficiency of noisy transmission channels, or when learning behaviour policies for exploration by artificial agents. Most learning algorithms that involve optimisation of the mutual information rely on the Blahut-Arimoto algorithm --- an enumerative algorithm with exponential complexity that is not suitable for modern machine learning applications. This paper provides a new approach for scalable optimisation of the mutual information by merging techniques from variational inference and deep learning. We develop our approach by focusing on the problem of intrinsically-motivated learning, where the mutual information forms the definition of a well-known internal drive known as empowerment. Using a variational lower bound on the mutual information, combined with convolutional networks for handling visual input streams, we develop a stochastic optimisation algorithm that allows for scalable information maximisation and empowerment-based reasoning directly from pixels to actions.Comment: Proceedings of the 29th Conference on Neural Information Processing Systems (NIPS 2015

    Super-Resolution via Deep Learning

    Full text link
    The recent phenomenal interest in convolutional neural networks (CNNs) must have made it inevitable for the super-resolution (SR) community to explore its potential. The response has been immense and in the last three years, since the advent of the pioneering work, there appeared too many works not to warrant a comprehensive survey. This paper surveys the SR literature in the context of deep learning. We focus on the three important aspects of multimedia - namely image, video and multi-dimensions, especially depth maps. In each case, first relevant benchmarks are introduced in the form of datasets and state of the art SR methods, excluding deep learning. Next is a detailed analysis of the individual works, each including a short description of the method and a critique of the results with special reference to the benchmarking done. This is followed by minimum overall benchmarking in the form of comparison on some common dataset, while relying on the results reported in various works

    Towards an Intelligent Edge: Wireless Communication Meets Machine Learning

    Full text link
    The recent revival of artificial intelligence (AI) is revolutionizing almost every branch of science and technology. Given the ubiquitous smart mobile gadgets and Internet of Things (IoT) devices, it is expected that a majority of intelligent applications will be deployed at the edge of wireless networks. This trend has generated strong interests in realizing an "intelligent edge" to support AI-enabled applications at various edge devices. Accordingly, a new research area, called edge learning, emerges, which crosses and revolutionizes two disciplines: wireless communication and machine learning. A major theme in edge learning is to overcome the limited computing power, as well as limited data, at each edge device. This is accomplished by leveraging the mobile edge computing (MEC) platform and exploiting the massive data distributed over a large number of edge devices. In such systems, learning from distributed data and communicating between the edge server and devices are two critical and coupled aspects, and their fusion poses many new research challenges. This article advocates a new set of design principles for wireless communication in edge learning, collectively called learning-driven communication. Illustrative examples are provided to demonstrate the effectiveness of these design principles, and unique research opportunities are identified.Comment: submitted to IEEE for possible publicatio

    When Provably Secure Steganography Meets Generative Models

    Full text link
    Steganography is the art and science of hiding secret messages in public communication so that the presence of the secret messages cannot be detected. There are two provably secure steganographic frameworks, one is black-box sampling based and the other is compression based. The former requires a perfect sampler which yields data following the same distribution, and the latter needs explicit distributions of generative objects. However, these two conditions are too strict even unrealistic in the traditional data environment, because it is hard to model the explicit distribution of natural image. With the development of deep learning, generative models bring new vitality to provably secure steganography, which can serve as the black-box sampler or provide the explicit distribution of generative media. Motivated by this, this paper proposes two types of provably secure stegosystems with generative models. Specifically, we first design block-box sampling based provably secure stegosystem for broad generative models without explicit distribution, such as GAN, VAE, and flow-based generative models, where the generative network can serve as the perfect sampler. For compression based stegosystem, we leverage the generative models with explicit distribution such as autoregressive models instead, where the adaptive arithmetic coding plays the role of the perfect compressor, decompressing the encrypted message bits into generative media, and the receiver can compress the generative media into the encrypted message bits. To show the effectiveness of our method, we take DFC-VAE, Glow, WaveNet as instances of generative models and demonstrate the perfectly secure performance of these stegosystems with the state-of-the-art steganalysis methods

    Channel Agnostic End-to-End Learning based Communication Systems with Conditional GAN

    Full text link
    In this article, we use deep neural networks (DNNs) to develop a wireless end-to-end communication system, in which DNNs are employed for all signal-related functionalities, such as encoding, decoding, modulation, and equalization. However, accurate instantaneous channel transfer function, \emph{i.e.}, the channel state information (CSI), is necessary to compute the gradient of the DNN representing. In many communication systems, the channel transfer function is hard to obtain in advance and varies with time and location. In this article, this constraint is released by developing a channel agnostic end-to-end system that does not rely on any prior information about the channel. We use a conditional generative adversarial net (GAN) to represent the channel effects, where the encoded signal of the transmitter will serve as the conditioning information. In addition, in order to deal with the time-varying channel, the received signal corresponding to the pilot data can also be added as a part of the conditioning information. From the simulation results, the proposed method is effective on additive white Gaussian noise (AWGN) and Rayleigh fading channels, which opens a new door for building data-driven communication systems
    corecore