45,926 research outputs found
Deep Learning for Joint Source-Channel Coding of Text
We consider the problem of joint source and channel coding of structured data
such as natural language over a noisy channel. The typical approach to this
problem in both theory and practice involves performing source coding to first
compress the text and then channel coding to add robustness for the
transmission across the channel. This approach is optimal in terms of
minimizing end-to-end distortion with arbitrarily large block lengths of both
the source and channel codes when transmission is over discrete memoryless
channels. However, the optimality of this approach is no longer ensured for
documents of finite length and limitations on the length of the encoding. We
will show in this scenario that we can achieve lower word error rates by
developing a deep learning based encoder and decoder. While the approach of
separate source and channel coding would minimize bit error rates, our approach
preserves semantic information of sentences by first embedding sentences in a
semantic space where sentences closer in meaning are located closer together,
and then performing joint source and channel coding on these embeddings.Comment: accepted for publication in the proceedings of IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) 201
Neural Joint Source-Channel Coding
For reliable transmission across a noisy communication channel, classical
results from information theory show that it is asymptotically optimal to
separate out the source and channel coding processes. However, this
decomposition can fall short in the finite bit-length regime, as it requires
non-trivial tuning of hand-crafted codes and assumes infinite computational
power for decoding. In this work, we propose to jointly learn the encoding and
decoding processes using a new discrete variational autoencoder model. By
adding noise into the latent codes to simulate the channel during training, we
learn to both compress and error-correct given a fixed bit-length and
computational budget. We obtain codes that are not only competitive against
several separation schemes, but also learn useful robust representations of the
data for downstream tasks such as classification. Finally, inference
amortization yields an extremely fast neural decoder, almost an order of
magnitude faster compared to standard decoding methods based on iterative
belief propagation
Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features
This work presents a novel method of exploring human brain-visual
representations, with a view towards replicating these processes in machines.
The core idea is to learn plausible computational and biological
representations by correlating human neural activity and natural images. Thus,
we first propose a model, EEG-ChannelNet, to learn a brain manifold for EEG
classification. After verifying that visual information can be extracted from
EEG data, we introduce a multimodal approach that uses deep image and EEG
encoders, trained in a siamese configuration, for learning a joint manifold
that maximizes a compatibility measure between visual features and brain
representations. We then carry out image classification and saliency detection
on the learned manifold. Performance analyses show that our approach
satisfactorily decodes visual information from neural signals. This, in turn,
can be used to effectively supervise the training of deep learning models, as
demonstrated by the high performance of image classification and saliency
detection on out-of-training classes. The obtained results show that the
learned brain-visual features lead to improved performance and simultaneously
bring deep models more in line with cognitive neuroscience work related to
visual perception and attention
Multimodal sparse representation learning and applications
Unsupervised methods have proven effective for discriminative tasks in a
single-modality scenario. In this paper, we present a multimodal framework for
learning sparse representations that can capture semantic correlation between
modalities. The framework can model relationships at a higher level by forcing
the shared sparse representation. In particular, we propose the use of joint
dictionary learning technique for sparse coding and formulate the joint
representation for concision, cross-modal representations (in case of a missing
modality), and union of the cross-modal representations. Given the accelerated
growth of multimodal data posted on the Web such as YouTube, Wikipedia, and
Twitter, learning good multimodal features is becoming increasingly important.
We show that the shared representations enabled by our framework substantially
improve the classification performance under both unimodal and multimodal
settings. We further show how deep architectures built on the proposed
framework are effective for the case of highly nonlinear correlations between
modalities. The effectiveness of our approach is demonstrated experimentally in
image denoising, multimedia event detection and retrieval on the TRECVID
dataset (audio-video), category classification on the Wikipedia dataset
(image-text), and sentiment classification on PhotoTweet (image-text)
Deep Learning for Wireless Communications
Existing communication systems exhibit inherent limitations in translating
theory to practice when handling the complexity of optimization for emerging
wireless applications with high degrees of freedom. Deep learning has a strong
potential to overcome this challenge via data-driven solutions and improve the
performance of wireless systems in utilizing limited spectrum resources. In
this chapter, we first describe how deep learning is used to design an
end-to-end communication system using autoencoders. This flexible design
effectively captures channel impairments and optimizes transmitter and receiver
operations jointly in single-antenna, multiple-antenna, and multiuser
communications. Next, we present the benefits of deep learning in spectrum
situation awareness ranging from channel modeling and estimation to signal
detection and classification tasks. Deep learning improves the performance when
the model-based methods fail. Finally, we discuss how deep learning applies to
wireless communication security. In this context, adversarial machine learning
provides novel means to launch and defend against wireless attacks. These
applications demonstrate the power of deep learning in providing novel means to
design, optimize, adapt, and secure wireless communications
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning
The mutual information is a core statistical quantity that has applications
in all areas of machine learning, whether this is in training of density models
over multiple data modalities, in maximising the efficiency of noisy
transmission channels, or when learning behaviour policies for exploration by
artificial agents. Most learning algorithms that involve optimisation of the
mutual information rely on the Blahut-Arimoto algorithm --- an enumerative
algorithm with exponential complexity that is not suitable for modern machine
learning applications. This paper provides a new approach for scalable
optimisation of the mutual information by merging techniques from variational
inference and deep learning. We develop our approach by focusing on the problem
of intrinsically-motivated learning, where the mutual information forms the
definition of a well-known internal drive known as empowerment. Using a
variational lower bound on the mutual information, combined with convolutional
networks for handling visual input streams, we develop a stochastic
optimisation algorithm that allows for scalable information maximisation and
empowerment-based reasoning directly from pixels to actions.Comment: Proceedings of the 29th Conference on Neural Information Processing
Systems (NIPS 2015
Super-Resolution via Deep Learning
The recent phenomenal interest in convolutional neural networks (CNNs) must
have made it inevitable for the super-resolution (SR) community to explore its
potential. The response has been immense and in the last three years, since the
advent of the pioneering work, there appeared too many works not to warrant a
comprehensive survey. This paper surveys the SR literature in the context of
deep learning. We focus on the three important aspects of multimedia - namely
image, video and multi-dimensions, especially depth maps. In each case, first
relevant benchmarks are introduced in the form of datasets and state of the art
SR methods, excluding deep learning. Next is a detailed analysis of the
individual works, each including a short description of the method and a
critique of the results with special reference to the benchmarking done. This
is followed by minimum overall benchmarking in the form of comparison on some
common dataset, while relying on the results reported in various works
Towards an Intelligent Edge: Wireless Communication Meets Machine Learning
The recent revival of artificial intelligence (AI) is revolutionizing almost
every branch of science and technology. Given the ubiquitous smart mobile
gadgets and Internet of Things (IoT) devices, it is expected that a majority of
intelligent applications will be deployed at the edge of wireless networks.
This trend has generated strong interests in realizing an "intelligent edge" to
support AI-enabled applications at various edge devices. Accordingly, a new
research area, called edge learning, emerges, which crosses and revolutionizes
two disciplines: wireless communication and machine learning. A major theme in
edge learning is to overcome the limited computing power, as well as limited
data, at each edge device. This is accomplished by leveraging the mobile edge
computing (MEC) platform and exploiting the massive data distributed over a
large number of edge devices. In such systems, learning from distributed data
and communicating between the edge server and devices are two critical and
coupled aspects, and their fusion poses many new research challenges. This
article advocates a new set of design principles for wireless communication in
edge learning, collectively called learning-driven communication. Illustrative
examples are provided to demonstrate the effectiveness of these design
principles, and unique research opportunities are identified.Comment: submitted to IEEE for possible publicatio
When Provably Secure Steganography Meets Generative Models
Steganography is the art and science of hiding secret messages in public
communication so that the presence of the secret messages cannot be detected.
There are two provably secure steganographic frameworks, one is black-box
sampling based and the other is compression based. The former requires a
perfect sampler which yields data following the same distribution, and the
latter needs explicit distributions of generative objects. However, these two
conditions are too strict even unrealistic in the traditional data environment,
because it is hard to model the explicit distribution of natural image. With
the development of deep learning, generative models bring new vitality to
provably secure steganography, which can serve as the black-box sampler or
provide the explicit distribution of generative media. Motivated by this, this
paper proposes two types of provably secure stegosystems with generative
models. Specifically, we first design block-box sampling based provably secure
stegosystem for broad generative models without explicit distribution, such as
GAN, VAE, and flow-based generative models, where the generative network can
serve as the perfect sampler. For compression based stegosystem, we leverage
the generative models with explicit distribution such as autoregressive models
instead, where the adaptive arithmetic coding plays the role of the perfect
compressor, decompressing the encrypted message bits into generative media, and
the receiver can compress the generative media into the encrypted message bits.
To show the effectiveness of our method, we take DFC-VAE, Glow, WaveNet as
instances of generative models and demonstrate the perfectly secure performance
of these stegosystems with the state-of-the-art steganalysis methods
Channel Agnostic End-to-End Learning based Communication Systems with Conditional GAN
In this article, we use deep neural networks (DNNs) to develop a wireless
end-to-end communication system, in which DNNs are employed for all
signal-related functionalities, such as encoding, decoding, modulation, and
equalization. However, accurate instantaneous channel transfer function,
\emph{i.e.}, the channel state information (CSI), is necessary to compute the
gradient of the DNN representing. In many communication systems, the channel
transfer function is hard to obtain in advance and varies with time and
location. In this article, this constraint is released by developing a channel
agnostic end-to-end system that does not rely on any prior information about
the channel. We use a conditional generative adversarial net (GAN) to represent
the channel effects, where the encoded signal of the transmitter will serve as
the conditioning information. In addition, in order to deal with the
time-varying channel, the received signal corresponding to the pilot data can
also be added as a part of the conditioning information. From the simulation
results, the proposed method is effective on additive white Gaussian noise
(AWGN) and Rayleigh fading channels, which opens a new door for building
data-driven communication systems
- …