8,300 research outputs found
Adversarial Network Bottleneck Features for Noise Robust Speaker Verification
In this paper, we propose a noise robust bottleneck feature representation
which is generated by an adversarial network (AN). The AN includes two cascade
connected networks, an encoding network (EN) and a discriminative network (DN).
Mel-frequency cepstral coefficients (MFCCs) of clean and noisy speech are used
as input to the EN and the output of the EN is used as the noise robust
feature. The EN and DN are trained in turn, namely, when training the DN, noise
types are selected as the training labels and when training the EN, all labels
are set as the same, i.e., the clean speech label, which aims to make the AN
features invariant to noise and thus achieve noise robustness. We evaluate the
performance of the proposed feature on a Gaussian Mixture Model-Universal
Background Model based speaker verification system, and make comparison to MFCC
features of speech enhanced by short-time spectral amplitude minimum mean
square error (STSA-MMSE) and deep neural network-based speech enhancement
(DNN-SE) methods. Experimental results on the RSR2015 database show that the
proposed AN bottleneck feature (AN-BN) dramatically outperforms the STSA-MMSE
and DNN-SE based MFCCs for different noise types and signal-to-noise ratios.
Furthermore, the AN-BN feature is able to improve the speaker verification
performance under the clean condition
Channel Covariance Matrix Estimation via Dimension Reduction for Hybrid MIMO MmWave Communication Systems
Hybrid massive MIMO structures with lower hardware complexity and power
consumption have been considered as a potential candidate for millimeter wave
(mmWave) communications. Channel covariance information can be used for
designing transmitter precoders, receiver combiners, channel estimators, etc.
However, hybrid structures allow only a lower-dimensional signal to be
observed, which adds difficulties for channel covariance matrix estimation. In
this paper, we formulate the channel covariance estimation as a structured
low-rank matrix sensing problem via Kronecker product expansion and use a
low-complexity algorithm to solve this problem. Numerical results with uniform
linear arrays (ULA) and uniform squared planar arrays (USPA) are provided to
demonstrate the effectiveness of our proposed method
Matrix Completion-Based Channel Estimation for MmWave Communication Systems With Array-Inherent Impairments
Hybrid massive MIMO structures with reduced hardware complexity and power
consumption have been widely studied as a potential candidate for millimeter
wave (mmWave) communications. Channel estimators that require knowledge of the
array response, such as those using compressive sensing (CS) methods, may
suffer from performance degradation when array-inherent impairments bring
unknown phase errors and gain errors to the antenna elements. In this paper, we
design matrix completion (MC)-based channel estimation schemes which are robust
against the array-inherent impairments. We first design an open-loop training
scheme that can sample entries from the effective channel matrix randomly and
is compatible with the phase shifter-based hybrid system. Leveraging the
low-rank property of the effective channel matrix, we then design a channel
estimator based on the generalized conditional gradient (GCG) framework and the
alternating minimization (AltMin) approach. The resulting estimator is immune
to array-inherent impairments and can be implemented to systems with any array
shapes for its independence of the array response. In addition, we extend our
design to sample a transformed channel matrix following the concept of
inductive matrix completion (IMC), which can be solved efficiently using our
proposed estimator and achieve similar performance with a lower requirement of
the dynamic range of the transmission power per antenna. Numerical results
demonstrate the advantages of our proposed MC-based channel estimators in terms
of estimation performance, computational complexity and robustness against
array-inherent impairments over the orthogonal matching pursuit (OMP)-based CS
channel estimator.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Multipartite entanglement purification with quantum nondemolition detectors
We present a scheme for multipartite entanglement purification of quantum
systems in a Greenberger-Horne-Zeilinger state with quantum nondemolition
detectors (QNDs). This scheme does not require the controlled-not gates which
cannot be implemented perfectly with linear optical elements at present, but
QNDs based on cross-Kerr nonlinearities. It works with two steps, i.e., the
bit-flipping error correction and the phase-flipping error correction. These
two steps can be iterated perfectly with parity checks and simple single-photon
measurements. This scheme does not require the parties to possess sophisticated
single photon detectors. These features maybe make this scheme more efficient
and feasible than others in practical applications.Comment: 8 pages, 5 figure
Ranking-based Deep Cross-modal Hashing
Cross-modal hashing has been receiving increasing interests for its low
storage cost and fast query speed in multi-modal data retrievals. However, most
existing hashing methods are based on hand-crafted or raw level features of
objects, which may not be optimally compatible with the coding process.
Besides, these hashing methods are mainly designed to handle simple pairwise
similarity. The complex multilevel ranking semantic structure of instances
associated with multiple labels has not been well explored yet. In this paper,
we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH
firstly uses the feature and label information of data to derive a
semi-supervised semantic ranking list. Next, to expand the semantic
representation power of hand-crafted features, RDCMH integrates the semantic
ranking information into deep cross-modal hashing and jointly optimizes the
compatible parameters of deep feature representations and of hashing functions.
Experiments on real multi-modal datasets show that RDCMH outperforms other
competitive baselines and achieves the state-of-the-art performance in
cross-modal retrieval applications
Long Text Generation via Adversarial Training with Leaked Information
Automatically generating coherent and semantically meaningful text has many
applications in machine translation, dialogue systems, image captioning, etc.
Recently, by combining with policy gradient, Generative Adversarial Nets (GAN)
that use a discriminative model to guide the training of the generative model
as a reinforcement learning policy has shown promising results in text
generation. However, the scalar guiding signal is only available after the
entire text has been generated and lacks intermediate information about text
structure during the generative process. As such, it limits its success when
the length of the generated text samples is long (more than 20 words). In this
paper, we propose a new framework, called LeakGAN, to address the problem for
long text generation. We allow the discriminative net to leak its own
high-level extracted features to the generative net to further help the
guidance. The generator incorporates such informative signals into all
generation steps through an additional Manager module, which takes the
extracted features of current generated words and outputs a latent vector to
guide the Worker module for next-word generation. Our extensive experiments on
synthetic data and various real-world tasks with Turing test demonstrate that
LeakGAN is highly effective in long text generation and also improves the
performance in short text generation scenarios. More importantly, without any
supervision, LeakGAN would be able to implicitly learn sentence structures only
through the interaction between Manager and Worker.Comment: 14 pages, AAAI 201
- …