Search CORE

8,300 research outputs found

Adversarial Network Bottleneck Features for Noise Robust Speaker Verification

Author: Guo Jun
Ma Zhanyu
Tan Zheng-Hua
Yu Hong
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we propose a noise robust bottleneck feature representation which is generated by an adversarial network (AN). The AN includes two cascade connected networks, an encoding network (EN) and a discriminative network (DN). Mel-frequency cepstral coefficients (MFCCs) of clean and noisy speech are used as input to the EN and the output of the EN is used as the noise robust feature. The EN and DN are trained in turn, namely, when training the DN, noise types are selected as the training labels and when training the EN, all labels are set as the same, i.e., the clean speech label, which aims to make the AN features invariant to noise and thus achieve noise robustness. We evaluate the performance of the proposed feature on a Gaussian Mixture Model-Universal Background Model based speaker verification system, and make comparison to MFCC features of speech enhanced by short-time spectral amplitude minimum mean square error (STSA-MMSE) and deep neural network-based speech enhancement (DNN-SE) methods. Experimental results on the RSR2015 database show that the proposed AN bottleneck feature (AN-BN) dramatically outperforms the STSA-MMSE and DNN-SE based MFCCs for different noise types and signal-to-noise ratios. Furthermore, the AN-BN feature is able to improve the speaker verification performance under the clean condition

arXiv.org e-Print Archive

Crossref

VBN

Channel Covariance Matrix Estimation via Dimension Reduction for Hybrid MIMO MmWave Communication Systems

Author: Guo Qinghua
Hu Rui
Tong Jun
Xi Jiangtao
Yu Yanguang
Publication venue
Publication date: 01/01/2019
Field of study

Hybrid massive MIMO structures with lower hardware complexity and power consumption have been considered as a potential candidate for millimeter wave (mmWave) communications. Channel covariance information can be used for designing transmitter precoders, receiver combiners, channel estimators, etc. However, hybrid structures allow only a lower-dimensional signal to be observed, which adds difficulties for channel covariance matrix estimation. In this paper, we formulate the channel covariance estimation as a structured low-rank matrix sensing problem via Kronecker product expansion and use a low-complexity algorithm to solve this problem. Numerical results with uniform linear arrays (ULA) and uniform squared planar arrays (USPA) are provided to demonstrate the effectiveness of our proposed method

arXiv.org e-Print Archive

Research Online

Matrix Completion-Based Channel Estimation for MmWave Communication Systems With Array-Inherent Impairments

Author: Guo Qinghua
Hu Rui
Tong Jun
Xi Jiangtao
Yu Yanguang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Hybrid massive MIMO structures with reduced hardware complexity and power consumption have been widely studied as a potential candidate for millimeter wave (mmWave) communications. Channel estimators that require knowledge of the array response, such as those using compressive sensing (CS) methods, may suffer from performance degradation when array-inherent impairments bring unknown phase errors and gain errors to the antenna elements. In this paper, we design matrix completion (MC)-based channel estimation schemes which are robust against the array-inherent impairments. We first design an open-loop training scheme that can sample entries from the effective channel matrix randomly and is compatible with the phase shifter-based hybrid system. Leveraging the low-rank property of the effective channel matrix, we then design a channel estimator based on the generalized conditional gradient (GCG) framework and the alternating minimization (AltMin) approach. The resulting estimator is immune to array-inherent impairments and can be implemented to systems with any array shapes for its independence of the array response. In addition, we extend our design to sample a transformed channel matrix following the concept of inductive matrix completion (IMC), which can be solved efficiently using our proposed estimator and achieve similar performance with a lower requirement of the dynamic range of the transmission power per antenna. Numerical results demonstrate the advantages of our proposed MC-based channel estimators in terms of estimation performance, computational complexity and robustness against array-inherent impairments over the orthogonal matching pursuit (OMP)-based CS channel estimator.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Research Online

Multipartite entanglement purification with quantum nondemolition detectors

Author: Deng Fu-Guo
Sheng Yu-Bo
Wang Tie-Jun
Zhao Bao-Kui
Zhou Hong-Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/08/2009
Field of study

We present a scheme for multipartite entanglement purification of quantum systems in a Greenberger-Horne-Zeilinger state with quantum nondemolition detectors (QNDs). This scheme does not require the controlled-not gates which cannot be implemented perfectly with linear optical elements at present, but QNDs based on cross-Kerr nonlinearities. It works with two steps, i.e., the bit-flipping error correction and the phase-flipping error correction. These two steps can be iterated perfectly with parity checks and simple single-photon measurements. This scheme does not require the parties to possess sophisticated single photon detectors. These features maybe make this scheme more efficient and feasible than others in practical applications.Comment: 8 pages, 5 figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Ranking-based Deep Cross-modal Hashing

Author: Domeniconi Carlotta
Guo Maozu
Liu Xuanwu
Ren Yazhou
Wang Jun
Yu Guoxian
Publication venue
Publication date: 11/05/2019
Field of study

Cross-modal hashing has been receiving increasing interests for its low storage cost and fast query speed in multi-modal data retrievals. However, most existing hashing methods are based on hand-crafted or raw level features of objects, which may not be optimally compatible with the coding process. Besides, these hashing methods are mainly designed to handle simple pairwise similarity. The complex multilevel ranking semantic structure of instances associated with multiple labels has not been well explored yet. In this paper, we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH firstly uses the feature and label information of data to derive a semi-supervised semantic ranking list. Next, to expand the semantic representation power of hand-crafted features, RDCMH integrates the semantic ranking information into deep cross-modal hashing and jointly optimizes the compatible parameters of deep feature representations and of hashing functions. Experiments on real multi-modal datasets show that RDCMH outperforms other competitive baselines and achieves the state-of-the-art performance in cross-modal retrieval applications

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Long Text Generation via Adversarial Training with Leaked Information

Author: Cai Han
Guo Jiaxian
Lu Sidi
Wang Jun
Yu Yong
Zhang Weinan
Publication venue
Publication date: 08/12/2017
Field of study

Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.Comment: 14 pages, AAAI 201

arXiv.org e-Print Archive

UCL Discovery