50 research outputs found
Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
This paper presents a novel speech phase prediction model which predicts
wrapped phase spectra directly from amplitude spectra by neural networks. The
proposed model is a cascade of a residual convolutional network and a parallel
estimation architecture. The parallel estimation architecture is composed of
two parallel linear convolutional layers and a phase calculation formula,
imitating the process of calculating the phase spectra from the real and
imaginary parts of complex spectra and strictly restricting the predicted phase
values to the principal value interval. To avoid the error expansion issue
caused by phase wrapping, we design anti-wrapping training losses defined
between the predicted wrapped phase spectra and natural ones by activating the
instantaneous phase error, group delay error and instantaneous angular
frequency error using an anti-wrapping function. Experimental results show that
our proposed neural speech phase prediction model outperforms the iterative
Griffin-Lim algorithm and other neural network-based method, in terms of both
reconstructed speech quality and generation speed.Comment: Accepted by ICASSP 2023. Codes are availabl
Secure Grouping Protocol Using a Deck of Cards
We consider a problem, which we call secure grouping, of dividing a number of
parties into some subsets (groups) in the following manner: Each party has to
know the other members of his/her group, while he/she may not know anything
about how the remaining parties are divided (except for certain public
predetermined constraints, such as the number of parties in each group). In
this paper, we construct an information-theoretically secure protocol using a
deck of physical cards to solve the problem, which is jointly executable by the
parties themselves without a trusted third party. Despite the non-triviality
and the potential usefulness of the secure grouping, our proposed protocol is
fairly simple to describe and execute. Our protocol is based on algebraic
properties of conjugate permutations. A key ingredient of our protocol is our
new techniques to apply multiplication and inverse operations to hidden
permutations (i.e., those encoded by using face-down cards), which would be of
independent interest and would have various potential applications
Explaining noise trader risk: evidence from Chinese stock market
We test for noise trader risk in China stock market through the interaction between noise traders and information traders by applying the Information-Adjusted Noise Model. Information traders tend to underreact, overreact or increase information pricing error (IPE effects) on the stock market. Consequently information traders in China drive price away from fundamental level rather than correcting for the price error. We test our model using data from the Shenzhen Stock Exchange. We finally present evidence that the market is informational inefficient. The most common violation of information efficiency is overreaction and information pricing error. Liquidity, profitability, size, leverage, capital expenditure, price to book ratio, financial crisis, seasonality and market sentiment are used to explain noise trader risk. The existing literature confirms the existence of noise trader risk in developed markets like the US and Australia but little has been done in emerging markets. The literature suspects the existence of noise trader risk in China but there is no ‘direct and appropriate methodology’ that has been used to quantity this risk. Furthermore there is a relatively thin (almost inexistent) literature on the interaction between noise traders and information traders in the Chinese stock market. The main contributions of this paper are (1) we show an interaction between noise traders and information traders in terms of overreaction, underreaction and information pricing errors, (2) we show whether there are opportunities to profit from these noise traders, (3) we explain the causes noise trader risk and (4) we created a database for information arrival in China
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
Speech bandwidth extension (BWE) refers to widening the frequency bandwidth
range of speech signals, enhancing the speech quality towards brighter and
fuller. This paper proposes a generative adversarial network (GAN) based BWE
model with parallel prediction of Amplitude and Phase spectra, named AP-BWE,
which achieves both high-quality and efficient wideband speech waveform
generation. The proposed AP-BWE generator is entirely based on convolutional
neural networks (CNNs). It features a dual-stream architecture with mutual
interaction, where the amplitude stream and the phase stream communicate with
each other and respectively extend the high-frequency components from the input
narrowband amplitude and phase spectra. To improve the naturalness of the
extended speech signals, we employ a multi-period discriminator at the waveform
level and design a pair of multi-resolution amplitude and phase discriminators
at the spectral level, respectively. Experimental results demonstrate that our
proposed AP-BWE achieves state-of-the-art performance in terms of speech
quality for BWE tasks targeting sampling rates of both 16 kHz and 48 kHz. In
terms of generation efficiency, due to the all-convolutional architecture and
all-frame-level operations, the proposed AP-BWE can generate 48 kHz waveform
samples 292.3 times faster than real-time on a single RTX 4090 GPU and 18.1
times faster than real-time on a single CPU. Notably, to our knowledge, AP-BWE
is the first to achieve the direct extension of the high-frequency phase
spectrum, which is beneficial for improving the effectiveness of existing BWE
methods.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language
Processin
Automotive Go Global: the internationalization of Chinese automakers through foreign direct investment (2001-2020)
El estudio pretende explicar el proceso de expansión internacional de las automotrices chinas a través de la IED entre 2001 y 2020. A tal fin, hemos considerado el desarrollo histórico de la industria automotriz china, la evolución cuantitativa de las inversiones, los motivos que impulsaron las operaciones en el extranjero y el destino geográfico de los capitales invertidos. El punto de partida del estudio ha sido el trabajo de rastreo, registro y sistematización de la IED concretada; además, también trabajamos con artÃculos de prensa, información proporcionada por las compañÃas y bibliografÃa secundaria.The main aim of this study is to explain the Chinese automakers’ international expansion process between 2001 and 2020 through FDI. To this end, we have considered the Chinese automotive industry’s historical development, the quantitative evolution of the investments, the reasons that boosted enterprises’ go global, and the geographical distribution of the capital invested. We have tracked, recorded and systematized the concreted FDI. Besides, we have used press articles, information published by automotive companies, and specialized literature.Instituto de Relaciones Internacionale
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Phase information has a significant impact on speech perceptual quality and
intelligibility. However, existing speech enhancement methods encounter
limitations in explicit phase estimation due to the non-structural nature and
wrapping characteristics of the phase, leading to a bottleneck in enhanced
speech quality. To overcome the above issue, in this paper, we proposed
MP-SENet, a novel Speech Enhancement Network which explicitly enhances
Magnitude and Phase spectra in parallel. The proposed MP-SENet adopts a codec
architecture in which the encoder and decoder are bridged by time-frequency
Transformers along both time and frequency dimensions. The encoder aims to
encode time-frequency representations derived from the input distorted
magnitude and phase spectra. The decoder comprises dual-stream magnitude and
phase decoders, directly enhancing magnitude and wrapped phase spectra by
incorporating a magnitude estimation architecture and a phase parallel
estimation architecture, respectively. To train the MP-SENet model effectively,
we define multi-level loss functions, including mean square error and
perceptual metric loss of magnitude spectra, anti-wrapping loss of phase
spectra, as well as mean square error and consistency loss of short-time
complex spectra. Experimental results demonstrate that our proposed MP-SENet
excels in high-quality speech enhancement across multiple tasks, including
speech denoising, dereverberation, and bandwidth extension. Compared to
existing phase-aware speech enhancement methods, it successfully avoids the
bidirectional compensation effect between the magnitude and phase, leading to a
better harmonic restoration. Notably, for the speech denoising task, the
MP-SENet yields a state-of-the-art performance with a PESQ of 3.60 on the
public VoiceBank+DEMAND dataset.Comment: Submmited to IEEE Transactions on Audio, Speech and Language
Processin