50 research outputs found

    Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses

    Full text link
    This paper presents a novel speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra by neural networks. The proposed model is a cascade of a residual convolutional network and a parallel estimation architecture. The parallel estimation architecture is composed of two parallel linear convolutional layers and a phase calculation formula, imitating the process of calculating the phase spectra from the real and imaginary parts of complex spectra and strictly restricting the predicted phase values to the principal value interval. To avoid the error expansion issue caused by phase wrapping, we design anti-wrapping training losses defined between the predicted wrapped phase spectra and natural ones by activating the instantaneous phase error, group delay error and instantaneous angular frequency error using an anti-wrapping function. Experimental results show that our proposed neural speech phase prediction model outperforms the iterative Griffin-Lim algorithm and other neural network-based method, in terms of both reconstructed speech quality and generation speed.Comment: Accepted by ICASSP 2023. Codes are availabl

    Secure Grouping Protocol Using a Deck of Cards

    Full text link
    We consider a problem, which we call secure grouping, of dividing a number of parties into some subsets (groups) in the following manner: Each party has to know the other members of his/her group, while he/she may not know anything about how the remaining parties are divided (except for certain public predetermined constraints, such as the number of parties in each group). In this paper, we construct an information-theoretically secure protocol using a deck of physical cards to solve the problem, which is jointly executable by the parties themselves without a trusted third party. Despite the non-triviality and the potential usefulness of the secure grouping, our proposed protocol is fairly simple to describe and execute. Our protocol is based on algebraic properties of conjugate permutations. A key ingredient of our protocol is our new techniques to apply multiplication and inverse operations to hidden permutations (i.e., those encoded by using face-down cards), which would be of independent interest and would have various potential applications

    The prospects of development of the car carrier industry in China

    Get PDF

    Explaining noise trader risk: evidence from Chinese stock market

    Get PDF
    We test for noise trader risk in China stock market through the interaction between noise traders and information traders by applying the Information-Adjusted Noise Model. Information traders tend to underreact, overreact or increase information pricing error (IPE effects) on the stock market. Consequently information traders in China drive price away from fundamental level rather than correcting for the price error. We test our model using data from the Shenzhen Stock Exchange. We finally present evidence that the market is informational inefficient. The most common violation of information efficiency is overreaction and information pricing error. Liquidity, profitability, size, leverage, capital expenditure, price to book ratio, financial crisis, seasonality and market sentiment are used to explain noise trader risk. The existing literature confirms the existence of noise trader risk in developed markets like the US and Australia but little has been done in emerging markets. The literature suspects the existence of noise trader risk in China but there is no ‘direct and appropriate methodology’ that has been used to quantity this risk. Furthermore there is a relatively thin (almost inexistent) literature on the interaction between noise traders and information traders in the Chinese stock market. The main contributions of this paper are (1) we show an interaction between noise traders and information traders in terms of overreaction, underreaction and information pricing errors, (2) we show whether there are opportunities to profit from these noise traders, (3) we explain the causes noise trader risk and (4) we created a database for information arrival in China

    Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

    Full text link
    Speech bandwidth extension (BWE) refers to widening the frequency bandwidth range of speech signals, enhancing the speech quality towards brighter and fuller. This paper proposes a generative adversarial network (GAN) based BWE model with parallel prediction of Amplitude and Phase spectra, named AP-BWE, which achieves both high-quality and efficient wideband speech waveform generation. The proposed AP-BWE generator is entirely based on convolutional neural networks (CNNs). It features a dual-stream architecture with mutual interaction, where the amplitude stream and the phase stream communicate with each other and respectively extend the high-frequency components from the input narrowband amplitude and phase spectra. To improve the naturalness of the extended speech signals, we employ a multi-period discriminator at the waveform level and design a pair of multi-resolution amplitude and phase discriminators at the spectral level, respectively. Experimental results demonstrate that our proposed AP-BWE achieves state-of-the-art performance in terms of speech quality for BWE tasks targeting sampling rates of both 16 kHz and 48 kHz. In terms of generation efficiency, due to the all-convolutional architecture and all-frame-level operations, the proposed AP-BWE can generate 48 kHz waveform samples 292.3 times faster than real-time on a single RTX 4090 GPU and 18.1 times faster than real-time on a single CPU. Notably, to our knowledge, AP-BWE is the first to achieve the direct extension of the high-frequency phase spectrum, which is beneficial for improving the effectiveness of existing BWE methods.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Automotive Go Global: the internationalization of Chinese automakers through foreign direct investment (2001-2020)

    Get PDF
    El estudio pretende explicar el proceso de expansión internacional de las automotrices chinas a través de la IED entre 2001 y 2020. A tal fin, hemos considerado el desarrollo histórico de la industria automotriz china, la evolución cuantitativa de las inversiones, los motivos que impulsaron las operaciones en el extranjero y el destino geográfico de los capitales invertidos. El punto de partida del estudio ha sido el trabajo de rastreo, registro y sistematización de la IED concretada; además, también trabajamos con artículos de prensa, información proporcionada por las compañías y bibliografía secundaria.The main aim of this study is to explain the Chinese automakers’ international expansion process between 2001 and 2020 through FDI. To this end, we have considered the Chinese automotive industry’s historical development, the quantitative evolution of the investments, the reasons that boosted enterprises’ go global, and the geographical distribution of the capital invested. We have tracked, recorded and systematized the concreted FDI. Besides, we have used press articles, information published by automotive companies, and specialized literature.Instituto de Relaciones Internacionale

    Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

    Full text link
    Phase information has a significant impact on speech perceptual quality and intelligibility. However, existing speech enhancement methods encounter limitations in explicit phase estimation due to the non-structural nature and wrapping characteristics of the phase, leading to a bottleneck in enhanced speech quality. To overcome the above issue, in this paper, we proposed MP-SENet, a novel Speech Enhancement Network which explicitly enhances Magnitude and Phase spectra in parallel. The proposed MP-SENet adopts a codec architecture in which the encoder and decoder are bridged by time-frequency Transformers along both time and frequency dimensions. The encoder aims to encode time-frequency representations derived from the input distorted magnitude and phase spectra. The decoder comprises dual-stream magnitude and phase decoders, directly enhancing magnitude and wrapped phase spectra by incorporating a magnitude estimation architecture and a phase parallel estimation architecture, respectively. To train the MP-SENet model effectively, we define multi-level loss functions, including mean square error and perceptual metric loss of magnitude spectra, anti-wrapping loss of phase spectra, as well as mean square error and consistency loss of short-time complex spectra. Experimental results demonstrate that our proposed MP-SENet excels in high-quality speech enhancement across multiple tasks, including speech denoising, dereverberation, and bandwidth extension. Compared to existing phase-aware speech enhancement methods, it successfully avoids the bidirectional compensation effect between the magnitude and phase, leading to a better harmonic restoration. Notably, for the speech denoising task, the MP-SENet yields a state-of-the-art performance with a PESQ of 3.60 on the public VoiceBank+DEMAND dataset.Comment: Submmited to IEEE Transactions on Audio, Speech and Language Processin
    corecore