37,946 research outputs found
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation
In previous works, only parameter weights of ASR models are optimized under
fixed-topology architecture. However, the design of successful model
architecture has always relied on human experience and intuition. Besides, many
hyperparameters related to model architecture need to be manually tuned.
Therefore in this paper, we propose an ASR approach with efficient
gradient-based architecture search, DARTS-ASR. In order to examine the
generalizability of DARTS-ASR, we apply our approach not only on many languages
to perform monolingual ASR, but also on a multilingual ASR setting. Following
previous works, we conducted experiments on a multilingual dataset, IARPA
BABEL. The experiment results show that our approach outperformed the baseline
fixed-topology architecture by 10.2% and 10.0% relative reduction on character
error rates under monolingual and multilingual ASR settings respectively.
Furthermore, we perform some analysis on the searched architectures by
DARTS-ASR.Comment: Accepted at INTERSPEECH 202
Multi-objective particle swarm optimization algorithm for multi-step electric load forecasting
As energy saving becomes more and more popular, electric load forecasting has played a more and more crucial role in power management systems in the last few years. Because of the real-time characteristic of electricity and the uncertainty change of an electric load, realizing the accuracy and stability of electric load forecasting is a challenging task. Many predecessors have obtained the expected forecasting results by various methods. Considering the stability of time series prediction, a novel combined electric load forecasting, which based on extreme learning machine (ELM), recurrent neural network (RNN), and support vector machines (SVMs), was proposed. The combined model first uses three neural networks to forecast the electric load data separately considering that the single model has inevitable disadvantages, the combined model applies the multi-objective particle swarm optimization algorithm (MOPSO) to optimize the parameters. In order to verify the capacity of the proposed combined model, 1-step, 2-step, and 3-step are used to forecast the electric load data of three Australian states, including New South Wales, Queensland, and Victoria. The experimental results intuitively indicate that for these three datasets, the combined model outperforms all three individual models used for comparison, which demonstrates its superior capability in terms of accuracy and stability
Revisiting the problem of audio-based hit song prediction using convolutional neural networks
Being able to predict whether a song can be a hit has impor- tant
applications in the music industry. Although it is true that the popularity of
a song can be greatly affected by exter- nal factors such as social and
commercial influences, to which degree audio features computed from musical
signals (whom we regard as internal factors) can predict song popularity is an
interesting research question on its own. Motivated by the recent success of
deep learning techniques, we attempt to ex- tend previous work on hit song
prediction by jointly learning the audio features and prediction models using
deep learning. Specifically, we experiment with a convolutional neural net-
work model that takes the primitive mel-spectrogram as the input for feature
learning, a more advanced JYnet model that uses an external song dataset for
supervised pre-training and auto-tagging, and the combination of these two
models. We also consider the inception model to characterize audio infor-
mation in different scales. Our experiments suggest that deep structures are
indeed more accurate than shallow structures in predicting the popularity of
either Chinese or Western Pop songs in Taiwan. We also use the tags predicted
by JYnet to gain insights into the result of different models.Comment: To appear in the proceedings of 2017 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP
- …