1,436 research outputs found
Audio style transfer
'Style transfer' among images has recently emerged as a very active research
topic, fuelled by the power of convolution neural networks (CNNs), and has
become fast a very popular technology in social media. This paper investigates
the analogous problem in the audio domain: How to transfer the style of a
reference audio signal to a target audio content? We propose a flexible
framework for the task, which uses a sound texture model to extract statistics
characterizing the reference audio style, followed by an optimization-based
audio texture synthesis to modify the target content. In contrast to mainstream
optimization-based visual transfer method, the proposed process is initialized
by the target content instead of random noise and the optimized loss is only
about texture, not structure. These differences proved key for audio style
transfer in our experiments. In order to extract features of interest, we
investigate different architectures, whether pre-trained on other tasks, as
done in image style transfer, or engineered based on the human auditory system.
Experimental results on different types of audio signal confirm the potential
of the proposed approach.Comment: ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP), Apr 2018, Calgary, France. IEE
A Convolutional Neural Network model based on Neutrosophy for Noisy Speech Recognition
Convolutional neural networks are sensitive to unknown noisy condition in the
test phase and so their performance degrades for the noisy data classification
task including noisy speech recognition. In this research, a new convolutional
neural network (CNN) model with data uncertainty handling; referred as NCNN
(Neutrosophic Convolutional Neural Network); is proposed for classification
task. Here, speech signals are used as input data and their noise is modeled as
uncertainty. In this task, using speech spectrogram, a definition of
uncertainty is proposed in neutrosophic (NS) domain. Uncertainty is computed
for each Time-frequency point of speech spectrogram as like a pixel. Therefore,
uncertainty matrix with the same size of spectrogram is created in NS domain.
In the next step, a two parallel paths CNN classification model is proposed.
Speech spectrogram is used as input of the first path and uncertainty matrix
for the second path. The outputs of two paths are combined to compute the final
output of the classifier. To show the effectiveness of the proposed method, it
has been compared with conventional CNN on the isolated words of Aurora2
dataset. The proposed method achieves the average accuracy of 85.96 in noisy
train data. It is more robust against Car, Airport and Subway noises with
accuracies 90, 88 and 81 in test sets A, B and C, respectively. Results show
that the proposed method outperforms conventional CNN with the improvement of
6, 5 and 2 percentage in test set A, test set B and test sets C, respectively.
It means that the proposed method is more robust against noisy data and handle
these data effectively.Comment: International conference on Pattern Recognition and Image Analysis
(IPRIA 2019
Deep Learning based Recommender System: A Survey and New Perspectives
With the ever-growing volume of online information, recommender systems have
been an effective strategy to overcome such information overload. The utility
of recommender systems cannot be overstated, given its widespread adoption in
many web applications, along with its potential impact to ameliorate many
problems related to over-choice. In recent years, deep learning has garnered
considerable interest in many research fields such as computer vision and
natural language processing, owing not only to stellar performance but also the
attractive property of learning feature representations from scratch. The
influence of deep learning is also pervasive, recently demonstrating its
effectiveness when applied to information retrieval and recommender systems
research. Evidently, the field of deep learning in recommender system is
flourishing. This article aims to provide a comprehensive review of recent
research efforts on deep learning based recommender systems. More concretely,
we provide and devise a taxonomy of deep learning based recommendation models,
along with providing a comprehensive summary of the state-of-the-art. Finally,
we expand on current trends and provide new perspectives pertaining to this new
exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys.
https://doi.acm.org/10.1145/328502
Pointwise Convolutional Neural Networks
Deep learning with 3D data such as reconstructed point clouds and CAD models
has received great research interests recently. However, the capability of
using point clouds with convolutional neural network has been so far not fully
explored. In this paper, we present a convolutional neural network for semantic
segmentation and object recognition with 3D point clouds. At the core of our
network is pointwise convolution, a new convolution operator that can be
applied at each point of a point cloud. Our fully convolutional network design,
while being surprisingly simple to implement, can yield competitive accuracy in
both semantic segmentation and object recognition task.Comment: 10 pages, 6 figures, 10 tables. Paper accepted to CVPR 201
- …