20 research outputs found
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Robust speech processing in multi-talker environments requires effective
speech separation. Recent deep learning systems have made significant progress
toward solving this problem, yet it remains challenging particularly in
real-time, short latency applications. Most methods attempt to construct a mask
for each source in time-frequency representation of the mixture signal which is
not necessarily an optimal representation for speech separation. In addition,
time-frequency decomposition results in inherent problems such as
phase/magnitude decoupling and long time window which is required to achieve
sufficient frequency resolution. We propose Time-domain Audio Separation
Network (TasNet) to overcome these limitations. We directly model the signal in
the time-domain using an encoder-decoder framework and perform the source
separation on nonnegative encoder outputs. This method removes the frequency
decomposition step and reduces the separation problem to estimation of source
masks on encoder outputs which is then synthesized by the decoder. Our system
outperforms the current state-of-the-art causal and noncausal speech separation
algorithms, reduces the computational cost of speech separation, and
significantly reduces the minimum required latency of the output. This makes
TasNet suitable for applications where low-power, real-time implementation is
desirable such as in hearable and telecommunication devices.Comment: Camera ready version for ICASSP 2018, Calgary, Canad
Unsupervised Deep Transfer Feature Learning for Medical Image Classification
The accuracy and robustness of image classification with supervised deep
learning are dependent on the availability of large-scale, annotated training
data. However, there is a paucity of annotated data available due to the
complexity of manual annotation. To overcome this problem, a popular approach
is to use transferable knowledge across different domains by: 1) using a
generic feature extractor that has been pre-trained on large-scale general
images (i.e., transfer-learned) but which not suited to capture characteristics
from medical images; or 2) fine-tuning generic knowledge with a relatively
smaller number of annotated images. Our aim is to reduce the reliance on
annotated training data by using a new hierarchical unsupervised feature
extractor with a convolutional auto-encoder placed atop of a pre-trained
convolutional neural network. Our approach constrains the rich and generic
image features from the pre-trained domain to a sophisticated representation of
the local image characteristics from the unannotated medical image domain. Our
approach has a higher classification accuracy than transfer-learned approaches
and is competitive with state-of-the-art supervised fine-tuned methods.Comment: 4 pages, 1 figure, 3 tables, Accepted (Oral) as IEEE International
Symposium on Biomedical Imaging 201
Underwater target recognition method based on t-SNE and stacked nonnegative constrained denoising autoencoder
1822-1832Underwater targets recognition is a difficult task due to the specific attributes of underwater target radiated noises, low signal to noise ratio and so on. In this paper, the input data optimization method and recognition model were researched. The underwater target radiated noise spectrum was chosen as the original feature. The t-distributed stochastic neighbor embedding (t-SNE) algorithm was used to reduce the dimensionality of the original spectrum segments divided by frequency. The optimal features can be obtained by analyzing the separability. Then the stacked nonnegative constrained denoising autoencoder (SNDAE) model was established to recognize the optimal features. The experimental signal spectra were processed by above methods. The results show that the recognition accuracy of SNDAE is higher than that of other contrastive methods. And the frequency of input band with the highest recognition accuracy is approximately the same as that with the best separability based on t-SNE, indicating that the above method can improve the recognition accuracy and efficiency