52,912 research outputs found
Deep transfer learning for improving single-EEG arousal detection
Datasets in sleep science present challenges for machine learning algorithms
due to differences in recording setups across clinics. We investigate two deep
transfer learning strategies for overcoming the channel mismatch problem for
cases where two datasets do not contain exactly the same setup leading to
degraded performance in single-EEG models. Specifically, we train a baseline
model on multivariate polysomnography data and subsequently replace the first
two layers to prepare the architecture for single-channel
electroencephalography data. Using a fine-tuning strategy, our model yields
similar performance to the baseline model (F1=0.682 and F1=0.694,
respectively), and was significantly better than a comparable single-channel
model. Our results are promising for researchers working with small databases
who wish to use deep learning models pre-trained on larger databases.Comment: Accepted for presentation at EMBC202
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge
We propose a fully automatic minutiae extractor, called MinutiaeNet, based on
deep neural networks with compact feature representation for fast comparison of
minutiae sets. Specifically, first a network, called CoarseNet, estimates the
minutiae score map and minutiae orientation based on convolutional neural
network and fingerprint domain knowledge (enhanced image, orientation field,
and segmentation map). Subsequently, another network, called FineNet, refines
the candidate minutiae locations based on score map. We demonstrate the
effectiveness of using the fingerprint domain knowledge together with the deep
networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004)
public domain fingerprint datasets provide comprehensive empirical support for
the merits of our method. Further, our method finds minutiae sets that are
better in terms of precision and recall in comparison with state-of-the-art on
these two datasets. Given the lack of annotated fingerprint datasets with
minutiae ground truth, the proposed approach to robust minutiae detection will
be useful to train network-based fingerprint matching algorithms as well as for
evaluating fingerprint individuality at scale. MinutiaeNet is implemented in
Tensorflow: https://github.com/luannd/MinutiaeNetComment: Accepted to International Conference on Biometrics (ICB 2018
Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multimedia retrieval, with the aim of learning joint representations between different data modalities. Unfortunately, little research focuses on cross-modal correlation learning where temporal structures of different data modalities such as audio and lyrics should be taken into account. Stemming from the characteristic of temporal structures of music in nature, we are motivated to learn the deep sequential correlation between audio and lyrics. In this work, we propose a deep cross-modal correlation learning architecture involving two-branch deep neural networks for audio modality and text modality (lyrics). Data in different modalities are converted to the same canonical space where inter modal canonical correlation analysis is utilized as an objective function to calculate the similarity of temporal structures. This is the first study that uses deep architectures for learning the temporal correlation between audio and lyrics. A pre-trained Doc2Vec model followed by fully-connected layers is used to represent lyrics. Two significant contributions are made in the audio branch, as follows: i) We propose an end-to-end network to learn cross-modal correlation between audio and lyrics, where feature extraction and correlation learning are simultaneously performed and joint representation is learned by considering temporal structures. ii) As for feature extraction, we further represent an audio signal by a short sequence of local summaries (VGG16 features) and apply a recurrent neural network to compute a compact feature that better learns temporal structures of music audio. Experimental results, using audio to retrieve lyrics or using lyrics to retrieve audio, verify the effectiveness of the proposed deep correlation learning architectures in cross-modal music retrieval
- …