Search CORE

60,276 research outputs found

Building high-level features using large scale unsupervised learning

Author: Chen Kai
Corrado Greg S.
Dean Jeff
Devin Matthieu
Le Quoc V.
Monga Rajat
Ng Andrew Y.
Ranzato Marc'Aurelio
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art

arXiv.org e-Print Archive

CiteSeerX

Unsupervised feature learning for electrocardiogram data using the convolutional variational autoencoder

Author: 윤덕용
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/12/2021
Field of study

Most existing electrocardiogram (ECG) feature extraction methods rely on rule-based approaches. It is difficult to manually define all ECG features. We propose an unsupervised feature learning method using a convolutional variational autoencoder (CVAE) that can extract ECG features with unlabeled data. We used 596,000 ECG samples from 1,278 patients archived in biosignal databases from intensive care units to train the CVAE. Three external datasets were used for feature validation using two approaches. First, we explored the features without an additional training process. Clustering, latent space exploration, and anomaly detection were conducted. We confirmed that CVAE features reflected the various types of ECG rhythms. Second, we applied CVAE features to new tasks as input data and CVAE weights to weight initialization for different models for transfer learning for the classification of 12 types of arrhythmias. The f1-score for arrhythmia classification with extreme gradient boosting was 0.86 using CVAE features only. The f1-score of the model in which weights were initialized with the CVAE encoder was 5% better than that obtained with random initialization. Unsupervised feature learning with CVAE can extract the characteristics of various types of ECGs and can be an alternative to the feature extraction method for ECGs.ope

Yonsei University Medical Library Open Access Repository

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Author: Efros Alexei A.
Isola Phillip
Shechtman Eli
Wang Oliver
Zhang Richard
Publication venue
Publication date: 10/04/2018
Field of study

While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification has been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.Comment: Accepted to CVPR 2018; Code and data available at https://www.github.com/richzhang/PerceptualSimilarit

arXiv.org e-Print Archive

Crossref

Unsupervised Feature Learning through Divergent Discriminative Feature Accumulation

Author: Morse Gregory
Pugh Justin K.
Stanley Kenneth O.
Szerlip Paul A.
Publication venue
Publication date: 09/06/2014
Field of study

Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start even though they are trained without knowledge of the ultimate classification problem. Interestingly, DDFA also continues to add new features indefinitely (so it does not depend on a hidden layer size), is not based on minimizing error, and is inherently divergent instead of convergent, thereby providing a unique direction of research for unsupervised feature learning. In this paper the quality of its learned features is demonstrated on the MNIST dataset, where its performance confirms that indeed DDFA is a viable technique for learning useful features.Comment: Corrected citation formattin

arXiv.org e-Print Archive

CiteSeerX

Association for the Advancement of Artificial Intelligence: AAAI Publications

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision

Author: Baluja Shumeet
Leordeanu Marius
Radu Alexandra
Sukthankar Rahul
Publication venue
Publication date: 01/12/2015
Field of study

Feature selection is essential for effective visual recognition. We propose an efficient joint classifier learning and feature selection method that discovers sparse, compact representations of input features from a vast sea of candidates, with an almost unsupervised formulation. Our method requires only the following knowledge, which we call the \emph{feature sign}---whether or not a particular feature has on average stronger values over positive samples than over negatives. We show how this can be estimated using as few as a single labeled training sample per class. Then, using these feature signs, we extend an initial supervised learning problem into an (almost) unsupervised clustering formulation that can incorporate new data without requiring ground truth labels. Our method works both as a feature selection mechanism and as a fully competitive classifier. It has important properties, low computational cost and excellent accuracy, especially in difficult cases of very limited training data. We experiment on large-scale recognition in video and show superior speed and performance to established feature selection approaches such as AdaBoost, Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sparse Coding on Stereo Video for Object Detection

Author: Kenyon Garrett T.
Lundquist Sheng Y.
Mitchell Melanie
Publication venue
Publication date: 01/05/2017
Field of study

Deep Convolutional Neural Networks (DCNN) require millions of labeled training examples for image classification and object detection tasks, which restrict these models to domains where such datasets are available. In this paper, we explore the use of unsupervised sparse coding applied to stereo-video data to help alleviate the need for large amounts of labeled data. We show that replacing a typical supervised convolutional layer with an unsupervised sparse-coding layer within a DCNN allows for better performance on a car detection task when only a limited number of labeled training examples is available. Furthermore, the network that incorporates sparse coding allows for more consistent performance over varying initializations and ordering of training examples when compared to a fully supervised DCNN. Finally, we compare activations between the unsupervised sparse-coding layer and the supervised convolutional layer, and show that the sparse representation exhibits an encoding that is depth selective, whereas encodings from the convolutional layer do not exhibit such selectivity. These result indicates promise for using unsupervised sparse-coding approaches in real-world computer vision tasks in domains with limited labeled training data

arXiv.org e-Print Archive

PDXScholar (Portland State University)

Unsupervised Learning of Visual Structure using Predictive Generative Networks

Author: Cox David
Kreiman Gabriel
Lotter William
Publication venue
Publication date: 15/12/2015
Field of study

The ability to predict future states of the environment is a central pillar of intelligence. At its core, effective prediction requires an internal model of the world and an understanding of the rules by which the world changes. Here, we explore the internal models developed by deep neural networks trained using a loss based on predicting future frames in synthetic video sequences, using a CNN-LSTM-deCNN framework. We first show that this architecture can achieve excellent performance in visual sequence prediction tasks, including state-of-the-art performance in a standard 'bouncing balls' dataset (Sutskever et al., 2009). Using a weighted mean-squared error and adversarial loss (Goodfellow et al., 2014), the same architecture successfully extrapolates out-of-the-plane rotations of computer-generated faces. Furthermore, despite being trained end-to-end to predict only pixel-level information, our Predictive Generative Networks learn a representation of the latent structure of the underlying three-dimensional objects themselves. Importantly, we find that this representation is naturally tolerant to object transformations, and generalizes well to new tasks, such as classification of static images. Similar models trained solely with a reconstruction loss fail to generalize as effectively. We argue that prediction can serve as a powerful unsupervised loss for learning rich internal representations of high-level object features.Comment: under review as conference paper at ICLR 201

arXiv.org e-Print Archive

DSpace@MIT