285 research outputs found
Compassionately Conservative Normalized Cuts for Image Segmentation
Image segmentation is a process used in computer vision to partition an image into regions with similar characteristics. One category of image segmentation algorithms is graph-based, where pixels in an image are represented by vertices in a graph and the similarity between pixels is represented by weighted edges. A segmentation of the image can be found by cutting edges between dissimilar groups of pixels in the graph, leaving different clusters or partitions of the data.
A popular graph-based method for segmenting images is the Normalized Cuts (NCuts) algorithm, which quantifies the cost for graph partitioning in a way that biases clusters or segments that are balanced towards having lower values than unbalanced partitionings. This bias is so strong, however, that the NCuts algorithm avoids any singleton partitions, even when vertices are weakly connected to the rest of the graph. For this reason, we propose the Compassionately Conservative Normalized Cut (CCNCut) objective function, which strikes a better compromise between the desire to avoid too many singleton partitions and the notion that all partitions should be balanced.
We demonstrate how CCNCut minimization can be relaxed into the problem of computing Piecewise Flat Embeddings (PFE) and provide an overview of, as well as two efficiency improvements to, the Splitting Orthogonality Constraint (SOC) algorithm previously used to approximate PFE. We then present a new algorithm for computing PFE based on iteratively minimizing a sequence of reweighted Rayleigh quotients (IRRQ) and run a series of experiments to compare CCNCut-based image segmentation via SOC and IRRQ to NCut-based image segmentation on the BSDS500 dataset. Our results indicate that CCNCut-based image segmentation yields more accurate results with respect to ground truth than NCut-based segmentation, and IRRQ is less sensitive to initialization than SOC
Towards Efficient Lifelong Machine Learning in Deep Neural Networks
Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes
Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?
Supervised classification methods often assume the train and test data
distributions are the same and that all classes in the test set are present in
the training set. However, deployed classifiers often require the ability to
recognize inputs from outside the training set as unknowns. This problem has
been studied under multiple paradigms including out-of-distribution detection
and open set recognition. For convolutional neural networks, there have been
two major approaches: 1) inference methods to separate knowns from unknowns and
2) feature space regularization strategies to improve model robustness to
outlier inputs. There has been little effort to explore the relationship
between the two approaches and directly compare performance on anything other
than small-scale datasets that have at most 100 categories. Using ImageNet-1K
and Places-434, we identify novel combinations of regularization and
specialized inference methods that perform best across multiple outlier
detection problems of increasing difficulty level. We found that input
perturbation and temperature scaling yield the best performance on large scale
datasets regardless of the feature space regularization strategy. Improving the
feature space by regularizing against a background class can be helpful if an
appropriate background class can be found, but this is impractical for large
scale image classification datasets
SIESTA: Efficient Online Continual Learning with Sleep
In supervised continual learning, a deep neural network (DNN) is updated with
an ever-growing data stream. Unlike the offline setting where data is shuffled,
we cannot make any distributional assumptions about the data stream. Ideally,
only one pass through the dataset is needed for computational efficiency.
However, existing methods are inadequate and make many assumptions that cannot
be made for real-world applications, while simultaneously failing to improve
computational efficiency. In this paper, we propose a novel continual learning
method, SIESTA based on wake/sleep framework for training, which is well
aligned to the needs of on-device learning. The major goal of SIESTA is to
advance compute efficient continual learning so that DNNs can be updated
efficiently using far less time and energy. The principal innovations of SIESTA
are: 1) rapid online updates using a rehearsal-free, backpropagation-free, and
data-driven network update rule during its wake phase, and 2) expedited memory
consolidation using a compute-restricted rehearsal policy during its sleep
phase. For memory efficiency, SIESTA adapts latent rehearsal using memory
indexing from REMIND. Compared to REMIND and prior arts, SIESTA is far more
computationally efficient, enabling continual learning on ImageNet-1K in under
2 hours on a single GPU; moreover, in the augmentation-free setting it matches
the performance of the offline learner, a milestone critical to driving
adoption of continual learning in real-world applications.Comment: Accepted to TMLR 202
- …