19,364 research outputs found
Object-Based Supervised Machine Learning Regional-Scale Land-Cover Classification Using High Resolution Remotely Sensed Data
High spatial resolution (HR) (1m – 5m) remotely sensed data in conjunction with supervised machine learning classification are commonly used to construct land-cover classifications. Despite the increasing availability of HR data, most studies investigating HR remotely sensed data and associated classification methods employ relatively small study areas. This work therefore drew on a 2,609 km2, regional-scale study in northeastern West Virginia, USA, to investigates a number of core aspects of HR land-cover supervised classification using machine learning. Issues explored include training sample selection, cross-validation parameter tuning, the choice of machine learning algorithm, training sample set size, and feature selection. A geographic object-based image analysis (GEOBIA) approach was used. The data comprised National Agricultural Imagery Program (NAIP) orthoimagery and LIDAR-derived rasters. Stratified-statistical-based training sampling methods were found to generate higher classification accuracies than deliberative-based sampling. Subset-based sampling, in which training data is collected from a small geographic subset area within the study site, did not notably decrease the classification accuracy. For the five machine learning algorithms investigated, support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), and learning vector quantization (LVQ), increasing the size of the training set typically improved the overall accuracy of the classification. However, RF was consistently more accurate than the other four machine learning algorithms, even when trained from a relatively small training sample set. Recursive feature elimination (RFE), which can be used to reduce the dimensionality of a training set, was found to increase the overall accuracy of both SVM and NEU classification, however the improvement in overall accuracy diminished as sample size increased. RFE resulted in only a small improvement the overall accuracy of RF classification, indicating that RF is generally insensitive to the Hughes Phenomenon. Nevertheless, as feature selection is an optional step in the classification process, and can be discarded if it has a negative effect on classification accuracy, it should be investigated as part of best practice for supervised machine land-cover classification using remotely sensed data
Decorrelation of Neutral Vector Variables: Theory and Applications
In this paper, we propose novel strategies for neutral vector variable
decorrelation. Two fundamental invertible transformations, namely serial
nonlinear transformation and parallel nonlinear transformation, are proposed to
carry out the decorrelation. For a neutral vector variable, which is not
multivariate Gaussian distributed, the conventional principal component
analysis (PCA) cannot yield mutually independent scalar variables. With the two
proposed transformations, a highly negatively correlated neutral vector can be
transformed to a set of mutually independent scalar variables with the same
degrees of freedom. We also evaluate the decorrelation performances for the
vectors generated from a single Dirichlet distribution and a mixture of
Dirichlet distributions. The mutual independence is verified with the distance
correlation measurement. The advantages of the proposed decorrelation
strategies are intensively studied and demonstrated with synthesized data and
practical application evaluations
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
Highly distributed training of Deep Neural Networks (DNNs) on future compute
platforms (offering 100 of TeraOps/s of computational capacity) is expected to
be severely communication constrained. To overcome this limitation, new
gradient compression techniques are needed that are computationally friendly,
applicable to a wide variety of layers seen in Deep Neural Networks and
adaptable to variations in network architectures as well as their
hyper-parameters. In this paper we introduce a novel technique - the Adaptive
Residual Gradient Compression (AdaComp) scheme. AdaComp is based on localized
selection of gradient residues and automatically tunes the compression rate
depending on local activity. We show excellent results on a wide spectrum of
state of the art Deep Learning models in multiple domains (vision, speech,
language), datasets (MNIST, CIFAR10, ImageNet, BN50, Shakespeare), optimizers
(SGD with momentum, Adam) and network parameters (number of learners,
minibatch-size etc.). Exploiting both sparsity and quantization, we demonstrate
end-to-end compression rates of ~200X for fully-connected and recurrent layers,
and ~40X for convolutional layers, without any noticeable degradation in model
accuracies.Comment: IBM Research AI, 9 pages, 7 figures, AAAI18 accepte
- …