384 research outputs found
ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks
Hash codes are efficient data representations for coping with the ever
growing amounts of data. In this paper, we introduce a random forest semantic
hashing scheme that embeds tiny convolutional neural networks (CNN) into
shallow random forests, with near-optimal information-theoretic code
aggregation among trees. We start with a simple hashing scheme, where random
trees in a forest act as hashing functions by setting `1' for the visited tree
leaf, and `0' for the rest. We show that traditional random forests fail to
generate hashes that preserve the underlying similarity between the trees,
rendering the random forests approach to hashing challenging. To address this,
we propose to first randomly group arriving classes at each tree split node
into two groups, obtaining a significantly simplified two-class classification
problem, which can be handled using a light-weight CNN weak learner. Such
random class grouping scheme enables code uniqueness by enforcing each class to
share its code with different classes in different trees. A non-conventional
low-rank loss is further adopted for the CNN weak learners to encourage code
consistency by minimizing intra-class variations and maximizing inter-class
distance for the two random class groups. Finally, we introduce an
information-theoretic approach for aggregating codes of individual trees into a
single hash code, producing a near-optimal unique hash for each class. The
proposed approach significantly outperforms state-of-the-art hashing methods
for image retrieval tasks on large-scale public datasets, while performing at
the level of other state-of-the-art image classification techniques while
utilizing a more compact and efficient scalable representation. This work
proposes a principled and robust procedure to train and deploy in parallel an
ensemble of light-weight CNNs, instead of simply going deeper.Comment: Accepted to ECCV 201
PHURIE : hurricane intensity estimation from infrared satellite imagery using machine learning
Automated prediction of hurricane intensity from satellite infrared imagery is a challenging problem with implications in weather forecasting and disaster planning. In this work, a novel machine learning-based method for estimation of intensity or maximum sustained wind speed of tropical cyclones over their life cycle is presented. The approach is based on a support vector regression model over novel statistical features of infrared images of a hurricane. Specifically, the features characterize the degree of uniformity in various temperature bands of a hurricane. Performance of several machine learning methods such as ordinary least squares regression, backpropagation neural networks and XGBoost regression has been compared using these features under different experimental setups for the task. Kernelized support vector regression resulted in the lowest prediction error between true and predicted hurricane intensities (approximately 10 knots or 18.5 km/h), which is better than previously proposed techniques and comparable to SATCON consensus. The performance of the proposed scheme has also been analyzed with respect to errors in annotation of center of the hurricane and aircraft reconnaissance data. The source code and webserver implementation of the proposed method called PHURIE (PIEAS HURricane Intensity Estimator) is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#PHURIE
Learning with incremental iterative regularization
Within a statistical learning setting, we propose and study an iterative regularization
algorithm for least squares defined by an incremental gradient method. In
particular, we show that, if all other parameters are fixed a priori, the number of
passes over the data (epochs) acts as a regularization parameter, and prove strong
universal consistency, i.e. almost sure convergence of the risk, as well as sharp
finite sample bounds for the iterates. Our results are a step towards understanding
the effect of multiple epochs in stochastic gradient techniques in machine learning
and rely on integrating statistical and optimization result
Approximation and Relaxation Approaches for Parallel and Distributed Machine Learning
Large scale machine learning requires tradeoffs. Commonly this tradeoff has led practitioners to choose simpler, less powerful models, e.g. linear models, in order to process more training examples in a limited time. In this work, we introduce parallelism to the training of non-linear models by leveraging a different tradeoff--approximation. We demonstrate various techniques by which non-linear models can be made amenable to larger data sets and significantly more training parallelism by strategically introducing approximation in certain optimization steps.
For gradient boosted regression tree ensembles, we replace precise selection of tree splits with a coarse-grained, approximate split selection, yielding both faster sequential training and a significant increase in parallelism, in the distributed setting in particular. For metric learning with nearest neighbor classification, rather than explicitly train a neighborhood structure we leverage the implicit neighborhood structure induced by task-specific random forest classifiers, yielding a highly parallel method for metric learning. For support vector machines, we follow existing work to learn a reduced basis set with extremely high parallelism, particularly on GPUs, via existing linear algebra libraries.
We believe these optimization tradeoffs are widely applicable wherever machine learning is put in practice in large scale settings. By carefully introducing approximation, we also introduce significantly higher parallelism and consequently can process more training examples for more iterations than competing exact methods. While seemingly learning the model with less precision, this tradeoff often yields noticeably higher accuracy under a restricted training time budget
- …