27 research outputs found
ProxQuant: Quantized Neural Networks via Proximal Operators
To make deep neural networks feasible in resource-constrained environments
(such as mobile devices), it is beneficial to quantize models by using
low-precision weights. One common technique for quantizing neural networks is
the straight-through gradient method, which enables back-propagation through
the quantization mapping. Despite its empirical success, little is understood
about why the straight-through gradient method works.
Building upon a novel observation that the straight-through gradient method
is in fact identical to the well-known Nesterov's dual-averaging algorithm on a
quantization constrained optimization problem, we propose a more principled
alternative approach, called ProxQuant, that formulates quantized network
training as a regularized learning problem instead and optimizes it via the
prox-gradient method. ProxQuant does back-propagation on the underlying
full-precision vector and applies an efficient prox-operator in between
stochastic gradient steps to encourage quantizedness. For quantizing ResNets
and LSTMs, ProxQuant outperforms state-of-the-art results on binary
quantization and is on par with state-of-the-art on multi-bit quantization. For
binary quantization, our analysis shows both theoretically and experimentally
that ProxQuant is more stable than the straight-through gradient method (i.e.
BinaryConnect), challenging the indispensability of the straight-through
gradient method and providing a powerful alternative
An adaptive mirror-prox algorithm for variational inequalities with singular operators
International audienceLipschitz continuity is a central requirement for achieving the optimal O(1/T) rate of convergence in monotone, deterministic variational inequalities (a setting that includes convex minimization, convex-concave optimization, nonatomic games, and many other problems). However, in many cases of practical interest, the operator defining the variational inequality may exhibit singularities at the boundary of the feasible region, precluding in this way the use of fast gradient methods that attain this optimal rate (such as Nemirovski's mirror-prox algorithm and its variants). To address this issue, we propose a novel regularity condition which we call Bregman continuity, and which relates the variation of the operator to that of a suitably chosen Bregman function. Leveraging this condition, we derive an adaptive mirror-prox algorithm which attains the optimal O(1/T) rate of convergence in problems with possibly singular operators, without any prior knowledge of the degree of smoothness (the Bregman analogue of the Lipschitz constant). We also show that, under Bregman continuity, the mirror-prox algorithm achieves a convergence rate in stochastic variational inequalities
Fast Polynomial Kernel Classification for Massive Data
In the era of big data, it is highly desired to develop efficient machine
learning algorithms to tackle massive data challenges such as storage
bottleneck, algorithmic scalability, and interpretability. In this paper, we
develop a novel efficient classification algorithm, called fast polynomial
kernel classification (FPC), to conquer the scalability and storage challenges.
Our main tools are a suitable selected feature mapping based on polynomial
kernels and an alternating direction method of multipliers (ADMM) algorithm for
a related non-smooth convex optimization problem. Fast learning rates as well
as feasibility verifications including the convergence of ADMM and the
selection of center points are established to justify theoretical behaviors of
FPC. Our theoretical assertions are verified by a series of simulations and
real data applications. The numerical results demonstrate that FPC
significantly reduces the computational burden and storage memory of the
existing learning schemes such as support vector machines and boosting, without
sacrificing their generalization abilities much.Comment: arXiv admin note: text overlap with arXiv:1402.4735 by other author
Hybrid Advanced Optimization Methods with Evolutionary Computation Techniques in Energy Forecasting
More accurate and precise energy demand forecasts are required when energy decisions are made in a competitive environment. Particularly in the Big Data era, forecasting models are always based on a complex function combination, and energy data are always complicated. Examples include seasonality, cyclicity, fluctuation, dynamic nonlinearity, and so on. These forecasting models have resulted in an over-reliance on the use of informal judgment and higher expenses when lacking the ability to determine data characteristics and patterns. The hybridization of optimization methods and superior evolutionary algorithms can provide important improvements via good parameter determinations in the optimization process, which is of great assistance to actions taken by energy decision-makers. This book aimed to attract researchers with an interest in the research areas described above. Specifically, it sought contributions to the development of any hybrid optimization methods (e.g., quadratic programming techniques, chaotic mapping, fuzzy inference theory, quantum computing, etc.) with advanced algorithms (e.g., genetic algorithms, ant colony optimization, particle swarm optimization algorithm, etc.) that have superior capabilities over the traditional optimization approaches to overcome some embedded drawbacks, and the application of these advanced hybrid approaches to significantly improve forecasting accuracy
An Integrated Method for Optimizing Bridge Maintenance Plans
Bridges are one of the vital civil infrastructure assets, essential for economic developments and public welfare. Their large numbers, deteriorating condition, public demands for safe and efficient transportation networks and limited maintenance and intervention budgets pose a challenge, particularly when coupled with the need to respect environmental constraints. This state of affairs creates a wide gap between critical needs for intervention actions, and tight maintenance and rehabilitation funds. In an effort to meet this challenge, a newly developed integrated method for optimized maintenance and intervention plans for reinforced concrete bridge decks is introduced. The method encompasses development of five models: surface defects evaluation, corrosion severities evaluation, deterioration modeling, integrated condition assessment, and optimized maintenance plans. These models were automated in a set of standalone computer applications, coded using C#.net in Matlab environment. These computer applications were subsequently combined to form an integrated method for optimized maintenance and intervention plans. Four bridges and a dataset of bridge images were used in testing and validating the developed optimization method and its five models.
The developed models have unique features and demonstrated noticeable performance and accuracy over methods used in practice and those reported in the literature. For example, the accuracy of the surface defects detection and evaluation model outperforms those of widely-recognized machine leaning and deep learning models; reducing detection, recognition and evaluation of surface defects error by 56.08%, 20.2% and 64.23%, respectively. The corrosion evaluation model comprises design of a standardized amplitude rating system that circumvents limitations of numerical amplitude-based corrosion maps. In the integrated condition, it was inferred that the developed model accomplished consistent improvement over the visual inspection procedures in-use by the Ministry of Transportation in Quebec. Similarly, the deterioration model displayed average enhancement in the prediction accuracies by 60% when compared against the most commonly-utilized weibull distribution. The performance of the developed multi-objective optimization model yielded 49% and 25% improvement over that of genetic algorithm in a five-year study period and a twenty five-year study period, respectively. At the level of thirty five-year study period, unlike the developed model, classical meta-heuristics failed to find feasible solutions within the assigned constraints. The developed integrated platform is expected to provide an efficient tool that enables decision makers to formulate sustainable maintenance plans that optimize budget allocations and ensure efficient utilization of resources
Learning in games with continuous action spaces and unknown payoff functions
International audienc