2,345 research outputs found
Stochastic optimization methods for the simultaneous control of parameter-dependent systems
We address the application of stochastic optimization methods for the
simultaneous control of parameter-dependent systems. In particular, we focus on
the classical Stochastic Gradient Descent (SGD) approach of Robbins and Monro,
and on the recently developed Continuous Stochastic Gradient (CSG) algorithm.
We consider the problem of computing simultaneous controls through the
minimization of a cost functional defined as the superposition of individual
costs for each realization of the system. We compare the performances of these
stochastic approaches, in terms of their computational complexity, with those
of the more classical Gradient Descent (GD) and Conjugate Gradient (CG)
algorithms, and we discuss the advantages and disadvantages of each
methodology. In agreement with well-established results in the machine learning
context, we show how the SGD and CSG algorithms can significantly reduce the
computational burden when treating control problems depending on a large amount
of parameters. This is corroborated by numerical experiments
Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability
Internet-of-Things (IoT) envisions an intelligent infrastructure of networked
smart devices offering task-specific monitoring and control services. The
unique features of IoT include extreme heterogeneity, massive number of
devices, and unpredictable dynamics partially due to human interaction. These
call for foundational innovations in network design and management. Ideally, it
should allow efficient adaptation to changing environments, and low-cost
implementation scalable to massive number of devices, subject to stringent
latency constraints. To this end, the overarching goal of this paper is to
outline a unified framework for online learning and management policies in IoT
through joint advances in communication, networking, learning, and
optimization. From the network architecture vantage point, the unified
framework leverages a promising fog architecture that enables smart devices to
have proximity access to cloud functionalities at the network edge, along the
cloud-to-things continuum. From the algorithmic perspective, key innovations
target online approaches adaptive to different degrees of nonstationarity in
IoT dynamics, and their scalable model-free implementation under limited
feedback that motivates blind or bandit approaches. The proposed framework
aspires to offer a stepping stone that leads to systematic designs and analysis
of task-specific learning and management schemes for IoT, along with a host of
new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive
and Scalable Communication Network
Adaptively parametrized surface wave tomography: methodology and a new model of the European upper mantle
In this study, we aim to close the gap between regional and global traveltime tomography in the context of surface wave tomography of the upper mantle implementing the principle of adaptive parametrization. Observations of seismic surface waves are a very powerful tool to constrain the 3-D structure of the Earth's upper mantle, including its anisotropy, because they sample this volume efficiently due to their sensitivity over a wide depth range along the ray path. On a global scale, surface wave tomography models are often parametrized uniformly, without accounting for inhomogeneities in data coverage and, as a result, in resolution, that are caused by effective under- or overparametrization in many areas. If the local resolving power of seismic data is not taken into account when parametrizing the model, features will be smeared and distorted in tomographic maps, with subsequent misinterpretation. Parametrization density has to change locally, for models to be robustly constrained without losing any accurate information available in the best sampled regions. We have implemented a new algorithm for upper mantle surface wave tomography, based on adaptive-voxel parametrization, with voxel size defined by both the ‘hit count' (number of observations sampling the voxel) and ‘azimuthal coverage' (how well different azimuths with respect to the voxel are covered by the source-station distribution). High image resolution is achieved in regions with dense data coverage, while lower image resolution is kept in regions where data coverage is poorer. This way, parametrization is everywhere tuned to optimal resolution, minimizing both the computational costs, and the non-uniqueness of the solution. The spacing of our global grid is locally as small as ∼50 km. We apply our method to identify a new global model of vertically and horizontally polarized shear velocity, with resolution particularly enhanced in the European lithosphere and upper mantle. We find our new model to resolve lithospheric thickness and radial anisotropy better than earlier results based on the same data. Robust features of our model include, for example, the Trans-European Suture Zone, the Panonnian Basin, thinned lithosphere in the Aegean and Western Mediterranean, possible small-scale mantle upwellings under Iberia and Massif Central, subduction under the Aegean arc and a very deep cratonic root underneath southern Finlan
SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Computer vision is experiencing an AI renaissance, in which machine learning
models are expediting important breakthroughs in academic research and
commercial applications. Effectively training these models, however, is not
trivial due in part to hyperparameters: user-configured values that control a
model's ability to learn from data. Existing hyperparameter optimization
methods are highly parallel but make no effort to balance the search across
heterogeneous hardware or to prioritize searching high-impact spaces. In this
paper, we introduce a framework for massively Scalable Hardware-Aware
Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the
relative complexity of each search space and monitors performance on the
learning task over all trials. These metrics are then used as heuristics to
assign hyperparameters to distributed workers based on their hardware. We first
demonstrate that our framework achieves double the throughput of a standard
distributed hyperparameter optimization framework by optimizing SVM for MNIST
using 150 distributed workers. We then conduct model search with SHADHO over
the course of one week using 74 GPUs across two compute clusters to optimize
U-Net for a cell segmentation task, discovering 515 models that achieve a lower
validation loss than standard U-Net.Comment: 10 pages, 6 figure
- …