36 research outputs found
Initialization of lattice Boltzmann models with the help of the numerical Chapman-Enskog expansion
We extend the applicability of the numerical Chapman-Enskog expansion as a
lifting operator for lattice Boltzmann models to map density and momentum to
distribution functions. In earlier work [Vanderhoydonc et al. Multiscale Model.
Simul. 10(3): 766-791, 2012] such an expansion was constructed in the context
of lifting only the zeroth order velocity moment, namely the density. A lifting
operator is necessary to convert information from the macroscopic to the
mesoscopic scale. This operator is used for the initialization of lattice
Boltzmann models. Given only density and momentum, the goal is to initialize
the distribution functions of lattice Boltzmann models. For this
initialization, the numerical Chapman-Enskog expansion is used in this paper.Comment: arXiv admin note: text overlap with arXiv:1108.491
Coarse Grained Computations for a Micellar System
We establish, through coarse-grained computation, a connection between
traditional, continuum numerical algorithms (initial value problems as well as
fixed point algorithms) and atomistic simulations of the Larson model of
micelle formation. The procedure hinges on the (expected) evolution of a few
slow, coarse-grained mesoscopic observables of the MC simulation, and on
(computational) time scale separation between these and the remaining "slaved",
fast variables. Short bursts of appropriately initialized atomistic simulation
are used to estimate the (coarse-grained, deterministic) local dynamics of the
evolution of the observables. These estimates are then in turn used to
accelerate the evolution to computational stationarity through traditional
continuum algorithms (forward Euler integration, Newton-Raphson fixed point
computation). This "equation-free" framework, bypassing the derivation of
explicit, closed equations for the observables (e.g. equations of state) may
provide a computational bridge between direct atomistic / stochastic simulation
and the analysis of its macroscopic, system-level consequences
Numerical extraction of a macroscopic pde and a lifting operator from a lattice Boltzmann model
Lifting operators play an important role in starting a lattice Boltzmann
model from a given initial density. The density, a macroscopic variable, needs
to be mapped to the distribution functions, mesoscopic variables, of the
lattice Boltzmann model. Several methods proposed as lifting operators have
been tested and discussed in the literature. The most famous methods are an
analytically found lifting operator, like the Chapman-Enskog expansion, and a
numerical method, like the Constrained Runs algorithm, to arrive at an implicit
expression for the unknown distribution functions with the help of the density.
This paper proposes a lifting operator that alleviates several drawbacks of
these existing methods. In particular, we focus on the computational expense
and the analytical work that needs to be done. The proposed lifting operator, a
numerical Chapman-Enskog expansion, obtains the coefficients of the
Chapman-Enskog expansion numerically. Another important feature of the use of
lifting operators is found in hybrid models. There the lattice Boltzmann model
is spatially coupled with a model based on a more macroscopic description, for
example an advection-diffusion-reaction equation. In one part of the domain,
the lattice Boltzmann model is used, while in another part, the more
macroscopic model. Such a hybrid coupling results in missing data at the
interfaces between the different models. A lifting operator is then an
important tool since the lattice Boltzmann model is typically described by more
variables than a model based on a macroscopic partial differential equation.Comment: submitted to SIAM MM
Some of the variables, some of the parameters, some of the times, with some physics known: Identification with partial information
Experimental data is often comprised of variables measured independently, at
different sampling rates (non-uniform t between successive
measurements); and at a specific time point only a subset of all variables may
be sampled. Approaches to identifying dynamical systems from such data
typically use interpolation, imputation or subsampling to reorganize or modify
the training data to learning. Partial physical knowledge may
also be available (accurately or approximately), and
data-driven techniques can complement this knowledge. Here we exploit neural
network architectures based on numerical integration methods and physical knowledge to identify the right-hand side of the underlying
governing differential equations. Iterates of such neural-network models allow
for learning from data sampled at arbitrary time points data
modification. Importantly, we integrate the network with available partial
physical knowledge in "physics informed gray-boxes"; this enables learning
unknown kinetic rates or microbial growth functions while simultaneously
estimating experimental parameters.Comment: 25 pages, 15 figure
Advances in scaling deep learning algorithms
Les algorithmes d'apprentissage profond forment un nouvel ensemble de méthodes
puissantes pour l'apprentissage automatique. L'idée est de combiner des couches
de facteurs latents en hierarchies. Cela requiert souvent un coût computationel
plus elevé et augmente aussi le nombre de paramètres du modèle. Ainsi, l'utilisation
de ces méthodes sur des problèmes à plus grande échelle demande de réduire leur
coût et aussi d'améliorer leur régularisation et leur optimization. Cette
thèse adresse cette question sur ces trois perspectives.
Nous étudions tout d'abord le problème de réduire le coût de certains
algorithmes profonds. Nous proposons deux méthodes pour entrainer des machines
de Boltzmann restreintes et des auto-encodeurs débruitants sur des
distributions sparses à haute dimension. Ceci est important pour l'application
de ces algorithmes pour le traitement de langues naturelles. Ces deux méthodes
(Dauphin et al., 2011; Dauphin and Bengio, 2013) utilisent l'échantillonage
par importance pour échantilloner l'objectif de ces modèles. Nous observons que
cela réduit significativement le temps d'entrainement. L'accéleration atteint
2 ordres de magnitude sur plusieurs bancs d'essai.
Deuxièmement, nous introduisont un puissant régularisateur pour les méthodes
profondes. Les résultats expérimentaux démontrent qu'un bon régularisateur est
crucial pour obtenir de bonnes performances avec des gros réseaux (Hinton et al., 2012).
Dans Rifai et al. (2011), nous proposons un nouveau régularisateur
qui combine l'apprentissage non-supervisé et la propagation de tangente (Simard et al., 1992).
Cette méthode exploite des principes géometriques et permit au moment de la
publication d'atteindre des résultats à l'état de l'art.
Finalement, nous considérons le problème d'optimiser des surfaces non-convexes
à haute dimensionalité comme celle des réseaux de neurones. Tradionellement,
l'abondance de minimum locaux était considéré comme la principale difficulté
dans ces problèmes. Dans Dauphin et al. (2014a) nous argumentons
à partir de résultats en statistique physique, de la théorie des matrices aléatoires,
de la théorie des réseaux de neurones et à partir de résultats expérimentaux qu'une
difficulté plus profonde provient de la prolifération de points-selle. Dans
ce papier nous proposons aussi une nouvelle méthode pour l'optimisation
non-convexe.Deep learning algorithms are a new set of powerful methods for machine
learning. The general idea is to combine layers of latent factors into
hierarchies. This usually leads to a higher computational cost and having more
parameters to tune. Thus scaling to larger problems will require not only
reducing their computational cost but also improving regularization and
optimization. This thesis investigates scaling from these three perspectives.
We first study the problem of reducing the computational cost of some deep learning
algorithms. We propose methods to scale restricted Boltzmann machines (RBM) and
denoising auto-encoders (DAE) to very high-dimensional sparse distributions.
This is important for applications of deep learning to natural language
processing. Both methods (Dauphin et al., 2011; Dauphin and Bengio, 2013)
rely on importance sampling to subsample the learning objective of
these models. We show that this greatly reduces the training time, leading to 2
orders of magnitude speed ups on several benchmark datasets without losses in
the quality of the model.
Second, we introduce a powerful regularization method for deep neural nets.
Experiments have shown that proper regularization is in many cases crucial to
obtaining good performance out of larger networks (Hinton et al., 2012).
In Rifai et al. (2011), we propose a new regularizer that combines
unsupervised learning and tangent propagation (Simard et al., 1992). The method
exploits several geometrical insights and was able at the time of publication
to reach state-of-the-art results on competitive benchmarks.
Finally, we consider the problem of optimizing over high-dimensional non-convex
loss surfaces like those found in deep neural nets. Traditionally, the main
difficulty in these problems is considered to be the abundance of local minima.
In Dauphin et al. (2014a) we argue, based on results from
statistical physics, random matrix theory, neural network theory, and empirical
evidence, that the vast majority of critical points
are saddle points, not local minima. We also propose a new optimization method for
non-convex optimization
Interior-point methods for PDE-constrained optimization
In applied sciences PDEs model an extensive variety of phenomena. Typically the final goal of simulations is a system which is optimal in a certain sense. For instance optimal control problems identify a control to steer a system towards a desired state. Inverse problems seek PDE parameters which are most consistent with measurements. In these optimization problems PDEs appear as equality constraints. PDE-constrained optimization problems are large-scale and often nonconvex. Their numerical solution leads to large ill-conditioned linear systems. In many practical problems inequality constraints implement technical limitations or prior knowledge.
In this thesis interior-point (IP) methods are considered to solve nonconvex large-scale PDE-constrained optimization problems with inequality constraints. To cope with enormous fill-in of direct linear solvers, inexact search directions are allowed in an inexact interior-point (IIP) method. This thesis builds upon the IIP method proposed in [Curtis, Schenk, Wächter, SIAM Journal on Scientific Computing, 2010]. SMART tests cope with the lack of inertia information to control Hessian modification and also specify termination tests for the iterative linear solver.
The original IIP method needs to solve two sparse large-scale linear systems in each optimization step. This is improved to only a single linear system solution in most optimization steps. Within this improved IIP framework, two iterative linear solvers are evaluated: A general purpose algebraic multilevel incomplete L D L^T preconditioned SQMR method is applied to PDE-constrained optimization problems for optimal server room cooling in three space dimensions and to compute an ambient temperature for optimal cooling. The results show robustness and efficiency of the IIP method when compared with the exact IP method.
These advantages are even more evident for a reduced-space preconditioned (RSP) GMRES solver which takes advantage of the linear system's structure. This RSP-IIP method is studied on the basis of distributed and boundary control problems originating from superconductivity and from two-dimensional and three-dimensional parameter estimation problems in groundwater modeling. The numerical results exhibit the improved efficiency especially for multiple PDE constraints.
An inverse medium problem for the Helmholtz equation with pointwise box constraints is solved by IP methods. The ill-posedness of the problem is explored numerically and different regularization strategies are compared. The impact of box constraints and the importance of Hessian modification on the optimization algorithm is demonstrated. A real world seismic imaging problem is solved successfully by the RSP-IIP method