204 research outputs found
On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks
Theoretical analysis of the error landscape of deep neural networks has
garnered significant interest in recent years. In this work, we theoretically
study the importance of noise in the trajectories of gradient descent towards
optimal solutions in multi-layer neural networks. We show that adding noise (in
different ways) to a neural network while training increases the rank of the
product of weight matrices of a multi-layer linear neural network. We thus
study how adding noise can assist reaching a global optimum when the product
matrix is full-rank (under certain conditions). We establish theoretical
foundations between the noise induced into the neural network - either to the
gradient, to the architecture, or to the input/output to a neural network - and
the rank of product of weight matrices. We corroborate our theoretical findings
with empirical results.Comment: 4 pages + 1 figure (main, excluding references), 5 pages + 4 figures
(appendix
Boron nitride photocatalysts for solar fuel synthesis
Reshaping our global energy portfolio in light of the rising anthropogenic CO2 emissions
is paramount. Solar fuel production via photocatalysis constitutes a sustainable energy generation route, allowing one to harness the abundance of sunlight for CO2 transformation. In this thesis, we develop a new materials platform for boron nitride (BN) photocatalysts in solar fuel synthesis. We present a proof-of-concept for a porous boron oxynitride (BNO) photocatalyst facilitating gas phase CO2 capture and photoreduction, without doping or cocatalysts. We then present two routes to enhance light harvesting and photoactivity in BN: boron- and oxygen doping. Boron doping yielded B-BNO, the first water-stable, photoactive BN material, facilitating liquid phase H2 evolution under deep visible irradiation (λ > 550 nm) and gas phase CO2 photoreduction. In parallel, we demonstrate that tuning the oxygen content in BNO can lower and vary light harvesting to the deep visible region. Using a systematic design of experiments process, we tune and predict the chemical, paramagnetic and optoelectronic properties of BNO. We probe the role of free radicals and paramagnetic states on the photochemistry of BNO using a combined experimental, computational and
first-principles approach. The family of BN photocatalysts all exhibit unique paramagnetism, shown to arise from free radicals in isolated OB3 sites, which we unequivocally confirm as the governing state for red-shifted light harvesting and photoactivity in BNO. Finally, we explore
a new avenue in BN photocatalyst design and present the first example of semiconducting
BNO quantum dots for CO2 photoreduction. The evolution rates, quantum efficiencies, and
selectivities of all the BN materials surpassed P25 TiO2 and graphitic carbon nitride -
benchmark photocatalysts in the field. Overall, this thesis opens the door to a radically new
generation of BN-based photocatalysts for solar fuels synthesis.Open Acces
ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
Two major momentum-based techniques that have achieved tremendous success in
optimization are Polyak's heavy ball method and Nesterov's accelerated
gradient. A crucial step in all momentum-based methods is the choice of the
momentum parameter which is always suggested to be set to less than .
Although the choice of is justified only under very strong theoretical
assumptions, it works well in practice even when the assumptions do not
necessarily hold. In this paper, we propose a new momentum based method
, which relaxes the constraint of and allows the
learning algorithm to use adaptive higher momentum. We motivate our hypothesis
on by experimentally verifying that a higher momentum () can help
escape saddles much faster. Using this motivation, we propose our method
that helps weigh the previous updates more (by setting the
momentum parameter ), evaluate our proposed algorithm on deep neural
networks and show that helps the learning algorithm to
converge much faster without compromising on the generalization error.Comment: 8 + 1 pages, 12 figures, accepted at CoDS-COMAD 201
DANTE: Deep AlterNations for Training nEural networks
We present DANTE, a novel method for training neural networks using the
alternating minimization principle. DANTE provides an alternate perspective to
traditional gradient-based backpropagation techniques commonly used to train
deep networks. It utilizes an adaptation of quasi-convexity to cast training a
neural network as a bi-quasi-convex optimization problem. We show that for
neural network configurations with both differentiable (e.g. sigmoid) and
non-differentiable (e.g. ReLU) activation functions, we can perform the
alternations effectively in this formulation. DANTE can also be extended to
networks with multiple hidden layers. In experiments on standard datasets,
neural networks trained using the proposed method were found to be promising
and competitive to traditional backpropagation techniques, both in terms of
quality of the solution, as well as training speed.Comment: 19 page
- …