11 research outputs found
Recommended from our members
Equilibration of deep neural networks and carrier chirality in Rashba systems
This thesis reports results of studies conducted on the equilibration of two systems and consists of two parts: the first part deals with the optimisation of deep neural networks, whereas the second part with the decay of non-equilibrium states in strongly Rashba-coupled systems at low temperature.
Deep learning is a conceptually simple, highly effective, and widely used tool, yet there remains insufficient understanding for why it works. The optimisation of deep neural networks with common algorithms such as stochastic gradient descent performs unexpectedly well given the complexity of the underlying high-dimensional non-convex minimisation problem. The first part of this thesis therefore looks at the optimisation procedure from the perspective of statistical physics. This allows us to interpret the loss function landscape of deep neural networks as the counterpart of the potential energy landscape in molecular systems and the optimisation of the network as its equilibration dynamics. Using landscape exploration tools developed in theoretical chemistry, we resolve the structure of the loss function landscape, from which we can draw conclusions for the relaxational dynamics of typical optimisers and, consequently, for deep learning.
The second part investigates how a non-equilibrium charge-carrier chirality distribution in a clean, strongly Rashba-coupled system at low temperatures decays over time. We first motivate this analysis based on experimental studies of transport properties in Rashba materials at low temperatures and subject to external magnetic fields. We investigate whether chirality imbalances could serve as the source for those experimental observations and develop a framework that models the behaviour of such a system. We then proceed with a more general theoretical study of the equilibration mechanisms of chirality in low-temperature strongly Rashba-coupled systems and compute the relaxation timescales of those mechanisms.This thesis is the outcome of doctoral studies conducted at the University of Cambridge with the financial support of the Engineering and Physical Sciences Research Council of the UK
Recommended from our members
Archetypal landscapes for deep neural networks.
The predictive capabilities of deep neural networks (DNNs) continue to evolve to increasingly impressive levels. However, it is still unclear how training procedures for DNNs succeed in finding parameters that produce good results for such high-dimensional and nonconvex loss functions. In particular, we wish to understand why simple optimization schemes, such as stochastic gradient descent, do not end up trapped in local minima with high loss values that would not yield useful predictions. We explain the optimizability of DNNs by characterizing the local minima and transition states of the loss-function landscape (LFL) along with their connectivity. We show that the LFL of a DNN in the shallow network or data-abundant limit is funneled, and thus easy to optimize. Crucially, in the opposite low-data/deep limit, although the number of minima increases, the landscape is characterized by many minima with similar loss values separated by low barriers. This organization is different from the hierarchical landscapes of structural glass formers and explains why minimization procedures commonly employed by the machine-learning community can navigate the LFL successfully and reach low-lying solutions.A.A.L. was supported by the Winton Program for the Physics of Sustainability. P.C.V. and D.J.W. were supported by the Engineering and Physical Sciences Research Council
Recommended from our members
Perspective: new insights from loss function landscapes of neural networks
Abstract: We investigate the structure of the loss function landscape for neural networks subject to dataset mislabelling, increased training set diversity, and reduced node connectivity, using various techniques developed for energy landscape exploration. The benchmarking models are classification problems for atomic geometry optimisation and hand-written digit prediction. We consider the effect of varying the size of the atomic configuration space used to generate initial geometries and find that the number of stationary points increases rapidly with the size of the training configuration space. We introduce a measure of node locality to limit network connectivity and perturb permutational weight symmetry, and examine how this parameter affects the resulting landscapes. We find that highly-reduced systems have low capacity and exhibit landscapes with very few minima. On the other hand, small amounts of reduced connectivity can enhance network expressibility and can yield more complex landscapes. Investigating the effect of deliberate classification errors in the training data, we find that the variance in testing AUC, computed over a sample of minima, grows significantly with the training error, providing new insight into the role of the variance-bias trade-off when training under noise. Finally, we illustrate how the number of local minima for networks with two and three hidden layers, but a comparable number of variable edge weights, increases significantly with the number of layers, and as the number of training data decreases. This work helps shed further light on neural network loss landscapes and provides guidance for future work on neural network training and optimisation
Recommended from our members
Au - Ge Alloys for Wide-Range Low-Temperature On-Chip Thermometry
We present results for a
Au
-
Ge
alloy that is useful as a resistance-based thermometer from room temperature down to at least 0.2 K. Over a wide range, the electrical resistivity of the alloy shows a logarithmic temperature dependence, which simultaneously retains the sensitivity required for practical thermometry while also maintaining a relatively modest and easily measurable value of resistivity. We characterize the sensitivity of the alloy as a possible thermometer and show that it compares favorably with commercially available temperature sensors. We experimentally identify that the characteristic logarithmic temperature dependence of the alloy stems from Kondo-like behavior induced by the specific heat treatment it undergoes.J.R.A.D., P.C.V., G.J.C., and V.N. acknowledge funding from the Engineering and Physical Sciences Research
Council, United Kingdom. G.J.C. and S.E.R. acknowledge funding from the Royal Society, United Kingdom.
J.F.O. thanks the Brazilian Agency CNPq. A.D. and
S.K-N. acknowledge financial support through a European
Research Council Starting Grant (Grant No. ERC-2014-
STG-639526, NANOGEN)
Recommended from our members
Color-dependent interactions in the three coloring model
Since it was first discussed by Baxter in 1970, the three coloring model has been studied in several contexts, from frustrated magnetism to superconducting devices and glassiness. In presence of interactions, when the model is no longer exactly soluble, it was already observed that the phase diagram is highly nontrivial. Here we discuss the generic case of "color-dependent" nearest-neighbor interactions between the vertex chiralities. We uncover different critical regimes merging into one another: c=1/2 free fermions combining into c=1 free bosons; c=1 free bosons combining into c=2 critical loop models; as well as three separate c=1/2 critical lines merging at a supersymmetric c=3/2 critical point. When the three coupling constants are tuned to equal one another, transfer-matrix calculations highlight a puzzling regime where the central charge appears to vary continuously from 3/2 to 2.This work was supported in part by Engineering and Physical Sciences Research Council (EPSRC) Grant No. GR/R83712/01 and by EPSRC Postdoctoral Research Fellowship EP/G049394/1 (C. Castelnovo), and by EPSRC Grant No. EP/D070643/1 (JJHS). P. Verpoort acknowledges funding by the Studienstiftung des deutschen Volkes
Recommended from our members
Research data supporting "Archetypal landscapes for deep neural networks"
This archive contains input and data files to obtain the results published in:
Archetypal landscapes for deep neural networks
P.C. Verpoort, A.A. Lee, D.J. Wales
Accepted in: Proceedings of the National Academy of Sciences of the USA
The folder data_files contains the training data (and testing data where produced) for the energy landscapes reported in the article. The names of the subfolders correspond to the names given to these datasets in the article.
The LJAT19 dataset was created specifically for this publication, and has not been reported elsewhere. The other two datasets were taken from the UCI Machine Learning Repository and can also be obtained from there; links to the original data source (accessed on April 24th, 2020) are provided in the README files in each subfolder. For completeness and because these data had to be processed to serve as inputs for our landscape analysis software, the data files used for the present work are also contained within this archive.
The folder input_files contains the instruction input files for the GMIN, OPTIM and PATHSAMPLE programs. This is just a starting point, and more manual refinement of parameters and connectivity jobs to run is required in order to fully reproduce the results presented in the article.A.A.L. was supported by the Winton Program for the Physics of Sustainability. P.C.V. and D.J.W. were supported by the Engineering and Physical Sciences Research Counci
Recommended from our members
Perspective: new insights from loss function landscapes of neural networks
Abstract: We investigate the structure of the loss function landscape for neural networks subject to dataset mislabelling, increased training set diversity, and reduced node connectivity, using various techniques developed for energy landscape exploration. The benchmarking models are classification problems for atomic geometry optimisation and hand-written digit prediction. We consider the effect of varying the size of the atomic configuration space used to generate initial geometries and find that the number of stationary points increases rapidly with the size of the training configuration space. We introduce a measure of node locality to limit network connectivity and perturb permutational weight symmetry, and examine how this parameter affects the resulting landscapes. We find that highly-reduced systems have low capacity and exhibit landscapes with very few minima. On the other hand, small amounts of reduced connectivity can enhance network expressibility and can yield more complex landscapes. Investigating the effect of deliberate classification errors in the training data, we find that the variance in testing AUC, computed over a sample of minima, grows significantly with the training error, providing new insight into the role of the variance-bias trade-off when training under noise. Finally, we illustrate how the number of local minima for networks with two and three hidden layers, but a comparable number of variable edge weights, increases significantly with the number of layers, and as the number of training data decreases. This work helps shed further light on neural network loss landscapes and provides guidance for future work on neural network training and optimisation
Recommended from our members
Perspective: new insights from loss function landscapes of neural networks
Abstract: We investigate the structure of the loss function landscape for neural networks subject to dataset mislabelling, increased training set diversity, and reduced node connectivity, using various techniques developed for energy landscape exploration. The benchmarking models are classification problems for atomic geometry optimisation and hand-written digit prediction. We consider the effect of varying the size of the atomic configuration space used to generate initial geometries and find that the number of stationary points increases rapidly with the size of the training configuration space. We introduce a measure of node locality to limit network connectivity and perturb permutational weight symmetry, and examine how this parameter affects the resulting landscapes. We find that highly-reduced systems have low capacity and exhibit landscapes with very few minima. On the other hand, small amounts of reduced connectivity can enhance network expressibility and can yield more complex landscapes. Investigating the effect of deliberate classification errors in the training data, we find that the variance in testing AUC, computed over a sample of minima, grows significantly with the training error, providing new insight into the role of the variance-bias trade-off when training under noise. Finally, we illustrate how the number of local minima for networks with two and three hidden layers, but a comparable number of variable edge weights, increases significantly with the number of layers, and as the number of training data decreases. This work helps shed further light on neural network loss landscapes and provides guidance for future work on neural network training and optimisation
Recommended from our members
Au - Ge Alloys for Wide-Range Low-Temperature On-Chip Thermometry
We present results for a
Au
-
Ge
alloy that is useful as a resistance-based thermometer from room temperature down to at least 0.2 K. Over a wide range, the electrical resistivity of the alloy shows a logarithmic temperature dependence, which simultaneously retains the sensitivity required for practical thermometry while also maintaining a relatively modest and easily measurable value of resistivity. We characterize the sensitivity of the alloy as a possible thermometer and show that it compares favorably with commercially available temperature sensors. We experimentally identify that the characteristic logarithmic temperature dependence of the alloy stems from Kondo-like behavior induced by the specific heat treatment it undergoes.J.R.A.D., P.C.V., G.J.C., and V.N. acknowledge funding from the Engineering and Physical Sciences Research
Council, United Kingdom. G.J.C. and S.E.R. acknowledge funding from the Royal Society, United Kingdom.
J.F.O. thanks the Brazilian Agency CNPq. A.D. and
S.K-N. acknowledge financial support through a European
Research Council Starting Grant (Grant No. ERC-2014-
STG-639526, NANOGEN)
Recommended from our members
Research data supporting "Long-lived nonequilibrium superconductivity in a noncentrosymmetric Rashba semiconductor"
Electrical resistivity measurements taken in low-temperature cryostats.
Abstract of associated paper in Physical Review B: We report nonequilibrium magnetodynamics in the Rashba-superconductor GeTe, which lacks inversion symmetry in the bulk. We find that at low temperature the system exhibits a nonequilibrium state, which decays on timescales that exceed conventional electronic scattering times by many orders of magnitude. This reveals a nonequilibrium magnetoresponse that is asymmetric under magnetic-field reversal and, strikingly, induces a nonequilibrium superconducting state distinct from the equilibrium one. We develop a model of a Rashba system in which nonequilibrium configurations relax on a finite timescale that captures the qualitative features of the data. We also obtain evidence for the slow dynamics in another nonsuperconducting Rashba system. Our work provides insights into the dynamics of noncentrosymmetric superconductors and Rashba systems in general