244,575 research outputs found

    Applications of machine learning to studies of quantum phase transitions

    Get PDF
    In the past years Machine Learning has shown to be a useful tool in quantum many-body physics to detect phase transitions. Being able to identify phases via machine learning introduces the question of how did the algorithm learn to classify them, and thus how to interpret the model?s prediction. In this thesis we present a study of the transition from a normal insulator to a topological insulator. We study this quantum phase transition in the framework of the Su-Schrie?er-Heeger model. In the area of Deep Learning, we introduce two models, a normal convolutional neural network and a model based on deep residual learning. In particular, we focus on the interpretability of the model and its prediction by generating class activation maps (CAM) using a global average pooling (GAP) layer. We show the application of this technique by applying it on the model without disorder and with disorder. Here we give further analysis of the detection of states using transfer learning from no disordered to disordered systems. We conclude that the neural network is able to detect edge states when there is no disorder but unable to distinguish between edge states and Anderson localized states when disorder is introduced

    Reinforcement Learning in Different Phases of Quantum Control

    Get PDF
    The ability to prepare a physical system in a desired quantum state is central to many areas of physics such as nuclear magnetic resonance, cold atoms, and quantum computing. Yet, preparing states quickly and with high fidelity remains a formidable challenge. In this work we implement cutting-edge Reinforcement Learning (RL) techniques and show that their performance is comparable to optimal control methods in the task of finding short, high-fidelity driving protocol from an initial to a target state in non-integrable many-body quantum systems of interacting qubits. RL methods learn about the underlying physical system solely through a single scalar reward (the fidelity of the resulting state) calculated from numerical simulations of the physical system. We further show that quantum state manipulation, viewed as an optimization problem, exhibits a spin-glass-like phase transition in the space of protocols as a function of the protocol duration. Our RL-aided approach helps identify variational protocols with nearly optimal fidelity, even in the glassy phase, where optimal state manipulation is exponentially hard. This study highlights the potential usefulness of RL for applications in out-of-equilibrium quantum physics.Comment: A legend for the videos referred to in the paper is available on https://mgbukov.github.io/RL_movies

    Learning threshold neurons via the "edge of stability"

    Full text link
    Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks.Comment: 31 pages, 13 figures, Published at NeurIPS 202

    Towards Real-World Federated Learning: Empirical Studies in the Domain of Embedded Systems

    Get PDF
    Context: Artificial intelligence (AI) has led a new phase of technical revolution and industrial development around the world since the twenty-first century, revolutionizing the way of production. Artificial intelligence (AI), an emerging information technology, is thriving, and AI application technologies are gaining traction, particularly in professional services such as healthcare, education, finance, security, etc. More machine learning technologies have begun to be thoroughly applied to the production stage as big data and cloud computing capabilities have improved. With the increased focus on Machine Learning applications and the rapid growth of distributed edge devices in the industry, we believe that utilizing a large number of edge devices will become increasingly important. The introduction of Federated Learning changes the situation in which data must be centrally uploaded to the cloud for processing and maximizes the use of edge devices\u27 computing and storage capabilities. With local data processing, the learning approach eliminates the need to upload large amounts of local data and reduces data transfer latency. Because Federated Learning does not require centralized data for model training, it is better suited to edge learning scenarios with limited data and privacy concerns. Objective: The purpose of this research is to identify the characteristics and problems of the Federated Learning methods, our new algorithms and frameworks that can assist companies in making the transition to Federated Learning, and empirically validate the proposed approaches. Method: To achieve these objectives, we adopted an empirical research approach with design science being our primary research method. We conducted a literature review, case studies, including semi-structured interviews and simulation experiments in close collaboration with software-intensive companies in the embedded systems domain. Results: We present four major findings in this paper. First, we present a state-of-the-art review of the empirical results reported in the existing Federated Learning literature. We then categorize those Federated Learning implementations into different application domains, identify their challenges, and propose six open research questions based on the problems identified in the literature. Second, we conduct a case study to explain why companies anticipate Federated Learning as a potential solution to the challenges they encountered when implementing machine learning components. We summarize the services that a comprehensive Federated Learning system must enable in industrial settings. Furthermore, we identify the primary barriers that companies must overcome in order to embrace and transition to Federated Learning. Based on our empirical findings, we propose five requirements for companies implementing reliable Federated Learning systems. Third, we develop and evaluate four architecture alternatives for a Federated Learning system, including centralized, hierarchical, regional, and decentralized architectures. We investigate the trade-o between communication latency, model evolution time, and model classification performance, which is critical for applying our findings to real-world industrial systems. Fourth, we introduce techniques and asynchronous frameworks for end-to-end on-device Federated Learning. The method is validated using a steering wheel angle prediction case. The local models of each edge vehicle can be continuously trained and shared with other vehicles to improve their local model prediction accuracy. Furthermore, we combine the asynchronous Federated Learning approach with Deep Neural Decision Forests and validate our method using important industry use cases in the automotive domain. Our findings show that Federated Learning can improve model training speed while lowering communication overhead without sacrificing accuracy, demonstrating that this technique has significant benefits to a wide range of real-world embedded systems. Future Work: In the future, we plan to test our approach in other use cases and look into more sophisticated neural networks integrated with our approach. In order to improve model training performance on resource-constrained edge devices in real-world embedded systems, we intend to design more appropriate aggregation methods and protocols. Furthermore, we intend to use the Federated Learning and Reinforcement Learning methods to assist the edge in evolving themselves autonomously and fully utilizing the computation capabilities of the edge devices
    • …
    corecore