4,090 research outputs found

    Exploiting Structural Properties in the Analysis of High-dimensional Dynamical Systems

    Get PDF
    The physical and cyber domains with which we interact are filled with high-dimensional dynamical systems. In machine learning, for instance, the evolution of overparametrized neural networks can be seen as a dynamical system. In networked systems, numerous agents or nodes dynamically interact with each other. A deep understanding of these systems can enable us to predict their behavior, identify potential pitfalls, and devise effective solutions for optimal outcomes. In this dissertation, we will discuss two classes of high-dimensional dynamical systems with specific structural properties that aid in understanding their dynamic behavior. In the first scenario, we consider the training dynamics of multi-layer neural networks. The high dimensionality comes from overparametrization: a typical network has a large depth and hidden layer width. We are interested in the following question regarding convergence: Do network weights converge to an equilibrium point corresponding to a global minimum of our training loss, and how fast is the convergence rate? The key to those questions is the symmetry of the weights, a critical property induced by the multi-layer architecture. Such symmetry leads to a set of time-invariant quantities, called weight imbalance, that restrict the training trajectory to a low-dimensional manifold defined by the weight initialization. A tailored convergence analysis is developed over this low-dimensional manifold, showing improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the convergence rate. In the second scenario, we consider large-scale networked systems with multiple weakly-connected groups. Such a multi-cluster structure leads to a time-scale separation between the fast intra-group interaction due to high intra-group connectivity, and the slow inter-group oscillation, due to the weak inter-group connection. We develop a novel frequency-domain network coherence analysis that captures both the coherent behavior within each group, and the dynamical interaction between groups, leading to a structure-preserving model-reduction methodology for large-scale dynamic networks with multiple clusters under general node dynamics assumptions

    Techniques for high-multiplicity scattering amplitudes and applications to precision collider physics

    Get PDF
    In this thesis, we present state-of-the-art techniques for the computation of scattering amplitudes in Quantum Field Theories. Following an introduction to the topic, we describe a robust framework that enables the calculation of multi-scale two-loop amplitudes directly relevant to modern particle physics phenomenology at the Large Hadron Collider and beyond. We discuss in detail the use of finite fields to bypass the algebraic complexity of such computations, as well as the method of integration-by-parts relations and differential equations. We apply our framework to calculate the two-loop amplitudes contributing to three process: Higgs boson production in association with a bottom-quark pair, W boson production with a photon and a jet, as well as lepton-pair scattering with an off-shell and an on-shell photon. Finally, we draw our conclusions and discuss directions for future progress of amplitude computations

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Differentially Private Stochastic Convex Optimization in (Non)-Euclidean Space Revisited

    Full text link
    In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) in Euclidean and general ℓpd\ell_p^d spaces. Specifically, we focus on three settings that are still far from well understood: (1) DP-SCO over a constrained and bounded (convex) set in Euclidean space; (2) unconstrained DP-SCO in ℓpd\ell_p^d space; (3) DP-SCO with heavy-tailed data over a constrained and bounded set in ℓpd\ell_p^d space. For problem (1), for both convex and strongly convex loss functions, we propose methods whose outputs could achieve (expected) excess population risks that are only dependent on the Gaussian width of the constraint set rather than the dimension of the space. Moreover, we also show the bound for strongly convex functions is optimal up to a logarithmic factor. For problems (2) and (3), we propose several novel algorithms and provide the first theoretical results for both cases when 1<p<21<p<2 and 2≤p≤∞2\leq p\leq \infty

    Machine learning applications in search algorithms for gravitational waves from compact binary mergers

    Get PDF
    Gravitational waves from compact binary mergers are now routinely observed by Earth-bound detectors. These observations enable exciting new science, as they have opened a new window to the Universe. However, extracting gravitational-wave signals from the noisy detector data is a challenging problem. The most sensitive search algorithms for compact binary mergers use matched filtering, an algorithm that compares the data with a set of expected template signals. As detectors are upgraded and more sophisticated signal models become available, the number of required templates will increase, which can make some sources computationally prohibitive to search for. The computational cost is of particular concern when low-latency alerts should be issued to maximize the time for electromagnetic follow-up observations. One potential solution to reduce computational requirements that has started to be explored in the last decade is machine learning. However, different proposed deep learning searches target varying parameter spaces and use metrics that are not always comparable to existing literature. Consequently, a clear picture of the capabilities of machine learning searches has been sorely missing. In this thesis, we closely examine the sensitivity of various deep learning gravitational-wave search algorithms and introduce new methods to detect signals from binary black hole and binary neutron star mergers at previously untested statistical confidence levels. By using the sensitive distance as our core metric, we allow for a direct comparison of our algorithms to state-of-the-art search pipelines. As part of this thesis, we organized a global mock data challenge to create a benchmark for machine learning search algorithms targeting compact binaries. This way, the tools developed in this thesis are made available to the greater community by publishing them as open source software. Our studies show that, depending on the parameter space, deep learning gravitational-wave search algorithms are already competitive with current production search pipelines. We also find that strategies developed for traditional searches can be effectively adapted to their machine learning counterparts. In regions where matched filtering becomes computationally expensive, available deep learning algorithms are also limited in their capability. We find reduced sensitivity to long duration signals compared to the excellent results for short-duration binary black hole signals

    Contactless excitation for electric machines: high temperature superconducting flux pumps

    Get PDF
    With the intensification of global warming and climate change, the pace of transformation to a neutral-emission society is accelerating. In various sectors, electrification has become the absolute tendency to promote such a movement, where electric machines play an important role in the current power generation system. It is widely convinced that electric machines with very high power density are essential for future applications, which, however, can be hardly achieved by conventional technologies. Owing to the maturation of the second generation (2G) high temperature superconducting (HTS) technologies, it has been recognized that superconducting machine could be a competitive candidate to realize the vision. One significant obstacle that hinders the implementation of superconducting machines is how to provide the required magnetic fields, or in other words, how to energise them appropriately. Conventional direct injection is not suitable for HTS machines, because the current leads would bridge ambident temperature to the cryogenic environment, which can impose considerable heat load on the system and increase the operational cost. Thus, an efficient energisation method is demanded by HTS machines. As an emerging technology that can accumulate substantial flux in a closed loop without any physical contact, HTS flux pumps have been proposed as a promising solution. Among the existing developed HTS flux pumps, rotary HTS flux pumps, or so-called HTS dynamo, can output non-zero time-averaged DC voltage and charge the rest of the circuit if a closed loop has been formed. This type of flux pump is often employed together with HTS coils, where the HTS coils can potentially work in the persistent current mode, and act like electromagnets with a considerable magnetic field, having a wide range of applications in industry. The output characteristics of rotary HTS flux pumps have been extensively explored through experiments and finite element method (FEM) simulations, yet the work on constructing statistical models as an alternative approach to capture key characteristics has not been studied. In this thesis, a 2D FEM program has been developed to model the operation of rotary HTS flux pumps and evaluate the effects of different factors on the output voltage through parameter sweeping and analysis of variance. Typical design considerations, including the operating frequency, air gap, HTS tape width, and remanent flux density have been investigated, in particular, the bilateral effect of HTS tape width has been discovered and explained by looking at the averaged integration of the electric field over the HTS tape. Based on the data obtained from various simulations, regression analysis has been conducted through a collection of machine learning methods. It has been demonstrated that the output voltage of a rotary HTS flux pump can be obtained promptly with satisfactory accuracy via Gaussian process regression, aiming to provide a novel approach for future research and a powerful design tool for industrial applications using rotary HTS flux pumps. To enhance the applicability of the proposed statistical models, an updated FEM program has been built to take more parameters into account. The newly added parameters, namely the rotor radius and the width of permanent magnet, together with formerly included ones, should have covered all the key design parameters for a rotary HTS flux pump. Based on data collected from the FEM model, a well-trained semi-deep neural network (DNN) model with a back-propagation algorithm has been put forward and validated. The proposed DNN model is capable of quantifying the output voltage of a rotary HTS flux pump instantly with an overall accuracy of 98% with respect to the simulated values with all design parameters explicitly specified. The model possesses a powerful ability to characterize the output behaviour of rotary HTS flux pumps by integrating all design parameters, and the output characteristics of rotary HTS flux pumps have been successfully demonstrated and visualized using this model. Compared to conventional time-consuming FEM-based numerical models, the proposed DNN model has the advantages of fast learning, accurate computation, as well as strong programmability. Therefore, the DNN model can greatly facilitate the design and optimization process for rotary HTS flux pumps. An executable application has been developed accordingly based on the DNN model, which is believed to provide a useful tool for learners and designers of rotary HTS flux pumps. A new variant inspired by the working principles of rotary HTS flux pumps has been proposed and termed as stationary wave HTS flux pumps. The superiority of this type is that it has a simple structure without any moving components, and it utilises a controllable current-driven electromagnet to provide the required magnetic field. It has been demonstrated that the origin of the output voltage is determined by the asymmetric distribution of the dynamic resistance in the HTS tape, for which the electromagnet must be placed at such a position that its central line is not aligned with that of the HTS tape. A numerical model has been built to simulate the operation of a stationary wave HTS flux pump, based on which the output characteristics and dynamic resistance against various parameters have been investigated. Besides, accurate and reliable statistical models have been proposed to predict the open circuit voltage and effective dynamic resistance by adapting the previously developed machine learning techniques. The work presented in this PhD thesis can bring more insight into HTS flux pumps as an emerging promising contactless energisation technology, and the proposed statistical models can be particularly useful for the design and optimization of such devices

    SCQPTH: an efficient differentiable splitting method for convex quadratic programming

    Full text link
    We present SCQPTH: a differentiable first-order splitting method for convex quadratic programs. The SCQPTH framework is based on the alternating direction method of multipliers (ADMM) and the software implementation is motivated by the state-of-the art solver OSQP: an operating splitting solver for convex quadratic programs (QPs). The SCQPTH software is made available as an open-source python package and contains many similar features including efficient reuse of matrix factorizations, infeasibility detection, automatic scaling and parameter selection. The forward pass algorithm performs operator splitting in the dimension of the original problem space and is therefore suitable for large scale QPs with 100−1000100-1000 decision variables and thousands of constraints. Backpropagation is performed by implicit differentiation of the ADMM fixed-point mapping. Experiments demonstrate that for large scale QPs, SCQPTH can provide a 1×−10×1\times - 10\times improvement in computational efficiency in comparison to existing differentiable QP solvers

    Investigating Lyman continuum escape fractions of high redshift galaxies during the era of reionization

    Get PDF

    Towards Reliable and Accurate Global Structure-from-Motion

    Get PDF
    Reconstruction of objects or scenes from sparse point detections across multiple views is one of the most tackled problems in computer vision. Given the coordinates of 2D points tracked in multiple images, the problem consists of estimating the corresponding 3D points and cameras\u27 calibrations (intrinsic and pose), and can be solved by minimizing reprojection errors using bundle adjustment. However, given bundle adjustment\u27s nonlinear objective function and iterative nature, a good starting guess is required to converge to global minima. Global and Incremental Structure-from-Motion methods appear as ways to provide good initializations to bundle adjustment, each with different properties. While Global Structure-from-Motion has been shown to result in more accurate reconstructions compared to Incremental Structure-from-Motion, the latter has better scalability by starting with a small subset of images and sequentially adding new views, allowing reconstruction of sequences with millions of images. Additionally, both Global and Incremental Structure-from-Motion methods rely on accurate models of the scene or object, and under noisy conditions or high model uncertainty might result in poor initializations for bundle adjustment. Recently pOSE, a class of matrix factorization methods, has been proposed as an alternative to conventional Global SfM methods. These methods use VarPro - a second-order optimization method - to minimize a linear combination of an approximation of reprojection errors and a regularization term based on an affine camera model, and have been shown to converge to global minima with a high rate even when starting from random camera calibration estimations.This thesis aims at improving the reliability and accuracy of global SfM through different approaches. First, by studying conditions for global optimality of point set registration, a point cloud averaging method that can be used when (incomplete) 3D point clouds of the same scene in different coordinate systems are available. Second, by extending pOSE methods to different Structure-from-Motion problem instances, such as Non-Rigid SfM or radial distortion invariant SfM. Third and finally, by replacing the regularization term of pOSE methods with an exponential regularization on the projective depth of the 3D point estimations, resulting in a loss that achieves reconstructions with accuracy close to bundle adjustment
    • …
    corecore