4,090 research outputs found
Exploiting Structural Properties in the Analysis of High-dimensional Dynamical Systems
The physical and cyber domains with which we interact are filled with high-dimensional dynamical systems. In machine learning, for instance, the evolution of overparametrized neural networks can be seen as a dynamical system. In networked systems, numerous agents or nodes dynamically interact with each other. A deep understanding of these systems can enable us to predict their behavior, identify potential pitfalls, and devise effective solutions for optimal outcomes. In this dissertation, we will discuss two classes of high-dimensional dynamical systems with specific structural properties that aid in understanding their dynamic behavior.
In the first scenario, we consider the training dynamics of multi-layer neural networks. The high dimensionality comes from overparametrization: a typical network has a large depth and hidden layer width. We are interested in the following question regarding convergence: Do network weights converge to an equilibrium point corresponding to a global minimum of our training loss, and how fast is the convergence rate? The key to those questions is the symmetry of the weights, a critical property induced by the multi-layer architecture. Such symmetry leads to a set of time-invariant quantities, called weight imbalance, that restrict the training trajectory to a low-dimensional manifold defined by the weight initialization. A tailored convergence analysis is developed over this low-dimensional manifold, showing improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the convergence rate.
In the second scenario, we consider large-scale networked systems with multiple weakly-connected groups. Such a multi-cluster structure leads to a time-scale separation between the fast intra-group interaction due to high intra-group connectivity, and the slow inter-group oscillation, due to the weak inter-group connection. We develop a novel frequency-domain network coherence analysis that captures both the coherent behavior within each group, and the dynamical interaction between groups, leading to a structure-preserving model-reduction methodology for large-scale dynamic networks with multiple clusters under general node dynamics assumptions
Recommended from our members
Three-dimensional magnetic fields: from coils to reconnection
This thesis is a work divided into two parts on aspects of three-dimensional (3D) magnetic fields: (I) magnetic reconnection treated from a strictly 3D viewpoint and (II) the design of coils for producing the 3D magnetic fields of optimized stellarators.
In astrophysical settings, magnetic fields are generically 3D. 3D divergence-free fields have rich topological structures such as magnetic nulls and chaotic field line structures. Standard reconnection literature identifies magnetic nulls as locations of magnetic reconnection, and that intense currents will build up around them. This idea is explored with a key realization that by placing a vanishingly small sphere around the null, boundary conditions on field lines passing through the sphere may be sorted out. The main result here is (1) the dismissal of the notion that nulls are crucial places for magnetic reconnection and current accumulation, instead identifying separatrices of topological type on the boundaries of null-passing field lines to be crucial. Standard reconnection literature dismisses chaotic flows yet 3D fields generically have chaotic flows. An inherent property of chaotic flows is exponentiation. The main result here is (2) the identification of exponentiation as a natural mechanism for magnetic reconnection and that the associated current builds up linearly in time in contradiction to standard results requiring the formation of high-density current sheets.
The magnetic fields of optimized stellarators are intricate, producing complex 3D magnetic surfaces. These fields are conventionally generated by non-planar electromagnetic coils, though these coils are costly to manufacture, slow device assembly, and hinder stellarator maintenance. Part II of this thesis explores methods of stellarator coil simplification that do not involve modular coils. All of this work uses current potentials, which are stream functions of the current sheets that produce magnetic surfaces. We begin with a result found using analytic methods on current potentials that (1) there may be an inherent limitation in the ability of modular coils to produce fields at a distance. This result is not surprising, though further analysis is necessary to work out some complexities of the result.
Next, (2) a novel method to produce localized patches of current potential, representative of patches of current sheets, is developed and used to identify crucial locations of current placement for shaping magnetic surfaces. Most notably, these current sheet patches are able to produce much of the surface shaping while occupying a small fraction of the winding surface, resulting in good open-access stellarator coil configurations. Continuing the trend away from modular coils, (3) helical coils are optimized to support stellarator magnetic fields.
This work agrees with related work on the optimization of helical coils, finding them unsuitable to the precise production of equilibria generated by modular coils. To improve this result, we use coil sets of mixed-type: helical coils with windowpane coils or permanent magnets, to mitigate field error left behind by the helical coils. Finally, (4) the development of a generalized method to cut modular, helical, and windowpane coils out of current potentials and to identify the associated coil currents is developed and used in coil optimization
Techniques for high-multiplicity scattering amplitudes and applications to precision collider physics
In this thesis, we present state-of-the-art techniques for the computation of scattering amplitudes in Quantum Field Theories. Following an introduction to the topic, we describe a robust framework that enables the calculation of multi-scale two-loop amplitudes directly relevant to modern particle physics phenomenology at the Large Hadron Collider and beyond. We discuss in detail the use of finite fields to bypass the algebraic complexity of such computations, as well as the method of integration-by-parts relations and differential equations. We apply our framework to calculate the two-loop amplitudes contributing to three process: Higgs boson production in association with a bottom-quark pair, W boson production with a photon and a jet, as well as lepton-pair scattering with an off-shell and an on-shell photon. Finally, we draw our conclusions and discuss directions for future progress of amplitude computations
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Differentially Private Stochastic Convex Optimization in (Non)-Euclidean Space Revisited
In this paper, we revisit the problem of Differentially Private Stochastic
Convex Optimization (DP-SCO) in Euclidean and general spaces.
Specifically, we focus on three settings that are still far from well
understood: (1) DP-SCO over a constrained and bounded (convex) set in Euclidean
space; (2) unconstrained DP-SCO in space; (3) DP-SCO with
heavy-tailed data over a constrained and bounded set in space. For
problem (1), for both convex and strongly convex loss functions, we propose
methods whose outputs could achieve (expected) excess population risks that are
only dependent on the Gaussian width of the constraint set rather than the
dimension of the space. Moreover, we also show the bound for strongly convex
functions is optimal up to a logarithmic factor. For problems (2) and (3), we
propose several novel algorithms and provide the first theoretical results for
both cases when and
Machine learning applications in search algorithms for gravitational waves from compact binary mergers
Gravitational waves from compact binary mergers are now routinely observed by Earth-bound detectors. These observations enable exciting new science, as they have opened a new window to the Universe.
However, extracting gravitational-wave signals from the noisy detector data is a challenging problem. The most sensitive search algorithms for compact binary mergers use matched filtering, an algorithm that compares the data with a set of expected template signals. As detectors are upgraded and more sophisticated signal models become available, the number of required templates will increase, which can make some sources computationally prohibitive to search for. The computational cost is of particular concern when low-latency alerts should be issued to maximize the time for electromagnetic follow-up observations. One potential solution to reduce computational requirements that has started to be explored in the last decade is machine learning. However, different proposed deep learning searches target varying parameter spaces and use metrics that are not always comparable to existing literature. Consequently, a clear picture of the capabilities of machine learning searches has been sorely missing.
In this thesis, we closely examine the sensitivity of various deep learning gravitational-wave search algorithms and introduce new methods to detect signals from binary black hole and binary neutron star mergers at previously untested statistical confidence levels. By using the sensitive distance as our core metric, we allow for a direct comparison of our algorithms to state-of-the-art search pipelines. As part of this thesis, we organized a global mock data challenge to create a benchmark for machine learning search algorithms targeting compact binaries. This way, the tools developed in this thesis are made available to the greater community by publishing them as open source software.
Our studies show that, depending on the parameter space, deep learning gravitational-wave search algorithms are already competitive with current production search pipelines. We also find that strategies developed for traditional searches can be effectively adapted to their machine learning counterparts. In regions where matched filtering becomes computationally expensive, available deep learning algorithms are also limited in their capability. We find reduced sensitivity to long duration signals compared to the excellent results for short-duration binary black hole signals
Contactless excitation for electric machines: high temperature superconducting flux pumps
With the intensification of global warming and climate change, the pace of transformation to a neutral-emission society is accelerating. In various sectors, electrification has become the absolute tendency to promote such a movement, where electric machines play an important role in the current power generation system. It is widely convinced that electric machines with very high power density are essential for future applications, which, however, can be hardly achieved by conventional technologies. Owing to the maturation of the second generation (2G) high temperature superconducting (HTS) technologies, it has been recognized that superconducting machine could be a competitive candidate to realize the vision.
One significant obstacle that hinders the implementation of superconducting machines is how to provide the required magnetic fields, or in other words, how to energise them appropriately. Conventional direct injection is not suitable for HTS machines, because the current leads would bridge ambident temperature to the cryogenic environment, which can impose considerable heat load on the system and increase the operational cost. Thus, an efficient energisation method is demanded by HTS machines. As an emerging technology that can accumulate substantial flux in a closed loop without any physical contact, HTS flux pumps have been proposed as a promising solution.
Among the existing developed HTS flux pumps, rotary HTS flux pumps, or so-called HTS dynamo, can output non-zero time-averaged DC voltage and charge the rest of the circuit if a closed loop has been formed. This type of flux pump is often employed together with HTS coils, where the HTS coils can potentially work in the persistent current mode, and act like electromagnets with a considerable magnetic field, having a wide range of applications in industry. The output characteristics of rotary HTS flux pumps have been extensively explored through experiments and finite element method (FEM) simulations, yet the work on constructing statistical models as an alternative approach to capture key characteristics has not been studied. In this thesis, a 2D FEM program has been developed to model the operation of rotary HTS flux pumps and evaluate the effects of different factors on the output voltage through parameter sweeping and analysis of variance. Typical design considerations,
including the operating frequency, air gap, HTS tape width, and remanent flux density have been investigated, in particular, the bilateral effect of HTS tape width has been discovered and explained by looking at the averaged integration of the electric field over the HTS tape. Based on the data obtained from various simulations, regression analysis has been conducted through a collection of machine learning methods. It has been demonstrated that the output voltage of a rotary HTS flux pump can be obtained promptly with satisfactory accuracy via Gaussian process regression, aiming to provide a novel approach for future research and a powerful design tool for industrial applications using rotary HTS flux pumps.
To enhance the applicability of the proposed statistical models, an updated FEM program has been built to take more parameters into account. The newly added parameters, namely the rotor radius and the width of permanent magnet, together with formerly included ones, should have covered all the key design parameters for a rotary HTS flux pump. Based on data collected from the FEM model, a well-trained semi-deep neural network (DNN) model with a back-propagation algorithm has been put forward and validated. The proposed DNN model is capable of quantifying the output voltage of a rotary HTS flux pump instantly with an overall accuracy of 98% with respect to the simulated values with all design parameters explicitly specified. The model possesses a powerful ability to characterize the output behaviour of rotary HTS flux pumps by integrating all design parameters, and the output characteristics of rotary HTS flux pumps have been successfully demonstrated and visualized using this model. Compared to conventional time-consuming FEM-based numerical models, the proposed DNN model has the advantages of fast learning, accurate computation, as well as strong programmability. Therefore, the DNN model can greatly facilitate the design and optimization process for rotary HTS flux pumps. An executable application has been developed accordingly based on the DNN model, which is believed to provide a useful tool for learners and designers of rotary HTS flux pumps.
A new variant inspired by the working principles of rotary HTS flux pumps has been proposed and termed as stationary wave HTS flux pumps. The superiority of this type is that it has a simple structure without any moving components, and it utilises a controllable current-driven electromagnet to provide the required magnetic field. It has been demonstrated that the origin of the output voltage is determined by the asymmetric distribution of the dynamic resistance in the HTS tape, for which the electromagnet must be placed at such a position that its central line is not aligned with that of the HTS tape. A numerical model has
been built to simulate the operation of a stationary wave HTS flux pump, based on which the output characteristics and dynamic resistance against various parameters have been investigated. Besides, accurate and reliable statistical models have been proposed to predict the open circuit voltage and effective dynamic resistance by adapting the previously developed machine learning techniques.
The work presented in this PhD thesis can bring more insight into HTS flux pumps as an emerging promising contactless energisation technology, and the proposed statistical models can be particularly useful for the design and optimization of such devices
SCQPTH: an efficient differentiable splitting method for convex quadratic programming
We present SCQPTH: a differentiable first-order splitting method for convex
quadratic programs. The SCQPTH framework is based on the alternating direction
method of multipliers (ADMM) and the software implementation is motivated by
the state-of-the art solver OSQP: an operating splitting solver for convex
quadratic programs (QPs). The SCQPTH software is made available as an
open-source python package and contains many similar features including
efficient reuse of matrix factorizations, infeasibility detection, automatic
scaling and parameter selection. The forward pass algorithm performs operator
splitting in the dimension of the original problem space and is therefore
suitable for large scale QPs with decision variables and thousands
of constraints. Backpropagation is performed by implicit differentiation of the
ADMM fixed-point mapping. Experiments demonstrate that for large scale QPs,
SCQPTH can provide a improvement in computational
efficiency in comparison to existing differentiable QP solvers
Towards Reliable and Accurate Global Structure-from-Motion
Reconstruction of objects or scenes from sparse point detections across multiple views is one of the most tackled problems in computer vision. Given the coordinates of 2D points tracked in multiple images, the problem consists of estimating the corresponding 3D points and cameras\u27 calibrations (intrinsic and pose), and can be solved by minimizing reprojection errors using bundle adjustment. However, given bundle adjustment\u27s nonlinear objective function and iterative nature, a good starting guess is required to converge to global minima. Global and Incremental Structure-from-Motion methods appear as ways to provide good initializations to bundle adjustment, each with different properties. While Global Structure-from-Motion has been shown to result in more accurate reconstructions compared to Incremental Structure-from-Motion, the latter has better scalability by starting with a small subset of images and sequentially adding new views, allowing reconstruction of sequences with millions of images. Additionally, both Global and Incremental Structure-from-Motion methods rely on accurate models of the scene or object, and under noisy conditions or high model uncertainty might result in poor initializations for bundle adjustment. Recently pOSE, a class of matrix factorization methods, has been proposed as an alternative to conventional Global SfM methods. These methods use VarPro - a second-order optimization method - to minimize a linear combination of an approximation of reprojection errors and a regularization term based on an affine camera model, and have been shown to converge to global minima with a high rate even when starting from random camera calibration estimations.This thesis aims at improving the reliability and accuracy of global SfM through different approaches. First, by studying conditions for global optimality of point set registration, a point cloud averaging method that can be used when (incomplete) 3D point clouds of the same scene in different coordinate systems are available. Second, by extending pOSE methods to different Structure-from-Motion problem instances, such as Non-Rigid SfM or radial distortion invariant SfM. Third and finally, by replacing the regularization term of pOSE methods with an exponential regularization on the projective depth of the 3D point estimations, resulting in a loss that achieves reconstructions with accuracy close to bundle adjustment
- …