9,885 research outputs found

    Exploiting Structural Properties in the Analysis of High-dimensional Dynamical Systems

    Get PDF
    The physical and cyber domains with which we interact are filled with high-dimensional dynamical systems. In machine learning, for instance, the evolution of overparametrized neural networks can be seen as a dynamical system. In networked systems, numerous agents or nodes dynamically interact with each other. A deep understanding of these systems can enable us to predict their behavior, identify potential pitfalls, and devise effective solutions for optimal outcomes. In this dissertation, we will discuss two classes of high-dimensional dynamical systems with specific structural properties that aid in understanding their dynamic behavior. In the first scenario, we consider the training dynamics of multi-layer neural networks. The high dimensionality comes from overparametrization: a typical network has a large depth and hidden layer width. We are interested in the following question regarding convergence: Do network weights converge to an equilibrium point corresponding to a global minimum of our training loss, and how fast is the convergence rate? The key to those questions is the symmetry of the weights, a critical property induced by the multi-layer architecture. Such symmetry leads to a set of time-invariant quantities, called weight imbalance, that restrict the training trajectory to a low-dimensional manifold defined by the weight initialization. A tailored convergence analysis is developed over this low-dimensional manifold, showing improved rate bounds for several multi-layer network models studied in the literature, leading to novel characterizations of the effect of weight imbalance on the convergence rate. In the second scenario, we consider large-scale networked systems with multiple weakly-connected groups. Such a multi-cluster structure leads to a time-scale separation between the fast intra-group interaction due to high intra-group connectivity, and the slow inter-group oscillation, due to the weak inter-group connection. We develop a novel frequency-domain network coherence analysis that captures both the coherent behavior within each group, and the dynamical interaction between groups, leading to a structure-preserving model-reduction methodology for large-scale dynamic networks with multiple clusters under general node dynamics assumptions

    Linear Amplification in Nonequilibrium Turbulent Boundary Layers

    Get PDF
    Resolvent analysis is applied to nonequilibrium incompressible adverse pressure gradient (APG) turbulent boundary layers (TBL) and hypersonic boundary layers with high temperature real gas effects, including chemical nonequilibrium. Resolvent analysis is an equation-based, scale-dependent decomposition of the Navier Stokes equations, linearized about a known mean flow field. The decomposition identifies the optimal response and forcing modes, ranked by their linear amplification. To treat the nonequilibrium APG TBL, a biglobal resolvent analysis approach is used to account for the streamwise and wall-normal inhomogeneities in the streamwise developing flow. For the hypersonic boundary layer in chemical nonequilibrium, the resolvent analysis is constructed using a parallel flow assumption, incorporating Nâ‚‚, Oâ‚‚, NO, N, and O as a mixture of chemically reacting gases. Biglobal resolvent analysis is first applied to the zero pressure gradient (ZPG) TBL. Scaling relationships are determined for the spanwise wavenumber and temporal frequency that admit self-similar resolvent modes in the inner layer, mesolayer, and outer layer regions of the ZPG TBL. The APG effects on the inner scaling of the biglobal modes are shown to diminish as their self-similarity improves with increased Reynolds number. An increase in APG strength is shown to increase the linear amplification of the large-scale biglobal modes in the outer region, similar to the energization of large scale modes observed in simulation. The linear amplification of these modes grows linearly with the APG history, measured as the streamwise averaged APG strength, and relates to a novel pressure-based velocity scale. Resolvent analysis is then used to identify the length scales most affected by the high-temperature gas effects in hypersonic TBLs. It is shown that the high-temperature gas effects primarily affect modes localized near the peak mean temperature. Due to the chemical nonequilibrium effects, the modes can be linearly amplified through changes in chemical concentration, which have non-negligible effects on the higher order modes. Correlations in the components of the small-scale resolvent modes agree qualitatively with similar correlations in simulation data. Finally, efficient strategies for resolvent analysis are presented. These include an algorithm to autonomously sample the large amplification regions using a Bayesian Optimization-like approach and a projection-based method to approximate resolvent analysis through a reduced eigenvalue problem, derived from calculus of variations.</p

    Space object identification and classification from hyperspectral material analysis

    Get PDF
    This paper presents a data processing pipeline designed to extract information from the hyperspectral signature of unknown space objects. The methodology proposed in this paper determines the material composition of space objects from single pixel images. Two techniques are used for material identification and classification: one based on machine learning and the other based on a least square match with a library of known spectra. From this information, a supervised machine learning algorithm is used to classify the object into one of several categories based on the detection of materials on the object. The behaviour of the material classification methods is investigated under non-ideal circumstances, to determine the effect of weathered materials, and the behaviour when the training library is missing a material that is present in the object being observed. Finally the paper will present some preliminary results on the identification and classification of space objects

    Cyclic proof systems for modal fixpoint logics

    Get PDF
    This thesis is about cyclic and ill-founded proof systems for modal fixpoint logics, with and without explicit fixpoint quantifiers.Cyclic and ill-founded proof-theory allow proofs with infinite branches or paths, as long as they satisfy some correctness conditions ensuring the validity of the conclusion. In this dissertation we design a few cyclic and ill-founded systems: a cyclic one for the weak Grzegorczyk modal logic K4Grz, based on our explanation of the phenomenon of cyclic companionship; and ill-founded and cyclic ones for the full computation tree logic CTL* and the intuitionistic linear-time temporal logic iLTL. All systems are cut-free, and the cyclic ones for K4Grz and iLTL have fully finitary correctness conditions.Lastly, we use a cyclic system for the modal mu-calculus to obtain a proof of the uniform interpolation property for the logic which differs from the original, automata-based one

    Classical and quantum algorithms for scaling problems

    Get PDF
    This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size

    Emergence of number sense through the integration of multimodal information: developmental learning insights from neural network models

    Get PDF
    IntroductionAssociating multimodal information is essential for human cognitive abilities including mathematical skills. Multimodal learning has also attracted attention in the field of machine learning, and it has been suggested that the acquisition of better latent representation plays an important role in enhancing task performance. This study aimed to explore the impact of multimodal learning on representation, and to understand the relationship between multimodal representation and the development of mathematical skills.MethodsWe employed a multimodal deep neural network as the computational model for multimodal associations in the brain. We compared the representations of numerical information, that is, handwritten digits and images containing a variable number of geometric figures learned through single- and multimodal methods. Next, we evaluated whether these representations were beneficial for downstream arithmetic tasks.ResultsMultimodal training produced better latent representation in terms of clustering quality, which is consistent with previous findings on multimodal learning in deep neural networks. Moreover, the representations learned using multimodal information exhibited superior performance in arithmetic tasks.DiscussionOur novel findings experimentally demonstrate that changes in acquired latent representations through multimodal association learning are directly related to cognitive functions, including mathematical skills. This supports the possibility that multimodal learning using deep neural network models may offer novel insights into higher cognitive functions

    Backpropagation Beyond the Gradient

    Get PDF
    Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models for which they could manually compute derivatives. Now, they can create sophisticated models with almost no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the gradient computation in these libraries. Their entire design centers around gradient backpropagation. These frameworks are specialized around one specific task—computing the average gradient in a mini-batch. This specialization often complicates the extraction of other information like higher-order statistical moments of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order information and there is evidence that focusing solely on the gradient has not lead to significant recent advances in deep learning optimization. To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient must be made available at the same level of computational efficiency, automation, and convenience. This thesis presents approaches to simplify experimentation with rich information beyond the gradient by making it more readily accessible. We present an implementation of these ideas as an extension to the backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information. First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation which enables computing approximate per-layer curvature. This perspective unifies recently proposed block- diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order derivatives is modular, and therefore simple to automate and extend to new operations. Based on the insight that rich information beyond the gradient can be computed efficiently and at the same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and convenient access to statistical moments of the gradient and approximate curvature information, often at a small overhead compared to computing just the gradient. Next, we showcase the utility of such information to better understand neural network training. We build the Cockpit library that visualizes what is happening inside the model during training through various instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide hyperparameter tuning, and study deep learning phenomena. Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information helps in understanding challenges to make second-order optimization methods work in practice. This work develops new tools to experiment more easily with higher-order information in complex deep learning models. These tools have impacted works on Bayesian applications with Laplace approximations, out-of-distribution generalization, differential privacy, and the design of automatic differentia- tion systems. They constitute one important step towards developing and establishing more efficient deep learning algorithms

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    MASIL: Towards Maximum Separable Class Representation for Few Shot Class Incremental Learning

    Full text link
    Few Shot Class Incremental Learning (FSCIL) with few examples per class for each incremental session is the realistic setting of continual learning since obtaining large number of annotated samples is not feasible and cost effective. We present the framework MASIL as a step towards learning the maximal separable classifier. It addresses the common problem i.e forgetting of old classes and over-fitting to novel classes by learning the classifier weights to be maximally separable between classes forming a simplex Equiangular Tight Frame. We propose the idea of concept factorization explaining the collapsed features for base session classes in terms of concept basis and use these to induce classifier simplex for few shot classes. We further adds fine tuning to reduce any error occurred during factorization and train the classifier jointly on base and novel classes without retaining any base class samples in memory. Experimental results on miniImageNet, CIFAR-100 and CUB-200 demonstrate that MASIL outperforms all the benchmarks.Comment: 13 pages, 2 figures, 6 table
    • …
    corecore