54 research outputs found

    Some phenomenological investigations in deep learning

    Full text link
    Les remarquables performances des réseaux de neurones profonds dans de nombreux domaines de l'apprentissage automatique au cours de la dernière décennie soulèvent un certain nombre de questions théoriques. Par exemple, quels mecanismes permettent à ces reseaux, qui ont largement la capacité de mémoriser entièrement les exemples d'entrainement, de généraliser correctement à de nouvelles données, même en l'absence de régularisation explicite ? De telles questions ont fait l'objet d'intenses efforts de recherche ces dernières années, combinant analyses de systèmes simplifiés et études empiriques de propriétés qui semblent être corrélées à la performance de généralisation. Les deux premiers articles présentés dans cette thèse contribuent à cette ligne de recherche. Leur but est de mettre en évidence et d'etudier des mécanismes de biais implicites permettant à de larges modèles de prioriser l'apprentissage de fonctions "simples" et d'adapter leur capacité à la complexité du problème. Le troisième article aborde le problème de l'estimation de information mutuelle en haute, en mettant à profit l'expressivité et la scalabilité des reseaux de neurones profonds. Il introduit et étudie une nouvelle classe d'estimateurs, dont il présente plusieurs applications en apprentissage non supervisé, notamment à l'amélioration des modèles neuronaux génératifs.The striking empirical success of deep neural networks in machine learning raises a number of theoretical puzzles. For example, why can they generalize to unseen data despite their capacity to fully memorize the training examples? Such puzzles have been the subject of intense research efforts in the past few years, which combine rigorous analysis of simplified systems with empirical studies of phenomenological properties shown to correlate with generalization. The first two articles presented in these thesis contribute to this line of work. They highlight and discuss mechanisms that allow large models to prioritize learning `simple' functions during training and to adapt their capacity to the complexity of the problem. The third article of this thesis addresses the long standing problem of estimating mutual information in high dimension, by leveraging the scalability of neural networks. It introduces and studies a new class of estimators and present several applications in unsupervised learning, especially on enhancing generative models

    Scalable Learning In Distributed Robot Teams

    Get PDF
    Mobile robots are already in use for mapping, agriculture, entertainment, and the delivery of goods and people. As robotic systems continue to become more affordable, large numbers of mobile robots may be deployed concurrently to accomplish tasks faster and more efficiently. Practical deployments of very large teams will require scalable algorithms to enable the distributed cooperation of autonomous agents. This thesis focuses on the three main algorithmic obstacles to the scalability of robot teams: coordination, control, and communication. To address these challenges, we design graph-based abstractions that allow us to apply Graph Neural Networks (GNNs).First, a team of robots must continually coordinate to divide up mission requirements among all agents. We focus on the case studies of exploration and coverage to develop a spatial GNN controller that can coordinate a team of dozens of agents as they visit thousands of landmarks. A routing problem of this size is intractable for existing optimization-based approaches. Second, a robot in a team must be able to execute the trajectory that will accomplish its given sub-task. In large teams with high densities of robots, planning and execution of safe, collision-free trajectories requires the joint optimization over all agent trajectories, which may be impractical in large teams. We present two approaches to scalable control: a) a controller for flocking that uses delayed communication formalized via a GNN; and b) an inverse optimal planning method that learns from real air traffic data. Third, robot teams may need to operate in harsh environments without existing communication infrastructure, requiring the formation of ad-hoc networks to exchange information. Many algorithms for control of multi-robot teams operate under the assumption that low-latency, global state information necessary to coordinate agent actions can readily be disseminated among the team. Our approach leverages GNNs to control the connectivity within the ad-hoc network and to provide the data distribution infrastructure necessary for countless multi-robot algorithms. Finally, this thesis develops a framework for distributed learning to be used when centralized information is unavailable during training. Our approach allows robots to train controllers independently and then share their experiences by composing multiple models represented in a Reproducing Kernel Hilbert Space

    The Shallow and the Deep:A biased introduction to neural networks and old school machine learning

    Get PDF
    The Shallow and the Deep is a collection of lecture notes that offers an accessible introduction to neural networks and machine learning in general. However, it was clear from the beginning that these notes would not be able to cover this rapidly changing and growing field in its entirety. The focus lies on classical machine learning techniques, with a bias towards classification and regression. Other learning paradigms and many recent developments in, for instance, Deep Learning are not addressed or only briefly touched upon.Biehl argues that having a solid knowledge of the foundations of the field is essential, especially for anyone who wants to explore the world of machine learning with an ambition that goes beyond the application of some software package to some data set. Therefore, The Shallow and the Deep places emphasis on fundamental concepts and theoretical background. This also involves delving into the history and pre-history of neural networks, where the foundations for most of the recent developments were laid. These notes aim to demystify machine learning and neural networks without losing the appreciation for their impressive power and versatility

    Role of biases in neural network models

    Get PDF

    Optimal aeroelastic trim for rotorcraft with constrained, non-unique trim solutions

    Get PDF
    New rotorcraft configurations are emerging, such as the optimal speed helicopter and slowed-rotor compound helicopter which, due to variable rotor speed and redundant lifting components, have non-unique trim solution spaces. The combination of controls and rotor speed that produce the best steady-flight condition is sought among all the possible solutions. This work develops the concept of optimal rotorcraft trim and explores its application to advanced rotorcraft configurations with non-unique, constrained trim solutions. The optimal trim work is based on the nonlinear programming method of the generalized reduced gradient (GRG) and is integrated into a multi-body, comprehensive aeroelastic rotorcraft code. In addition to the concept of optimal trim, two further developments are presented that allow the extension of optimal trim to rotorcraft with rotors that operate over a wide range of rotor speeds. The first is the concept of variable rotor speed trim with special application to rotors operating in steady autorotation. The technique developed herein treats rotor speed as a trim variable and uses a Newton-Raphson iterative method to drive the rotor speed to zero average torque simultaneously with other dependent trim variables. The second additional contribution of this thesis is a novel way to rapidly approximate elastic rotor blade stresses and strains in the aeroelastic trim analysis for structural constraints. For rotors that operate over large angular velocity ranges, rotor resonance and increased flapping conditions are encountered that can drive the maximum cross-sectional stress and strain to levels beyond endurance limits; such conditions must be avoided. The method developed herein captures the maximum cross-sectional stress/strain based on the trained response of an artificial neural network (ANN) surrogate as a function of 1-D beam forces and moments. The stresses/strains are computed simultaneously with the optimal trim and are used as constraints in the optimal trim solution. Finally, an optimal trim analysis is applied to a high-speed compound gyroplane configuration, which has two distinct rotor speed control methods, with the purpose of maximizing the vehicle cruise efficiency while maintaining rotor blade strain below endurance limit values.Ph.D.Committee Chair: Dimitri N. Mavris; Committee Co-Chair: Daniel P Schrage; Committee Member: David A. Peters; Committee Member: Dewey H. Hodges; Committee Member: J.V.R. Prasa
    corecore