12 research outputs found

    Optimizing ANN Architectures using Mixed-Integer Programming

    Full text link
    Over-parameterized networks, where the number of parameters surpass the number of train-ing samples, generalize well on various tasks. However, large networks are computationally expensive in terms of the training and inference time. Furthermore, the lottery ticket hy-pothesis states that a subnetwork of a randomly initialized network can achieve marginal loss after training on a specific task compared to the original network. Therefore, there is a need to optimize the inference and training time, and a potential for more compact neural architectures. We introduce a novel approach “Optimizing ANN Architectures using Mixed-Integer Programming” (OAMIP) to find these subnetworks by identifying critical neurons and re-moving non-critical ones, resulting in a faster inference time. The proposed OAMIP utilizes a Mixed-Integer Program (MIP) for assigning importance scores to each neuron in deep neural network architectures. Our MIP is guided by the impact on the main learning task of the net-work when simultaneously pruning subsets of neurons. In concrete, the optimization of the objective function drives the solver to minimize the number of neurons, to limit the network to critical neurons, i.e., with high importance score, that need to be kept for maintaining the overall accuracy of the trained neural network. Further, the proposed formulation generalizes the recently considered lottery ticket hypothesis by identifying multiple “lucky” subnetworks, resulting in optimized architectures, that not only perform well on a single dataset, but also generalize across multiple ones upon retraining of network weights. Finally, we present a scalable implementation of our method by decoupling the importance scores across layers using auxiliary networks and across di˙erent classes. We demonstrate the ability of OAMIP to prune neural networks with marginal loss in accuracy and generalizability on popular datasets and architectures.Les réseaux sur-paramétrés, où le nombre de paramètres dépasse le nombre de données, se généralisent bien sur diverses tâches. Cependant, les grands réseaux sont coûteux en termes d’entraînement et de temps d’inférence. De plus, l’hypothèse du billet de loterie indique qu’un sous-réseau d’un réseau initialisé de façon aléatoire peut atteindre une perte marginale après l’entrainement sur une tâche spécifique par rapport au réseau de référence. Par conséquent, il est nécessaire d’optimiser le temps d’inférence et d’entrainement, ce qui est possible pour des architectures neurales plus compactes. Nous introduisons une nouvelle approche “Optimizing ANN Architectures using Mixed-Integer Programming” (OAMIP) pour trouver ces sous-réseaux en identifiant les neurones importants et en supprimant les neurones non importants, ce qui permet d’accélérer le temps d’inférence. L’approche OAMIP proposée fait appel à un programme mixte en nombres entiers (MIP) pour attribuer des scores d’importance à chaque neurone dans les architectures de modèles profonds. Notre MIP est guidé par l’impact sur la principale tâche d’apprentissage du réseau en élaguant simultanément les neurones. En définissant soigneusement la fonction objective du MIP, le solveur aura une tendance à minimiser le nombre de neurones, à limiter le réseau aux neurones critiques, c’est-à-dire avec un score d’importance élevé, qui doivent être conservés pour maintenir la précision globale du réseau neuronal formé. De plus, la formulation proposée généralise l’hypothèse des billets de loterie récemment envisagée en identifiant de multiples sous-réseaux “chanceux”. Cela permet d’obtenir des architectures optimisées qui non seulement fonctionnent bien sur un seul ensemble de données, mais aussi se généralisent sur des di˙érents ensembles de données lors du recyclage des poids des réseaux. Enfin, nous présentons une implémentation évolutive de notre méthode en découplant les scores d’importance entre les couches à l’aide de réseaux auxiliaires et entre les di˙érentes classes. Nous démontrons la capacité de notre formulation à élaguer les réseaux de neurones avec une perte marginale de précision et de généralisabilité sur des ensembles de données et des architectures populaires

    Acta Cybernetica : Volume 14. Number 2.

    Get PDF

    Extending the Finite Domain Solver of GNU Prolog

    No full text
    International audienceThis paper describes three significant extensions for the Finite Domain solver of GNU Prolog. First, the solver now supports negative integers. Second, the solver detects and prevents integer overflows from occurring. Third, the internal representation of sparse domains has been redesigned to overcome its current limitations. The preliminary performance evaluation shows a limited slowdown factor with respect to the initial solver. This factor is widely counterbalanced by the new possibilities and the robustness of the solver. Furthermore these results are preliminary and we propose some directions to limit this overhead

    NASA SERC 1990 Symposium on VLSI Design

    Get PDF
    This document contains papers presented at the first annual NASA Symposium on VLSI Design. NASA's involvement in this event demonstrates a need for research and development in high performance computing. High performance computing addresses problems faced by the scientific and industrial communities. High performance computing is needed in: (1) real-time manipulation of large data sets; (2) advanced systems control of spacecraft; (3) digital data transmission, error correction, and image compression; and (4) expert system control of spacecraft. Clearly, a valuable technology in meeting these needs is Very Large Scale Integration (VLSI). This conference addresses the following issues in VLSI design: (1) system architectures; (2) electronics; (3) algorithms; and (4) CAD tools

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF

    Automated Deduction – CADE 28

    Get PDF
    This open access book constitutes the proceeding of the 28th International Conference on Automated Deduction, CADE 28, held virtually in July 2021. The 29 full papers and 7 system descriptions presented together with 2 invited papers were carefully reviewed and selected from 76 submissions. CADE is the major forum for the presentation of research in all aspects of automated deduction, including foundations, applications, implementations, and practical experience. The papers are organized in the following topics: Logical foundations; theory and principles; implementation and application; ATP and AI; and system descriptions

    AVATAR - Machine Learning Pipeline Evaluation Using Surrogate Model

    Get PDF
    © 2020, The Author(s). The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution

    Space and Earth Sciences, Computer Systems, and Scientific Data Analysis Support, Volume 1

    Get PDF
    This Final Progress Report covers the specific technical activities of Hughes STX Corporation for the last contract triannual period of 1 June through 30 Sep. 1993, in support of assigned task activities at Goddard Space Flight Center (GSFC). It also provides a brief summary of work throughout the contract period of performance on each active task. Technical activity is presented in Volume 1, while financial and level-of-effort data is presented in Volume 2. Technical support was provided to all Division and Laboratories of Goddard's Space Sciences and Earth Sciences Directorates. Types of support include: scientific programming, systems programming, computer management, mission planning, scientific investigation, data analysis, data processing, data base creation and maintenance, instrumentation development, and management services. Mission and instruments supported include: ROSAT, Astro-D, BBXRT, XTE, AXAF, GRO, COBE, WIND, UIT, SMM, STIS, HEIDI, DE, URAP, CRRES, Voyagers, ISEE, San Marco, LAGEOS, TOPEX/Poseidon, Pioneer-Venus, Galileo, Cassini, Nimbus-7/TOMS, Meteor-3/TOMS, FIFE, BOREAS, TRMM, AVHRR, and Landsat. Accomplishments include: development of computing programs for mission science and data analysis, supercomputer applications support, computer network support, computational upgrades for data archival and analysis centers, end-to-end management for mission data flow, scientific modeling and results in the fields of space and Earth physics, planning and design of GSFC VO DAAC and VO IMS, fabrication, assembly, and testing of mission instrumentation, and design of mission operations center

    Thrust Area Report, Engineering Research, Development and Technology

    Full text link
    corecore