Search CORE

923 research outputs found

The Shapley Value of Classifiers in Ensemble Games

Author: Rozemberczki Benedek
Sarkar Rik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/06/2021
Field of study

What is the value of an individual model in an ensemble of binary classifiers? We answer this question by introducing a class of transferable utility cooperative games called \textit{ensemble games}. In machine learning ensembles, pre-trained models cooperate to make classification decisions. To quantify the importance of models in these ensemble games, we define \textit{Troupe} -- an efficient algorithm which allocates payoffs based on approximate Shapley values of the classifiers. We argue that the Shapley value of models in these games is an effective decision metric for choosing a high performing subset of models from the ensemble. Our analytical findings prove that our Shapley value estimation scheme is precise and scalable; its performance increases with size of the dataset and ensemble. Empirical results on real world graph classification tasks demonstrate that our algorithm produces high quality estimates of the Shapley value. We find that Shapley values can be utilized for ensemble pruning, and that adversarial models receive a low valuation. Complex classifiers are frequently found to be responsible for both correct and incorrect classification decisions.Comment: Source code is available here: https://github.com/benedekrozemberczki/shaple

arXiv.org e-Print Archive

Edinburgh Research Explorer

Neural network security and optimization for single-person authentication using electroencephalogram data

Author: Andre Naomi
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2022
Field of study

Includes bibliographical references.2022 Fall.Security is an important focus for devices that use biometric data, and as such security around authentication needs to be considered. This is true for brain-computer interfaces (BCIs), which often use electroencephalogram (EEG) data as inputs and neural network classification to determine their function. EEG data can also serve as a form of biometric authentication, which would contribute to the security of these devices. Neural networks have also used a method known as ablation to improve their efficiency. In light of this info, the goal of this research is to determine whether neural network ablation can also be used as a method to improve security by reducing a network's learning capabilities to include authenticating only a given target, and preventing adversaries from training new data to be authenticated. Data on the change in entropy of weight values of the networks after training was also collected for the purpose of determining patterns in weight distribution. Results from a set of ablated networks to a set of baseline (non-ablated) networks for five targets chosen randomly from a data set of 12 people were compared. The results found that ablated maintained accuracy through the ablation process, but that they did not perform as well as the baseline networks. Change in performance between single-target authentication and target-plus-invader authentication was also examined, but no significant results were found. Furthermore, the change in entropy differed between both baseline networks and ablated networks, as well as between single-target authentication and target-plus-invader authentication for all networks. Ablation was determined to have potential for security applications that need to be expanded on, and weight distribution was found to have some correlation with the complexity of an input to a network

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Tools for efficient Deep Learning

Author: Zhao Ruizhe
Publication venue: Computing, Imperial College London
Publication date: 01/09/2023
Field of study

In the era of Deep Learning (DL), there is a fast-growing demand for building and deploying Deep Neural Networks (DNNs) on various platforms. This thesis proposes five tools to address the challenges for designing DNNs that are efficient in time, in resources and in power consumption. We first present Aegis and SPGC to address the challenges in improving the memory efficiency of DL training and inference. Aegis makes mixed precision training (MPT) stabler by layer-wise gradient scaling. Empirical experiments show that Aegis can improve MPT accuracy by at most 4\%. SPGC focuses on structured pruning: replacing standard convolution with group convolution (GConv) to avoid irregular sparsity. SPGC formulates GConv pruning as a channel permutation problem and proposes a novel heuristic polynomial-time algorithm. Common DNNs pruned by SPGC have maximally 1\% higher accuracy than prior work. This thesis also addresses the challenges lying in the gap between DNN descriptions and executables by Polygeist for software and POLSCA for hardware. Many novel techniques, e.g. statement splitting and memory partitioning, are explored and used to expand polyhedral optimisation. Polygeist can speed up software execution in sequential and parallel by 2.53 and 9.47 times on Polybench/C. POLSCA achieves 1.5 times speedup over hardware designs directly generated from high-level synthesis on Polybench/C. Moreover, this thesis presents Deacon, a framework that generates FPGA-based DNN accelerators of streaming architectures with advanced pipelining techniques to address the challenges from heterogeneous convolution and residual connections. Deacon provides fine-grained pipelining, graph-level optimisation, and heuristic exploration by graph colouring. Compared with prior designs, Deacon shows resource/power consumption efficiency improvement of 1.2x/3.5x for MobileNets and 1.0x/2.8x for SqueezeNets. All these tools are open source, some of which have already gained public engagement. We believe they can make efficient deep learning applications easier to build and deploy.Open Acces

Spiral - Imperial College Digital Repository

Graph Filters for Signal Processing and Machine Learning on Graphs

Author: Gama Fernando
Isufi Elvin
Segarra Santiago
Shuman David I.
Publication venue
Publication date: 19/02/2024
Field of study

Filters are fundamental in extracting information from data. For time series and image data that reside on Euclidean domains, filters are the crux of many signal processing and machine learning techniques, including convolutional neural networks. Increasingly, modern data also reside on networks and other irregular domains whose structure is better captured by a graph. To process and learn from such data, graph filters account for the structure of the underlying data domain. In this article, we provide a comprehensive overview of graph filters, including the different filtering categories, design strategies for each type, and trade-offs between different types of graph filters. We discuss how to extend graph filters into filter banks and graph neural networks to enhance the representational power; that is, to model a broader variety of signal classes, data patterns, and relationships. We also showcase the fundamental role of graph filters in signal processing and machine learning applications. Our aim is that this article provides a unifying framework for both beginner and experienced researchers, as well as a common understanding that promotes collaborations at the intersections of signal processing, machine learning, and application domains

arXiv.org e-Print Archive

Recommended from our members

Higher-Order Calculations in Quantum Chromodynamics

Author: Chawdhry Herschel
Publication venue: University of Cambridge
Publication date: 01/11/2020
Field of study

In this thesis, several techniques and advances in higher-order Quantum Chromodynamics (QCD) calculations are presented. There is a particular focus on 2-loop 5-point massless QCD amplitudes, which are currently at the frontier of higher-order QCD calculations. Firstly, we study the Brodsky-Lepage-Mackenzie/Principle of Maximum Conformality (BLM/PMC) method for setting the renormalisation scale, μ_R, in higher-order QCD calculations. We identify three ambiguities in the BLM/PMC procedure and study their numerical impact using the example of the total cross-section for top-pair production at Next-to-Next-to-Leading Order (NNLO) in QCD. The numerical impact of these ambiguities on the BLM/PMC prediction for the cross-section is found to be comparable to the impact of the choice of μ_R in the conventional scale-setting approach. Secondly, we introduce a novel strategy for solving integration-by-parts (IBP) identities, which are widely used in the computation of multi-loop QCD amplitudes. We implement the strategy in an efficient C++ program and hence solve the IBP identities needed for the computation of any planar 2-loop 5-point massless amplitude in QCD. We also derive representative results for the most complicated non-planar family of integrals. Thirdly, we present an automated computational framework to reduce 2-loop 5-point massless amplitudes to a basis of pentagon functions. It uses finite-field evaluation and interpolation techniques, as well as the aforementioned analytical IBP results. We use this to calculate the leading-colour 2-loop QCD amplitude for qq̄→γγγ and then compute the NNLO QCD corrections to 3-photon production at the LHC. This is the first NNLO QCD calculation for a 2→3 process. We compare our predictions with the available 8 TeV measurements from the ATLAS collaboration and we find that the inclusion of the NNLO corrections eliminates the existing significant discrepancy with respect to NLO QCD predictions, paving the way for precision phenomenology in this process

Apollo (Cambridge)

High-Level GPU Programming: Domain-Specific Optimization and Inference

Author: Lejdfors Calle
Publication venue
Publication date: 01/01/2008
Field of study

When writing computer software one is often forced to balance the need for high run-time performance with high programmer productivity. By using a high-level language it is often possible to cut development times, but this typically comes at the cost of reduced run-time performance. Using a lower-level language, programs can be made very efficient but at the cost of increased development time. Real-time computer graphics is an area where there are very high demands on both performance and visual quality. Typically, large portions of such applications are written in lower-level languages and also rely on dedicated hardware, in the form of programmable graphics processing units (GPUs), for handling computationally demanding rendering algorithms. These GPUs are parallel stream processors, specialized towards computer graphics, that have computational performance more than a magnitude higher than corresponding CPUs. This has revolutionized computer graphics and also led to GPUs being used to solve more general numerical problems, such as fluid and physics simulation, protein folding, image processing, and databases. Unfortunately, the highly specialized nature of GPUs has also made them difficult to program. In this dissertation we show that GPUs can be programmed at a higher level, while maintaining performance, compared to current lower-level languages. By constructing a domain-specific language (DSL), which provides appropriate domain-specific abstractions and user-annotations, it is possible to write programs in a more abstract and modular manner. Using knowledge of the domain it is possible for the DSL compiler to generate very efficient code. We show that, by experiment, the performance of our DSLs is equal to that of GPU programs written by hand using current low-level languages. Also, control over the trade-offs between visual quality and performance is retained. In the papers included in this dissertation, we present domain-specific languages targeted at numerical processing and computer graphics, respectively. These DSL have been implemented as embedded languages in Python, a dynamic programming language that provide a rich set of high-level features. In this dissertation we show how these features can be used to facilitate the construction of embedded languages

Lund University Publications