35,408 research outputs found

    Compact: Approximating Complex Activation Functions for Secure Computation

    Full text link
    Secure multi-party computation (MPC) techniques can be used to provide data privacy when users query deep neural network (DNN) models hosted on a public cloud. State-of-the-art MPC techniques can be directly leveraged for DNN models that use simple activation functions (AFs) such as ReLU. However, DNN model architectures designed for cutting-edge applications often use complex and highly non-linear AFs. Designing efficient MPC techniques for such complex AFs is an open problem. Towards this, we propose Compact, which produces piece-wise polynomial approximations of complex AFs to enable their efficient use with state-of-the-art MPC techniques. Compact neither requires nor imposes any restriction on model training and results in near-identical model accuracy. We extensively evaluate Compact on four different machine-learning tasks with DNN architectures that use popular complex AFs SiLU, GeLU, and Mish. Our experimental results show that Compact incurs negligible accuracy loss compared to DNN-specific approaches for handling complex non-linear AFs. We also incorporate Compact in two state-of-the-art MPC libraries for privacy-preserving inference and demonstrate that Compact provides 2x-5x speedup in computation compared to the state-of-the-art approximation approach for non-linear functions -- while providing similar or better accuracy for DNN models with large number of hidden layer

    The General Approximation Theorem

    Get PDF
    A general approximation theorem is proved. It uniformly envelopes both the classical Stone theorem and approximation of functions of several variables by means of superpositions and linear combinations of functions of one variable. This theorem is interpreted as a statement on universal approximating possibilities ( approximating omnipotence ) of arbitrary nonlinearity. For the neural networks, our result states that the function of neuron activation must be nonlinear, and nothing els

    An efficient hardware architecture for a neural network activation function generator

    Get PDF
    This paper proposes an efficient hardware architecture for a function generator suitable for an artificial neural network (ANN). A spline-based approximation function is designed that provides a good trade-off between accuracy and silicon area, whilst also being inherently scalable and adaptable for numerous activation functions. This has been achieved by using a minimax polynomial and through optimal placement of the approximating polynomials based on the results of a genetic algorithm. The approximation error of the proposed method compares favourably to all related research in this field. Efficient hardware multiplication circuitry is used in the implementation, which reduces the area overhead and increases the throughput

    The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality

    Full text link
    In this article we study high-dimensional approximation capacities of shallow and deep artificial neural networks (ANNs) with the rectified linear unit (ReLU) activation. In particular, it is a key contribution of this work to reveal that for all a,bRa,b\in\mathbb{R} with ba7b-a\geq 7 we have that the functions [a,b]dx=(x1,,xd)i=1dxiR[a,b]^d\ni x=(x_1,\dots,x_d)\mapsto\prod_{i=1}^d x_i\in\mathbb{R} for dNd\in\mathbb{N} as well as the functions [a,b]dx=(x1,,xd)sin(i=1dxi)R[a,b]^d\ni x =(x_1,\dots, x_d)\mapsto\sin(\prod_{i=1}^d x_i) \in \mathbb{R} for dN d \in \mathbb{N} can neither be approximated without the curse of dimensionality by means of shallow ANNs nor insufficiently deep ANNs with ReLU activation but can be approximated without the curse of dimensionality by sufficiently deep ANNs with ReLU activation. We show that the product functions and the sine of the product functions are polynomially tractable approximation problems among the approximating class of deep ReLU ANNs with the number of hidden layers being allowed to grow in the dimension dN d \in \mathbb{N} . We establish the above outlined statements not only for the product functions and the sine of the product functions but also for other classes of target functions, in particular, for classes of uniformly globally bounded C C^{ \infty } -functions with compact support on any [a,b]d[a,b]^d with aRa\in\mathbb{R}, b(a,)b\in(a,\infty). Roughly speaking, in this work we lay open that simple approximation problems such as approximating the sine or cosine of products cannot be solved in standard implementation frameworks by shallow or insufficiently deep ANNs with ReLU activation in polynomial time, but can be approximated by sufficiently deep ReLU ANNs with the number of parameters growing at most polynomially.Comment: 101 pages, 1 figure. arXiv admin note: substantial text overlap with arXiv:2112.1452

    Effective Activation Functions for Homomorphic Evaluation of Deep Neural Networks

    Get PDF
    CryptoNets and subsequent work have demonstrated the capability of homomorphic encryption (HE) in the applications of private artificial intelligence (AI). While convolutional neural networks (CNNs) are primarily composed of linear functions which can be homomorphically evaluated, layers such as the activation layer are non-linear and cannot be homomorphically evaluated. One of the most commonly used alternatives is approximating these non-linear functions using low-degree polynomials. However, it is difficult to generate efficient approximations and often, dataset specific improvements are required. This thesis presents a systematic method to construct HE-friendly activation functions for CNNs. We first determine the key properties in a good activation function that contribute to performance by analyzing commonly used functions such as Rectified Linear Units (ReLU) and Sigmoid. We then analyse the inputs to the activation layer and search for an optimal range of approximation for the polynomial activation. Based on our findings, we propose a novel weighted polynomial approximation method tailored to this input distribution. Finally, we demonstrate effectiveness and robustness of our method using three datasets; MNIST, FMNIST, CIFAR-10
    corecore