33,314 research outputs found
QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks
Recent progress in computer vision-oriented neural network designs is mostly
driven by capturing high-order neural interactions among inputs and features.
And there emerged a variety of approaches to accomplish this, such as
Transformers and its variants. However, these interactions generate a large
amount of intermediate state and/or strong data dependency, leading to
considerable memory consumption and computing cost, and therefore compromising
the overall runtime performance. To address this challenge, we rethink the
high-order interactive neural network design with a quadratic computing
approach. Specifically, we propose QuadraNet -- a comprehensive model design
methodology from neuron reconstruction to structural block and eventually to
the overall neural network implementation. Leveraging quadratic neurons'
intrinsic high-order advantages and dedicated computation optimization schemes,
QuadraNet could effectively achieve optimal cognition and computation
performance. Incorporating state-of-the-art hardware-aware neural architecture
search and system integration techniques, QuadraNet could also be well
generalized in different hardware constraint settings and deployment scenarios.
The experiment shows thatQuadraNet achieves up to 1.5 throughput, 30%
less memory footprint, and similar cognition performance, compared with the
state-of-the-art high-order approaches.Comment: ASP-DAC 2024 Best Paper Nominatio
Recommended from our members
Design of an adaptive neural predictive nonlinear controller for nonholonomic mobile robot system based on posture identifier in the presence of disturbance
This paper proposes an adaptive neural predictive nonlinear controller to guide a nonholonomic wheeled mobile robot during continuous and non-continuous gradients trajectory tracking. The structure of the controller consists of two models that describe the kinematics and dynamics of the mobile robot system and a feedforward neural controller. The models are modified Elman neural network and feedforward multi-layer perceptron respectively. The modified Elman neural network model is trained off-line and on-line stages to guarantee the outputs of the model accurately represent the actual outputs of the mobile robot system. The trained neural model acts as the position and orientation identifier. The feedforward neural controller is trained off-line and adaptive weights are adapted on-line to find the reference torques, which controls the steady-state outputs of the mobile robot system. The feedback neural controller is based on the posture neural identifier and quadratic performance index optimization algorithm to find the optimal torque action in the transient state for N-step-ahead prediction. General back propagation algorithm is used to learn the feedforward neural controller and the posture neural identifier. Simulation results show the effectiveness of the proposed adaptive neural predictive control algorithm; this is demonstrated by the minimised tracking error and the smoothness of the torque control signal obtained with bounded external disturbances
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
Surrogate modeling approximation using a mixture of experts based on EM joint estimation
An automatic method to combine several local surrogate models is presented. This method is intended to build accurate and smooth approximation of discontinuous functions that are to be used in structural optimization problems. It strongly relies on the Expectation-Maximization (EM) algorithm for Gaussian mixture models (GMM). To the end of regression, the inputs are clustered together with their output values by means of parameter estimation of the joint distribution. A local expert is then built (linear, quadratic, artificial neural network, moving least squares) on each cluster. Lastly, the local experts are combined using the Gaussian mixture model parameters found by the EM algorithm to obtain a global model. This method is tested over both mathematical test cases and an engineering optimization problem from aeronautics and is found to improve the accuracy of the approximation
Optimal Observer Design Using Reinforcement Learning and Quadratic Neural Networks
This paper introduces an innovative approach based on policy iteration (PI),
a reinforcement learning (RL) algorithm, to obtain an optimal observer with a
quadratic cost function. This observer is designed for systems with a given
linearized model and a stabilizing Luenberger observer gain. We utilize
two-layer quadratic neural networks (QNN) for policy evaluation and derive a
linear correction term using the input and output data. This correction term
effectively rectifies inaccuracies introduced by the linearized model employed
within the observer design. A unique feature of the proposed methodology is
that the QNN is trained through convex optimization. The main advantage is that
the QNN's input-output mapping has an analytical expression as a quadratic
form, which can then be used to obtain a linear correction term policy. This is
in stark contrast to the available techniques in the literature that must train
a second neural network to obtain policy improvement. It is proven that the
obtained linear correction term is optimal for linear systems, as both the
value function and the QNN's input-output mapping are quadratic. The proposed
method is applied to a simple pendulum, demonstrating an enhanced correction
term policy compared to relying solely on the linearized model. This shows its
promise for addressing nonlinear systems
Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces
In practical Bayesian optimization, we must often search over structures with
differing numbers of parameters. For instance, we may wish to search over
neural network architectures with an unknown number of layers. To relate
performance data gathered for different architectures, we define a new kernel
for conditional parameter spaces that explicitly includes information about
which parameters are relevant in a given structure. We show that this kernel
improves model quality and Bayesian optimization results over several simpler
baseline kernels.Comment: 6 pages, 3 figures. Appeared in the NIPS 2013 workshop on Bayesian
optimizatio
Reactive Planar Manipulation with Convex Hybrid MPC
This paper presents a reactive controller for planar manipulation tasks that
leverages machine learning to achieve real-time performance. The approach is
based on a Model Predictive Control (MPC) formulation, where the goal is to
find an optimal sequence of robot motions to achieve a desired object motion.
Due to the multiple contact modes associated with frictional interactions, the
resulting optimization program suffers from combinatorial complexity when
tasked with determining the optimal sequence of modes.
To overcome this difficulty, we formulate the search for the optimal mode
sequences offline, separately from the search for optimal control inputs
online. Using tools from machine learning, this leads to a convex hybrid MPC
program that can be solved in real-time. We validate our algorithm on a planar
manipulation experimental setup where results show that the convex hybrid MPC
formulation with learned modes achieves good closed-loop performance on a
trajectory tracking problem
- …