285 research outputs found
A Machine Learning-based Framework for Optimizing the Operation of Future Networks
5G and beyond are not only sophisticated and difficult to manage, but must also satisfy a wide range of stringent performance requirements and adapt quickly to changes in traffic and network state. Advances in machine learning and parallel computing underpin new powerful tools that have the potential to tackle these complex challenges. In this article, we develop a general machinelearning- based framework that leverages artificial intelligence to forecast future traffic demands and characterize traffic features. This makes it possible to exploit such traffic insights to improve the performance of critical network control mechanisms, such as load balancing, routing, and scheduling. In contrast to prior works that design problem-specific machine learning algorithms, our generic approach can be applied to different network functions, allowing reuse of existing control mechanisms with minimal modifications. We explain how our framework can orchestrate ML to improve two different network mechanisms. Further, we undertake validation by implementing one of these, mobile backhaul routing, using data collected by a major European operator and demonstrating a 3×reduction of the packet delay compared to traditional approaches.This work is partially supported by the Madrid Regional Government through the TAPIR-CM program (S2018/TCS-4496) and the Juan de la Cierva grant (FJCI-2017-32309). Paul Patras acknowledges the support received from the Cisco University Research Program Fund (2019-197006)
Sparse MoEs meet Efficient Ensembles
Machine learning models based on the aggregated outputs of submodels, either
at the activation or prediction levels, often exhibit strong performance
compared to individual models. We study the interplay of two popular classes of
such models: ensembles of neural networks and sparse mixture of experts (sparse
MoEs). First, we show that the two approaches have complementary features whose
combination is beneficial. This includes a comprehensive evaluation of sparse
MoEs in uncertainty related benchmarks. Then, we present Efficient Ensemble of
Experts (E), a scalable and simple ensemble of sparse MoEs that takes the
best of both classes of models, while using up to 45% fewer FLOPs than a deep
ensemble. Extensive experiments demonstrate the accuracy, log-likelihood,
few-shot learning, robustness, and uncertainty improvements of E over
several challenging vision Transformer-based baselines. E not only
preserves its efficiency while scaling to models with up to 2.7B parameters,
but also provides better predictive performance and uncertainty estimates for
larger models.Comment: 59 pages, 26 figures, 36 tables. Accepted at TML
EdgeServe: An Execution Layer for Decentralized Prediction
The relevant features for a machine learning task may be aggregated from data
sources collected on different nodes in a network. This problem, which we call
decentralized prediction, creates a number of interesting systems challenges in
managing data routing, placing computation, and time-synchronization. This
paper presents EdgeServe, a machine learning system that can serve
decentralized predictions. EdgeServe relies on a low-latency message broker to
route data through a network to nodes that can serve predictions. EdgeServe
relies on a series of novel optimizations that can tradeoff computation,
communication, and accuracy. We evaluate EdgeServe on three decentralized
prediction tasks: (1) multi-camera object tracking, (2) network intrusion
detection, and (3) human activity recognition.Comment: 13 pages, 8 figure
Zero Time Waste: Recycling Predictions in Early Exit Neural Networks
The problem of reducing processing time of large deep learning models is a
fundamental challenge in many real-world applications. Early exit methods
strive towards this goal by attaching additional Internal Classifiers (ICs) to
intermediate layers of a neural network. ICs can quickly return predictions for
easy examples and, as a result, reduce the average inference time of the whole
model. However, if a particular IC does not decide to return an answer early,
its predictions are discarded, with its computations effectively being wasted.
To solve this issue, we introduce Zero Time Waste (ZTW), a novel approach in
which each IC reuses predictions returned by its predecessors by (1) adding
direct connections between ICs and (2) combining previous outputs in an
ensemble-like manner. We conduct extensive experiments across various datasets
and architectures to demonstrate that ZTW achieves a significantly better
accuracy vs. inference time trade-off than other recently proposed early exit
methods.Comment: Accepted at NeurIPS 202
Optimal control of Beneš optical networks assisted by machine learning
Optimal control of Beneˇs optical networks
assisted by machine learning
Ihtesham Khana, Lorenzo Tunesia, Muhammad Umar Masooda, Enrico Ghillinob,
Paolo Bardellaa, Andrea Carenaa, and Vittorio Curria
aPolitecnico di Torino, Corso Duca degli Abruzzi 24, Torino, Italy
bSynopsys Inc., Executive Blvd 101, Ossining, New York, USA
ABSTRACT
Beneˇs networks represent an excellent solution for the routing of optical telecom signals in integrated, fully
reconfigurable networks because of their limited number of elementary 2x2 crossbar switches and their non-
blocking properties. Various solutions have been proposed to determine a proper Control State (CS) providing
the required permutation of the input channels; since for a particular permutation, the choice is not unique, the
number of cross-points has often been used to estimate the cost of the routing operation. This work presents an
advanced version of this approach: we deterministically estimate all (or a reasonably large number of) the CSs
corresponding to the permutation requested by the user. After this, the retrieved CSs are exploited by a data-
driven framework to predict the Optical Signal to Noise Ratio (OSNR) penalty for each CS at each output port,
finally selecting the CS providing minimum OSNR penalty. Moreover, three different data-driven techniques are
proposed, and their prediction performance is analyzed and compared.
The proposed approach is demonstrated using 8x8 Beneˇs architecture with 20 ring resonator-based crossbar
switches. The dataset of 1000 OSNRs realizations is generated synthetically for random combinations of the
CSs using Synopsys® Optsim™ simulator. The computational cost of the proposed scheme enables its real-time
operation in the field
TreeCaps: Tree-Structured Capsule Networks for program source code processing
National Research Foundation (NRF) Singapore under its AI Singapore Programm
Automatic Induction of Neural Network Decision Tree Algorithms
This work presents an approach to automatically induction for non-greedy
decision trees constructed from neural network architecture. This construction
can be used to transfer weights when growing or pruning a decision tree,
allowing non-greedy decision tree algorithms to automatically learn and adapt
to the ideal architecture. In this work, we examine the underpinning ideas
within ensemble modelling and Bayesian model averaging which allow our neural
network to asymptotically approach the ideal architecture through weights
transfer. Experimental results demonstrate that this approach improves models
over fixed set of hyperparameters for decision tree models and decision forest
models.Comment: This is a pre-print of a contribution "Chapman Siu, Automatic
Induction of Neural Network Decision Tree Algorithms." To appear in Computing
Conference 2019 Proceedings. Advances in Intelligent Systems and Computing.
Implementation:
https://github.com/chappers/automatic-induction-neural-decision-tre
- …