4 research outputs found
Number Sequence Prediction Problems for Evaluating Computational Powers of Neural Networks
Inspired by number series tests to measure human intelligence, we suggest
number sequence prediction tasks to assess neural network models' computational
powers for solving algorithmic problems. We define the complexity and
difficulty of a number sequence prediction task with the structure of the
smallest automaton that can generate the sequence. We suggest two types of
number sequence prediction problems: the number-level and the digit-level
problems. The number-level problems format sequences as 2-dimensional grids of
digits and the digit-level problems provide a single digit input per a time
step. The complexity of a number-level sequence prediction can be defined with
the depth of an equivalent combinatorial logic, and the complexity of a
digit-level sequence prediction can be defined with an equivalent state
automaton for the generation rule. Experiments with number-level sequences
suggest that CNN models are capable of learning the compound operations of
sequence generation rules, but the depths of the compound operations are
limited. For the digit-level problems, simple GRU and LSTM models can solve
some problems with the complexity of finite state automata. Memory augmented
models such as Stack-RNN, Attention, and Neural Turing Machines can solve the
reverse-order task which has the complexity of simple pushdown automaton.
However, all of above cannot solve general Fibonacci, Arithmetic or Geometric
sequence generation problems that represent the complexity of queue automata or
Turing machines. The results show that our number sequence prediction problems
effectively evaluate machine learning models' computational capabilities.Comment: Accepted to 2019 AAAI Conference on Artificial Intelligenc
Neural Attention Memory
We propose a novel perspective of the attention mechanism by reinventing it
as a memory architecture for neural networks, namely Neural Attention Memory
(NAM). NAM is a memory structure that is both readable and writable via
differentiable linear algebra operations. We explore three use cases of NAM:
memory-augmented neural network (MANN), few-shot learning, and efficient
long-range attention. First, we design two NAM-based MANNs of Long Short-term
Memory (LSAM) and NAM Turing Machine (NAM-TM) that show better computational
powers in algorithmic zero-shot generalization tasks compared to other
baselines such as differentiable neural computer (DNC). Next, we apply NAM to
the N-way K-shot learning task and show that it is more effective at reducing
false positives compared to the baseline cosine classifier. Finally, we
implement an efficient Transformer with NAM and evaluate it with long-range
arena tasks to show that NAM can be an efficient and effective alternative for
scaled dot-product attention.Comment: Preprint. Under revie
Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation
Side-channel attacks that use machine learning (ML) for signal analysis have
become prominent threats to computer security, as ML models easily find
patterns in signals. To address this problem, this paper explores using
Adversarial Machine Learning (AML) methods as a defense at the computer
architecture layer to obfuscate side channels. We call this approach Defensive
ML, and the generator to obfuscate signals, defender. Defensive ML is a
workflow to design, implement, train, and deploy defenders for different
environments. First, we design a defender architecture given the physical
characteristics and hardware constraints of the side-channel. Next, we use our
DefenderGAN structure to train the defender. Finally, we apply defensive ML to
thwart two side-channel attacks: one based on memory contention and the other
on application power. The former uses a hardware defender with ns-level
response time that attains a high level of security with half the performance
impact of a traditional scheme; the latter uses a software defender with
ms-level response time that provides better security than a traditional scheme
with only 70% of its power overhead.Comment: Preprint. Under revie
Neural Sequence-to-grid Module for Learning Symbolic Rules
Logical reasoning tasks over symbols, such as learning arithmetic operations
and computer program evaluations, have become challenges to deep learning. In
particular, even state-of-the-art neural networks fail to achieve
\textit{out-of-distribution} (OOD) generalization of symbolic reasoning tasks,
whereas humans can easily extend learned symbolic rules. To resolve this
difficulty, we propose a neural sequence-to-grid (seq2grid) module, an input
preprocessor that automatically segments and aligns an input sequence into a
grid. As our module outputs a grid via a novel differentiable mapping, any
neural network structure taking a grid input, such as ResNet or TextCNN, can be
jointly trained with our module in an end-to-end fashion. Extensive experiments
show that neural networks having our module as an input preprocessor achieve
OOD generalization on various arithmetic and algorithmic problems including
number sequence prediction problems, algebraic word problems, and computer
program evaluation problems while other state-of-the-art sequence transduction
models cannot. Moreover, we verify that our module enhances TextCNN to solve
the bAbI QA tasks without external memory.Comment: 9 pages, 9 figures, AAAI 202