5 research outputs found
Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any
finite-state automaton in theory, and some workers demonstrated that this can
hold in practice. We test the capability of generalized linear models, RCs, and
Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic
processes generated by a large suite of probabilistic deterministic
finite-state automata (PDFA). PDFAs provide an excellent performance benchmark
in that they can be systematically enumerated, the randomness and correlation
structure of their generated processes are exactly known, and their optimal
memory-limited predictors are easily computed. Unsurprisingly, LSTMs outperform
RCs, which outperform generalized linear models. Surprisingly, each of these
methods can fall short of the maximal predictive accuracy by as much as 50%
after training and, when optimized, tend to fall short of the maximal
predictive accuracy by ~5%, even though previously available methods achieve
maximal predictive accuracy with orders-of-magnitude less data. Thus, despite
the representational universality of RCs and RNNs, using them can engender a
surprising predictive gap for simple stimuli. One concludes that there is an
important and underappreciated role for methods that infer "causal states" or
"predictive state representations"
Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA) in the small-data limit according to two metrics: predictive accuracy and distance to a predictive rate-distortion curve. The latter provides a sense of whether or not the RNN is a lossy predictive feature extractor in the information-theoretic sense. PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. With less data than is needed to make a good prediction, LSTMs surprisingly lose at predictive accuracy, but win at lossy predictive feature extraction. These results highlight the utility of causal states in understanding the capabilities of RNNs to predict
Recommended from our members
Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any
finite-state automaton in theory, and some workers demonstrated that this can
hold in practice. We test the capability of generalized linear models, RCs, and
Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic
processes generated by a large suite of probabilistic deterministic
finite-state automata (PDFA). PDFAs provide an excellent performance benchmark
in that they can be systematically enumerated, the randomness and correlation
structure of their generated processes are exactly known, and their optimal
memory-limited predictors are easily computed. Unsurprisingly, LSTMs outperform
RCs, which outperform generalized linear models. Surprisingly, each of these
methods can fall short of the maximal predictive accuracy by as much as 50%
after training and, when optimized, tend to fall short of the maximal
predictive accuracy by ~5%, even though previously available methods achieve
maximal predictive accuracy with orders-of-magnitude less data. Thus, despite
the representational universality of RCs and RNNs, using them can engender a
surprising predictive gap for simple stimuli. One concludes that there is an
important and underappreciated role for methods that infer "causal states" or
"predictive state representations"
Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited.
Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA) in the small-data limit according to two metrics: predictive accuracy and distance to a predictive rate-distortion curve. The latter provides a sense of whether or not the RNN is a lossy predictive feature extractor in the information-theoretic sense. PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. With less data than is needed to make a good prediction, LSTMs surprisingly lose at predictive accuracy, but win at lossy predictive feature extraction. These results highlight the utility of causal states in understanding the capabilities of RNNs to predict