92 research outputs found
Exploring the landscapes of "computing": digital, neuromorphic, unconventional -- and beyond
The acceleration race of digital computing technologies seems to be steering
toward impasses -- technological, economical and environmental -- a condition
that has spurred research efforts in alternative, "neuromorphic" (brain-like)
computing technologies. Furthermore, since decades the idea of exploiting
nonlinear physical phenomena "directly" for non-digital computing has been
explored under names like "unconventional computing", "natural computing",
"physical computing", or "in-materio computing". This has been taking place in
niches which are small compared to other sectors of computer science. In this
paper I stake out the grounds of how a general concept of "computing" can be
developed which comprises digital, neuromorphic, unconventional and possible
future "computing" paradigms. The main contribution of this paper is a
wide-scope survey of existing formal conceptualizations of "computing". The
survey inspects approaches rooted in three different kinds of background
mathematics: discrete-symbolic formalisms, probabilistic modeling, and
dynamical-systems oriented views. It turns out that different choices of
background mathematics lead to decisively different understandings of what
"computing" is. Across all of this diversity, a unifying coordinate system for
theorizing about "computing" can be distilled. Within these coordinates I
locate anchor points for a foundational formal theory of a future
computing-engineering discipline that includes, but will reach beyond, digital
and neuromorphic computing.Comment: An extended and carefully revised version of this manuscript has now
(March 2021) been published as "Toward a generalized theory comprising
digital, neuromorphic, and unconventional computing" in the new open-access
journal Neuromorphic Computing and Engineerin
A Neural Lambda Calculus: Neurosymbolic AI meets the foundations of computing and functional programming
Over the last decades, deep neural networks based-models became the dominant
paradigm in machine learning. Further, the use of artificial neural networks in
symbolic learning has been seen as increasingly relevant recently. To study the
capabilities of neural networks in the symbolic AI domain, researchers have
explored the ability of deep neural networks to learn mathematical
constructions, such as addition and multiplication, logic inference, such as
theorem provers, and even the execution of computer programs. The latter is
known to be too complex a task for neural networks. Therefore, the results were
not always successful, and often required the introduction of biased elements
in the learning process, in addition to restricting the scope of possible
programs to be executed. In this work, we will analyze the ability of neural
networks to learn how to execute programs as a whole. To do so, we propose a
different approach. Instead of using an imperative programming language, with
complex structures, we use the Lambda Calculus ({\lambda}-Calculus), a simple,
but Turing-Complete mathematical formalism, which serves as the basis for
modern functional programming languages and is at the heart of computability
theory. We will introduce the use of integrated neural learning and lambda
calculi formalization. Finally, we explore execution of a program in
{\lambda}-Calculus is based on reductions, we will show that it is enough to
learn how to perform these reductions so that we can execute any program.
Keywords: Machine Learning, Lambda Calculus, Neurosymbolic AI, Neural Networks,
Transformer Model, Sequence-to-Sequence Models, Computational ModelsComment: Keywords: Machine Learning, Lambda Calculus, Neurosymbolic AI, Neural
Networks, Transformer Model, Sequence-to-Sequence Models, Computational
Model
LAST: Scalable Lattice-Based Speech Modelling in JAX
We introduce LAST, a LAttice-based Speech Transducer library in JAX. With an
emphasis on flexibility, ease-of-use, and scalability, LAST implements
differentiable weighted finite state automaton (WFSA) algorithms needed for
training \& inference that scale to a large WFSA such as a recognition lattice
over the entire utterance. Despite these WFSA algorithms being well-known in
the literature, new challenges arise from performance characteristics of modern
architectures, and from nuances in automatic differentiation. We describe a
suite of generally applicable techniques employed in LAST to address these
challenges, and demonstrate their effectiveness with benchmarks on TPUv3 and
V100 GPU
Modeling cognition with generative neural networks: The case of orthographic processing
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words
- …