Search CORE

3,397 research outputs found

Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks

Author: Buckner Cameron
Publication venue
Publication date: 01/01/2018
Field of study

In artificial intelligence, recent research has demonstrated the remarkable potential of Deep Convolutional Neural Networks (DCNNs), which seem to exceed state-of-the-art performance in new domains weekly, especially on the sorts of very difficult perceptual discrimination tasks that skeptics thought would remain beyond the reach of artificial intelligence. However, it has proven difficult to explain why DCNNs perform so well. In philosophy of mind, empiricists have long suggested that complex cognition is based on information derived from sensory experience, often appealing to a faculty of abstraction. Rationalists have frequently complained, however, that empiricists never adequately explained how this faculty of abstraction actually works. In this paper, I tie these two questions together, to the mutual benefit of both disciplines. I argue that the architectural features that distinguish DCNNs from earlier neural networks allow them to implement a form of hierarchical processing that I call “transformational abstraction”. Transformational abstraction iteratively converts sensory-based representations of category exemplars into new formats that are increasingly tolerant to “nuisance variation” in input. Reflecting upon the way that DCNNs leverage a combination of linear and non-linear processing to efficiently accomplish this feat allows us to understand how the brain is capable of bi-directional travel between exemplars and abstractions, addressing longstanding problems in empiricist philosophy of mind. I end by considering the prospects for future research on DCNNs, arguing that rather than simply implementing 80s connectionism with more brute-force computation, transformational abstraction counts as a qualitatively distinct form of processing ripe with philosophical and psychological significance, because it is significantly better suited to depict the generic mechanism responsible for this important kind of psychological processing in the brain

PhilPapers

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Author: Bengio Yoshua
Courville Aaron
Pineau Joelle
Serban Iulian V.
Sordoni Alessandro
Publication venue
Publication date: 05/03/2016
Field of study

We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.Comment: 8 pages with references; Published in AAAI 2016 (Special Track on Cognitive Systems

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Radical Artificial Intelligence: A Postmodern Approach

Author: Blank Doug
Publication venue: Scholarship, Research, and Creative Work at Bryn Mawr College
Publication date: 01/01/1999
Field of study

Scholarship, Research, and Creative Work at Bryn Mawr College | Bryn Mawr College Research

Radical Artificial Intelligence: A Postmodern Approach

Author: Deshpande V. S.
Fleck N. A.
Rubino V.
Publication venue: Scholarship, Research, and Creative Work at Bryn Mawr College
Publication date: 01/01/1999
Field of study

The dynamic response of end-clamped monolithic beams and sandwich beams has been measured by loading the beams at mid-span using metal foam projectiles. The AISI 304 stainless-steel sandwich beams comprise two identical face sheets and either prismatic Y-frame or corrugated cores. The resistance to shock loading is quantified by the permanent transverse deflection at mid-span of the beams as a function of projectile momentum. The prismatic cores are aligned either longitudinally along the beam length or transversely. It is found that the sandwich beams with a longitudinal core orientation have a higher shock resistance than the monolithic beams of equal mass. In contrast, the performance of the sandwich beams with a transverse core orientation is very similar to that of the monolithic beams. Three-dimensional finite element (FE) simulations are in good agreement with the measured responses. The FE calculations indicate that strain concentrations in the sandwich beams occur at joints within the cores and between the core and face sheets; the level of maximum strain is similar for the Y-frame and corrugated core beams for a given value of projectile momentum. The experimental and FE results taken together reveal that Y-frame and corrugated core sandwich beams of equal mass have similar dynamic performances in terms of rear-face deflection, degree of core compression and level of strain within the beam

Scholarship, Research, and Creative Work at Bryn Mawr College | Bryn Mawr College Research

Caltech Authors