6,227 research outputs found
The Birth of Pictoriality in Computer Media
The aim of the paper is to follow some milestones of the story of
computer media as far as the notion of pictoriality is concerned. I am
going to describe in the most general way how it happens that two quite
separate technologies as computer machine and pictorial representation
met and since then became almost inseparable
Literal Perceptual Inference
In this paper, I argue that theories of perception that appeal to Helmholtz’s idea of unconscious inference (“Helmholtzian” theories) should be taken literally, i.e. that the inferences appealed to in such theories are inferences in the full sense of the term, as employed elsewhere in philosophy and in ordinary discourse.
In the course of the argument, I consider constraints on inference based on the idea that inference is a deliberate acton, and on the idea that inferences depend on the syntactic structure of representations. I argue that inference is a personal-level but sometimes unconscious process that cannot in general be distinguished from association on the basis of the structures of the representations over which it’s defined. I also critique arguments against representationalist interpretations of Helmholtzian theories, and argue against the view that perceptual inference is encapsulated in a module
Analyzing Input and Output Representations for Speech-Driven Gesture Generation
This paper presents a novel framework for automatic speech-driven gesture
generation, applicable to human-agent interaction including both virtual agents
and robots. Specifically, we extend recent deep-learning-based, data-driven
methods for speech-driven gesture generation by incorporating representation
learning. Our model takes speech as input and produces gestures as output, in
the form of a sequence of 3D coordinates. Our approach consists of two steps.
First, we learn a lower-dimensional representation of human motion using a
denoising autoencoder neural network, consisting of a motion encoder MotionE
and a motion decoder MotionD. The learned representation preserves the most
important aspects of the human pose variation while removing less relevant
variation. Second, we train a novel encoder network SpeechE to map from speech
to a corresponding motion representation with reduced dimensionality. At test
time, the speech encoder and the motion decoder networks are combined: SpeechE
predicts motion representations based on a given speech signal and MotionD then
decodes these representations to produce motion sequences. We evaluate
different representation sizes in order to find the most effective
dimensionality for the representation. We also evaluate the effects of using
different speech features as input to the model. We find that mel-frequency
cepstral coefficients (MFCCs), alone or combined with prosodic features,
perform the best. The results of a subsequent user study confirm the benefits
of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code
is available at
https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode
Crossing the symbolic threshold: a critical review of Terrence Deacon's The Symbolic Species
Terrence Deacon's views about the origin of language are based on a particular notion of a symbol. While the notion is derived from Peirce's semiotics, it diverges from that source and needs to be investigated on its own terms in order to evaluate the idea that the human species has crossed the symbolic threshold. Deacon's view is defended from the view that symbols in the animal world are widespread and from the extreme connectionist view that they are not even to be found in humans. Deacon's treatment of symbols involves a form of holism, as a symbol needs to be part of a system of symbols. He also appears to take a realist view of symbols. That combination of holism and realism makes the threshold a sharp threshold, which makes it hard to explain how the threshold was crossed. This difficulty is overcome if we take a mild realist position towards symbols, in the style of Dennett. Mild realism allows intermediate stages in the crossing but does not undermine Deacon's claim that the threshold is difficult to cross or the claim that it needs to be crossed quickly
Knowledge-rich Image Gist Understanding Beyond Literal Meaning
We investigate the problem of understanding the message (gist) conveyed by
images and their captions as found, for instance, on websites or news articles.
To this end, we propose a methodology to capture the meaning of image-caption
pairs on the basis of large amounts of machine-readable knowledge that has
previously been shown to be highly effective for text understanding. Our method
identifies the connotation of objects beyond their denotation: where most
approaches to image understanding focus on the denotation of objects, i.e.,
their literal meaning, our work addresses the identification of connotations,
i.e., iconic meanings of objects, to understand the message of images. We view
image understanding as the task of representing an image-caption pair on the
basis of a wide-coverage vocabulary of concepts such as the one provided by
Wikipedia, and cast gist detection as a concept-ranking problem with
image-caption pairs as queries. To enable a thorough investigation of the
problem of gist understanding, we produce a gold standard of over 300
image-caption pairs and over 8,000 gist annotations covering a wide variety of
topics at different levels of abstraction. We use this dataset to
experimentally benchmark the contribution of signals from heterogeneous
sources, namely image and text. The best result with a Mean Average Precision
(MAP) of 0.69 indicate that by combining both dimensions we are able to better
understand the meaning of our image-caption pairs than when using language or
vision information alone. We test the robustness of our gist detection approach
when receiving automatically generated input, i.e., using automatically
generated image tags or generated captions, and prove the feasibility of an
end-to-end automated process
Platonic model of mind as an approximation to neurodynamics
Hierarchy of approximations involved in simplification of microscopic theories, from sub-cellural to the whole brain level, is presented. A new approximation to neural dynamics is described, leading to a Platonic-like model of mind based on psychological spaces. Objects and events in these spaces correspond to quasi-stable states of brain dynamics and may be interpreted from psychological point of view. Platonic model bridges the gap between neurosciences and psychological sciences. Static and dynamic versions of this model are outlined and Feature Space Mapping, a neurofuzzy realization of the static version of Platonic model, described. Categorization experiments with human subjects are analyzed from the neurodynamical and Platonic model points of view
The audiovisual structure of onomatopoeias: An intrusion of real-world physics in lexical creation
Sound-symbolic word classes are found in different cultures and languages worldwide. These words are continuously produced to code complex information about events. Here we explore the capacity of creative language to transport complex multisensory information in a controlled experiment, where our participants improvised onomatopoeias from noisy moving objects in audio, visual and audiovisual formats. We found that consonants communicate movement types (slide, hit or ring) mainly through the manner of articulation in the vocal tract. Vowels communicate shapes in visual stimuli (spiky or rounded) and sound frequencies in auditory stimuli through the configuration of the lips and tongue. A machine learning model was trained to classify movement types and used to validate generalizations of our results across formats. We implemented the classifier with a list of cross-linguistic onomatopoeias simple actions were correctly classified, while different aspects were selected to build onomatopoeias of complex actions. These results show how the different aspects of complex sensory information are coded and how they interact in the creation of novel onomatopoeias.Fil: Taitz, Alan. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Assaneo, María Florencia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; ArgentinaFil: Elisei, Natalia Gabriela. Universidad de Buenos Aires. Facultad de Medicina; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Tripodi, Monica Noemi. Universidad de Buenos Aires; ArgentinaFil: Cohen, Laurent. Centre National de la Recherche Scientifique; Francia. Universite Pierre et Marie Curie; Francia. Institut National de la Santé et de la Recherche Médicale; FranciaFil: Sitt, Jacobo Diego. Centre National de la Recherche Scientifique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Institut National de la Santé et de la Recherche Médicale; Francia. Universite Pierre et Marie Curie; FranciaFil: Trevisan, Marcos Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física de Buenos Aires; Argentin
- …