199 research outputs found
Action recognition with unsynchronised multi-sensory data
Action recognition is a multi-faceted challenge that requires solving three principal challenges in its design: Synchronization, Segmentation and Uncertainty, all of which have specific implications to classification performance and possible solutions to mitigate these implications. We subsequently use observations carried out during the training of an action recognition system to generalize to the challenges encountered in the classification of any time-dependant signal
Recommended from our members
Integrating mid-air haptics into movie experiences
`Seeing is believing, but feeling is the truth''. This idiom from the seventeenth century English clergyman Thomas Fuller gains new momentum in light of an increased proliferation of haptic technologies that allow people to have various kinds of `touch' and `touchless' interactions. Here, we report on the process of creating and integrating touchless feedback (i.e. mid-air haptic stimuli) into short movie experiences (i.e. one-minute movie format). Based on a systematic evaluation of user's experiences of those haptically enhanced movies, we show evidence for the positive effect of haptic feedback during the first viewing experience, but also for a repeated viewing after two weeks. This opens up a promising design space for content creators and researchers interested in sensory augmentation of audiovisual content. We discuss our findings and the use of mid-air haptics technologies with respect to its effect on users' emotions, changes in the viewing experience over time, and the effects of synchronisation
Recommended from our members
Memory and mental time travel in humans and social robots.
From neuroscience, brain imaging and the psychology of memory, we are beginning to assemble an integrated theory of the brain subsystems and pathways that allow the compression, storage and reconstruction of memories for past events and their use in contextualizing the present and reasoning about the future-mental time travel (MTT). Using computational models, embedded in humanoid robots, we are seeking to test the sufficiency of this theoretical account and to evaluate the usefulness of brain-inspired memory systems for social robots. In this contribution, we describe the use of machine learning techniques-Gaussian process latent variable models-to build a multimodal memory system for the iCub humanoid robot and summarize results of the deployment of this system for human-robot interaction. We also outline the further steps required to create a more complete robotic implementation of human-like autobiographical memory and MTT. We propose that generative memory models, such as those that form the core of our robot memory system, can provide a solution to the symbol grounding problem in embodied artificial intelligence. This article is part of the theme issue 'From social brains to social robots: applying neurocognitive insights to human-robot interaction'.Funding. The preparation of this chapter was supported by funding
from the EU Seventh Framework Programme as part of the projects
Experimental Functional Android Assistant (EFAA, FP7-ICT-270490)
and What You Say Is What You Did (WYSIWYD, FP7-ICT-612139)
and by the EU H2020 Programme as part of the Human Brain Project
(HBP-SGA1, 720270; HBP-SGA2, 785907).
Acknowledgements. The authors are grateful to Paul Verschure, Peter
Dominey, Giorgio Metta, Yiannis Demiris and the other members
of the WYSIWYD and EFAA consortia; to members of the HBP EPISENSE
group; and to our colleagues at the University of Sheffield
who have helped us to develop memory systems for the iCub, particularly
Luke Boorman, Harry Jackson and Matthew Evans. The
Sheffield iCub was purchased with the support of the UK Engineering
and Physical Sciences Research Council (EPSRC)
Neuromorphic engineering needs closed-loop benchmarks
Neuromorphic engineering aims to build (autonomous) systems by mimicking biological systems. It is motivated by the observation that biological organisms—from algae to primates—excel in sensing their environment, reacting promptly to their perils and opportunities. Furthermore, they do so more resiliently than our most advanced machines, at a fraction of the power consumption. It follows that the performance of neuromorphic systems should be evaluated in terms of real-time operation, power consumption, and resiliency to real-world perturbations and noise using task-relevant evaluation metrics. Yet, following in the footsteps of conventional machine learning, most neuromorphic benchmarks rely on recorded datasets that foster sensing accuracy as the primary measure for performance. Sensing accuracy is but an arbitrary proxy for the actual system's goal—taking a good decision in a timely manner. Moreover, static datasets hinder our ability to study and compare closed-loop sensing and control strategies that are central to survival for biological organisms. This article makes the case for a renewed focus on closed-loop benchmarks involving real-world tasks. Such benchmarks will be crucial in developing and progressing neuromorphic Intelligence. The shift towards dynamic real-world benchmarking tasks should usher in richer, more resilient, and robust artificially intelligent systems in the future
Automated Composition of Picture-Synched Music Soundtracks for Movies
We describe the implementation of and early results from a system that
automatically composes picture-synched musical soundtracks for videos and
movies. We use the phrase "picture-synched" to mean that the structure of the
automatically composed music is determined by visual events in the input movie,
i.e. the final music is synchronised to visual events and features such as cut
transitions or within-shot key-frame events. Our system combines automated
video analysis and computer-generated music-composition techniques to create
unique soundtracks in response to the video input, and can be thought of as an
initial step in creating a computerised replacement for a human composer
writing music to fit the picture-locked edit of a movie. Working only from the
video information in the movie, key features are extracted from the input
video, using video analysis techniques, which are then fed into a
machine-learning-based music generation tool, to compose a piece of music from
scratch. The resulting soundtrack is tied to video features, such as scene
transition markers and scene-level energy values, and is unique to the input
video. Although the system we describe here is only a preliminary
proof-of-concept, user evaluations of the output of the system have been
positive.Comment: To be presented at the 16th ACM SIGGRAPH European Conference on
Visual Media Production. London, England: 17th-18th December 2019. 10 pages,
9 figure
Motion seen and understood: interactions between language comprehension and visual perception.
Embodied theories of cognition state that the body plays a central role in cognitive representation. Under this description semantic representations, which constitute the meaning of words and sentences, are simulations of real experience that directly engage sensory and motor systems. This predicts interactions between comprehension and perception at low levels, since both engage the same systems, but the majority of evidence comes from picture judgements or visuo-spatial attention therefore it is not clear which visual processes are implicated. In addition, most of the work has concentrated on sentences rather than single words although theories predict that the semantics of both should be grounded in simulation. This investigation sought to systematically explore these interactions, using verbs that refer to upwards or downwards motion and sentences derived from the same set of verbs. As well as looking at visuo-spatial attention, we employed tasks routinely used in visual psychophysics that access low levels of motion processing. In this way we were able to separate different levels of visual processing and explore whether interactions between comprehension and perception were present when low level visual processes were assessed or manipulated. The results from this investigation show that: (1) There are bilateral interactions between low level visual processes and semantic content (lexical and sentential). (2) Interactions are automatic, arising whenever linguistic and visual stimuli are presented in close temporal contiguity. (3) Interactions are subject to processes within the visual system such as perceptual learning and suppression. (4) The precise content of semantic representations dictates which visual processes are implicated in interactions. The data is best explained by a close connection between semantic representation and perceptual systems when information from both is available it is automatically integrated. However, it does not support the direct and unmediated commitment of the visual system in the semantic representation of motion events. The results suggest a complex relationship between semantic representation and sensory-motor systems that can be explained by combining task specific processes with either strong or weak embodiment
Memory and mental time travel in humans and social robots
From neuroscience, brain imaging, and the psychology of memory we are beginning to assemble an integrated theory of the brain sub-systems and pathways that allow the compression, storage and reconstruction of memories for past events and their use in contextualizing the present and reasoning about the future—mental time travel (MTT). Using computational models, embedded in humanoid robots, we are seeking to test the sufficiency of this theoretical account and to evaluate the usefulness of brain-inspired memory systems for social robots. In this contribution, we describe the use of machine learning techniques—Gaussian process latent variable models—to build a multimodal memory system for the iCub humanoid robot and summarise results of the deployment of this system for human-robot interaction. We also outline the further steps required to create a more complete robotic implementation of human-like autobiographical memory and MTT. We propose that generative memory models, such as those that form the core of our robot memory system, can provide a solution to the symbol grounding problem in embodied artificial intelligence
The synthetic psychology of the self
Synthetic psychology describes the approach of “understanding through building” applied to the human condition. In this chapter, we consider the specific challenge of synthesizing a robot “sense of self”. Our starting hypothesis is that the human self is brought into being by the activity of a set of transient self-processes instantiated by the brain and body. We propose that we can synthesize a robot self by developing equivalent sub-systems within an integrated biomimetic cognitive architecture for a humanoid robot. We begin the chapter by motivating this work in the context of the criteria for recognizing other minds, and the challenge of benchmarking artificial intelligence against human, and conclude by describing efforts to create a sense of self for the iCub humanoid robot that has ecological, temporally-extended, interpersonal and narrative components set within a multi-layered model of mind
- …