Search CORE

9,522 research outputs found

Comparison between two spatio-temporal organization maps for speech recognition

Author: Alexandre Frédéric
Bougrain Laurent
Neji Ben Salem Zouhour
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/08/2006
Field of study

http://www.springerlink.com/content/w1q3r7r14857/In this paper, we compare two models biologically inspired and gathering spatio-temporal data coding, representation and processing. These models are based on Self-Organizing Map (SOM) yielding to a Spatio-Temporel Organization Map (STOM). More precisely, the map is trained using two different spatio-temporal algorithms taking their roots in biological researches: The ST-Kohonen and the Time-Organized Map (TOM). These algorithms use two kinds of spatio-temporal data coding. The first one is based on the domain of complex numbers, while the second is based on the ISI (Inter Spike Interval). STOM is experimented in the field of speech recognition in order to evaluate its perform-ance for such time variable application and to prove that biological models are capable of giving good results as stochastic and hybrid ones

INRIA a CCSD electronic archive server

Temporal and Spatial Data Mining with Second-Order Hidden Models

Author: Ber Florence Le
Mari Jean-Francois
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

In the frame of designing a knowledge discovery system, we have developed stochastic models based on high-order hidden Markov models. These models are capable to map sequences of data into a Markov chain in which the transitions between the states depend on the \texttt{n} previous states according to the order of the model. We study the process of achieving information extraction fromspatial and temporal data by means of an unsupervised classification. We use therefore a French national database related to the land use of a region, named Teruti, which describes the land use both in the spatial and temporal domain. Land-use categories (wheat, corn, forest, ...) are logged every year on each site regularly spaced in the region. They constitute a temporal sequence of images in which we look for spatial and temporal dependencies. The temporal segmentation of the data is done by means of a second-order Hidden Markov Model (\hmmd) that appears to have very good capabilities to locate stationary segments, as shown in our previous work in speech recognition. Thespatial classification is performed by defining a fractal scanning ofthe images with the help of a Hilbert-Peano curve that introduces atotal order on the sites, preserving the relation ofneighborhood between the sites. We show that the \hmmd performs aclassification that is meaningful for the agronomists.Spatial and temporal classification may be achieved simultaneously by means of a 2 levels \hmmd that measures the \aposteriori probability to map a temporal sequence of images onto a set of hidden classes

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Recommended from our members

The role of HG in the analysis of temporal iteration and interaural correlation

Author: Barrett DJK
Hall DA
Publication venue
Publication date: 01/01/2004
Field of study

Nottingham Trent Institutional Repository (IRep)

A Neural Network Model of Spatio-Temporal Pattern Recognition, Recall and Timing

Author: Mannes Christian
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/1992
Field of study

This paper describes the design of a self~organizing, hierarchical neural network model of unsupervised serial learning. The model learns to recognize, store, and recall sequences of unitized patterns, using either short-term memory (STM) or both STM and long-term memory (LTM) mechanisms. Timing information is learned and recall {both from STM and from LTM) is performed with a learned rhythmical structure. The network, bearing similarities with ART (Carpenter & Grossberg 1987a), learns to map temporal sequences to unitized patterns, which makes it suitable for hierarchical operation. It is therefore capable of self-organizing codes for sequences of sequences. The capacity is only limited by the number of nodes provided. Selected simulation results are reported to illustrate system properties.National Science Foundation (IRI-9024877

Boston University Institutional Repository (OpenBU)

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

A semantic and language-based representation of an environmental scene

Author: CLARAMUNT Christophe
LE YAOUANC Jean-Marie
SAUX Eric
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The modeling of a landscape environment is a cognitive activity that requires appropriate spatial representations. The research presented in this paper introduces a structural and semantic categorization of a landscape view based on panoramic photographs that act as a substitute of a given natural environment. Verbal descriptions of a landscape scene provide themodeling input of our approach. This structure-based model identifies the spatial, relational, and semantic constructs that emerge from these descriptions. Concepts in the environment are qualified according to a semantic classification, their proximity and direction to the observer, and the spatial relations that qualify them. The resulting model is represented in a way that constitutes a modeling support for the study of environmental scenes, and a contribution for further research oriented to the mapping of a verbal description onto a geographical information system-based representation

SAM : Science Arts et Métiers