1,214 research outputs found
Learning Generative ConvNets via Multi-grid Modeling and Sampling
This paper proposes a multi-grid method for learning energy-based generative
ConvNet models of images. For each grid, we learn an energy-based probabilistic
model where the energy function is defined by a bottom-up convolutional neural
network (ConvNet or CNN). Learning such a model requires generating synthesized
examples from the model. Within each iteration of our learning algorithm, for
each observed training image, we generate synthesized images at multiple grids
by initializing the finite-step MCMC sampling from a minimal 1 x 1 version of
the training image. The synthesized image at each subsequent grid is obtained
by a finite-step MCMC initialized from the synthesized image generated at the
previous coarser grid. After obtaining the synthesized examples, the parameters
of the models at multiple grids are updated separately and simultaneously based
on the differences between synthesized and observed examples. We show that this
multi-grid method can learn realistic energy-based generative ConvNet models,
and it outperforms the original contrastive divergence (CD) and persistent CD.Comment: CVPR 201
Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors
Visual localization is an attractive problem that estimates the camera
localization from database images based on the query image. It is a crucial
task for various applications, such as autonomous vehicles, assistive
navigation and augmented reality. The challenging issues of the task lie in
various appearance variations between query and database images, including
illumination variations, dynamic object variations and viewpoint variations. In
order to tackle those challenges, Panoramic Annular Localizer into which
panoramic annular lens and robust deep image descriptors are incorporated is
proposed in this paper. The panoramic annular images captured by the single
camera are processed and fed into the NetVLAD network to form the active deep
descriptor, and sequential matching is utilized to generate the localization
result. The experiments carried on the public datasets and in the field
illustrate the validation of the proposed system.Comment: Accepted by ITSC 201
Two Heads are Better than One: A Bio-inspired Method for Improving Classification on EEG-ET Data
Classifying EEG data is integral to the performance of Brain Computer
Interfaces (BCI) and their applications. However, external noise often
obstructs EEG data due to its biological nature and complex data collection
process. Especially when dealing with classification tasks, standard EEG
preprocessing approaches extract relevant events and features from the entire
dataset. However, these approaches treat all relevant cognitive events equally
and overlook the dynamic nature of the brain over time. In contrast, we are
inspired by neuroscience studies to use a novel approach that integrates
feature selection and time segmentation of EEG data. When tested on the
EEGEyeNet dataset, our proposed method significantly increases the performance
of Machine Learning classifiers while reducing their respective computational
complexity.Comment: 6 pages, 3 figures, HCI International 2023 Poste
A Developmental Learning Approach of Mobile Manipulator via Playing
Inspired by infant development theories, a robotic developmental model combined with game elements is proposed in this paper. This model does not require the definition of specific developmental goals for the robot, but the developmental goals are implied in the goals of a series of game tasks. The games are characterized into a sequence of game modes based on the complexity of the game tasks from simple to complex, and the task complexity is determined by the applications of developmental constraints. Given a current mode, the robot switches to play in a more complicated game mode when it cannot find any new salient stimuli in the current mode. By doing so, the robot gradually achieves it developmental goals by playing different modes of games. In the experiment, the game was instantiated into a mobile robot with the playing task of picking up toys, and the game is designed with a simple game mode and a complex game mode. A developmental algorithm, “Lift-Constraint, Act and Saturate,” is employed to drive the mobile robot move from the simple mode to the complex one. The experimental results show that the mobile manipulator is able to successfully learn the mobile grasping ability after playing simple and complex games, which is promising in developing robotic abilities to solve complex tasks using games
IMPROVED DESIGN OF DTW AND GMM CASCADED ARABIC SPEAKER
In this paper, we discuss about the design, implementation and assessment of a two-stage Arabic speaker recognition system, which aims to recognize a target Arabic speaker among several people. The first stage uses improved DTW (Dynamic Time Warping) algorithm and the second stage uses SA-KM-based GMM (Gaussian Mixture Model). MFCC (Mel Frequency Cepstral Coefficients) and its differences form, as acoustic feature, are extracted from the sample speeches. DTW provides three most possible speakers and then the recognition results are conveyed to GMM training processes. A specified similarity assessment algorithm, KL distance, is applied to find the best match with the target speaker. Experimental results show that text-independent recognition rate of the cascaded system reaches 90 percent
- …