1,388 research outputs found
Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes
Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition
Online handwritten Chinese text recognition (OHCTR) is a challenging problem
as it involves a large-scale character set, ambiguous segmentation, and
variable-length input sequences. In this paper, we exploit the outstanding
capability of path signature to translate online pen-tip trajectories into
informative signature feature maps using a sliding window-based method,
successfully capturing the analytic and geometric properties of pen strokes
with strong local invariance and robustness. A multi-spatial-context fully
convolutional recurrent network (MCFCRN) is proposed to exploit the multiple
spatial contexts from the signature feature maps and generate a prediction
sequence while completely avoiding the difficult segmentation problem.
Furthermore, an implicit language model is developed to make predictions based
on semantic context within a predicting feature sequence, providing a new
perspective for incorporating lexicon constraints and prior knowledge about a
certain language in the recognition procedure. Experiments on two standard
benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with
correct rates of 97.10% and 97.15%, respectively, which are significantly
better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure
Using signal processing, evolutionary computation, and machine learning to identify transposable elements in genomes
About half of the human genome consists of transposable elements (TE's), sequences that have many copies of themselves distributed throughout the genome. All genomes, from bacterial to human, contain TE's. TE's affect genome function by either creating proteins directly or affecting genome regulation. They serve as molecular fossils, giving clues to the evolutionary history of the organism. TE's are often challenging to identify because they are fragmentary or heavily mutated. In this thesis, novel features for the detection and study of TE's are developed. These features are of two types. The first type are statistical features based on the Fourier transform used to assess reading frame use. These features measure how different the reading frame use is from that of a random sequence, which reading frames the sequence is using, and the proportion of use of the active reading frames. The second type of feature, called side effect machine (SEM) features, are generated by finite state machines augmented with counters that track the number of times the state is visited. These counters then become features of the sequence. The number of possible SEM features is super-exponential in the number of states. New methods for selecting useful feature subsets that incorporate a genetic algorithm and a novel clustering method are introduced. The features produced reveal structural characteristics of the sequences of potential interest to biologists. A detailed analysis of the genetic algorithm, its fitness functions, and its fitness landscapes is performed. The features are used, together with features used in existing exon finding algorithms, to build classifiers that distinguish TE's from other genomic sequences in humans, fruit flies, and ciliates. The classifiers achieve high accuracy (> 85%) on a variety of TE classification problems. The classifiers are used to scan large genomes for TE's. In addition, the features are used to describe the TE's in the newly sequenced ciliate, Tetrahymena thermophile to provide information for biologists useful to them in forming hypotheses to test experimentally concerning the role of these TE's and the mechanisms that govern them
Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel LMS
Over the last decade, kernel methods for nonlinear processing have
successfully been used in the machine learning community. The primary
mathematical tool employed in these methods is the notion of the Reproducing
Kernel Hilbert Space. However, so far, the emphasis has been on batch
techniques. It is only recently, that online techniques have been considered in
the context of adaptive signal processing tasks. Moreover, these efforts have
only been focussed on real valued data sequences. To the best of our knowledge,
no adaptive kernel-based strategy has been developed, so far, for complex
valued signals. Furthermore, although the real reproducing kernels are used in
an increasing number of machine learning problems, complex kernels have not,
yet, been used, in spite of their potential interest in applications that deal
with complex signals, with Communications being a typical example. In this
paper, we present a general framework to attack the problem of adaptive
filtering of complex signals, using either real reproducing kernels, taking
advantage of a technique called \textit{complexification} of real RKHSs, or
complex reproducing kernels, highlighting the use of the complex gaussian
kernel. In order to derive gradients of operators that need to be defined on
the associated complex RKHSs, we employ the powerful tool of Wirtinger's
Calculus, which has recently attracted attention in the signal processing
community. To this end, in this paper, the notion of Wirtinger's calculus is
extended, for the first time, to include complex RKHSs and use it to derive
several realizations of the Complex Kernel Least-Mean-Square (CKLMS) algorithm.
Experiments verify that the CKLMS offers significant performance improvements
over several linear and nonlinear algorithms, when dealing with nonlinearities.Comment: 15 pages (double column), preprint of article accepted in IEEE Trans.
Sig. Pro
- …