3,699 research outputs found
Classification and Verification of Online Handwritten Signatures with Time Causal Information Theory Quantifiers
We present a new approach for online handwritten signature classification and
verification based on descriptors stemming from Information Theory. The
proposal uses the Shannon Entropy, the Statistical Complexity, and the Fisher
Information evaluated over the Bandt and Pompe symbolization of the horizontal
and vertical coordinates of signatures. These six features are easy and fast to
compute, and they are the input to an One-Class Support Vector Machine
classifier. The results produced surpass state-of-the-art techniques that
employ higher-dimensional feature spaces which often require specialized
software and hardware. We assess the consistency of our proposal with respect
to the size of the training sample, and we also use it to classify the
signatures into meaningful groups.Comment: Submitted to PLOS On
DNA viewed as an out-of-equilibrium structure
The complexity of the primary structure of human DNA is explored using
methods from nonequilibrium statistical mechanics, dynamical systems theory and
information theory. The use of chi-square tests shows that DNA cannot be
described as a low order Markov chain of order up to . Although detailed
balance seems to hold at the level of purine-pyrimidine notation it fails when
all four basepairs are considered, suggesting spatial asymmetry and
irreversibility. Furthermore, the block entropy does not increase linearly with
the block size, reflecting the long range nature of the correlations in the
human genomic sequences. To probe locally the spatial structure of the chain we
study the exit distances from a specific symbol, the distribution of recurrence
distances and the Hurst exponent, all of which show power law tails and long
range characteristics. These results suggest that human DNA can be viewed as a
non-equilibrium structure maintained in its state through interactions with a
constantly changing environment. Based solely on the exit distance distribution
accounting for the nonequilibrium statistics and using the Monte Carlo
rejection sampling method we construct a model DNA sequence. This method allows
to keep all long range and short range statistical characteristics of the
original sequence. The model sequence presents the same characteristic
exponents as the natural DNA but fails to capture point-to-point details
Basic research planning in mathematical pattern recognition and image analysis
Fundamental problems encountered while attempting to develop automated techniques for applications of remote sensing are discussed under the following categories: (1) geometric and radiometric preprocessing; (2) spatial, spectral, temporal, syntactic, and ancillary digital image representation; (3) image partitioning, proportion estimation, and error models in object scene interference; (4) parallel processing and image data structures; and (5) continuing studies in polarization; computer architectures and parallel processing; and the applicability of "expert systems" to interactive analysis
Modeling user navigation
This paper proposes the use of neural networks as a tool for studying navigation within virtual worlds. Results indicate that the network learned to predict the next step for a given trajectory. The analysis of hidden layer shows that the network was able to differentiate between two groups of users identified on the basis of their performance for a spatial task. Time series analysis of hidden node activation values and input vectors suggested that certain hidden units become specialised for place and heading, respectively. The benefits of this approach and the possibility of extending the methodology to the study of navigation in Human Computer Interaction applications are discussed
Eddy current defect response analysis using sum of Gaussian methods
This dissertation is a study of methods to automatedly detect and produce approximations of eddy current differential coil defect signatures in terms of a summed collection of Gaussian functions (SoG). Datasets consisting of varying material, defect size, inspection frequency, and coil diameter were investigated. Dimensionally reduced representations of the defect responses were obtained utilizing common existing reduction methods and novel enhancements to them utilizing SoG Representations. Efficacy of the SoG enhanced representations were studied utilizing common Machine Learning (ML) interpretable classifier designs with the SoG representations indicating significant improvement of common analysis metrics
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
- …