16 research outputs found
Gait Recognition from Motion Capture Data
Gait recognition from motion capture data, as a pattern classification
discipline, can be improved by the use of machine learning. This paper
contributes to the state-of-the-art with a statistical approach for extracting
robust gait features directly from raw data by a modification of Linear
Discriminant Analysis with Maximum Margin Criterion. Experiments on the CMU
MoCap database show that the suggested method outperforms thirteen relevant
methods based on geometric features and a method to learn the features by a
combination of Principal Component Analysis and Linear Discriminant Analysis.
The methods are evaluated in terms of the distribution of biometric templates
in respective feature spaces expressed in a number of class separability
coefficients and classification metrics. Results also indicate a high
portability of learned features, that means, we can learn what aspects of walk
people generally differ in and extract those as general gait features.
Recognizing people without needing group-specific features is convenient as
particular people might not always provide annotated learning data. As a
contribution to reproducible research, our evaluation framework and database
have been made publicly available. This research makes motion capture
technology directly applicable for human recognition.Comment: Preprint. Full paper accepted at the ACM Transactions on Multimedia
Computing, Communications, and Applications (TOMM), special issue on
Representation, Analysis and Recognition of 3D Humans. 18 pages. arXiv admin
note: substantial text overlap with arXiv:1701.00995, arXiv:1609.04392,
arXiv:1609.0693
Persistent homology-based gait recognition robust to upper body variations
Gait recognition is nowadays an important biometric
technique for video surveillance tasks, due to the advantage of
using it at distance. However, when the upper body movements
are unrelated to the natural dynamic of the gait, caused for
example by carrying a bag or wearing a coat, the reported results
show low accuracy. With the goal of solving this problem, we
apply persistent homology to extract topological features from
the lowest fourth part of the body silhouettes. To obtain the
features, we modify our previous algorithm for gait recognition,
to improve its efficacy and robustness to variations in the amount
of simplices of the gait complex. We evaluate our approach
using the CASIA-B dataset, obtaining a considerable accuracy
improvement of 93:8%, achieving at the same time invariance to
upper body movements unrelated with the dynamic of the gait.Ministerio de Econom铆a y Competitividad MTM2015-67072-
Topological signature for periodic motion recognition
In this paper, we present an algorithm that computes the topological
signature for a given periodic motion sequence. Such signature consists of a
vector obtained by persistent homology which captures the topological and
geometric changes of the object that models the motion. Two topological
signatures are compared simply by the angle between the corresponding vectors.
With respect to gait recognition, we have tested our method using only the
lowest fourth part of the body's silhouette. In this way, the impact of
variations in the upper part of the body, which are very frequent in real
scenarios, decreases considerably. We have also tested our method using other
periodic motions such as running or jumping. Finally, we formally prove that
our method is robust to small perturbations in the input data and does not
depend on the number of periods contained in the periodic motion sequence.Comment: arXiv admin note: substantial text overlap with arXiv:1707.0698
Benchmarking Deep Learning Models for Tooth Structure Segmentation.
A wide range of deep learning (DL) architectures with varying depths are available, with developers usually choosing one or a few of them for their specific task in a nonsystematic way. Benchmarking (i.e., the systematic comparison of state-of-the art architectures on a specific task) may provide guidance in the model development process and may allow developers to make better decisions. However, comprehensive benchmarking has not been performed in dentistry yet. We aimed to benchmark a range of architecture designs for 1 specific, exemplary case: tooth structure segmentation on dental bitewing radiographs. We built 72 models for tooth structure (enamel, dentin, pulp, fillings, crowns) segmentation by combining 6 different DL network architectures (U-Net, U-Net++, Feature Pyramid Networks, LinkNet, Pyramid Scene Parsing Network, Mask Attention Network) with 12 encoders from 3 different encoder families (ResNet, VGG, DenseNet) of varying depth (e.g., VGG13, VGG16, VGG19). On each model design, 3 initialization strategies (ImageNet, CheXpert, random initialization) were applied, resulting overall into 216 trained models, which were trained up to 200 epochs with the Adam optimizer (learning rate = 0.0001) and a batch size of 32. Our data set consisted of 1,625 human-annotated dental bitewing radiographs. We used a 5-fold cross-validation scheme and quantified model performances primarily by the F1-score. Initialization with ImageNet or CheXpert weights significantly outperformed random initialization (P < 0.05). Deeper and more complex models did not necessarily perform better than less complex alternatives. VGG-based models were more robust across model configurations, while more complex models (e.g., from the ResNet family) achieved peak performances. In conclusion, initializing models with pretrained weights may be recommended when training models for dental radiographic analysis. Less complex model architectures may be competitive alternatives if computational resources and training time are restricting factors. Models developed and found superior on nondental data sets may not show this behavior for dental domain-specific tasks
Deep Convolutional Neural Network Ensembles Using ECOC
Deep neural networks have enhanced the performance of decision making systems in many applications, including image understanding, and further gains can be achieved by constructing ensembles. However, designing an ensemble of deep networks is often not very beneficial since the time needed to train the networks is generally very high or the performance gain obtained is not very significant. In this paper, we analyse an error correcting output coding (ECOC) framework for constructing ensembles of deep networks and propose different design strategies to address the accuracy-complexity trade-off. We carry out an extensive comparative study between the introduced ECOC designs and the state-of-the-art ensemble techniques such as ensemble averaging and gradient boosting decision trees. Furthermore, we propose a fusion technique, that is shown to achieve the highest classification performance
Microcanonical and Canonical Ensembles for fMRI Brain Networks in Alzheimer鈥檚 Disease
This paper seeks to advance the state-of-the-art in analysing fMRI data to detect onset of Alzheimer鈥檚 disease and identify stages in the disease progression. We employ methods of network neuroscience to represent correlation across fMRI data arrays, and introduce novel techniques for network construction and analysis. In network construction, we vary thresholds in establishing BOLD time series correlation between nodes, yielding variations in topological and other network characteristics. For network analysis, we employ methods developed for modelling statistical ensembles of virtual particles in thermal systems. The microcanonical ensemble and the canonical ensemble are analogous to two different fMRI network representations. In the former case, there is zero variance in the number of edges in each network, while in the latter case the set of networks have a variance in the number of edges. Ensemble methods describe the macroscopic properties of a network by considering the underlying microscopic characterisations which are in turn closely related to the degree configuration and network entropy. When applied to fMRI data in populations of Alzheimer鈥檚 patients and controls, our methods demonstrated levels of sensitivity adequate for clinical purposes in both identifying brain regions undergoing pathological changes and in revealing the dynamics of such changes
Phonological Proximity in Costa Rican Sign Language
The study of phonological proximity makes it possible to establish a basis for future decision-making in the treatment of sign languages. Knowing how close a set of signs are allows the interested party to decide more easily its study by clustering, as well as the teaching of the language to third parties based on similarities. In addition, it lays the foundation for strengthening disambiguation modules in automatic recognition systems. To the best of our knowledge, this is the first study of its kind for Costa Rican Sign Language (LESCO, for its Spanish acronym), and forms the basis for one of the modules of the already operational system of sign and speech editing called the International Platform for Sign Language Edition (PIELS). A database of 2665 signs, grouped into eight contexts, is used, and a comparison of similarity measures is made, using standard statistical formulas to measure their degree of correlation. This corpus will be especially useful in machine learning approaches. In this work, we have proposed an analysis of different similarity measures between signs in order to find out the phonological proximity between them. After analyzing the results obtained, we can conclude that LESCO is a sign language with high levels of phonological proximity, particularly in the orientation and location components, but they are noticeably lower in the form component. We have also concluded as an outstanding contribution of our research that automatic recognition systems can take as a basis for their first prototypes the contexts or sign domains that map to clusters with lower levels of similarity. As mentioned, the results obtained have multiple applications such as in the teaching area or the Natural Language Processing area for automatic recognition tasks.This work was supported in part by the Spanish Ministry of Science, Innovation and Universities through the Project ECLIPSE-UA under Grant RTI2018-094283-B-C32, the Project INTEGER under Grant RTI2018-094649-B-I00, and partly by the Conselleria de Educaci贸n, Investigaci贸n, Cultura y Deporte of the Community of Valencia, Spain, within the Project PROMETEO/2018/089