5,430 research outputs found
Recommended from our members
An efficient local binary pattern based plantar pressure optical sensor image classification using convolutional neural networks
The objective of this study was to design and produce highly comfortable shoe products guided by a plantar pressure imaging data-set. Previous studies have focused on the geometric measurement on the size of the plantar, while in this research a plantar pressure optical imaging data-set based classification technology has been developed. In this paper, an improved local binary pattern (LBP) algorithm is used to extract texture-based features and recognize patterns from the data-set. A calculating model of plantar pressure imaging feature area is established subsequently. The data-set is classified by a neural network to guide the generation of various shoe-last surfaces. Firstly, the local binary mode is improved to adapt to the pressure imaging data-set, and the texture-based feature calculation is fully used to accurately generate the feature point set; hereafter, the plantar pressure imaging feature point set is then used to guide the design of last free surface forming. In the presented experiments of plantar imaging, multi-dimensional texture-based features and improved LBP features have been found by a convolution neural network (CNN), and compared with a 21-input-3-output two-layer perceptual neural network. Three feet types are investigated in the experiment, being flatfoot (F) referring to the lack of a normal arch, or arch collapse, Talipes Equinovarus (TE), being the front part of the foot is adduction, calcaneus varus, plantar flexion, or Achilles tendon contracture and Normal (N). This research has achieved an 82% accuracy rate with 10 hidden-layers CNN of rotation invariance LBP (RI-LBP) algorithm using 21 texture-based features by comparing other deep learning methods presented in the literature
Recommended from our members
Automatic affective dimension recognition from naturalistic facial expressions based on wavelet filtering and PLS regression
Automatic affective dimension recognition from facial expression continuously in naturalistic contexts is a very challenging research topic but very important in human-computer interaction. In this paper, an automatic recognition system was proposed to predict the affective dimensions such as Arousal, Valence and Dominance continuously in naturalistic facial expression videos. Firstly, visual and vocal features are extracted from image frames and audio segments in facial expression videos. Secondly, a wavelet transform based digital filtering method is applied to remove the irrelevant noise information in the feature space. Thirdly, Partial Least Squares regression is used to predict the affective dimensions from both video and audio modalities. Finally, two modalities are combined to boost overall performance in the decision fusion process. The proposed method is tested in the fourth international Audio/Visual Emotion Recognition Challenge (AVEC2014) dataset and compared to other state-of-the-art methods in the affect recognition sub-challenge with a good performance
Towards glass-box CNNs
With the substantial performance of neural networks in sensitive fields
increases the need for interpretable deep learning models. Major challenge is
to uncover the multiscale and distributed representation hidden inside the
basket mappings of the deep neural networks. Researchers have been trying to
comprehend it through visual analysis of features, mathematical structures, or
other data-driven approaches. Here, we work on implementation invariances of
CNN-based representations and present an analytical binary prototype that
provides useful insights for large scale real-life applications. We begin by
unfolding conventional CNN and then repack it with a more transparent
representation. Inspired by the attainment of neural networks, we choose to
present our findings as a three-layer model. First is a representation layer
that encompasses both the class information (group invariant) and symmetric
transformations (group equivariant) of input images. Through these
transformations, we decrease intra-class distance and increase the inter-class
distance. It is then passed through a dimension reduction layer followed by a
classifier. The proposed representation is compared with the equivariance of
AlexNet (CNN) internal representation for better dissemination of simulation
results. We foresee following immediate advantages of this toy version: i)
contributes pre-processing of data to increase the feature or class
separability in large scale problems, ii) helps designing neural architecture
to improve the classification performance in multi-class problems, and iii)
helps building interpretable CNN through scalable functional blocks
- …