28 research outputs found

    DEVELOPMENT OF THE METHOD OF UNSUPERVISED TRAINING OF CONVOLUTIONAL NEURAL NETWORKS BASED ON NEURAL GAS MODIFICATION

    Get PDF
    oai:ojs.localhost:article/469Technologies for computer analysis of visual information based on convolutional neural networks have been widely used, but there is still a shortage of working algorithms for continuous unsupervised training and re-training of neural networks in real time, limiting the effectiveness of their functioning under conditions of nonstationarity and a priori uncertainty. In addition, the back propagation method for learning multi-layer neural networks requires significant computational resources and the amount of marked learning data, which makes it difficult to implement them in autonomous systems with limited resources. One approach to reducing the computational complexity of deep machine learning and overfitting is use of the neural gas principles to implement learning in the process of direct information propagation and sparse coding to increase the compactness and informativeness of feature representation. The paper considers the use of sparse coding neural gas for learning ten layers of the VGG-16 neural network on selective data from the ImageNet database. At the same time, it is suggested that the evaluation of the effectiveness of the feature extractor learning be carried out according to the results of so-called information-extreme machine learning with the teacher of the output classifier. Information-extreme learning is based on the principles of population optimization methods for binary coding of observations and the construction of radial-basic decision rules optimal in the information criterion in the binary Hamming space. According to the results of physical modeling, it is shown that learning without a teacher ensures the accuracy of decision rules to 96.4 %, which is inferior to the accuracy of learning with the teacher, which is equal to 98.7 %. However, the absence of an error in the training algorithm for the backward propagation of the error causes the prospect of further research towards the development of meta-optimization algorithms to refine the feature extractor's filters and parameters of the unsupervised training algorith

    Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning

    Get PDF
    The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes' problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What's more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%

    Vacant Parking Lot Information System Using Transfer Learning and IoT

    Get PDF
    Parking information systems have become very important, especially in metropolitan areas as they help to save time, effort and fuel when searching for parking. This paper offers a novel low-cost deep learning approach to easily implement vacancy detection at outdoor parking spaces with CCTV surveillance. The proposed method also addresses issues due to perspective distortion in CCTV images. The architecture consists of three classifiers for checking the availability of parking spaces. They were developed on the TensorFlow platform by re-training MobileNet (a pre-trained Convolutional Neural Network (CNN)) model using the transfer learning technique. A performance analysis showed 88% accuracy for vacancy detection. An end-to-end application model with Internet of Things (IoT) and an Android application is also presented. Users can interact with the cloud using their Android application to get real-time updates on parking space availability and the parking location. In the future, an autonomous car could use this system as a V2I (Vehicle to Infrastructure) application in deciding the nearest parking space

    Comparing Data Augmentation Strategies for Deep Image Classification

    Get PDF
    Currently deep learning requires large volumes of training data to fit accurate models. In practice, however, there is often insufficient training data available and augmentation is used to expand the dataset. Historically, only simple forms of augmentation, such as cropping and horizontal flips, were used. More complex augmentation methods have recently been developed, but it is still unclear which techniques are most effective, and at what stage of the learning process they should be introduced. This paper investigates data augmentation strategies for image classification, including the effectiveness of different forms of augmentation, dependency on the number of training examples, and when augmentation should be introduced during training. The most accurate results in all experiments are achieved using random erasing due to its ability to simulate occlusion. As expected, reducing the number of training examples significantly increases the importance of augmentation, but surprisingly the improvements in generalization from augmentation do not appear to be only as a result of augmentation preventing overfitting. Results also indicate a learning curriculum that injects augmentation after the initial learning phase has passed is more effective than the standard practice of using augmentation throughout, and that injection too late also reduces accuracy. We find that careful augmentation can improve accuracy by +2.83% to 95.85% using a ResNet model on CIFAR-10 with more dramatic improvements seen when there are fewer training examples. Source code is available at https://git.io/fjPP

    The classification of skateboarding trick images by means of transfer learning and machine learning models

    Get PDF
    The evaluation of tricks executions in skateboarding is commonly executed manually and subjectively. The panels of judges often rely on their prior experience in identifying the effectiveness of tricks performance during skateboarding competitions. This technique of classifying tricks is deemed as not a practical solution for the evaluation of skateboarding tricks mainly for big competitions. Therefore, an objective and unbiased means of evaluating skateboarding tricks for analyzing skateboarder’s trick is nontrivial. This study aims at classifying flat ground tricks namely Ollie, Kickflip, Pop Shove-it, Nollie Frontside Shove-it, and Frontside 180 through the camera vision and the combination of Transfer Learning (TL) and Machine Learning (ML). An amateur skateboarder (23 years of age with ± 5.0 years’ experience) executed five tricks for each type of trick repeatedly on an HZ skateboard from a YI action camera placed at a distance of 1.26 m on a cemented ground. The features from the image obtained are extracted automatically via 18 TL models. The features extracted from the models are then fed into different tuned ML classifiers models, for instance, Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Random Forest (RF). The grid search optimization technique through five-fold cross-validation was used to tune the hyperparameters of the classifiers evaluated. The data (722 images) was split into training, validation, and testing with a stratified ratio of 60:20:20, respectively. The study demonstrated that VGG16 + SVM and VGG19 + RF attained classification accuracy (CA) of 100% and 98%, respectively on the test dataset, followed by VGG19 + k-NN and also DenseNet201 + k-NN that achieved a CA of 97%. In order to evaluate the developed pipelines, robustness evaluation was carried out via the form of independent testing that employed the augmented images (2250 images). It was found that VGG16 + SVM, VGG19 + k-NN, and DenseNet201 + RF (by average) are able to yield reasonable CA with 99%, 98%, and 97%, respectively. Conclusively, based on the robustness evaluation, it can be ascertained that the VGG16 + SVM pipeline able to classify the tricks exceptionally well. Therefore, from the present study, it has been demonstrated that the proposed pipelines may facilitate judges in providing a more accurate evaluation of the tricks performed as opposed to the traditional method that is currently applied in competitions

    Classification of skateboarding tricks by synthesizing transfer learning models and machine learning classifiers using different input signal transformations

    Get PDF
    Skateboarding has made its Olympic debut at the delayed Tokyo 2020 Olympic Games. Conventionally, in the competition scene, the scoring of the game is done manually and subjectively by the judges through the observation of the trick executions. Nevertheless, the complexity of the manoeuvres executed has caused difficulties in its scoring that is obviously prone to human error and bias. Therefore, the aim of this study is to classify five skateboarding flat ground tricks which are Ollie, Kickflip, Shove-it, Nollie and Frontside 180. This is achieved by using three optimized machine learning models of k-Nearest Neighbor (kNN), Random Forest (RF), and Support Vector Machine (SVM) from features extracted via eighteen transfer learning models. Six amateur skaters performed five tricks on a customized ORY skateboard. The raw data from the inertial measurement unit (IMU) embedded on the developed device attached to the skateboarding were extracted. It is worth noting that four types of input images were transformed via Fast Fourier Transform (FFT), Continuous Wavelet Transform (CWT), Discrete Wavelet Transform (DWT) and synthesized raw image (RAW) from the IMU-based signals obtained. The optimized form of the classifiers was obtained by performing GridSearch optimization technique on the training dataset with 3-folds cross-validation on a data split of 4:1:1 ratio for training, validation and testing, respectively from 150 transformed images. It was shown that the CWT and RAW images used in the MobileNet transfer learning model coupled with the optimized SVM and RF classifiers exhibited a test accuracy of 100%. In order to identify the best possible method for the pipelines, computational time was used to evaluate the various models. It was concluded that the RAW-MobileNet-optimized-RF approach was the most effective one, with a computational time of 24.796875 seconds. The results of the study revealed that the proposed approach could improve the classification of skateboarding tricks
    corecore