85 research outputs found

    Data Efficient Learning: Towards Reducing Risk and Uncertainty of Data Driven Learning Paradigm

    Get PDF
    The success of Deep Learning in various tasks is highly dependent on the large amount of domain-specific annotated data, which are expensive to acquire and may contain varying degrees of noise. In this doctoral journey, our research goal is first to identify and then tackle the issues relating to data that causes significant performance degradation to real-world applications of Deep Learning algorithms. Human Activity Recognition from RGB data is challenging due to the lack of relative motion parameters. To address this issue, we propose a novel framework that introduces the skeleton information from RGB data for activity recognition. With experimentation, we demonstrate that our RGB-only solution surpasses the state-of-the-art, all exploit RGB-D video streams, by a notable margin. The predictive uncertainty of Deep Neural Networks (DNNs) makes them unreliable for real-world deployment. Moreover, available labeled data may contain noise. We aim to address these two issues holistically by proposing a unified density-driven framework, which can effectively denoise training data as well as avoid predicting uncertain test data points. Our plug-and-play framework is easy to deploy on real-world applications while achieving superior performance over state-of-the-art techniques. To assess effectiveness of our proposed framework in a real-world scenario, we experimented with x-ray images from COVID-19 patients. Supervised learning of DNNs inherits the limitation of a very narrow field of view in terms of known data distributions. Moreover, annotating data is costly. Hence, we explore self-supervised Siamese networks to avoid these constraints. Through extensive experimentation, we demonstrate that self supervised method perform surprisingly comparative to its supervised counterpart in a real world use-case. We also delve deeper with activation mapping and feature distribution visualization to understand the causality of this method. Through our research, we achieve a better understanding of issues relating to data-driven learning while solving some of the core problems of this paradigm and expose some novel and intriguing research questions to the community

    A New Supervised t-SNE with Dissimilarity Measure for Effective Data Visualization and Classification

    Get PDF
    In this paper, a new version of supervised t-SNE algorithm is proposed which introduces using a dissimilarity measure elated with class information. The proposed S-tSNE can be applied in any high dimensional dataset for visualisation or as a feature extraction for classification problems. In this study the S-tSNE is applied to three datasets MNIST, Chest x-ray and SEER Breast Cancer. The two-dimensional data generated by the S-tSNE showed better visualization and an improvement in terms of classification accuracy in comparison to the original t-SNE method. The results from k-NN classification models which used the lower dimension space generated by the new S-tSNE methods showed more than 20% accuracy improvement in all the three datasets compared with t-SNE method. In addition, the classification accuracy using the S-tSNE for feature extraction was even higher than classification accuracy obtained from the original high-dimensional data

    A convolutional neural network based deep learning methodology for recognition of partial discharge patterns from high voltage cables

    Get PDF
    It is a great challenge to differentiate partial discharge (PD) induced by different types of insulation defects in high-voltage cables. Some types of PD signals have very similar characteristics and are specifically difficult to differentiate, even for the most experienced specialists. To overcome the challenge, a convolutional neural network (CNN)-based deep learning methodology for PD pattern recognition is presented in this paper. First, PD testing for five types of artificial defects in ethylene-propylene-rubber cables is carried out in high voltage laboratory to generate signals containing PD data. Second, 3500 sets of PD transient pulses are extracted, and then 33 kinds of PD features are established. The third stage applies a CNN to the data; typical CNN architecture and the key factors which affect the CNN-based pattern recognition accuracy are described. Factors discussed include the number of the network layers, convolutional kernel size, activation function, and pooling method. This paper presents a flowchart of the CNN-based PD pattern recognition method and an evaluation with 3500 sets of PD samples. Finally, the CNN-based pattern recognition results are shown and the proposed method is compared with two more traditional analysis methods, i.e., support vector machine (SVM) and back propagation neural network (BPNN). The results show that the proposed CNN method has higher pattern recognition accuracy than SVM and BPNN, and that the novel method is especially effective for PD type recognition in cases of signals of high similarity, which is applicable for industrial applications

    Underwater target recognition method based on t-SNE and stacked nonnegative constrained denoising autoencoder

    Get PDF
    1822-1832Underwater targets recognition is a difficult task due to the specific attributes of underwater target radiated noises, low signal to noise ratio and so on. In this paper, the input data optimization method and recognition model were researched. The underwater target radiated noise spectrum was chosen as the original feature. The t-distributed stochastic neighbor embedding (t-SNE) algorithm was used to reduce the dimensionality of the original spectrum segments divided by frequency. The optimal features can be obtained by analyzing the separability. Then the stacked nonnegative constrained denoising autoencoder (SNDAE) model was established to recognize the optimal features. The experimental signal spectra were processed by above methods. The results show that the recognition accuracy of SNDAE is higher than that of other contrastive methods. And the frequency of input band with the highest recognition accuracy is approximately the same as that with the best separability based on t-SNE, indicating that the above method can improve the recognition accuracy and efficiency

    Using Skeleton Correction to Improve Flash Lidar-Based Gait Recognition

    Get PDF
    This paper presents GlidarPoly, an efficacious pipeline of 3D gait recognition for flash lidar data based on pose estimation and robust correction of erroneous and missing joint measurements. A flash lidar can provide new opportunities for gait recognition through a fast acquisition of depth and intensity data over an extended range of distance. However, the flash lidar data are plagued by artifacts, outliers, noise, and sometimes missing measurements, which negatively affects the performance of existing analytics solutions. We present a filtering mechanism that corrects noisy and missing skeleton joint measurements to improve gait recognition. Furthermore, robust statistics are integrated with conventional feature moments to encode the dynamics of the motion. As a comparison, length-based and vector-based features extracted from the noisy skeletons are investigated for outlier removal. Experimental results illustrate the superiority of the proposed methodology in improving gait recognition given noisy, low-resolution flash lidar data

    Data analytic tool for clustering identification based on dimensionality reduction of frequency measurements

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.This work presents a data analytic tool for clustering analysis based on Dimensionality Reduction (DR) of power system measurements. The proposed method is applied to frequency measurements of the ENTSO-E dynamic model of continental Europe and the results are compared with other conventional DR approaches. After considerable reduction of the raw measurements, a phasor metric for identification of coherency groups of generators is proposed. The recommended measure stands for its simple implementation, interpretation and fast computation. To illustrate the effectiveness of the clustering approach and the coherency of the metrics, a particular study case following the outage of a representative generation unit in France is presented
    corecore