2 research outputs found

    Learning Representations Toward the Understanding of Out-of-Distribution for Neural Networks

    Get PDF
    Data-driven representations achieve powerful generalization performance in diverse information processing tasks. However, the generalization is often limited to test data from the same distribution as training data (in-distribution (ID)). In addition, the neural networks often make overconfident and incorrect predictions for data outside training distribution, called out-of-distribution (OOD). In this dissertation, we develop representations that can characterize OOD for the neural networks and utilize the characterization to efficiently generalize to OOD. We categorize the data-driven representations based on information flow in neural networks and develop novel gradient-based representations. In particular, we utilize the backpropagated gradients to represent what the neural networks has not learned in the data. The capability of gradient-based representations for OOD characterization is comprehensively analyzed in comparison with standard activation-based representations. We also utilize a regularization technique for the gradient-based representations to better characterize OOD. Finally, we develop activation-based representations learned with auxiliary information to efficiently generalize to data from OOD. We use an unsupervised learning framework to learn the aligned representations of visual and attribute data. These aligned representations are utilized to calibrate the overconfident prediction toward ID classes and the generalization performance is validated in the application of generalized zero-shot learning (GZSL). The developed GZSL method, GatingAE, achieves state-of-the-art performance in generalizing to OOD with significantly less number of model parameters compared to other state-of-the-art methods.Ph.D
    corecore