Search CORE

10 research outputs found

Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods

Author: Zabihzadeh Davood
Publication venue
Publication date: 02/07/2021
Field of study

Deep Metric Learning (DML) learns a non-linear semantic embedding from input data that brings similar pairs together while keeps dissimilar data away from each other. To this end, many different methods are proposed in the last decade with promising results in various applications. The success of a DML algorithm greatly depends on its loss function. However, no loss function is perfect, and it deals only with some aspects of an optimal similarity embedding. Besides, the generalizability of the DML on unseen categories during the test stage is an important matter that is not considered by existing loss functions. To address these challenges, we propose novel approaches to combine different losses built on top of a shared deep feature extractor. The proposed ensemble of losses enforces the deep model to extract features that are consistent with all losses. Since the selected losses are diverse and each emphasizes different aspects of an optimal semantic embedding, our effective combining methods yield a considerable improvement over any individual loss and generalize well on unseen categories. Here, there is no limitation in choosing loss functions, and our methods can work with any set of existing ones. Besides, they can optimize each loss function as well as its weight in an end-to-end paradigm with no need to adjust any hyper-parameter. We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings. The results are very encouraging and show that our methods outperform all baseline losses by a large margin in all datasets.Comment: 27 pages, 12 figure

arXiv.org e-Print Archive

MergedNET: A simple approach for one-shot learning in siamese networks based on similarity layers

Author: Atanbori John
Rose Samuel
Publication venue: 'Elsevier BV'
Publication date: 14/10/2022
Field of study

Classifiers trained on disjointed classes with few labelled data points are used in one-shot learning to identify visual concepts from other classes. Recently, Siamese networks and similarity layers have been used to solve the one-shot learning problem, achieving state-of-the-art performance on visual-character recognition datasets. Various techniques have been developed over the years to improve the performance of these networks on fine-grained image classification datasets. They focused primarily on improving the loss and activation functions, augmenting visual features, employing multiscale metric learning, and pre-training and fine-tuning the backbone network. We investigate similarity layers for one-shot learning tasks and propose two frameworks for combining these layers into a MergedNet network. On all four datasets used in our experiment, MergedNet outperformed the baselines based on classification accuracy, and it generalises to other datasets when trained on miniImageNet

University of Lincoln Institutional Repository

딥러닝 기반 고장 진단을 위한 정보 활용 극대화 기법 개발

Author: 김명연
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 기계항공공학부, 2021.8. 윤병동.기계 시스템의 예기치 않은 고장은 많은 산업 분야에서 막대한 사회적, 경제적 손실을 야기할 수 있다. 갑작스런 고장을 감지하고 예방하여 기계 시스템의 신뢰성을 높이기 위해 데이터 기반 고장 진단 기술을 개발하기 위한 연구가 활발하게 이루어지고 있다. 고장 진단 기술의 목표는 대상 기계 시스템의 고장 발생을 가능한 빨리 감지하고 진단하는 것이다. 최근 합성곱 신경망 기법을 포함한 딥러닝 기반 고장 진단 기술은 자율적인 특성인자(feature) 학습이 가능하고 높은 진단 성능을 얻을 수 있다는 장점이 있어 활발히 연구되고 있다. 그러나 딥러닝 기반의 고장 진단 기술을 개발함에 있어 해결해야 할 몇 가지 문제점들이 존재한다. 먼저, 신경망 구조를 깊게 쌓음으로써 풍부한 계층적 특성인자들을 배울 수 있고, 이를 통해 향상된 성능을 얻을 수 있다. 그러나 기울기(gradient) 정보 흐름의 비효율성과 과적합 문제로 인해 모델이 깊어질수록 학습이 어렵게 된다는 문제가 있다. 다음으로, 높은 성능의 고장 진단 모델을 학습하기 위해서는 충분한 양의 레이블 데이터(labeled data)가 확보돼야 한다. 그러나 실제 현장에서 운용되고 있는 기계 시스템의 경우, 충분한 양의 데이터와 레이블 정보를 얻는 것이 어려운 경우가 많다. 따라서 이러한 문제들을 해결하고 진단 성능을 향상시키기 위한 새로운 딥러닝 기반 고장 진단 기술의 개발이 필요하다. 본 박사학위논문에서는 딥러닝 기반 고장 진단 기술의 성능을 향상시키기 위한 세가지 정보 활용 극대화 기법에 대한 연구로 1) 딥러닝 아키텍처 내 기울기 정보 흐름을 향상시키기 위한 새로운 딥러닝 구조 연구, 2) 파라미터 전이 및 삼중항 손실을 기반으로 불충분한 데이터 및 노이즈 조건 하 강건하고 차별적인 특성인자 학습에 대한 연구, 3) 다른 도메인으로부터 레이블 정보를 전이시켜 사용하는 도메인 적응 기반 고장 진단 기법 연구를 제안한다. 첫 번째 연구에서는 딥러닝 모델 내 기울기 정보 흐름을 개선하기 위한 향상된 합성곱 신경망 기반 구조를 제안한다. 본 연구에서는 다양한 계층의 아웃풋(feature map)을 직접 연결함으로써 향상된 정보 흐름을 얻을 수 있으며, 그 결과 진단 모델을 효율적으로 학습하는 것이 가능하다. 또한 차원 축소 모듈을 통해 학습 파라미터 수를 크게 줄임으로써 학습 효율성을 높일 수 있다. 두 번째 연구에서는 파라미터 전이 및 메트릭 학습 기반 고장 진단 기법을 제안한다. 본 연구는 데이터가 불충분하고 노이즈가 많은 조건 하에서도 높은 고장 진단 성능을 얻기 위해 강건하고 차별적인 특성인자 학습을 가능하게 한다. 먼저, 풍부한 소스 도메인 데이터를 사용해 훈련된 사전학습모델을 타겟 도메인으로 전이해 사용함으로써 강건한 진단 방법을 개발할 수 있다. 또한, semi-hard 삼중항 손실 함수를 사용함으로써 각 상태 레이블에 따라 데이터가 더 잘 분리되도록 해주는 특성인자를 학습할 수 있다. 세 번째 연구에서는 레이블이 지정되지 않은(unlabeled) 대상 도메인에서의 고장 진단 성능을 높이기 위한 레이블 정보 전이 전략을 제안한다. 우리가 목표로 하는 대상 도메인에서의 고장 진단 방법을 개발하기 위해 다른 소스 도메인에서 얻은 레이블 정보가 전이되어 활용된다. 동시에 새롭게 고안한 의미론적 클러스터링 손실(semantic clustering loss)을 여러 특성인자 수준에 적용함으로써 차별적인 도메인 불변 기능을 학습한다. 결과적으로 도메인 불변 특성을 가지며 의미론적으로 잘 분류되는 특성인자를 효과적으로 학습할 수 있음을 증명하였다.Unexpected failures of mechanical systems can lead to substantial social and financial losses in many industries. In order to detect and prevent sudden failures and to enhance the reliability of mechanical systems, significant research efforts have been made to develop data-driven fault diagnosis techniques. The purpose of fault diagnosis techniques is to detect and identify the occurrence of abnormal behaviors in the target mechanical systems as early as possible. Recently, deep learning (DL) based fault diagnosis approaches, including the convolutional neural network (CNN) method, have shown remarkable fault diagnosis performance, thanks to their autonomous feature learning ability. Still, there are several issues that remain to be solved in the development of robust and industry-applicable deep learning-based fault diagnosis techniques. First, by stacking the neural network architectures deeper, enriched hierarchical features can be learned, and therefore, improved performance can be achieved. However, due to inefficiency in the gradient information flow and overfitting problems, deeper models cannot be trained comprehensively. Next, to develop a fault diagnosis model with high performance, it is necessary to obtain sufficient labeled data. However, for mechanical systems that operate in real-world environments, it is not easy to obtain sufficient data and label information. Consequently, novel methods that address these issues should be developed to improve the performance of deep learning based fault diagnosis techniques. This dissertation research investigated three research thrusts aimed toward maximizing the use of information to improve the performance of deep learning based fault diagnosis techniques, specifically: 1) study of the deep learning structure to enhance the gradient information flow within the architecture, 2) study of a robust and discriminative feature learning method under insufficient and noisy data conditions based on parameter transfer and triplet loss, and 3) investigation of a domain adaptation based fault diagnosis method that propagates the label information across different domains. The first research thrust suggests an advanced CNN-based architecture to improve the gradient information flow within the deep learning model. By directly connecting the feature maps of different layers, the diagnosis model can be trained efficiently thanks to enhanced information flow. In addition, the dimension reduction module also can increase the training efficiency by significantly reducing the number of trainable parameters. The second research thrust suggests a parameter transfer and metric learning based fault diagnosis method. The proposed approach facilitates robust and discriminative feature learning to enhance fault diagnosis performance under insufficient and noisy data conditions. The pre-trained model trained using abundant source domain data is transferred and used to develop a robust fault diagnosis method. Moreover, a semi-hard triplet loss function is adopted to learn the features with high separability, according to the class labels. Finally, the last research thrust proposes a label information propagation strategy to increase the fault diagnosis performance in the unlabeled target domain. The label information obtained from the source domain is transferred and utilized for developing fault diagnosis methods in the target domain. Simultaneously, the newly devised semantic clustering loss is applied at multiple feature levels to learn discriminative, domain-invariant features. As a result, features that are not only semantically well-clustered but also domain-invariant can be effectively learned.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Research Scope and Overview 3 1.3 Dissertation Layout 6 Chapter 2 Technical Background and Literature Review 8 2.1 Fault Diagnosis Techniques for Mechanical Systems 8 2.1.1 Fault Diagnosis Techniques 10 2.1.2 Deep Learning Based Fault Diagnosis Techniques 15 2.2 Transfer Learning 22 2.3 Metric Learning 28 2.4 Summary and Discussion 30 Chapter 3 Direct Connection Based Convolutional Neural Network (DC-CNN) for Fault Diagnosis 31 3.1 Directly Connected Convolutional Module 33 3.2 Dimension Reduction Module 34 3.3 Input Vibration Image Generation 36 3.4 DC-CNN-Based Fault Diagnosis Method 40 3.5 Experimental Studies and Results 45 3.5.1 Experiment and Data Description 45 3.5.2 Compared Methods 48 3.5.3 Diagnosis Performance Results 51 3.5.4 The Number of Trainable Parameters 56 3.5.5 Visualization of the Learned Features 58 3.5.6 Robustness of Diagnosis Performance 62 3.6 Summary and Discussion 67 Chapter 4 Robust and Discriminative Feature Learning for Fault Diagnosis Under Insufficient and Noisy Data Conditions 68 4.1 Parameter transfer learning 70 4.2 Robust Feature Learning Based on the Pre-trained model 72 4.3 Discriminative Feature Learning Based on the Triplet loss 77 4.4 Robust and Discriminative Feature Learning for Fault Diagnosis 80 4.5 Experimental Studies and Results 84 4.5.1 Experiment and Data Description 84 4.5.2 Compared Methods 85 4.5.3 Experimental Results Under Insufficient Data Conditions 86 4.5.4 Experimental Results Under Noisy Data Conditions 92 4.6 Summary and Discussion 95 Chapter 5 A Domain Adaptation with Semantic Clustering (DASC) Method for Fault Diagnosis 96 5.1 Unsupervised Domain Adaptation 101 5.2 CNN-based Diagnosis Model 104 5.3 Learning of Domain-invariant Features 105 5.4 Domain Adaptation with Semantic Clustering 107 5.5 Proposed DASC-based Fault Diagnosis Method 109 5.6 Experimental Studies and Results 114 5.6.1 Experiment and Data Description 114 5.6.2 Compared Methods 117 5.6.3 Scenario I: Different Operating Conditions 118 5.6.4 Scenario II: Different Rotating Machinery 125 5.6.5 Analysis and Discussion 131 5.7 Summary and Discussion 140 Chapter 6 Conclusion 141 6.1 Contributions and Significance 141 6.2 Suggestions for Future Research 143 References 146 국문 초록 154박

SNU Open Repository and Archive

Learning from Very Few Samples: A Survey

Author: Gong Pinghua
Lu Jiang
Ye Jieping
Zhang Changshui
Publication venue
Publication date: 12/09/2020
Field of study

Few sample learning (FSL) is significant and challenging in the field of machine learning. The capability of learning and generalizing from very few samples successfully is a noticeable demarcation separating artificial intelligence and human intelligence since humans can readily establish their cognition to novelty from just a single or a handful of examples whereas machine learning algorithms typically entail hundreds or thousands of supervised samples to guarantee generalization ability. Despite the long history dated back to the early 2000s and the widespread attention in recent years with booming deep learning technologies, little surveys or reviews for FSL are available until now. In this context, we extensively review 300+ papers of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive survey for FSL. In this survey, we review the evolution history as well as the current progress on FSL, categorize FSL approaches into the generative model based and discriminative model based kinds in principle, and emphasize particularly on the meta learning based FSL approaches. We also summarize several recently emerging extensional topics of FSL and review the latest advances on these topics. Furthermore, we highlight the important FSL applications covering many research hotspots in computer vision, natural language processing, audio and speech, reinforcement learning and robotic, data analysis, etc. Finally, we conclude the survey with a discussion on promising trends in the hope of providing guidance and insights to follow-up researches.Comment: 30 page

arXiv.org e-Print Archive