Search CORE

2,724 research outputs found

A Literature Review of Fault Diagnosis Based on Ensemble Learning

Author: Cao Tianya
Chen Kairan
Deng Xiaofei
Dong Xiaohui
Jaber Tareq
Mian Zhibao
Tian Yuzhu
Publication venue: Elsevier
Publication date: 10/11/2023
Field of study

The accuracy of fault diagnosis is an important indicator to ensure the reliability of key equipment systems. Ensemble learning integrates different weak learning methods to obtain stronger learning and has achieved remarkable results in the field of fault diagnosis. This paper reviews the recent research on ensemble learning from both technical and field application perspectives. The paper summarizes 87 journals in recent web of science and other academic resources, with a total of 209 papers. It summarizes 78 different ensemble learning based fault diagnosis methods, involving 18 public datasets and more than 20 different equipment systems. In detail, the paper summarizes the accuracy rates, fault classification types, fault datasets, used data signals, learners (traditional machine learning or deep learning-based learners), ensemble learning methods (bagging, boosting, stacking and other ensemble models) of these fault diagnosis models. The paper uses accuracy of fault diagnosis as the main evaluation metrics supplemented by generalization and imbalanced data processing ability to evaluate the performance of those ensemble learning methods. The discussion and evaluation of these methods lead to valuable research references in identifying and developing appropriate intelligent fault diagnosis models for various equipment. This paper also discusses and explores the technical challenges, lessons learned from the review and future development directions in the field of ensemble learning based fault diagnosis and intelligent maintenance

Repository@Hull - Worktribe

Recommended from our members

Learning from Limited Labeled Data for Visual Recognition

Author: Su Jong-Chyi
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

Recent advances in computer vision are in part due to the widespread use of deep neural networks. However, training deep networks require enormous amounts of labeled data which can be a bottleneck. In this thesis, we propose several approaches to mitigate this in the context of modern deep networks and computer vision tasks. While transfer learning is an effective strategy for natural image tasks where large labeled datasets such as ImageNet are available, it is less effective for distant domains such as medical images and 3D shapes. Chapter 2 focuses on transfer learning from natural image representations to other modalities. In many cases, cross-modal data can be generated using computer graphics techniques. By forcing the agreement of predictions across modalities, we show that the models are more robust to image degradation, such as lower resolution, grayscale, or line drawings instead of color images in high-resolution. Similarly, we show that 3D shape classifiers learned from multi-view images can be transferred to the models of voxel or point cloud representations. Another line of work has focused on techniques for few-shot learning. In particular, meta-learning approaches explicitly aim to generalize representations by emphasizing transferability to novel tasks. In Chapter 3, we analyze how to improve these techniques by exploiting unlabeled data from related tasks. We show that combining unsupervised objectives with meta-learning objectives can boost the performance of novel tasks. However, we find that small amounts of domain-specific data can be more beneficial than large amounts of generic data. While transfer learning, unsupervised learning, and few-shot learning have been studied in isolation, in practice, one often finds that transfer learning from large labeled datasets is more effective than others. This is partly due to a lack of evaluation on benchmarks that contains challenges such as class imbalance and domain mismatch. In Chapter 4, we explore the role of expert models in the context of semi-supervised learning on a realistic benchmark. Unlike existing semi-supervised benchmarks, our dataset is designed to expose some of the challenges encountered in a realistic setting, such as the fine-grained similarity between classes, significant class imbalance, and domain mismatch between the labeled and unlabeled data. We show that current semi-supervised methods are negatively affected by out-of-class data, and their performance pales compared to a transfer learning baseline. Last, we leverage the coarse labels from a large collection of images to improve semi-supervised learning. In Chapter 5, we show that incorporating hierarchical labels in the taxonomy improves state-of-the-art semi-supervised methods

ScholarWorks@UMass Amherst

Continual learning from stationary and non-stationary data

Author: Korycki Lukasz
Publication venue: VCU Scholars Compass
Publication date: 01/01/2022
Field of study

Continual learning aims at developing models that are capable of working on constantly evolving problems over a long-time horizon. In such environments, we can distinguish three essential aspects of training and maintaining machine learning models - incorporating new knowledge, retaining it and reacting to changes. Each of them poses its own challenges, constituting a compound problem with multiple goals. Remembering previously incorporated concepts is the main property of a model that is required when dealing with stationary distributions. In non-stationary environments, models should be capable of selectively forgetting outdated decision boundaries and adapting to new concepts. Finally, a significant difficulty can be found in combining these two abilities within a single learning algorithm, since, in such scenarios, we have to balance remembering and forgetting instead of focusing only on one aspect. The presented dissertation addressed these problems in an exploratory way. Its main goal was to grasp the continual learning paradigm as a whole, analyze its different branches and tackle identified issues covering various aspects of learning from sequentially incoming data. By doing so, this work not only filled several gaps in the current continual learning research but also emphasized the complexity and diversity of challenges existing in this domain. Comprehensive experiments conducted for all of the presented contributions have demonstrated their effectiveness and substantiated the validity of the stated claims

VCU Scholars Compass

Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review

Author: Masoumi Behrooz
Sharokhzadeh Behrooz
Shayesteh Mohammad Hossein
Publication venue
Publication date: 09/11/2023
Field of study

The Human Activity Recognition (HAR) tasks automatically identify human activities using the sensor data, which has numerous applications in healthcare, sports, security, and human-computer interaction. Despite significant advances in HAR, critical challenges still exist. Game theory has emerged as a promising solution to address these challenges in machine learning problems including HAR. However, there is a lack of research work on applying game theory solutions to the HAR problems. This review paper explores the potential of game theory as a solution for HAR tasks, and bridges the gap between game theory and HAR research work by suggesting novel game-theoretic approaches for HAR problems. The contributions of this work include exploring how game theory can improve the accuracy and robustness of HAR models, investigating how game-theoretic concepts can optimize recognition algorithms, and discussing the game-theoretic approaches against the existing HAR methods. The objective is to provide insights into the potential of game theory as a solution for sensor-based HAR, and contribute to develop a more accurate and efficient recognition system in the future research directions

arXiv.org e-Print Archive

New Methods For Domain Adaptation And Low Data Deep Learning

Author: Chaudhary Muawiz
Publication venue
Publication date: 14/07/2023
Field of study

Real-world data coming from settings like hospital collections for detecting disease experience multiple sources of distributional shifts. These issues affect the performance of diagnostic methods, reducing the quality of service provided and leading to health or economic harm. Deep learning has emerged as a promising method for classification tasks, including diagnostics, and recent progress has led to methods that allow a neural network to adapt network statistics to shifts in specific settings at test time. However, problems arise in these methods adapting to general shifts and domains. In addition, they underperform when data is limited. In our first contribution, we tackle general domain shifts by investigating the key issues leading Test Time Adaptive algorithms to fail under label shift, proposing a means for mitigating these failures. In the second contribution, we tackle few-shot cross-domain adaptation by modifying the affine parameters of the batch norm during few-shot train time, generally enhancing performance. The third contribution parameterizes Scattering Networks, where we enhance a method for low data regimes by providing problem-specific adaptation

Concordia University Research Repository

When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions

Author: Chen Chen
Lyu Lingjuan
Zhuang Weiming
Publication venue
Publication date: 27/06/2023
Field of study

The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual benefits, presents a unique opportunity to unlock new possibilities in AI research, and address critical challenges in AI and real-world applications. FL expands the availability of data for FMs and enables computation sharing, distributing the training process and reducing the burden on FL participants. It promotes collaborative FM development, democratizing the process and fostering inclusivity and innovation. On the other hand, FM, with its enormous size, pre-trained knowledge, and exceptional performance, serves as a robust starting point for FL, facilitating faster convergence and better performance under non-iid data. Additionally, leveraging FM to generate synthetic data enriches data diversity, reduces overfitting, and preserves privacy. By examining the interplay between FL and FM, this paper aims to deepen the understanding of their synergistic relationship, highlighting the motivations, challenges, and future directions. Through an exploration of the challenges faced by FL and FM individually and their interconnections, we aim to inspire future research directions that can further enhance both fields, driving advancements and propelling the development of privacy-preserving and scalable AI systems

arXiv.org e-Print Archive

Distributionally Robust Classification on a Data Budget

Author: Feuer Benjamin
Hegde Chinmay
Joshi Ameya
Pham Minh
Publication venue
Publication date: 07/08/2023
Field of study

Real world uses of deep learning require predictable model behavior under distribution shifts. Models such as CLIP show emergent natural distributional robustness comparable to humans, but may require hundreds of millions of training samples. Can we train robust learners in a domain where data is limited? To rigorously address this question, we introduce JANuS (Joint Annotations and Names Set), a collection of four new training datasets with images, labels, and corresponding captions, and perform a series of carefully controlled investigations of factors contributing to robustness in image classification, then compare those results to findings derived from a large-scale meta-analysis. Using this approach, we show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples. To our knowledge, this is the first result showing (near) state-of-the-art distributional robustness on limited data budgets. Our dataset is available at \url{https://huggingface.co/datasets/penfever/JANuS_dataset}, and the code used to reproduce our experiments can be found at \url{https://github.com/penfever/vlhub/}.Comment: TMLR 2023; openreview link: https://openreview.net/forum?id=D5Z2E8CNs

arXiv.org e-Print Archive

Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark

Author: Chaudhury M.
Chaudhury M.
Ghazanfar M. A.
Ghazanfar M. A.
Karami A.
Karami A.
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

The trend for listening to music online has greatly increased over the past decade due to the number of online musical tracks. The large music databases of music libraries that are provided by online music content distribution vendors make music streaming and downloading services more accessible to the end-user. It is essential to classify similar types of songs with an appropriate tag or index (genre) to present similar songs in a convenient way to the end-user. As the trend of online music listening continues to increase, developing multiple machine learning models to classify music genres has become a main area of research. In this research paper, a popular music dataset GTZAN which contains ten music genres is analysed to study various types of music features and audio signals. Multiple scalable machine learning algorithms supported by Apache Spark, including naïve Bayes, decision tree, logistic regression, and random forest, are investigated for the classification of music genres. The performance of these classifiers is compared, and the random forest performs as the best classifier for the classification of music genres. Apache Spark is used in this paper to reduce the computation time for machine learning predictions with no computational cost, as it focuses on parallel computation. The present work also demonstrates that the perfect combination of Apache Spark and machine learning algorithms reduces the scalability problem of the computation of machine learning predictions. Moreover, different hyperparameters of the random forest classifier are optimized to increase the performance efficiency of the classifier in the domain of music genre classification. The experimental outcome shows that the developed random forest classifier can establish a high level of performance accuracy, especially for the mislabelled, distorted GTZAN dataset. This classifier has outperformed other machine learning classifiers supported by Apache Spark in the present work. The random forest classifier manages to achieve 90% accuracy for music genre classification compared to other work in the same domain

UEL Research Repository at University of East London