4 research outputs found

    Labelling of Annotated Condition Monitoring Data Through Technical Language Processing

    No full text
    We propose a novel approach, technical language labelling, to facilitate supervised intelligent fault diagnosis on unlabelled but annotated industry datasets using technical language processing. Condition monitoring (CM) is vital for high safety and resource efficiency in the green transition and digital transformation of the process industry. Computerised maintenance systems are required to facilitate CM scalability, and learning-based Intelligent Fault Diagnosis (IFD) methods are required to automate maintenance decisions and improve support for human analysts. A major challenge is the lack of labelled datasets from industry and the difficulty of transferring features from labelled lab datasets to unlabelled industry datasets. In this study, we investigate how the fault description annotations and maintenance work orders present in many CM datasets can be understood and used for IFD through Technical Language Processing, based on insights from recent advances in Natural Language Supervision joint pre-training of images and captions. We identify two distinct pipelines, one based on pre-training on large datasets, and one based on a human-centric approach and unsupervised clustering methods to transform annotations into labels, aided by insights from dimensionality reduction and visualisation techniques. Finally, we showcase one example of the small-data fault classification implementation on a CM industry dataset with a Sentence BERT model and conventional signal processing methods. Sets of features are used to overcome data imbalance and label misalignment, and we show that our model can separate sets of cable and sensor fault recordings from sets of bearing-related fault recordings with an F1-score of 92.6%. To our knowledge, this is the first system to create labels for CM data through pre-trained language models without requiring pre-defined taxonomies. This work is supported by the Strategic innovation program Process industrial IT and Automation(PiIA), a joint investment of Vinnova, Formas andthe Swedish Energy Agency, reference number 2019-02533. T</p

    Labelling of Annotated Condition Monitoring Data Through Technical Language Processing

    No full text
    We propose a novel approach to facilitate supervised fault diagnosis on unlabelled but annotated industry datasets using human-centric technical language processing and weak supervision. Fault diagnosis through Condition Monitoring (CM) is vital for high safety and resource efficiency in the green transition and digital transformation of the process industry. Learning-based Intelligent Fault Diagnosis (IFD) methods are required to automate maintenance decisions and improve decision support for analysts. A major challenge is the lack of labelled industry datasets, limiting supervised IFD research to lab datasets. However, features learned from lab environments generalise poorly to field environments due to different signal distributions, artificial induction or acceleration of lab faults, and lab set-up properties such as average frequency profiles affecting learned features. In this study, we investigate how the unstructured free text fault annotations and maintenance work orders that are present in many industrial CM systems can be used for IFD through technical language processing, based on recent advances in natural language supervision. We introduce two distinct pipelines, one based on contrastive pre-training on large datasets, and one based on a small-data human-centric approach with unsupervised clustering methods. Finally, we showcase one example of the small-data fault classification implementation on a CM industry dataset with a SentenceBERT language model, kMeans clustering, and conventional signal processing methods. Fault class imbalance and time-shift uncertainty is overcome with weak supervision through aggregates of features, and human-centric clustering is used to integrate technical knowledge with the annotation-based fault classes. We show that our model can separate cable and sensor fault recordings from bearing-related fault recordings with an F1-score of 93. To our knowledge, this is the first system to classify faults in field industry CM data based only on associated unstructured fault annotations.Funder: Process industrial IT and Automation(PiIA) (2019-02533);Full text license: CC BY;This paper has previously appeared as a manuscript in a thesis;ISBN for host publication: 978-1-936263-29-5KnowIT FAS
    corecore