1,241 research outputs found

    The Application of Data Analytics Technologies for the Predictive Maintenance of Industrial Facilities in Internet of Things (IoT) Environments

    Get PDF
    In industrial production environments, the maintenance of equipment has a decisive influence on costs and on the plannability of production capacities. In particular, unplanned failures during production times cause high costs, unplanned downtimes and possibly additional collateral damage. Predictive Maintenance starts here and tries to predict a possible failure and its cause so early that its prevention can be prepared and carried out in time. In order to be able to predict malfunctions and failures, the industrial plant with its characteristics, as well as wear and ageing processes, must be modelled. Such modelling can be done by replicating its physical properties. However, this is very complex and requires enormous expert knowledge about the plant and about wear and ageing processes of each individual component. Neural networks and machine learning make it possible to train such models using data and offer an alternative, especially when very complex and non-linear behaviour is evident. In order for models to make predictions, as much data as possible about the condition of a plant and its environment and production planning data is needed. In Industrial Internet of Things (IIoT) environments, the amount of available data is constantly increasing. Intelligent sensors and highly interconnected production facilities produce a steady stream of data. The sheer volume of data, but also the steady stream in which data is transmitted, place high demands on the data processing systems. If a participating system wants to perform live analyses on the incoming data streams, it must be able to process the incoming data at least as fast as the continuous data stream delivers it. If this is not the case, the system falls further and further behind in processing and thus in its analyses. This also applies to Predictive Maintenance systems, especially if they use complex and computationally intensive machine learning models. If sufficiently scalable hardware resources are available, this may not be a problem at first. However, if this is not the case or if the processing takes place on decentralised units with limited hardware resources (e.g. edge devices), the runtime behaviour and resource requirements of the type of neural network used can become an important criterion. This thesis addresses Predictive Maintenance systems in IIoT environments using neural networks and Deep Learning, where the runtime behaviour and the resource requirements are relevant. The question is whether it is possible to achieve better runtimes with similarly result quality using a new type of neural network. The focus is on reducing the complexity of the network and improving its parallelisability. Inspired by projects in which complexity was distributed to less complex neural subnetworks by upstream measures, two hypotheses presented in this thesis emerged: a) the distribution of complexity into simpler subnetworks leads to faster processing overall, despite the overhead this creates, and b) if a neural cell has a deeper internal structure, this leads to a less complex network. Within the framework of a qualitative study, an overall impression of Predictive Maintenance applications in IIoT environments using neural networks was developed. Based on the findings, a novel model layout was developed named Sliced Long Short-Term Memory Neural Network (SlicedLSTM). The SlicedLSTM implements the assumptions made in the aforementioned hypotheses in its inner model architecture. Within the framework of a quantitative study, the runtime behaviour of the SlicedLSTM was compared with that of a reference model in the form of laboratory tests. The study uses synthetically generated data from a NASA project to predict failures of modules of aircraft gas turbines. The dataset contains 1,414 multivariate time series with 104,897 samples of test data and 160,360 samples of training data. As a result, it could be proven for the specific application and the data used that the SlicedLSTM delivers faster processing times with similar result accuracy and thus clearly outperforms the reference model in this respect. The hypotheses about the influence of complexity in the internal structure of the neuronal cells were confirmed by the study carried out in the context of this thesis

    Unveiling the frontiers of deep learning: innovations shaping diverse domains

    Full text link
    Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table

    On the effectiveness of speech self-supervised learning for music

    Get PDF
    Self-supervised learning (SSL) has shown promising results in various speech and natural language processing applications. However, its efficacy in music information retrieval (MIR) still remains largely unexplored. While previous SSL models pre-trained on music recordings may have been mostly closed-sourced, recent speech models such as wav2vec2.0 have shown promise in music modelling. Nevertheless, research exploring the effectiveness of applying speech SSL models to music recordings has been limited. We explore the music adaption of SSL with two distinctive speech-related models, data2vec1.0 and Hubert, and refer to them as music2vec and musicHuBERT, respectively. We train 12 SSL models with 95M parameters under various pre-training configurations and systematically evaluate the MIR task performances with 13 different MIR tasks. Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech. However, we identify the limitations of such existing speech-oriented designs, especially in modelling polyphonic information. Based on the experimental results, empirical suggestions are also given for designing future musical SSL strategies and paradigms

    Classificação multiclasse de sinais de eletroencefalograma para tarefas de imaginação motora utilizando processamento estatístico de sinais e deep learning

    Get PDF
    Research Interests: Efficient classification of electroencephalogram (EEG) signals is crucial for the development of brain-computer interface systems. However, the complexity and variability of EEG signals pose significant challenges for accurate classification. Additionally, this study has social relevance as it can contribute to the development of assistive brain-computer interfaces, benefiting individuals with severe motor impairments, such as those who have experienced a stroke. These interfaces have the potential to improve the quality of life for these individuals by enabling communication and device control through brain activity. Objectives: This study aimed to compare the performance and computational cost of an artificial neural network using different signal processing techniques for the classification of resting state and left/right wrist movement imagination states from EEG signals. Three statistical signal processing techniques, Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Singular Spectrum Analysis (SSA), were explored in conjunction with a Convolutional Neural Network (CNN) to enhance the classification of EEG signals. Results Obtained: The results revealed that the PCA technique led to a reduction in training time of up to 63.5% without significantly compromising performance in terms of classification accuracy. PCA proved to be a promising approach, capturing relevant information from the EEG signals and improving the CNN’s ability to classify accurately. On the other hand, both ICA and SSA techniques did not yield promising results. ICA had negative effects on feature extraction, resulting in decreased classification accuracy by the CNN. SSA, on the other hand, showed consistently low performance across all evaluated metrics, indicating challenges in capturing discriminative information from the EEG-IM signals.Interesses de pesquisa: A classificação eficiente dos sinais de eletroencefalograma (EEG) é fundamental para a construção de sistemas com interface cérebro-computador. No entanto, a complexidade dos sinais de EEG e sua variabilidade entre indivíduos apresentam desafios significativos para a classificação precisa. Este estudo tem relevância social, pois pode contribuir para o desenvolvimento de interfaces cérebro-computador assistivas, beneficiando pessoas com severos danos motores, como aquelas que sofreram acidente vascular cerebral (AVC). Essas interfaces têm o potencial de melhorar a qualidade de vida desses indivíduos, permitindo a comunicação e o controle de dispositivos através da atividade cerebral. Objetivos: Este estudo teve como objetivo comparar o desempenho e o custo computacional de uma rede neural artificial utilizando diferentes técnicas de processamento de sinal na classificação de estados de repouso e imaginação do movimento do punho esquerdo e direito a partir de sinais de EEG. Foram exploradas três técnicas estatísticas de processamento de sinais: Análise de Componentes Principais (PCA), Análise de Componentes Independentes (ICA) e Análise Espectral Singular (SSA), em conjunto com uma Rede Neural Convolucional (CNN). Resultados obtidos: Os resultados obtidos revelaram que a técnica de PCA proporcionou uma redução no tempo de treinamento de até 63,5%, sem comprometer significativamente o desempenho em termos de acurácia na classificação. A PCA demonstrou ser uma abordagem promissora, permitindo a captura de informações relevantes nos sinais de EEG e aprimorando a capacidade da CNN em realizar a classificação com precisão. Por outro lado, as técnicas de ICA e SSA não apresentaram resultados promissores. A ICA teve efeitos negativos na extração de características, resultando em uma diminuição na acurácia da classificação realizada pela CNN. A SSA, por sua vez, mostrou um desempenho geralmente baixo em todas as métricas avaliadas, indicando uma dificuldade em capturar as informações discriminativas presentes nos sinais de EEG-IM

    Improving diagnostic procedures for epilepsy through automated recording and analysis of patients’ history

    Get PDF
    Transient loss of consciousness (TLOC) is a time-limited state of profound cognitive impairment characterised by amnesia, abnormal motor control, loss of responsiveness, a short duration and complete recovery. Most instances of TLOC are caused by one of three health conditions: epilepsy, functional (dissociative) seizures (FDS), or syncope. There is often a delay before the correct diagnosis is made and 10-20% of individuals initially receive an incorrect diagnosis. Clinical decision tools based on the endorsement of TLOC symptom lists have been limited to distinguishing between two causes of TLOC. The Initial Paroxysmal Event Profile (iPEP) has shown promise but was demonstrated to have greater accuracy in distinguishing between syncope and epilepsy or FDS than between epilepsy and FDS. The objective of this thesis was to investigate whether interactional, linguistic, and communicative differences in how people with epilepsy and people with FDS describe their experiences of TLOC can improve the predictive performance of the iPEP. An online web application was designed that collected information about TLOC symptoms and medical history from patients and witnesses using a binary questionnaire and verbal interaction with a virtual agent. We explored potential methods of automatically detecting these communicative differences, whether the differences were present during an interaction with a VA, to what extent these automatically detectable communicative differences improve the performance of the iPEP, and the acceptability of the application from the perspective of patients and witnesses. The two feature sets that were applied to previous doctor-patient interactions, features designed to measure formulation effort or detect semantic differences between the two groups, were able to predict the diagnosis with an accuracy of 71% and 81%, respectively. Individuals with epilepsy or FDS provided descriptions of TLOC to the VA that were qualitatively like those observed in previous research. Both feature sets were effective predictors of the diagnosis when applied to the web application recordings (85.7% and 85.7%). Overall, the accuracy of machine learning models trained for the threeway classification between epilepsy, FDS, and syncope using the iPEP responses from patients that were collected through the web application was worse than the performance observed in previous research (65.8% vs 78.3%), but the performance was increased by the inclusion of features extracted from the spoken descriptions on TLOC (85.5%). Finally, most participants who provided feedback reported that the online application was acceptable. These findings suggest that it is feasible to differentiate between people with epilepsy and people with FDS using an automated analysis of spoken seizure descriptions. Furthermore, incorporating these features into a clinical decision tool for TLOC can improve the predictive performance by improving the differential diagnosis between these two health conditions. Future research should use the feedback to improve the design of the application and increase perceived acceptability of the approach

    A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

    Full text link
    Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

    Sistema de reconocimiento de emociones a través de la voz, mediante técnicas de aprendizaje profundo

    Get PDF
    Diseñar un sistema de Reconocimiento de Emociones de Voz (REV) mediante técnicas de aprendizaje profundo para la ayuda en el diagnostico de depresión en pacientes que acuden al psicólogo.Este proyecto de grado se centra en la creación de un "Sistema de Reconocimiento de Emociones a través de la Voz, utilizando Técnicas de Aprendizaje Profundo". Se basa en la Inteligencia Artificial, en particular en el Aprendizaje Supervisado con Redes Neuronales Artificiales, que pueden ser utilizadas para predecir emociones. La necesidad de un sistema de este tipo surge de su potencial uso en psicología para ayudar a detectar patologías de depresión. Para alcanzar los objetivos predeterminados se empleará una metodología en cascada.Ingenierí

    Спосіб розпізнавання емоційних станів у зображеннях людини

    Get PDF
    У магістерській дисертації описується розробка способу розпізнавання емоційних станів людини у її зображеннях. В основі способу лежить використання нейронної мережі зі спеціальним модулем уваги. Розроблений спосіб дозволяє класифікувати вхідне зображення людини за одним із класів емоцій. Як практична сторона, реалізований програмний прототип, який моделює його роботу. Прототип створений за допомогою мови програмування Python та відповідних бібліотек до неї.The master's thesis describes the development of a method for emotional states recognition in human images. The method is based on the use of neural network with a special attention module. The developed method allows to classify the input image of a person according to one of the emotion classes. As a practical side, a software prototype was implemented that simulates its operation. The prototype is created using the Python programming language and its corresponding libraries

    Artificial Intelligence for Cognitive Health Assessment: State-of-the-Art, Open Challenges and Future Directions

    Get PDF
    The subjectivity and inaccuracy of in-clinic Cognitive Health Assessments (CHA) have led many researchers to explore ways to automate the process to make it more objective and to facilitate the needs of the healthcare industry. Artificial Intelligence (AI) and machine learning (ML) have emerged as the most promising approaches to automate the CHA process. In this paper, we explore the background of CHA and delve into the extensive research recently undertaken in this domain to provide a comprehensive survey of the state-of-the-art. In particular, a careful selection of significant works published in the literature is reviewed to elaborate a range of enabling technologies and AI/ML techniques used for CHA, including conventional supervised and unsupervised machine learning, deep learning, reinforcement learning, natural language processing, and image processing techniques. Furthermore, we provide an overview of various means of data acquisition and the benchmark datasets. Finally, we discuss open issues and challenges in using AI and ML for CHA along with some possible solutions. In summary, this paper presents CHA tools, lists various data acquisition methods for CHA, provides technological advancements, presents the usage of AI for CHA, and open issues, challenges in the CHA domain. We hope this first-of-its-kind survey paper will significantly contribute to identifying research gaps in the complex and rapidly evolving interdisciplinary mental health field

    Deep learning algorithms and their relevance: A review

    Get PDF
    Nowadays, the most revolutionary area in computer science is deep learning algorithms and models. This paper discusses deep learning and various supervised, unsupervised, and reinforcement learning models. An overview of Artificial neural network(ANN), Convolutional neural network(CNN), Recurrent neural network (RNN), Long short-term memory(LSTM), Self-organizing maps(SOM), Restricted Boltzmann machine(RBM), Deep Belief Network (DBN), Generative adversarial network(GAN), autoencoders, long short-term memory(LSTM), Gated Recurrent Unit(GRU) and Bidirectional-LSTM is provided. Various deep-learning application areas are also discussed. The most trending Chat GPT, which can understand natural language and respond to needs in various ways, uses supervised and reinforcement learning techniques. Additionally, the limitations of deep learning are discussed. This paper provides a snapshot of deep learning
    corecore