Search CORE

2,730 research outputs found

Interpretability-oriented data-driven modelling of bladder cancer via computational intelligence

Author: De Alejandro Montalvo Julio Cesar
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 13/02/2015
Field of study

White Rose E-theses Online

Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings

Author: Saias José
Teimas Rúben
Publication venue: MDPI - Electronics
Publication date: 29/10/2023
Field of study

Repositório Científico da Universidade de Évora

An investigation into machine pattern recognition based on time-frequency image feature extraction using a support vector machine

Author: Li Hongkun
Zhang Zhixin
Zhou Peilin
Publication venue
Publication date: 01/04/2010
Field of study

In this article, a new method of pattern recognition for machine working conditions is presented that is based on time-frequency image (TFI) feature extraction and support vector machines (SVMs). In this study, the Hilbert time-frequency spectrum (HTFS) is used to construct TFIs because of its good performance in non-stationary and non-linear signal analysis. Cyclostationarity signal analysis is a pre-processing method for improving the performance of the HTFS in the construction of TFIs. Feature extraction for TFIs is investigated in detail to construct a feature vector for pattern recognition. Gravity centre and information entropy of TFIs are used to construct the feature vector for pattern recognition. SVMs are used for different working conditions classification by the constructed feature vector because of its powerful performance even for small samples. In the end, rolling bearing pattern recognition is used as an example to testify the effectiveness of this method. According to the result analysis, it can be concluded that this method will contribute to the development of preventative maintenance

University of Strathclyde Institutional Repository

Automatic detection of persuasion attempts on social networks

Author: Teimas Rúben José Ferreira
Publication venue: 'Universidade de Evora'
Publication date: 05/09/2023
Field of study

The rise of social networks and the increasing amount of time people spend on them have created a perfect place for the dissemination of false narratives, propaganda, and manipulated content. In order to prevent the spread of disinformation, content moderation is needed, however it is unfeasible to do it manually due to the large number of daily posts. This dissertation aims at solving this problem by creating a system for automatic detection of persuasion techniques, as proposed in a SemEval challenge. We start by reviewing classic machine learning and natural language processing approaches and go through more sophisticated deep learning approaches which are more suited for this type of complex problem. The classic machine learning approaches are used to create a baseline for the problem. The architecture proposed, using deep learning techniques, is built on top of a DistilBERT transformer followed by Convolutional Neural Networks. We study how our usage of different loss functions, pre-processing the text, freezing DistilBERT layers and performing hyperparameter search impact the performance of our system. We discovered that we could optimize our architecture by freezing the two initial DistilBERT’s layers and using asymmetric loss to tackle the class imbalance on the dataset presented. This study resulted in three final models with the same architecture but using different parameters where the first showed signs of overfitting, one did not show sings of overfitting but did not seem to converge and other seemed to converge but yielded the worst performance of all three. They presented a micro f1-score of 0.551, 0.526 and 0.509 and were placed in 3rd, 6th and 11th place respectively in the overall table. The models can only classify textual elements as the multimodal component is not implemented on this iteration but only discussed; Sumário: Deteção automática de tentativas de persuasão em redes sociais - O crescimento das redes sociais e o aumento do tempo que as pessoas passam nelas criaram um lugar perfeito para a disseminação de falsas narrativas, propaganda e conteúdo manipulado. Para evitar a disseminação da desinformação, é necessária a moderação do conteúdo, porém é inviável fazê-la manualmente devido ao grande número de conteúdo diário. Esta dissertação visa resolver este problema através da criação de um sistema de deteção automática de técnicas de persuasão, conforme proposto num desafio da SemEval. Começamos por rever as abordagens clássicas de aprendizagem automática e processamento de linguagem natural, passamos de seguida por abordagens mais sofisticadas de aprendizagem profunda que são mais adequadas para esse tipo de problema complexo. As abordagens clássicas de aprendizagem automática são usadas para criar um ponto de partida para o problema. A arquitetura proposta, utilizando técnicas de aprendizagem profunda, é construída sobre um transformer DistilBERT seguido de redes neuronais convolucionais. Estudamos de que forma o uso de diferentes funções ativação, pré-processamento do texto, congelamento de camadas do DistilBERT e realização de pesquisa de hiperparâmetros afetam o desempenho do nosso sistema. Descobrimos que poderíamos otimizar nossa arquitetura congelando as duas camadas iniciais do DistilBERT e usando asymmetric loss para lidar com o desequilíbrio de classes no conjunto de dados apresentado. Este estudo resultou em três modelos finais com a mesma arquitetura, mas usando parâmetros diferentes, onde o primeiro mostrou sinais de overfitting, um não mostrou sinais de overfitting mas não parece convergir e outro parece convergir, mas produziu o pior desempenho de todos os três. Apresentaram micro f1-score de 0.551, 0.526 e 0.509 e ficaram em 3º, 6º e 11º lugares, respectivamente, na tabela geral. Os modelos podem apenas classificar elementos textuais, pois o componente multimodal não é implementado nesta iteração, mas apenas discutido

Repositório Científico da Universidade de Évora

Discretisation of conditions in decision rules induced for continuous

Author: Baron Grzegorz
Stańczyk Urszula
Zielosko Beata
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs

Directory of Open Access Journals

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Explainable AI for Machine Fault Diagnosis: Understanding Features' Contribution in Machine Learning Models for Industrial Condition Monitoring

Author: Brusa Eugenio
Cibrario Luca
Delprete Cristiana
Di Maggio Luigi Gianpio
Publication venue: MDPI
Publication date: 01/01/2023
Field of study

Although the effectiveness of machine learning (ML) for machine diagnosis has been widely established, the interpretation of the diagnosis outcomes is still an open issue. Machine learning models behave as black boxes; therefore, the contribution given by each of the selected features to the diagnosis is not transparent to the user. This work is aimed at investigating the capabilities of the SHapley Additive exPlanation (SHAP) to identify the most important features for fault detection and classification in condition monitoring programs for rotating machinery. The authors analyse the case of medium-sized bearings of industrial interest. Namely, vibration data were collected for different health states from the test rig for industrial bearings available at the Mechanical Engineering Laboratory of Politecnico di Torino. The Support Vector Machine (SVM) and k-Nearest Neighbour (kNN) diagnosis models are explained by means of the SHAP. Accuracies higher than 98.5% are achieved for both the models using the SHAP as a criterion for feature selection. It is found that the skewness and the shape factor of the vibration signal have the greatest impact on the models’ outcomes

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Recommended from our members

Mini-Workshop: Deep Learning and Inverse Problems

Author
Publication venue: Zürich : EMS Publ. House
Publication date: 01/01/2018
Field of study

Machine learning and in particular deep learning offer several data-driven methods to amend the typical shortcomings of purely analytical approaches. The mathematical research on these combined models is presently exploding on the experimental side but still lacking on the theoretical point of view. This workshop addresses the challenge of developing a solid mathematical theory for analyzing deep neural networks for inverse problems

Repositorium für Naturwissenschaften und Technik

Unsupervised Doppler Radar Based Activity Recognition for e-Healthcare

Author: Jing Yanguo
Karayaneva Yordanka
Li Wenda
Sharifzadeh Sara
Tan Bo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Passive radio frequency (RF) sensing and monitoring of human daily activities in elderly care homes is an emerging topic. Micro-Doppler radars are an appealing solution considering their non-intrusiveness, deep penetration, and high-distance range. Unsupervised activity recognition using Doppler radar data has not received attention, in spite of its importance in case of unlabelled or poorly labelled activities in real scenarios. This study proposes two unsupervised feature extraction methods for the purpose of human activity monitoring using Doppler-streams. These include a local Discrete Cosine Transform (DCT)-based feature extraction method and a local entropy-based feature extraction method. In addition, a novel application of Convolutional Variational Autoencoder (CVAE) feature extraction is employed for the first time for Doppler radar data. The three feature extraction architectures are compared with the previously used Convolutional Autoencoder (CAE) and linear feature extraction based on Principal Component Analysis (PCA) and 2DPCA. Unsupervised clustering is performed using K-Means and K-Medoids. The results show the superiority of DCT-based method, entropy-based method, and CVAE features compared to CAE, PCA, and 2DPCA, with more than 5\%-20\% average accuracy. In regards to computation time, the two proposed methods are noticeably much faster than the existing CVAE. Furthermore, for high-dimensional data visualisation, three manifold learning techniques are considered. The methods are compared for the projection of raw data as well as the encoded CVAE features. All three methods show an improved visualisation ability when applied to the encoded CVAE features

arXiv.org e-Print Archive

UCL Discovery

Coventry University Pure Portal

Trepo - Institutional Repository of Tampere University

University of Dundee Online Publications