1,243 research outputs found

    Schemes Based on Federated Learning for Decentralized Training in Machine Learning Models

    Get PDF
    Standard Machine Learning approaches require large amounts of data usually centralized in data centers. In these approaches, there is only one device responsible for the training of the whole process. New collaborative approaches allow the training of common models from different decentralized devices, each one holding local data samples. An example is Federated Learning. In recent years, along with the blooming of Machine Learning based applications and services, ensuring data privacy and security have become a critical obligation. In this work, three training procedures based on Federated Learning were tested: FedAvg, FedADA, and LoADABoost comparing their performance versus a traditional centralized training method. Using public information from written reviews about movies, a neural network algorithm was implemented. The objective of the model was to predict whether a review is positive or negative. Utilizing the F1 Score as a performance metric, the hypothesis was to validate whether the Federated Learning training methods are similar to traditional centralized training methodologies. After the implementation of the same neural network with different training methodologies, no major differences or changes in performance were noted, concluding that Federated Learning is indeed a similar and viable training methodology

    Automatic information search for countering covid-19 misinformation through semantic similarity

    Full text link
    Trabajo Fin de Máster en Bioinformática y Biología ComputacionalInformation quality in social media is an increasingly important issue and misinformation problem has become even more critical in the current COVID-19 pandemic, leading people exposed to false and potentially harmful claims and rumours. Civil society organizations, such as the World Health Organization, have demanded a global call for action to promote access to health information and mitigate harm from health misinformation. Consequently, this project pursues countering the spread of COVID-19 infodemic and its potential health hazards. In this work, we give an overall view of models and methods that have been employed in the NLP field from its foundations to the latest state-of-the-art approaches. Focusing on deep learning methods, we propose applying multilingual Transformer models based on siamese networks, also called bi-encoders, combined with ensemble and PCA dimensionality reduction techniques. The goal is to counter COVID-19 misinformation by analyzing the semantic similarity between a claim and tweets from a collection gathered from official fact-checkers verified by the International Fact-Checking Network of the Poynter Institute. It is factual that the number of Internet users increases every year and the language spoken determines access to information online. For this reason, we give a special effort in the application of multilingual models to tackle misinformation across the globe. Regarding semantic similarity, we firstly evaluate these multilingual ensemble models and improve the result in the STS-Benchmark compared to monolingual and single models. Secondly, we enhance the interpretability of the models’ performance through the SentEval toolkit. Lastly, we compare these models’ performance against biomedical models in TREC-COVID task round 1 using the BM25 Okapi ranking method as the baseline. Moreover, we are interested in understanding the ins and outs of misinformation. For that purpose, we extend interpretability using machine learning and deep learning approaches for sentiment analysis and topic modelling. Finally, we developed a dashboard to ease visualization of the results. In our view, the results obtained in this project constitute an excellent initial step toward incorporating multilingualism and will assist researchers and people in countering COVID-19 misinformation

    ML-Based User Authentication Through Mouse Dynamics

    Get PDF
    Increasing reliance on digital services and the limitations of traditional authentication methods have necessitated the development of more advanced and secure user authentication methods. For user authentication and intrusion detection, mouse dynamics, a form of behavioral biometrics, offers a promising and non-invasive method. This paper presents a comprehensive study on ML-Based User Authentication Through Mouse Dynamics. This project proposes a novel framework integrating sophisticated techniques such as embeddings extraction using Transformer models with cutting-edge machine learning algorithms such as Recurrent Neural Networks (RNN). The project aims to accurately identify users based on their distinct mouse behavior and detect unauthorized access by utilizing the hybrid models. Using a mouse dynamics dataset, the proposed framework’s performance is evaluated, demonstrating its efficacy in accurately identifying users and detecting intrusions. In addition, a comparative analysis with existing methodologies is provided, highlighting the enhancements made by the proposed framework. This paper contributes to the development of more secure, reliable, and user-friendly authentication systems that leverage the power of machine learning and behavioral biometrics, ultimately augmenting the privacy and security of digital services and resources

    Leveraging Deep Learning and Online Source Sentiment for Financial Portfolio Management

    Full text link
    Financial portfolio management describes the task of distributing funds and conducting trading operations on a set of financial assets, such as stocks, index funds, foreign exchange or cryptocurrencies, aiming to maximize the profit while minimizing the loss incurred by said operations. Deep Learning (DL) methods have been consistently excelling at various tasks and automated financial trading is one of the most complex one of those. This paper aims to provide insight into various DL methods for financial trading, under both the supervised and reinforcement learning schemes. At the same time, taking into consideration sentiment information regarding the traded assets, we discuss and demonstrate their usefulness through corresponding research studies. Finally, we discuss commonly found problems in training such financial agents and equip the reader with the necessary knowledge to avoid these problems and apply the discussed methods in practice

    Understanding the Role of Nonverbal Tokens in the Spread of Online Information

    Get PDF
    Individuals and society continue to suffer as the fake news infodemic continues unabated. Current research has focused largely on the verbal part (plain text) of fake news, the nuances of nonverbal communication (emojis and other semiotic tokens) remain largely understudied. We explore the relationship between fake news and emojis in this work through two studies. The first study found that information with emojis is retweeted 1.28 times more and liked 1.41 times more than information without them. Additionally, our research finds that tweets with emojis are more common in fake news (49%) than true news (33%). We also find that emojis are more popular with fake news compared to true news. In our second study, we conducted an online experiment with true and fake news (N=99) to understand how the functional usage (replace/emphasize) of emoji affects the spread of information. We find that when an emoji replaces a verbal token, it is liked less (p0.05)

    Trustworthiness in Social Big Data Incorporating Semantic Analysis, Machine Learning and Distributed Data Processing

    Get PDF
    This thesis presents several state-of-the-art approaches constructed for the purpose of (i) studying the trustworthiness of users in Online Social Network platforms, (ii) deriving concealed knowledge from their textual content, and (iii) classifying and predicting the domain knowledge of users and their content. The developed approaches are refined through proof-of-concept experiments, several benchmark comparisons, and appropriate and rigorous evaluation metrics to verify and validate their effectiveness and efficiency, and hence, those of the applied frameworks

    A Model of Online Social Interactions based on Sentiment Analysis and Content Similarity

    Get PDF
    In this paper we create a model of human behavior in online communities, based on the network topology and on the communication content. The model contains eleven distinct hypotheses, which validate three intuitions. The rst intuition is that the network topology alone fails to clearly distinguish between the users who contribute to the community and the troublemakers. The second intuition is that the content of the messages exchanged in an online community can separate good and insightful contri- butions from the rest. The third intuition is that there is a delay until the network stabilizes and un- til standard measures, such as betweenness central- ity, can be used accurately. Taken together, these three intuitions are a solid case against indiscrimi- nately using network measures. They also underline the importance of the communication content. We show that the sentiment within the messages, espe- cially antagonism, can signicantly alter the commu- nity perception. We create a novel sentiment analysis technique to identify antagonistic behavior. We use real world data, taken from the Slashdot discussion forum to validate our model. All the find- ings are accompanied by extremely signicant t-test p-values
    • 

    corecore