Search CORE

6,029 research outputs found

Exposing and explaining fake news on-the-fly

Author: Burguillo Rial Juan Carlos
de Arriba Perez Francisco
García Méndez Silvia
Leal Fátima
Malheiro Benedita
Publication venue: Grupo de Tecnoloxías da Información
Publication date: 03/05/2024
Field of study

Social media platforms enable the rapid dissemination and consumption of information. However, users instantly consume such content regardless of the reliability of the shared data. Consequently, the latter crowdsourcing model is exposed to manipulation. This work contributes with an explainable and online classification method to recognize fake news in real-time. The proposed method combines both unsupervised and supervised Machine Learning approaches with online created lexica. The profiling is built using creator-, content- and context-based features using Natural Language Processing techniques. The explainable classification mechanism displays in a dashboard the features selected for classification and the prediction confidence. The performance of the proposed solution has been validated with real data sets from Twitter and the results attain 80% accuracy and macro F-measure. This proposal is the first to jointly provide data stream processing, profiling, classification and explainability. Ultimately, the proposed early detection, isolation and explanation of fake news contribute to increase the quality and trustworthiness of social media contentsXunta de Galicia | Ref. ED481B-2021-118Xunta de Galicia | Ref. ED481B-2022-093Fundação para a Ciência e a Tecnologia | Ref. UIDB/50014/2020Universidade de Vigo/CISU

Investigo

Recommended from our members

Towards real-time feature tracking technique using adaptive micro-clusters

Author: Badii Atta
Shakir Hammoodi Mahmood
Stahl Frederic
Tennant Mark
Publication venue: BCS Specialist Group on Artifical Intelligence
Publication date: 01/04/2017
Field of study

Data streams are unbounded, sequential data instances that are generated with high velocity. Classifying sequential data instances is a very challenging problem in machine learning with applications in network intrusion detection, ﬁnancial markets and sensor networks. Data stream classiﬁcation is concerned with the automatic labelling of unseen instances from the stream in real-time. For this the classiﬁer needs to adapt to concept drifts and can only have a single pass through the data if the stream is fast. This research paper presents our work on a real-time pre-processing technique, in particular a feature tracking technique that takes concept drift into consideration. The feature tracking technique is designed to improve Data Stream Mining (DSM) classiﬁcation algorithms by enabling real-time feature selection. The technique is based on adaptive summaries of the data and class distributions, known as Micro-Clusters. Currently the technique is able to detect concept drift and identiﬁes which features have been involved

Central Archive at the University of Reading

Process-Oriented Stream Classification Pipeline:A Literature Review

Author: Bossek Jakob
Clever Lena
Kerschke Pascal
Pohl Janina Susanne
Trautmann Heike
Publication venue
Publication date: 01/09/2022
Field of study

Featured Application: Nowadays, many applications and disciplines work on the basis of stream data. Common examples are the IoT sector (e.g., sensor data analysis), or video, image, and text analysis applications (e.g., in social media analytics or astronomy). With our work, we gather different approaches and terminology, and give a broad overview over the topic. Our main target groups are practitioners and newcomers to the field of data stream classification. Due to the rise of continuous data-generating applications, analyzing data streams has gained increasing attention over the past decades. A core research area in stream data is stream classification, which categorizes or detects data points within an evolving stream of observations. Areas of stream classification are diverse—ranging, e.g., from monitoring sensor data to analyzing a wide range of (social) media applications. Research in stream classification is related to developing methods that adapt to the changing and potentially volatile data stream. It focuses on individual aspects of the stream classification pipeline, e.g., designing suitable algorithm architectures, an efficient train and test procedure, or detecting so-called concept drifts. As a result of the many different research questions and strands, the field is challenging to grasp, especially for beginners. This survey explores, summarizes, and categorizes work within the domain of stream classification and identifies core research threads over the past few years. It is structured based on the stream classification process to facilitate coordination within this complex topic, including common application scenarios and benchmarking data sets. Thus, both newcomers to the field and experts who want to widen their scope can gain (additional) insight into this research area and find starting points and pointers to more in-depth literature on specific issues and research directions in the field.</p

University of Twente Research Information

On-device modeling of user's social context and familiar places from smartphone-embedded sensor data

Author: Campana Mattia Giovanni
Delmastro Franca
Publication venue: 'Elsevier BV'
Publication date: 27/06/2023
Field of study

Context modeling and recognition are crucial for adaptive mobile and ubiquitous computing. Context-awareness in mobile environments relies on prompt reactions to context changes. However, current solutions focus on limited context information processed on centralized architectures, risking privacy leakage and lacking personalization. On-device context modeling and recognition are emerging research trends, addressing these concerns. Social interactions and visited locations play significant roles in characterizing daily life scenarios. This paper proposes an unsupervised and lightweight approach to model the user's social context and locations directly on the mobile device. Leveraging the ego-network model, the system extracts high-level, semantic-rich context features from smartphone-embedded sensor data. For the social context, the approach utilizes data on physical and cyber social interactions among users and their devices. Regarding location, it prioritizes modeling the familiarity degree of specific locations over raw location data, such as GPS coordinates and proximity devices. The effectiveness of the proposed approach is demonstrated through three sets of experiments, employing five real-world datasets. These experiments evaluate the structure of social and location ego networks, provide a semantic evaluation of the proposed models, and assess mobile computing performance. Finally, the relevance of the extracted features is showcased by the improved performance of three machine learning models in recognizing daily-life situations. Compared to using only features related to physical context, the proposed approach achieves a 3% improvement in AUROC, 9% in Precision, and 5% in Recall

arXiv.org e-Print Archive

Machine Learning for Financial Prediction Under Regime Change Using Technical Analysis: A Systematic Review

Author: Cervantes Alejandro
Quintana David
Suárez-Cetrulo Andrés L.
Publication venue: International Journal of Interactive Multimedia and Artificial Intelligence
Publication date: 11/07/2023
Field of study

Recent crises, recessions and bubbles have stressed the non-stationary nature and the presence of drastic structural changes in the financial domain. The most recent literature suggests the use of conventional machine learning and statistical approaches in this context. Unfortunately, several of these techniques are unable or slow to adapt to changes in the price-generation process. This study aims to survey the relevant literature on Machine Learning for financial prediction under regime change employing a systematic approach. It reviews key papers with a special emphasis on technical analysis. The study discusses the growing number of contributions that are bridging the gap between two separate communities, one focused on data stream learning and the other on economic research. However, it also makes apparent that we are still in an early stage. The range of machine learning algorithms that have been tested in this domain is very wide, but the results of the study do not suggest that currently there is a specific technique that is clearly dominant

Re-UNIR

Denial of Service in Web-Domains: Building Defenses Against Next-Generation Attack Behavior

Author: Stevanovic Dusan
Publication venue
Publication date: 25/11/2016
Field of study

The existing state-of-the-art in the field of application layer Distributed Denial of Service (DDoS) protection is generally designed, and thus effective, only for static web domains. To the best of our knowledge, our work is the first that studies the problem of application layer DDoS defense in web domains of dynamic content and organization, and for next-generation bot behaviour. In the first part of this thesis, we focus on the following research tasks: 1) we identify the main weaknesses of the existing application-layer anti-DDoS solutions as proposed in research literature and in the industry, 2) we obtain a comprehensive picture of the current-day as well as the next-generation application-layer attack behaviour and 3) we propose novel techniques, based on a multidisciplinary approach that combines offline machine learning algorithms and statistical analysis, for detection of suspicious web visitors in static web domains. Then, in the second part of the thesis, we propose and evaluate a novel anti-DDoS system that detects a broad range of application-layer DDoS attacks, both in static and dynamic web domains, through the use of advanced techniques of data mining. The key advantage of our system relative to other systems that resort to the use of challenge-response tests (such as CAPTCHAs) in combating malicious bots is that our system minimizes the number of these tests that are presented to valid human visitors while succeeding in preventing most malicious attackers from accessing the web site. The results of the experimental evaluation of the proposed system demonstrate effective detection of current and future variants of application layer DDoS attacks

YorkSpace

Recommended from our members

Data stream mining of event and complex event streams: a survey of existing and future technologies and applications in big data

Author: Di Fatta Giuseppe
Karthikeyan Vidhyalakshmi
Nauck Detlef D.
Stahl Frederic
Wrench Chris
Publication venue: 'IGI Global'
Publication date: 01/06/2016
Field of study

Central Archive at the University of Reading

A survey on machine learning for recurring concept drifting data streams

Author: Cervantes Alejandro
Quintana David
Suárez-Cetrulo Andrés L.
Publication venue: 'Elsevier BV'
Publication date: 01/03/2023
Field of study

The problem of concept drift has gained a lot of attention in recent years. This aspect is key in many domains exhibiting non-stationary as well as cyclic patterns and structural breaks affecting their generative processes. In this survey, we review the relevant literature to deal with regime changes in the behaviour of continuous data streams. The study starts with a general introduction to the field of data stream learning, describing recent works on passive or active mechanisms to adapt or detect concept drifts, frequent challenges in this area, and related performance metrics. Then, different supervised and non-supervised approaches such as online ensembles, meta-learning and model-based clustering that can be used to deal with seasonalities in a data stream are covered. The aim is to point out new research trends and give future research directions on the usage of machine learning techniques for data streams which can help in the event of shifts and recurrences in continuous learning scenarios in near real-time

Universidad Carlos III de Madrid e-Archivo

Re-UNIR

Exploiting Emergence of New Topics via Anamoly Detection: A Survey

Author: Miss. S.V.Saswade,Prof. S. S. Nandgaonkar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2014
Field of study

Detecting and generating new concepts has attracted much attention in data mining era, nowadays. The emergence of new topics in news data is a big challenge. The problem can be extended as “finding breaking news”. Years ago the emergence of new stories were detected and followed up by domain experts. But manually reading stories and concluding the misbehaviors is a critical and time consuming task. Further mapping these misbehaviors to various stories needs excellent knowledge about the news and old concepts. So automatically modeling breaking news has much interest in data mining. The anomalies in news published in newspapers are the basic clues for concluding the emergence of a new story(s). The anomalies are the keywords or phrases which doesn’t match the whole concept of the news. These anomalies then processed and mapped to the stories where these keywords and phrases doesn’t behave as anomalies. After mapping these anomalies one can conclude that these mapped topic by anomaly linking can generate a new concept which eventually can be modeled as emerging story. We survey some techniques which can be used to efficiently model the new concept. News Classification, Anomaly Detection, Concept Detection and Generation are some of those techniques which collectively can be the basics of modeling breaking news. We further discussed some data sources which can process and used as input stories or news for modeling emergence of new stories

International Journal on Recent and Innovation Trends in Computing and Communication