1,110 research outputs found
Social Fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling
Spambot detection in online social networks is a long-lasting challenge
involving the study and design of detection techniques capable of efficiently
identifying ever-evolving spammers. Recently, a new wave of social spambots has
emerged, with advanced human-like characteristics that allow them to go
undetected even by current state-of-the-art algorithms. In this paper, we show
that efficient spambots detection can be achieved via an in-depth analysis of
their collective behaviors exploiting the digital DNA technique for modeling
the behaviors of social network users. Inspired by its biological counterpart,
in the digital DNA representation the behavioral lifetime of a digital account
is encoded in a sequence of characters. Then, we define a similarity measure
for such digital DNA sequences. We build upon digital DNA and the similarity
between groups of users to characterize both genuine accounts and spambots.
Leveraging such characterization, we design the Social Fingerprinting
technique, which is able to discriminate among spambots and genuine accounts in
both a supervised and an unsupervised fashion. We finally evaluate the
effectiveness of Social Fingerprinting and we compare it with three
state-of-the-art detection algorithms. Among the peculiarities of our approach
is the possibility to apply off-the-shelf DNA analysis techniques to study
online users behaviors and to efficiently rely on a limited number of
lightweight account characteristics
Better Safe Than Sorry: An Adversarial Approach to Improve Social Bot Detection
The arm race between spambots and spambot-detectors is made of several cycles
(or generations): a new wave of spambots is created (and new spam is spread),
new spambot filters are derived and old spambots mutate (or evolve) to new
species. Recently, with the diffusion of the adversarial learning approach, a
new practice is emerging: to manipulate on purpose target samples in order to
make stronger detection models. Here, we manipulate generations of Twitter
social bots, to obtain - and study - their possible future evolutions, with the
aim of eventually deriving more effective detection techniques. In detail, we
propose and experiment with a novel genetic algorithm for the synthesis of
online accounts. The algorithm allows to create synthetic evolved versions of
current state-of-the-art social bots. Results demonstrate that synthetic bots
really escape current detection techniques. However, they give all the needed
elements to improve such techniques, making possible a proactive approach for
the design of social bot detection systems.Comment: This is the pre-final version of a paper accepted @ 11th ACM
Conference on Web Science, June 30-July 3, 2019, Boston, U
LENTA: Longitudinal Exploration for Network Traffic Analysis from Passive Data
In this work, we present LENTA (Longitudinal Exploration for Network Traffic Analysis), a system that supports the network analysts in the identification of traffic generated by services and applications running on the web. In the case of URLs observed in operative network, LENTA simplifies the analyst’s job by letting her observe few hundreds of clusters instead of the original hundred thousands of single URLs. We implement a self-learning methodology, where the system grows its knowledge, which is used in turn to automatically associate traffic to previously observed services, and identify new traffic generated by possibly suspicious applications. This approach lets the analysts easily observe changes in network traffic, identify new services, and unexpected activities. We follow a data-driven approach and run LENTA on traces collected both in ISP networks and directly on hosts via proxies. We analyze traffic in batches of 24-hours worth of traffic. Big data solutions are used to enable horizontal scalability and meet performance requirements. We show that LENTA allows the analyst to clearly understand which services are running on their network, possibly highlighting malicious traffic and changes over time, greatly simplifying the view and understanding of the network traffic
The Relationship Between Disclosing Purchase Information and Reputation Systems in Electronic Markets
In this work we investigate how the introduction of the Verified Purchase (VP) badge on Amazon.com affected both the review helpfulness and the product ratings. We first conduct a propensity score matching study and find that all else equal, camera reviews are on average ranked 7 positions higher than non-VP reviews, while book VP reviews are on average ranked 11 positions higher than non-VP reviews. Next, we use a natural experiment setting to identify whether the entry of the VP feature had an effect on the (1) overall review helpfulness (both VP and non-VP reviews), and (2) average product rating. Our results show that the introduction of VP caused an increase in review helpfulness of 7.7% for books, and 1.7% for electronics. Furthermore, it caused on average an increase of 20 and 18 positions in the ranks on book and electronic products respectively
Machine Learning Techniques for Credit Card Fraud Detection
The term “fraud”, it always concerned about credit card fraud in our minds. And after the significant increase in the transactions of credit card, the fraud of credit card increased extremely in last years. So the fraud detection should include surveillance of the spending attitude for the person/customer to the determination, avoidance, and detection of unwanted behavior. Because the credit card is the most payment predominant way for the online and regular purchasing, the credit card fraud raises highly. The Fraud detection is not only concerned with capturing of the fraudulent practices, but also, discover it as fast as they can, because the fraud costs millions of dollar business loss and it is rising over time, and that affects greatly the worldwide economy. . In this paper we introduce 14 different techniques of how data mining techniques can be successfully combined to obtain a high fraud coverage with a high or low false rate, the Advantage and The Disadvantages of every technique, and The Data Sets used in the researches by researcher
Machine Learning and other Computational-Intelligence Techniques for Security Applications
L'abstract è presente nell'allegato / the abstract is in the attachmen
Behavior Drift Detection Based on Anomalies Identification in Home Living Quantitative Indicators
Home Automation and Smart Homes diffusion are providing an interesting opportunity to implement elderly monitoring. This is a new valid technological support to allow in-place aging of seniors by means of a detection system to notify potential anomalies. Monitoring has been implemented by means of Complex Event Processing on live streams of home automation data: this allows the analysis of the behavior of the house inhabitant through quantitative indicators. Different kinds of quantitative indicators for monitoring and behavior drift detection have been identified and implemented using the Esper complex event processing engine. The chosen solution permits us not only to exploit the queries when run “online”, but enables also “offline” (re-)execution for testing and a posteriori analysis. Indicators were developed on both real world data and on realistic simulations. Tests were made on a dataset of 180 days: the obtained results prove that it is possible to evidence behavior changes for an evaluation of a person’s condition
- …