Search CORE

2,879 research outputs found

Symbiotic data mining for personalized spam filtering

Author: Cortez Paulo
Lopes Clotilde
Rio Miguel
Rocha Miguel
Sousa Pedro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2009
Field of study

Unsolicited e-mail (spam) is a severe problem due to intrusion of privacy, online fraud, viruses and time spent reading unwanted messages. To solve this issue, Collaborative Filtering (CF) and Content-Based Filtering (CBF) solutions have been adopted. We propose a new CBF-CF hybrid approach called Symbiotic Data Mining (SDM), which aims at aggregating distinct local filters in order to improve filtering at a personalized level using collaboration while preserving privacy. We apply SDM to spam e-mail detection and compare it with a local CBF filter (i.e. Naive Bayes). Several experiments were conducted by using a novel corpus based on the well known Enron datasets mixed with recent spam. The results show that the symbiotic strategy is competitive in performance when compared to CBF and also more robust to contamination attacks.Fundação para a Ciência e a Tecnologia (FCT) - PTDC/EIA/64541/2006

Universidade do Minho: RepositoriUM

Crossref

Security Evaluation of Support Vector Machines in Adversarial Environments

Author: A. Barth
A. Christmann
A.A. Cárdenas
B. Biggio
B. Biggio
B. Biggio
B. Nelson
B. Schölkopf
B.I.P. Rubinstein
C. Cortes
G. Cauwenberghs
H. Drucker
H. Lütkepohl
K. Chaudhuri
L. Breiman
L. Sweeney
M. Barreno
M. Brückner
M. Kloft
P. Laskov
R. Young
R.N. Rodrigues
R.O. Duda
V.N. Vapnik
Publication venue
Publication date: 01/01/2014
Field of study

Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

Archivio istituzionale della ricerca - Università di Genova

Profiling user activities with minimal traffic traces

Author: J Zhang
L Sweeney
T Fawcett
TTT Nguyen
Publication venue
Publication date: 07/04/2015
Field of study

Understanding user behavior is essential to personalize and enrich a user's online experience. While there are significant benefits to be accrued from the pursuit of personalized services based on a fine-grained behavioral analysis, care must be taken to address user privacy concerns. In this paper, we consider the use of web traces with truncated URLs - each URL is trimmed to only contain the web domain - for this purpose. While such truncation removes the fine-grained sensitive information, it also strips the data of many features that are crucial to the profiling of user activity. We show how to overcome the severe handicap of lack of crucial features for the purpose of filtering out the URLs representing a user activity from the noisy network traffic trace (including advertisement, spam, analytics, webscripts) with high accuracy. This activity profiling with truncated URLs enables the network operators to provide personalized services while mitigating privacy concerns by storing and sharing only truncated traffic traces. In order to offset the accuracy loss due to truncation, our statistical methodology leverages specialized features extracted from a group of consecutive URLs that represent a micro user action like web click, chat reply, etc., which we call bursts. These bursts, in turn, are detected by a novel algorithm which is based on our observed characteristics of the inter-arrival time of HTTP records. We present an extensive experimental evaluation on a real dataset of mobile web traces, consisting of more than 130 million records, representing the browsing activities of 10,000 users over a period of 30 days. Our results show that the proposed methodology achieves around 90% accuracy in segregating URLs representing user activities from non-representative URLs

arXiv.org e-Print Archive

Crossref

Let Your CyberAlter Ego Share Information and Manage Spam

Author: Boykin P. Oscar
Kong Joseph S.
Rezaei Behnam A.
Roychowdhury Vwani P.
Sarshar Nima
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/05/2005
Field of study

Almost all of us have multiple cyberspace identities, and these {\em cyber}alter egos are networked together to form a vast cyberspace social network. This network is distinct from the world-wide-web (WWW), which is being queried and mined to the tune of billions of dollars everyday, and until recently, has gone largely unexplored. Empirically, the cyberspace social networks have been found to possess many of the same complex features that characterize its real counterparts, including scale-free degree distributions, low diameter, and extensive connectivity. We show that these topological features make the latent networks particularly suitable for explorations and management via local-only messaging protocols. {\em Cyber}alter egos can communicate via their direct links (i.e., using only their own address books) and set up a highly decentralized and scalable message passing network that can allow large-scale sharing of information and data. As one particular example of such collaborative systems, we provide a design of a spam filtering system, and our large-scale simulations show that the system achieves a spam detection rate close to 100%, while the false positive rate is kept around zero. This system has several advantages over other recent proposals (i) It uses an already existing network, created by the same social dynamics that govern our daily lives, and no dedicated peer-to-peer (P2P) systems or centralized server-based systems need be constructed; (ii) It utilizes a percolation search algorithm that makes the query-generated traffic scalable; (iii) The network has a built in trust system (just as in social networks) that can be used to thwart malicious attacks; iv) It can be implemented right now as a plugin to popular email programs, such as MS Outlook, Eudora, and Sendmail.Comment: 13 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

Author: Biggio Battista
Roli Fabio
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Learning-based pattern classifiers, including deep networks, have shown impressive performance in several application domains, ranging from computer vision to cybersecurity. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to such wild patterns (also referred to as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this research area over the last ten years and beyond, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the security evaluation of machine-learning algorithms. We review the main threat models and attacks defined to this end, and discuss the main limitations of current work, along with the corresponding future challenges towards the design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Cagliari

Archivio istituzionale della ricerca - Università di Genova

Privacy-Friendly Collaboration for Cyber Threat Mitigation

Author: Brito Alex
De Cristofaro Emiliano
Freudiger Julien
Publication venue
Publication date: 01/03/2017
Field of study

Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by arXiv:1502.0533

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Toward practical and private online services

Author: Gupta Trinabh
Publication venue
Publication date: 31/01/2018
Field of study

Today's common online services (social networks, media streaming, messaging, email, etc.) bring convenience. However, these services are susceptible to privacy leaks. Certainly, email snooping by rogue employees, email server hacks, and accidental disclosures of user ratings for movies are some sources of private information leakage. This dissertation investigates the following question: Can we build systems that (a) provide strong privacy guarantees to the users, (b) are consistent with existing commercial and policy regimes, and (c) are affordable? Satisfying all three requirements simultaneously is challenging, as providing strong privacy guarantees usually necessitates either sacrificing functionality, incurring high resource costs, or both. Indeed, there are powerful cryptographic protocols---private information retrieval (PIR), and secure two-party computation (2PC)---that provide strong guarantees but are orders of magnitude more expensive than their non-private counterparts. This dissertation takes these protocols as a starting point and then substantially reduces their costs by tailoring them using application-specific properties. It presents two systems, Popcorn and Pretzel, built on this design ethos. Popcorn is a Netflix-like media delivery system, that provably hides, even from the content distributor (for example, Netflix), which movie a user is watching. Popcorn tailors PIR protocols to the media domain. It amortizes the server-side overhead of PIR by batching requests from the large number of concurrent users retrieving content at any given time; and, it forms large batches without introducing playback delays by leveraging the properties of media streaming. Popcorn is consistent with the prevailing commercial regime (copyrights, etc.), and its per-request dollar cost is 3.87 times that of a non-private system. The other system described in this dissertation, Pretzel, is an email system that encrypts emails end-to-end between senders and intended recipients, but allows the email service provider to perform content-based spam filtering and targeted advertising. Pretzel refines a 2PC protocol. It reduces the resource consumption of the protocol by replacing the underlying encryption scheme with a more efficient one, applying a packing technique to conserve invocations of the encryption algorithm, and pruning the inputs to the protocol. Pretzel's costs, versus a legacy non-private implementation, are estimated to be up to 5.4 times for the email provider, with additional but modest client-side requirements. Popcorn and Pretzel have fundamental connections. For instance, the cryptographic protocols in both systems securely compute vector-matrix products. However, we observe that differences in the vector and matrix dimensions lead to different system designs. Ultimately, both systems represent a potentially appealing compromise: sacrifice some functionality to build in strong privacy properties at affordable costs.Computer Science

Texas ScholarWorks

Opportunistic mobile social networks: architecture, privacy, security issues and future directions

Author: T. Gireesh Kumar
Vidhya Lakshmi Vimitha R.
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2019
Field of study

Mobile Social Networks and its related applications have made a very great impact in the society. Many new technologies related to mobile social networking are booming rapidly now-a-days and yet to boom. One such upcoming technology is Opportunistic Mobile Social Networking. This technology allows mobile users to communicate and exchange data with each other without the use of Internet. This paper is about Opportunistic Mobile Social Networks, its architecture, issues and some future research directions. The architecture and issues of Opportunistic Mobile Social Networks are compared with that of traditional Mobile Social Networks. The main contribution of this paper is regarding privacy and security issues in Opportunistic Mobile Social Networks. Finally, some future research directions in Opportunistic Mobile Social Networks have been elaborated regarding the data's privacy and security

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science