5 research outputs found
Design and implementation of robust systems for secure malware detection
Malicious software (malware) have significantly increased in terms of number and effectiveness
during the past years. Until 2006, such software were mostly used to disrupt
network infrastructures or to show coders’ skills. Nowadays, malware constitute a very
important source of economical profit, and are very difficult to detect. Thousands of
novel variants are released every day, and modern obfuscation techniques are used to
ensure that signature-based anti-malware systems are not able to detect such threats.
This tendency has also appeared on mobile devices, with Android being the most targeted
platform. To counteract this phenomenon, a lot of approaches have been developed
by the scientific community that attempt to increase the resilience of anti-malware systems.
Most of these approaches rely on machine learning, and have become very popular
also in commercial applications. However, attackers are now knowledgeable about these
systems, and have started preparing their countermeasures. This has lead to an arms
race between attackers and developers. Novel systems are progressively built to tackle
the attacks that get more and more sophisticated. For this reason, a necessity grows
for the developers to anticipate the attackers’ moves. This means that defense systems
should be built proactively, i.e., by introducing some security design principles in their
development. The main goal of this work is showing that such proactive approach can
be employed on a number of case studies. To do so, I adopted a global methodology that
can be divided in two steps. First, understanding what are the vulnerabilities of current
state-of-the-art systems (this anticipates the attacker’s moves). Then, developing novel
systems that are robust to these attacks, or suggesting research guidelines with which
current systems can be improved. This work presents two main case studies, concerning
the detection of PDF and Android malware. The idea is showing that a proactive approach
can be applied both on the X86 and mobile world. The contributions provided on
this two case studies are multifolded. With respect to PDF files, I first develop novel attacks
that can empirically and optimally evade current state-of-the-art detectors. Then,
I propose possible solutions with which it is possible to increase the robustness of such
detectors against known and novel attacks. With respect to the Android case study,
I first show how current signature-based tools and academically developed systems are
weak against empirical obfuscation attacks, which can be easily employed without particular
knowledge of the targeted systems. Then, I examine a possible strategy to build a
machine learning detector that is robust against both empirical obfuscation and optimal
attacks. Finally, I will show how proactive approaches can be also employed to develop
systems that are not aimed at detecting malware, such as mobile fingerprinting systems.
In particular, I propose a methodology to build a powerful mobile fingerprinting system,
and examine possible attacks with which users might be able to evade it, thus preserving
their privacy. To provide the aforementioned contributions, I co-developed (with the cooperation
of the researchers at PRALab and Ruhr-Universität Bochum) various systems:
a library to perform optimal attacks against machine learning systems (AdversariaLib),
a framework for automatically obfuscating Android applications, a system to the robust
detection of Javascript malware inside PDF files (LuxOR), a robust machine learning system
to the detection of Android malware, and a system to fingerprint mobile devices. I
also contributed to develop Android PRAGuard, a dataset containing a lot of empirical
obfuscation attacks against the Android platform. Finally, I entirely developed Slayer
NEO, an evolution of a previous system to the detection of PDF malware. The results
attained by using the aforementioned tools show that it is possible to proactively build
systems that predict possible evasion attacks. This suggests that a proactive approach
is crucial to build systems that provide concrete security against general and evasion
attacks
Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem
Transport Layer Security (TLS) is one of the most widely deployed cryptographic protocols on the Internet that provides confidentiality, integrity, and a certain degree of authenticity of the communications between clients and servers. Following Snowden's revelations on US surveillance programs, the adoption of TLS has steadily increased. However, encrypted traffic prevents legitimate inspection. Therefore, security solutions such as personal antiviruses and enterprise firewalls may intercept encrypted connections in search for malicious or unauthorized content. Therefore, the end-to-end property of TLS is broken by these TLS proxies (a.k.a. middleboxes) for arguably laudable reasons; yet, may pose a security risk. While TLS clients and servers have been analyzed to some extent, such proxies have remained unexplored until recently. We propose a framework for analyzing client-end TLS proxies, and apply it to 14 consumer antivirus and parental control applications as they break end-to-end TLS connections. Overall, the security of TLS connections was systematically worsened compared to the guarantees provided by modern browsers. Next, we aim at exploring the non-public HTTPS ecosystem, composed of locally-trusted proxy-issued certificates, from the user's perspective and from several countries in residential and enterprise settings. We focus our analysis on the long tail of interception events. We characterize the customers of network appliances, ranging from small/medium businesses and institutes to hospitals, hotels, resorts, insurance companies, and government agencies. We also discover regional cases of traffic interception malware/adware that mostly rely on the same Software Development Kit (i.e., NetFilter). Our scanning and analysis techniques allow us to identify more middleboxes and intercepting apps than previously found from privileged server vantages looking at billions of connections. We further perform a longitudinal study over six years of the evolution of a prominent traffic-intercepting adware found in our dataset: Wajam. We expose the TLS interception techniques it has used and the weaknesses it has introduced on hundreds of millions of user devices. This study also (re)opens the neglected problem of privacy-invasive adware, by showing how adware evolves sometimes stronger than even advanced malware and poses significant detection and reverse-engineering challenges. Overall, whether beneficial or not, TLS interception often has detrimental impacts on security without the end-user being alerted