223 research outputs found
Impact Of Content Features For Automatic Online Abuse Detection
Online communities have gained considerable importance in recent years due to
the increasing number of people connected to the Internet. Moderating user
content in online communities is mainly performed manually, and reducing the
workload through automatic methods is of great financial interest for community
maintainers. Often, the industry uses basic approaches such as bad words
filtering and regular expression matching to assist the moderators. In this
article, we consider the task of automatically determining if a message is
abusive. This task is complex since messages are written in a non-standardized
way, including spelling errors, abbreviations, community-specific codes...
First, we evaluate the system that we propose using standard features of online
messages. Then, we evaluate the impact of the addition of pre-processing
strategies, as well as original specific features developed for the community
of an online in-browser strategy game. We finally propose to analyze the
usefulness of this wide range of features using feature selection. This work
can lead to two possible applications: 1) automatically flag potentially
abusive messages to draw the moderator's attention on a narrow subset of
messages ; and 2) fully automate the moderation process by deciding whether a
message is abusive without any human intervention
PowerDrive: Accurate De-Obfuscation and Analysis of PowerShell Malware
PowerShell is nowadays a widely-used technology to administrate and manage
Windows-based operating systems. However, it is also extensively used by
malware vectors to execute payloads or drop additional malicious contents.
Similarly to other scripting languages used by malware, PowerShell attacks are
challenging to analyze due to the extensive use of multiple obfuscation layers,
which make the real malicious code hard to be unveiled. To the best of our
knowledge, a comprehensive solution for properly de-obfuscating such attacks is
currently missing. In this paper, we present PowerDrive, an open-source, static
and dynamic multi-stage de-obfuscator for PowerShell attacks. PowerDrive
instruments the PowerShell code to progressively de-obfuscate it by showing the
analyst the employed obfuscation steps. We used PowerDrive to successfully
analyze thousands of PowerShell attacks extracted from various malware vectors
and executables. The attained results show interesting patterns used by
attackers to devise their malicious scripts. Moreover, we provide a taxonomy of
behavioral models adopted by the analyzed codes and a comprehensive list of the
malicious domains contacted during the analysis
Using HTML5 to Prevent Detection of Drive-by-Download Web Malware
The web is experiencing an explosive growth in the last years. New
technologies are introduced at a very fast-pace with the aim of narrowing the
gap between web-based applications and traditional desktop applications. The
results are web applications that look and feel almost like desktop
applications while retaining the advantages of being originated from the web.
However, these advancements come at a price. The same technologies used to
build responsive, pleasant and fully-featured web applications, can also be
used to write web malware able to escape detection systems. In this article we
present new obfuscation techniques, based on some of the features of the
upcoming HTML5 standard, which can be used to deceive malware detection
systems. The proposed techniques have been experimented on a reference set of
obfuscated malware. Our results show that the malware rewritten using our
obfuscation techniques go undetected while being analyzed by a large number of
detection systems. The same detection systems were able to correctly identify
the same malware in its original unobfuscated form. We also provide some hints
about how the existing malware detection systems can be modified in order to
cope with these new techniques.Comment: This is the pre-peer reviewed version of the article: \emph{Using
HTML5 to Prevent Detection of Drive-by-Download Web Malware}, which has been
published in final form at \url{http://dx.doi.org/10.1002/sec.1077}. This
article may be used for non-commercial purposes in accordance with Wiley
Terms and Conditions for Self-Archivin
Defeating Opaque Predicates Statically through Machine Learning and Binary Analysis
International audienceWe present a new approach that bridges binary analysis techniques with machine learning classification for the purpose of providing a static and generic evaluation technique for opaque predicates, regardless of their constructions. We use this technique as a static automated deobfuscation tool to remove the opaque predicates introduced by obfuscation mechanisms. According to our experimental results, our models have up to 98% accuracy at detecting and deob-fuscating state-of-the-art opaque predicates patterns. By contrast, the leading edge deobfuscation methods based on symbolic execution show less accuracy mostly due to the SMT solvers constraints and the lack of scalability of dynamic symbolic analyses. Our approach underlines the efficiency of hybrid symbolic analysis and machine learning techniques for a static and generic deobfuscation methodology
Sciduction: Combining Induction, Deduction, and Structure for Verification and Synthesis
Even with impressive advances in automated formal methods, certain problems
in system verification and synthesis remain challenging. Examples include the
verification of quantitative properties of software involving constraints on
timing and energy consumption, and the automatic synthesis of systems from
specifications. The major challenges include environment modeling,
incompleteness in specifications, and the complexity of underlying decision
problems.
This position paper proposes sciduction, an approach to tackle these
challenges by integrating inductive inference, deductive reasoning, and
structure hypotheses. Deductive reasoning, which leads from general rules or
concepts to conclusions about specific problem instances, includes techniques
such as logical inference and constraint solving. Inductive inference, which
generalizes from specific instances to yield a concept, includes algorithmic
learning from examples. Structure hypotheses are used to define the class of
artifacts, such as invariants or program fragments, generated during
verification or synthesis. Sciduction constrains inductive and deductive
reasoning using structure hypotheses, and actively combines inductive and
deductive reasoning: for instance, deductive techniques generate examples for
learning, and inductive reasoning is used to guide the deductive engines.
We illustrate this approach with three applications: (i) timing analysis of
software; (ii) synthesis of loop-free programs, and (iii) controller synthesis
for hybrid systems. Some future applications are also discussed
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Deobfuscation of VBScript-based Malware
VBScript je desktopový a webový skriptovací jazyk, který je často využíván škodlivým softwarem. Autoři tohoto software se často snaží skrýt jeho pravou funkcionalitu a zabránit ostatním ve čtení jeho kódu pomocí obfuskací. Tato práce se zaměřuje na analýzu těchto obfuskací, prozkoumání způsobů, jak je překonat, a implementaci nástroje schopného zlepšit čitelnost obfuskovaných programů pomocí statických a dynamických deobfuskačních metod.VBScript is a desktop and web based scripting language that is often used by malicious software. Authors of such sofware often attempt to conceal its true functionality and prevent others from reading the source code by using obfuscations. This thesis focuses on analyzing these obfuscations, exploring ways of reverting them and implementing a tool to improve readability of obfuscated programs using both static and dynamic deobfuscation methods
- …