763 research outputs found

    On the possibility of practically obfuscating programs - Towards a unified perspective of code protection

    Get PDF
    International audienceBarak et al. gave a first formalization of obfuscation, describing an obfuscator OO as an efficient, probabilistic "compiler" that takes in input a program PP and produces a new program O(P)O(P) that has the same functionality as PP but is unintelligible. This mean that any result an obfuscated program can compute is actually computable given only an input/output access (called oracle access) to the program PP: we call such results trivial results. On the basis of this informal definition, they suggest a formal definition of obfuscation based on oracle access to programs and show that no obfuscator can exist according to this definition. They also try to relax the definition and show that, even with a restriction to some common classes of programs, there exists no obfuscator. In this work, we show that their definition is inaccurate and lacks a fundamental property, that we formalize by the notion of oracle programs. Oracle programs are an abstract notion which basically refers to perfectly obfuscated programs. We suggest a new definition of obfuscation based on these oracle programs and show that such obfuscators do not exist either. Considering the actual implementations of "obfuscators", we define a new kind of obfuscators, ττ-obfuscators. These are obfuscators that hide non trivial results at least for time ττ. By restricting the ττ-requirement to deobfuscation (that is outputting an intelligible program when fed with an obfuscated program in input), we show that such obfuscators do exist. Practical ττ-obfuscation methods are presented at the end of this paper: we focus more specifically on code protection techniques in a malware context. Based on the fact that a malware may fulfill its action in an amount of time which may be far larger than the analysis time of any automated detection program, these obfuscation methods can be considered as efficient enough to greatly thwart automated analysis and put check on any antivirus software

    Control Flow Graphs as Malware Signatures

    Get PDF
    International audienceThis study proposes a malware detection strategy based on control flow graphs. It carries out experiments to evaluates the false-positive ratios of the proposed methods. Moreover, it presents some insight to establish detection methods sound with respect to some obfuscation techniques

    Evaluation Methodologies in Software Protection Research

    Full text link
    Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 572 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks

    Morphological Detection of Malware

    Get PDF
    International audienceIn the field of malware detection, method based on syntactical consideration are usually efficient. However, they are strongly vulnerable to obfuscation techniques. This study proposes an efficient construction of a morphological malware detector based on a syntactic and a semantic analysis, technically on control flow graphs of programs (CFG). Our construction employs tree automata techniques to provide an efficient representation of the CFG database. Next, we deal with classic obfuscation of programs by mutation using a generic graph rewriting engine. Finally, we carry out experiments to evaluate the false-positive ratio of the proposed methods

    Control Flow to Detect Malware

    Get PDF
    International audienceThis study proposes a malware detection strategy based on control flow. It consists in searching in the control flow graph of the analysed program for an induced sub-graph which corresponds to the control flow graphs of a malicious program. The resulting detector is build over a strong theoretical framework.Finally, experiments are carried out in order to evaluates the proposed detection strategy

    The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

    Get PDF
    In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

    Exploiting loop transformations for the protection of software

    Get PDF
    Il software conserva la maggior parte del know-how che occorre per svilupparlo. Poich\ue9 oggigiorno il software pu\uf2 essere facilmente duplicato e ridistribuito ovunque, il rischio che la propriet\ue0 intellettuale venga violata su scala globale \ue8 elevato. Una delle pi\uf9 interessanti soluzioni a questo problema \ue8 dotare il software di un watermark. Ai watermark si richiede non solo di certificare in modo univoco il proprietario del software, ma anche di essere resistenti e pervasivi. In questa tesi riformuliamo i concetti di robustezza e pervasivit\ue0 a partire dalla semantica delle tracce. Evidenziamo i cicli quali costrutti di programmazione pervasivi e introduciamo le trasformazioni di ciclo come mattone di costruzione per schemi di watermarking pervasivo. Passiamo in rassegna alcune fra tali trasformazioni, studiando i loro principi di base. Infine, sfruttiamo tali principi per costruire una tecnica di watermarking pervasivo. La robustezza rimane una difficile, quanto affascinante, questione ancora da risolvere.Software retains most of the know-how required fot its development. Because nowadays software can be easily cloned and spread worldwide, the risk of intellectual property infringement on a global scale is high. One of the most viable solutions to this problem is to endow software with a watermark. Good watermarks are required not only to state unambiguously the owner of software, but also to be resilient and pervasive. In this thesis we base resiliency and pervasiveness on trace semantics. We point out loops as pervasive programming constructs and we introduce loop transformations as the basic block of pervasive watermarking schemes. We survey several loop transformations, outlining their underlying principles. Then we exploit these principles to build some pervasive watermarking techniques. Resiliency still remains a big and challenging open issue

    Garbling Schemes and Applications

    Get PDF
    The topic of this thesis is garbling schemes and their applications. A garbling scheme is a set of algorithms for realizing secure two-party computation. A party called a client possesses a private algorithm as well as a private input and would like to compute the algorithm with this input. However, the client might not have enough computational resources to evaluate the function with the input on his own. The client outsources the computation to another party, called an evaluator. Since the client wants to protect the algorithm and the input, he cannot just send the algorithm and the input to the evaluator. With a garbling scheme, the client can protect the privacy of the algorithm, the input and possibly also the privacy of the output. The increase in network-based applications has arisen concerns about the privacy of user data. Therefore, privacy-preserving or privacy-enhancing techniques have gained interest in recent research. Garbling schemes seem to be an ideal solution for privacy-preserving applications. First of all, secure garbling schemes hide the algorithm and its input. Secondly, garbling schemes are known to have efficient implementations. In this thesis, we propose two applications utilizing garbling schemes. The first application provides privacy-preserving electronic surveillance. The second application extends electronic surveillance to more versatile monitoring, including also health telemetry. This kind of application would be ideal for assisted living services. In this work, we also present theoretical results related to garbling schemes. We present several new security definitions for garbling schemes which are of practical use. Traditionally, the same garbled algorithm can be evaluated once with garbled input. In applications, the same function is often evaluated several times with different inputs. Recently, a solution based on fully homomorphic encryption provides arbitrarily reusable garbling schemes. The disadvantage in this approach is that the arbitrary reuse cannot be efficiently implemented due to the inefficiency of fully homomorphic encryption. We propose an alternative approach. Instead of arbitrary reusability, the same garbled algorithm could be used a limited number of times. This gives us a set of new security classes for garbling schemes. We prove several relations between new and established security definitions. As a result, we obtain a complex hierarchy which can be represented as a product of three directed graphs. The three graphs in turn represent the different flavors of security: the security notion, the security model and the level of reusability. In addition to defining new security classes, we improve the definition of side-information function, which has a central role in defining the security of a garbling scheme. The information allowed to be leaked by the garbled algorithm and the garbled input depend on the representation of the algorithm. The established definition of side-information models the side-information of circuits perfectly but does not model side-information of Turing machines as well. The established model requires that the length of the argument, the length of the final result and the length of the function can be efficiently computable from the side-information function. Moreover, the side-information depends only on the function. In other words, the length of the argument, the length of the final result and the length of the function should only depend on the function. For circuits this is a natural requirement since the number of input wires tells the size of the argument, the number of output wires tells the size of the final result and the number of gates and wires tell the size of the function. On the other hand, the description of a Turing machine does not set any limitation to the size of the argument. Therefore, side-information that depends only on the function cannot provide information about the length of the argument. To tackle this problem, we extend the model of side-information so that side-information depends on both the function and the argument. The new model of side information allows us to define new security classes. We show that the old security classes are compatible with the new model of side-information. We also prove relations between the new security classes.Tämä väitöskirja käsittelee garblausskeemoja ja niiden sovelluksia. Garblausskeema on työkalu, jota käytetään turvallisen kahden osapuolen laskennan toteuttamiseen. Asiakas pitää hallussaan yksityistä algoritmia ja sen yksityistä syötettä, joilla hän haluaisi suorittaa tietyn laskennan. Asiakkaalla ei välttämättä ole riittävästi laskentatehoa, minkä vuoksi hän ei pysty suorittamaan laskentaa itse, vaan joutuu ulkoistamaan laskennan toiselle osapuolelle, palvelimelle. Koska asiakas tahtoo suojella algoritmiaan ja syötettään, hän ei voi vain lähettää niitä palvelimen laskettavaksi. Asiakas pystyy suojelemaan syötteensä ja algoritminsa yksityisyyttä käyttämällä garblausskeemaa. Verkkopohjaisten sovellusten kasvu on herättänyt huolta käyttäjien datan yksityisyyden turvasta. Siksi yksityisyyden säilyttävien tai yksityisyyden suojaa lisäävien tekniikoiden tutkimus on saanut huomiota. Garblaustekniikan avulla voidaan suojata sekä syöte että algoritmi. Lisäksi garblaukselle tiedetään olevan useita tehokkaita toteutuksia. Näiden syiden vuoksi garblausskeemat ovat houkutteleva tekniikka käytettäväksi yksityisyyden säilyttävien sovellusten toteutuksessa. Tässä työssä esittelemme kaksi sovellusta, jotka hyödyntävät garblaustekniikkaa. Näistä ensimmäinen on yksityisyyden säilyttävä sähköinen seuranta. Toinen sovellus laajentaa seurantaa monipuolisempaan monitorointiin, kuten terveyden kaukoseurantaan. Tästä voi olla hyötyä etenkin kotihoidon palveluille. Tässä työssä esitämme myös teoreettisia tuloksia garblausskeemoihin liittyen. Esitämme garblausskeemoille uusia turvallisuusmääritelmiä, joiden tarve kumpuaa käytännön sovelluksista. Perinteisen määritelmän mukaan samaa garblattua algoritmia voi käyttää vain yhdellä garblatulla syötteellä laskemiseen. Käytännössä kuitenkin samaa algoritmia käytetään usean eri syötteen evaluoimiseen. Hiljattain on esitetty tähän ongelmaan ratkaisu, joka perustuu täysin homomorfiseen salaukseen. Tämän ratkaisun ansiosta samaa garblattua algoritmia voi turvallisesti käyttää mielivaltaisen monta kertaa. Ratkaisun haittapuoli kuitenkin on, ettei sille ole tiedossa tehokasta toteutusta, sillä täysin homomorfiseen salaukseen ei ole vielä onnistuttu löytämään sellaista. Esitämme vaihtoehtoisen näkökulman: sen sijaan, että samaa garblattua algoritmia voisi käyttää mielivaltaisen monta kertaa, sitä voikin käyttää vain tietyn, ennalta rajatun määrän kertoja. Tämä näkökulman avulla voidaan määritellä lukuisia uusia turvallisuusluokkia. Todistamme useita relaatioita uusien ja vanhojen turvallisuusmääritelmien välillä. Relaatioiden avulla garblausskeemojen turvallisuusluokille saadaan muodostettua hierarkia, joka koostuu kolmesta komponentista. Tieto, joka paljastuu garblatusta algoritmista tai garblatusta syötteestä riippuu siitä, millaisessa muodossa algoritmi on esitetty, kutsutaan sivutiedoksi. Vakiintunut määritelmä mallintaa loogisen piiriin liittyvää sivutietoa täydellisesti, mutta ei yhtä hyvin Turingin koneeseen liittyvää sivutietoa. Tämä johtuu siitä, että jokainen yksittäinen looginen piiri asettaa syötteensä pituudelle rajan, mutta yksittäisellä Turingin koneella vastaavanlaista rajoitusta ei ole. Parannamme sivutiedon määritelmää, jolloin tämä ongelma poistuu. Uudenlaisen sivutiedon avulla voidaan määritellä uusia turvallisuusluokkia. Osoitamme, että vanhat turvallisuusluokat voidaan esittää uudenkin sivutiedon avulla. Todistamme myös relaatioita uusien luokkien välillä.Siirretty Doriast
    corecore