836 research outputs found

    Strategies for protecting intellectual property when using CUDA applications on graphics processing units

    Get PDF
    Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering

    Combinator evaluation of functional programs with logical variables

    Get PDF
    technical reportA technique is presented that brings logical variables into the scope of the well known Turner method for evaluating normal order functioned programs by S, K, I combinator graph reduction. This extension is illustrated by SASL+LV, an extension of Turner's language SASL in which general expressions serve as formal parameters, and parameter passage is done by unification. The conceptual and practical advantages of such an extension are discussed, as well as semantic pitfalls that arise from the attendant weakening of referential transparency. Only four new combinators (LV, BV, FN and UNIFY) are introduced. The resulting object code is fully upward compatible in the sense that previously compiled SASL object code remains executable with unchanged semantics. However, "read-only" variable usage in SASL-f LV programs requires a "multi-tasking" extension of the customary stack-based evaluation method. Mechanisms are presented for managing this multi-tasking on both single and multi-processor systems. Finally, directions are examined for applying this technique to implementations involving larger granularity combinators, and fuller semantic treatment of logical variables (e.g. accommodation of failing unifications)

    Emulating and evaluating hybrid memory for managed languages on NUMA hardware

    Get PDF
    Non-volatile memory (NVM) has the potential to become a mainstream memory technology and challenge DRAM. Researchers evaluating the speed, endurance, and abstractions of hybrid memories with DRAM and NVM typically use simulation, making it easy to evaluate the impact of different hardware technologies and parameters. Simulation is, however, extremely slow, limiting the applications and datasets in the evaluation. Simulation also precludes critical workloads, especially those written in managed languages such as Java and C#. Good methodology embraces a variety of techniques for evaluating new ideas, expanding the experimental scope, and uncovering new insights. This paper introduces a platform to emulate hybrid memory for managed languages using commodity NUMA servers. Emulation complements simulation but offers richer software experimentation. We use a thread-local socket to emulate DRAM and a remote socket to emulate NVM. We use standard C library routines to allocate heap memory on the DRAM and NVM sockets for use with explicit memory management or garbage collection. We evaluate the emulator using various configurations of write-rationing garbage collectors that improve NVM lifetimes by limiting writes to NVM, using 15 applications and various datasets and workload configurations. We show emulation and simulation confirm each other's trends in terms of writes to NVM for different software configurations, increasing our confidence in predicting future system effects. Emulation brings novel insights, such as the non-linear effects of multi-programmed workloads on NVM writes, and that Java applications write significantly more than their C++ equivalents. We make our software infrastructure publicly available to advance the evaluation of novel memory management schemes on hybrid memories

    A Monitoring System for Runtime Adaptations of Streaming Applications

    Get PDF
    International audienceStreaming languages are adequate for expressing many applications quite naturally and have been proven to be a good approach for taking advantage of the intrinsic parallelism of modern CPU architectures. While numerous works focus on improving the throughput of streaming programs, we rather focus on satisfying quality-of-service requirements of streaming applications executed alongside non-streaming processes. We monitor synchronous dataflow (SDF) programs at runtime both at the application and system levels in order to identify violations of quality-of-service requirements. Our monitoring requires the programmer to provide the expected throughput of its application (e.g 25 frames per second for a video decoder), then takes full benefit from the compilation of the SDF graph to detect bottlenecks in this graph and identify causes among processor or memory overloading. It can then be used to perform dynamic adaptations of the applications in order to optimize the use of computing and memory resources

    Studies on register allocation for instruction-level parallel processors

    Get PDF
    制度:新 ; 文部省報告番号:甲2570号 ; 学位の種類:博士(工学) ; 授与年月日:2008/3/15 ; 早大学位記番号:新472
    corecore