1,074 research outputs found

    Decrypting The Java Gene Pool: Predicting Objects' Lifetimes with Micro-patterns

    Get PDF
    Pretenuring long-lived and immortal objects into infrequently or never collected regions reduces garbage collection costs significantly. However, extant approaches either require computationally expensive, application-specific, off-line profiling, or consider only allocation sites common to all programs, i.e. invoked by the virtual machine rather than application programs. In contrast, we show how a simple program analysis, combined with an object lifetime knowledge bank, can be exploited to match both runtime system and application program structure with object lifetimes. The complexity of the analysis is linear in the size of the program, so need not be run ahead of time. We obtain performance gains between 6-77% in GC time against a generational copying collector for several SPEC jvm98 programs

    Exploiting the Weak Generational Hypothesis for Write Reduction and Object Recycling

    Get PDF
    Programming languages with automatic memory management are continuing to grow in popularity due to ease of programming. However, these languages tend to allocate objects excessively, leading to inefficient use of memory and large garbage collection and allocation overheads. The weak generational hypothesis notes that objects tend to die young in languages with automatic dynamic memory management. Much work has been done to optimize allocation and garbage collection algorithms based on this observation. Previous work has largely focused on developing efficient software algorithms for allocation and collection. However, much less work has studied architectural solutions. In this work, we propose and evaluate architectural support for assisting allocation and garbage collection. We first study the effects of languages with automatic memory management on the memory system. As objects often die young, it is likely many objects die while in the processor\u27s caches. Writes of dead data back to main memory are unnecessary, as the data will never be used again. To study this, we develop and present architecture support to identify dead objects while they remain resident in cache and eliminate any unnecessary writes. We show that many writes out of the caches are unnecessary, and can be avoided using our hardware additions. Next, we study the effects of using dead data in cache to assist with allocation and garbage collection. Logic is developed and presented to allow for reuse of cache space found dead to satisfy future allocation requests. We show that dead cache space can be recycled at a high rate, reducing pressure on the allocator and reducing cache miss rates. However, a full implementation of our initial approach is shown to be unscalable. We propose and study limitations to our approach, trading object coverage for scalability. Third, we present a new approach for identifying objects that die young based on a limitation of our previous approach. We show this approach has much lower storage and logic requirements and is scalable, while only slightly decreasing overall object coverage

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Characterization and reduction of memory usage in 64-bit Java Virtual Machines

    Get PDF

    Enhancing Mobile Capacity through Generic and Efficient Resource Sharing

    Get PDF
    Mobile computing devices are becoming indispensable in every aspect of human life, but diverse hardware limits make current mobile devices far from ideal for satisfying the performance requirements of modern mobile applications and being used anytime, anywhere. Mobile Cloud Computing (MCC) could be a viable solution to bypass these limits which enhances the mobile capacity through cooperative resource sharing, but is challenging due to the heterogeneity of mobile devices in both hardware and software aspects. Traditional schemes either restrict to share a specific type of hardware resource within individual applications, which requires tremendous reprogramming efforts; or disregard the runtime execution pattern and transmit too much unnecessary data, resulting in bandwidth and energy waste.To address the aforementioned challenges, we present three novel designs of resource sharing frameworks which utilize the various system resources from a remote or personal cloud to enhance the mobile capacity in a generic and efficient manner. First, we propose a novel method-level offloading methodology to run the mobile computational workload on the remote cloud CPU. Minimized data transmission is achieved during such offloading by identifying and selectively migrating the memory contexts which are necessary to the method execution. Second, we present a systematic framework to maximize the mobile performance of graphics rendering with the remote cloud GPU, during which the redundant pixels across consecutive frames are reused to reduce the transmitted frame data. Last, we propose to exploit the unified mobile OS services and generically interconnect heterogeneous mobile devices towards a personal mobile cloud, which complement and flexibly share mobile peripherals (e.g., sensors, camera) with each other

    Achieving High Performance and High Productivity in Next Generational Parallel Programming Languages

    Get PDF
    Processor design has turned toward parallelism and heterogeneity cores to achieve performance and energy efficiency. Developers find high-level languages attractive because they use abstraction to offer productivity and portability over hardware complexities. To achieve performance, some modern implementations of high-level languages use work-stealing scheduling for load balancing of dynamically created tasks. Work-stealing is a promising approach for effectively exploiting software parallelism on parallel hardware. A programmer who uses work-stealing explicitly identifies potential parallelism and the runtime then schedules work, keeping otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. These overheads arise as a necessary side effect of the implementation and hamper parallel performance. In addition to runtime-imposed overheads, there is a substantial cognitive load associated with ensuring that parallel code is data-race free. This dissertation explores the overheads associated with achieving high performance parallelism in modern high-level languages. My thesis is that, by exploiting existing underlying mechanisms of managed runtimes; and by extending existing language design, high-level languages will be able to deliver productivity and parallel performance at the levels necessary for widespread uptake. The key contributions of my thesis are: 1) a detailed analysis of the key sources of overhead associated with a work-stealing runtime, namely sequential and dynamic overheads; 2) novel techniques to reduce these overheads that use rich features of managed runtimes such as the yieldpoint mechanism, on-stack replacement, dynamic code-patching, exception handling support, and return barriers; 3) comprehensive analysis of the resulting benefits, which demonstrate that work-stealing overheads can be significantly reduced, leading to substantial performance improvements; and 4) a small set of language extensions that achieve both high performance and high productivity with minimal programmer effort. A managed runtime forms the backbone of any modern implementation of a high-level language. Managed runtimes enjoy the benefits of a long history of research and their implementations are highly optimized. My thesis demonstrates that converging these highly optimized features together with the expressiveness of high-level languages, gives further hope for achieving high performance and high productivity on modern parallel hardwar

    Cyber-security protection techniques to mitigate memory errors exploitation

    Full text link
    Tesis por compendio[EN] Practical experience in software engineering has demonstrated that the goal of building totally fault-free software systems, although desirable, is impossible to achieve. Therefore, it is necessary to incorporate mitigation techniques in the deployed software, in order to reduce the impact of latent faults. This thesis makes contributions to three memory corruption mitigation techniques: the stack smashing protector (SSP), address space layout randomisation (ASLR) and automatic software diversification. The SSP is a very effective protection technique used against stack buffer overflows, but it is prone to brute force attacks, particularly the dangerous byte-for-byte attack. A novel modification, named RenewSSP, has been proposed which eliminates brute force attacks, can be used in a completely transparent way with existing software and has negligible overheads. There are two different kinds of application for which RenewSSP is especially beneficial: networking servers (tested in Apache) and application launchers (tested on Android). ASLR is a generic concept with multiple designs and implementations. In this thesis, the two most relevant ASLR implementations of Linux have been analysed (Vanilla Linux and PaX patch), and several weaknesses have been found. Taking into account technological improvements in execution support (compilers and libraries), a new ASLR design has been proposed, named ASLR-NG, which maximises entropy, effectively addresses the fragmentation issue and removes a number of identified weaknesses. Furthermore, ASLR-NG is transparent to applications, in that it preserves binary code compatibility and does not add overheads. ASLR-NG has been implemented as a patch to the Linux kernel 4.1. Software diversification is a technique that covers a wide range of faults, including memory errors. The main problem is how to create variants, i.e. programs which have identical behaviours on normal inputs but where faults manifest differently. A novel form of automatic variant generation has been proposed, using multiple cross-compiler suites and processor emulators. One of the main goals of this thesis is to create applicable results. Therefore, I have placed particular emphasis on the development of real prototypes in parallel with the theoretical study. The results of this thesis are directly applicable to real systems; in fact, some of the results have already been included in real-world products.[ES] La creación de software supone uno de los retos más complejos para el ser humano ya que requiere un alto grado de abstracción. Aunque se ha avanzado mucho en las metodologías para la prevención de los fallos software, es patente que el software resultante dista mucho de ser confiable, y debemos asumir que el software que se produce no está libre de fallos. Dada la imposibilidad de diseñar o implementar sistemas libres de fallos, es necesario incorporar técnicas de mitigación de errores para mejorar la seguridad. La presente tesis realiza aportaciones en tres de las principales técnicas de mitigación de errores de corrupción de memoria: Stack Smashing Protector (SSP), Address Space Layout Randomisation (ASLR) y Automatic Software Diversification. SSP es una técnica de protección muy efectiva contra ataques de desbordamiento de buffer en pila, pero es sensible a ataques de fuerza bruta, en particular al peligroso ataque denominado byte-for-byte. Se ha propuesto una novedosa modificación del SSP, llamada RenewSSP, la cual elimina los ataques de fuerza bruta. Puede ser usada de manera completamente transparente con los programas existentes sin introducir sobrecarga. El RenewSSP es especialmente beneficioso en dos áreas de aplicación: Servidores de red (probado en Apache) y lanzadores de aplicaciones eficientes (probado en Android). ASLR es un concepto genérico, del cual hay multitud de diseños e implementaciones. Se han analizado las dos implementaciones más relevantes de Linux (Vanilla Linux y PaX patch), encontrándose en ambas tanto debilidades como elementos mejorables. Teniendo en cuenta las mejoras tecnológicas en el soporte a la ejecución (compiladores y librerías), se ha propuesto un nuevo diseño del ASLR, llamado ASLR-NG, el cual: maximiza la entropía, soluciona el problema de la fragmentación y elimina las debilidades encontradas. Al igual que la solución propuesta para el SSP, la nueva propuesta de ASLR es transparente para las aplicaciones y compatible a nivel binario sin introducir sobrecarga. ASLR-NG ha sido implementado como un parche del núcleo de Linux para la versión 4.1. La diversificación software es una técnica que cubre una amplia gama de fallos, incluidos los errores de memoria. La principal dificultad para aplicar esta técnica radica en la generación de las "variantes", que son programas que tienen un comportamiento idéntico entre ellos ante entradas normales, pero tienen un comportamiento diferenciado en presencia de entradas anormales. Se ha propuesto una novedosa forma de generar variantes de forma automática a partir de un mismo código fuente, empleando la emulación de sistemas. Una de las máximas de esta investigación ha sido la aplicabilidad de los resultados, por lo que se ha hecho especial hincapié en el desarrollo de prototipos sobre sistemas reales a la par que se llevaba a cabo el estudio teórico. Como resultado, las propuestas de esta tesis son directamente aplicables a sistemas reales, algunas de ellas ya están siendo explotadas en la práctica.[CA] La creació de programari suposa un dels reptes més complexos per al ser humà ja que requerix un alt grau d'abstracció. Encara que s'ha avançat molt en les metodologies per a la prevenció de les fallades de programari, és palès que el programari resultant dista molt de ser confiable, i hem d'assumir que el programari que es produïx no està lliure de fallades. Donada la impossibilitat de dissenyar o implementar sistemes lliures de fallades, és necessari incorporar tècniques de mitigació d'errors per a millorar la seguretat. La present tesi realitza aportacions en tres de les principals tècniques de mitigació d'errors de corrupció de memòria: Stack Smashing Protector (SSP), Address Space Layout Randomisation (ASLR) i Automatic Software Diversification. SSP és una tècnica de protecció molt efectiva contra atacs de desbordament de buffer en pila, però és sensible a atacs de força bruta, en particular al perillós atac denominat byte-for-byte. S'ha proposat una nova modificació del SSP, RenewSSP, la qual elimina els atacs de força bruta. Pot ser usada de manera completament transparent amb els programes existents sense introduir sobrecàrrega. El RenewSSP és especialment beneficiós en dos àrees d'aplicació: servidors de xarxa (provat en Apache) i llançadors d'aplicacions eficients (provat en Android). ASLR és un concepte genèric, del qual hi ha multitud de dissenys i implementacions. S'han analitzat les dos implementacions més rellevants de Linux (Vanilla Linux i PaX patch), trobant-se en ambdues tant debilitats com elements millorables. Tenint en compte les millores tecnològiques en el suport a l'execució (compiladors i llibreries), s'ha proposat un nou disseny de l'ASLR: ASLR-NG, el qual, maximitza l'entropia, soluciona el problema de la fragmentació i elimina les debilitats trobades. Igual que la solució proposada per al SSP, la nova proposta d'ASLR és transparent per a les aplicacions i compatible a nivell binari sense introduir sobrecàrrega. ASLR-NG ha sigut implementat com un pedaç del nucli de Linux per a la versió 4.1. La diversificació de programari és una tècnica que cobrix una àmplia gamma de fa\-llades, inclosos els errors de memòria. La principal dificultat per a aplicar esta tècnica radica en la generació de les "variants", que són programes que tenen un comportament idèntic entre ells davant d'entrades normals, però tenen un comportament diferenciat en presència d'entrades anormals. S'ha proposat una nova forma de generar variants de forma automàtica a partir d'un mateix codi font, emprant l'emulació de sistemes. Una de les màximes d'esta investigació ha sigut l'aplicabilitat dels resultats, per la qual cosa s'ha fet especial insistència en el desenrotllament de prototips sobre sistemes reals al mateix temps que es duia a terme l'estudi teòric. Com a resultat, les propostes d'esta tesi són directament aplicables a sistemes reals, algunes d'elles ja estan sent explotades en la pràctica.Marco Gisbert, H. (2015). Cyber-security protection techniques to mitigate memory errors exploitation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/57806TESISCompendi

    Ramasse-miettes générationnel et incémental gérant les cycles et les gros objets en utilisant des frames délimités

    Get PDF
    Ces dernières années, des recherches ont été menées sur plusieurs techniques reliées à la collection des déchets. Plusieurs découvertes centrales pour le ramassage de miettes par copie ont été réalisées. Cependant, des améliorations sont encore possibles. Dans ce mémoire, nous introduisons des nouvelles techniques et de nouveaux algorithmes pour améliorer le ramassage de miettes. En particulier, nous introduisons une technique utilisant des cadres délimités pour marquer et retracer les pointeurs racines. Cette technique permet un calcul efficace de l'ensemble des racines. Elle réutilise des concepts de deux techniques existantes, card marking et remembered sets, et utilise une configuration bidirectionelle des objets pour améliorer ces concepts en stabilisant le surplus de mémoire utilisée et en réduisant la charge de travail lors du parcours des pointeurs. Nous présentons aussi un algorithme pour marquer récursivement les objets rejoignables sans utiliser de pile (éliminant le gaspillage de mémoire habituel). Nous adaptons cet algorithme pour implémenter un ramasse-miettes copiant en profondeur et améliorer la localité du heap. Nous améliorons l'algorithme de collection des miettes older-first et sa version générationnelle en ajoutant une phase de marquage garantissant la collection de toutes les miettes, incluant les structures cycliques réparties sur plusieurs fenêtres. Finalement, nous introduisons une technique pour gérer les gros objets. Pour tester nos idées, nous avons conçu et implémenté, dans la machine virtuelle libre Java SableVM, un cadre de développement portable et extensible pour la collection des miettes. Dans ce cadre, nous avons implémenté des algorithmes de collection semi-space, older-first et generational. Nos expérimentations montrent que la technique du cadre délimité procure des performances compétitives pour plusieurs benchmarks. Elles montrent aussi que, pour la plupart des benchmarks, notre algorithme de parcours en profondeur améliore la localité et augmente ainsi la performance. Nos mesures de la performance générale montrent que, utilisant nos techniques, un ramasse-miettes peut délivrer une performance compétitive et surpasser celle des ramasses-miettes existants pour plusieurs benchmarks. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Ramasse-Miettes, Machine Virtuelle, Java, SableVM

    Workload characterization of JVM languages

    Get PDF
    Being developed with a single language in mind, namely Java, the Java Virtual Machine (JVM) nowadays is targeted by numerous programming languages. Automatic memory management, Just-In-Time (JIT) compilation, and adaptive optimizations provided by the JVM make it an attractive target for different language implementations. Even though being targeted by so many languages, the JVM has been tuned with respect to characteristics of Java programs only -- different heuristics for the garbage collector or compiler optimizations are focused more on Java programs. In this dissertation, we aim at contributing to the understanding of the workloads imposed on the JVM by both dynamically-typed and statically-typed JVM languages. We introduce a new set of dynamic metrics and an easy-to-use toolchain for collecting the latter. We apply our toolchain to applications written in six JVM languages -- Java, Scala, Clojure, Jython, JRuby, and JavaScript. We identify differences and commonalities between the examined languages and discuss their implications. Moreover, we have a close look at one of the most efficient compiler optimizations - method inlining. We present the decision tree of the HotSpot JVM's JIT compiler and analyze how well the JVM performs in inlining the workloads written in different JVM languages
    corecore