1,094 research outputs found
Decrypting The Java Gene Pool: Predicting Objects' Lifetimes with Micro-patterns
Pretenuring long-lived and immortal objects into infrequently or never collected regions reduces garbage collection costs significantly. However, extant approaches either require computationally expensive, application-specific, off-line profiling, or consider only allocation sites common to all programs, i.e. invoked by the virtual machine rather than application programs. In contrast, we show how a simple program analysis, combined with an object lifetime knowledge bank, can be exploited to match both runtime system and application program structure with object lifetimes. The complexity of the analysis is linear in the size of the program, so need not be run ahead of time. We obtain performance gains between 6-77% in GC time against a generational copying collector for several SPEC jvm98 programs
Exploiting the Weak Generational Hypothesis for Write Reduction and Object Recycling
Programming languages with automatic memory management are continuing to grow in popularity due to ease of programming. However, these languages tend to allocate objects excessively, leading to inefficient use of memory and large garbage collection and allocation overheads.
The weak generational hypothesis notes that objects tend to die young in languages with automatic dynamic memory management. Much work has been done to optimize allocation and garbage collection algorithms based on this observation. Previous work has largely focused on developing efficient software algorithms for allocation and collection. However, much less work has studied architectural solutions. In this work, we propose and evaluate architectural support for assisting allocation and garbage collection.
We first study the effects of languages with automatic memory management on the memory system. As objects often die young, it is likely many objects die while in the processor\u27s caches. Writes of dead data back to main memory are unnecessary, as the data will never be used again. To study this, we develop and present architecture support to identify dead objects while they remain resident in cache and eliminate any unnecessary writes. We show that many writes out of the caches are unnecessary, and can be avoided using our hardware additions.
Next, we study the effects of using dead data in cache to assist with allocation and garbage collection. Logic is developed and presented to allow for reuse of cache space found dead to satisfy future allocation requests. We show that dead cache space can be recycled at a high rate, reducing pressure on the allocator and reducing cache miss rates. However, a full implementation of our initial approach is shown to be unscalable. We propose and study limitations to our approach, trading object coverage for scalability.
Third, we present a new approach for identifying objects that die young based on a limitation of our previous approach. We show this approach has much lower storage and logic requirements and is scalable, while only slightly decreasing overall object coverage
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Enhancing Mobile Capacity through Generic and Efficient Resource Sharing
Mobile computing devices are becoming indispensable in every aspect of human life, but diverse hardware limits make current mobile devices far from ideal for satisfying the performance requirements of modern mobile applications and being used anytime, anywhere. Mobile Cloud Computing (MCC) could be a viable solution to bypass these limits which enhances the mobile capacity through cooperative resource sharing, but is challenging due to the heterogeneity of mobile devices in both hardware and software aspects. Traditional schemes either restrict to share a specific type of hardware resource within individual applications, which requires tremendous reprogramming efforts; or disregard the runtime execution pattern and transmit too much unnecessary data, resulting in bandwidth and energy waste.To address the aforementioned challenges, we present three novel designs of resource sharing frameworks which utilize the various system resources from a remote or personal cloud to enhance the mobile capacity in a generic and efficient manner. First, we propose a novel method-level offloading methodology to run the mobile computational workload on the remote cloud CPU. Minimized data transmission is achieved during such offloading by identifying and selectively migrating the memory contexts which are necessary to the method execution. Second, we present a systematic framework to maximize the mobile performance of graphics rendering with the remote cloud GPU, during which the redundant pixels across consecutive frames are reused to reduce the transmitted frame data. Last, we propose to exploit the unified mobile OS services and generically interconnect heterogeneous mobile devices towards a personal mobile cloud, which complement and flexibly share mobile peripherals (e.g., sensors, camera) with each other
Achieving High Performance and High Productivity in Next Generational Parallel Programming Languages
Processor design has turned toward parallelism and heterogeneity
cores to achieve performance and energy efficiency. Developers
find high-level languages attractive because they use abstraction
to offer productivity and portability over hardware complexities.
To achieve performance, some modern implementations of high-level
languages use work-stealing scheduling for load balancing of
dynamically created tasks. Work-stealing is a promising approach
for effectively exploiting software parallelism on parallel
hardware. A programmer who uses work-stealing explicitly
identifies potential parallelism and the runtime then schedules
work, keeping otherwise idle hardware busy while relieving
overloaded hardware of its burden.
However, work-stealing comes with substantial overheads. These
overheads arise as a necessary side effect of the implementation
and hamper parallel performance. In addition to runtime-imposed
overheads, there is a substantial cognitive load associated with
ensuring that parallel code is data-race free. This dissertation
explores the overheads associated with achieving high performance
parallelism in modern high-level languages.
My thesis is that, by exploiting existing underlying mechanisms
of managed runtimes; and by extending existing language design,
high-level languages will be able to deliver productivity and
parallel performance at the levels necessary for widespread
uptake.
The key contributions of my thesis are: 1) a detailed analysis of
the key sources of overhead associated with a work-stealing
runtime, namely sequential and dynamic overheads; 2) novel
techniques to reduce these overheads that use rich features of
managed runtimes such as the yieldpoint mechanism, on-stack
replacement, dynamic code-patching, exception handling support,
and return barriers; 3) comprehensive analysis of the resulting
benefits, which demonstrate that work-stealing overheads can be
significantly reduced, leading to substantial performance
improvements; and 4) a small set of language extensions that
achieve both high performance and high productivity with minimal
programmer effort.
A managed runtime forms the backbone of any modern implementation
of a high-level language. Managed runtimes enjoy the benefits of
a long history of research and their implementations are highly
optimized. My thesis demonstrates that converging these highly
optimized features together with the expressiveness of high-level
languages, gives further hope for achieving high performance and
high productivity on modern parallel hardwar
JVM-based Techniques for Improving Java Observability
Observability measures the support of computer systems to accurately capture, analyze, and present (collectively observe) the internal information about the systems. Observability frameworks play important roles for program understanding, troubleshooting, performance diagnosis, and optimizations. However, traditional solutions are either expensive or coarse-grained, consequently compromising their utility in accommodating today’s increasingly complex software systems. New solutions are emerging for VM-based languages due to the full control language VMs have over program executions. Existing such solutions, nonetheless, still lack flexibility, have high overhead, or provide limited context information for developing powerful dynamic analyses. In this thesis, we present a VM-based infrastructure, called marker tracing framework (MTF), to address the deficiencies in the existing solutions for providing better observability for VM-based languages. MTF serves as a solid foundation for implementing fine-grained low-overhead program instrumentation. Specifically, MTF allows analysis clients to: 1) define custom events with rich semantics ; 2) specify precisely the program locations where the events should trigger; and 3) adaptively enable/disable the instrumentation at runtime. In addition, MTF-based analysis clients are more powerful by having access to all information available to the VM. To demonstrate the utility and effectiveness of MTF, we present two analysis clients: 1) dynamic typestate analysis with adaptive online program analysis (AOPA); and 2) selective probabilistic calling context analysis (SPCC). In addition, we evaluate the runtime performance of MTF and the typestate client with the DaCapo benchmarks. The results show that: 1) MTF has acceptable runtime overhead when tracing moderate numbers of marker events; and 2) AOPA is highly effective in reducing the event frequency for the dynamic typestate analysis; and 3) language VMs can be exploited to offer greater observability
Ramasse-miettes générationnel et incémental gérant les cycles et les gros objets en utilisant des frames délimités
Ces dernières années, des recherches ont été menées sur plusieurs techniques reliées à la collection des déchets. Plusieurs découvertes centrales pour le ramassage de miettes par copie ont été réalisées. Cependant, des améliorations sont encore possibles. Dans ce mémoire, nous introduisons des nouvelles techniques et de nouveaux algorithmes pour améliorer le ramassage de miettes. En particulier, nous introduisons une technique utilisant des cadres délimités pour marquer et retracer les pointeurs racines. Cette technique permet un calcul efficace de l'ensemble des racines. Elle réutilise des concepts de deux techniques existantes, card marking et remembered sets, et utilise une configuration bidirectionelle des objets pour améliorer ces concepts en stabilisant le surplus de mémoire utilisée et en réduisant la charge de travail lors du parcours des pointeurs. Nous présentons aussi un algorithme pour marquer récursivement les objets rejoignables sans utiliser de pile (éliminant le gaspillage de mémoire habituel). Nous adaptons cet algorithme pour implémenter un ramasse-miettes copiant en profondeur et améliorer la localité du heap. Nous améliorons l'algorithme de collection des miettes older-first et sa version générationnelle en ajoutant une phase de marquage garantissant la collection de toutes les miettes, incluant les structures cycliques réparties sur plusieurs fenêtres. Finalement, nous introduisons une technique pour gérer les gros objets. Pour tester nos idées, nous avons conçu et implémenté, dans la machine virtuelle libre Java SableVM, un cadre de développement portable et extensible pour la collection des miettes. Dans ce cadre, nous avons implémenté des algorithmes de collection semi-space, older-first et generational. Nos expérimentations montrent que la technique du cadre délimité procure des performances compétitives pour plusieurs benchmarks. Elles montrent aussi que, pour la plupart des benchmarks, notre algorithme de parcours en profondeur améliore la localité et augmente ainsi la performance. Nos mesures de la performance générale montrent que, utilisant nos techniques, un ramasse-miettes peut délivrer une performance compétitive et surpasser celle des ramasses-miettes existants pour plusieurs benchmarks. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Ramasse-Miettes, Machine Virtuelle, Java, SableVM
Cyber-security protection techniques to mitigate memory errors exploitation
Tesis por compendio[EN] Practical experience in software engineering has demonstrated that the goal of
building totally fault-free software systems, although desirable, is impossible
to achieve. Therefore, it is necessary to incorporate mitigation techniques in
the deployed software, in order to reduce the impact of latent faults.
This thesis makes contributions to three memory corruption mitigation
techniques: the stack smashing protector (SSP), address space layout
randomisation (ASLR) and automatic software diversification.
The SSP is a very effective protection technique used against stack buffer
overflows, but it is prone to brute force attacks, particularly the dangerous
byte-for-byte attack. A novel modification, named RenewSSP, has been proposed
which eliminates brute force attacks, can be used in a completely transparent
way with existing software and has negligible overheads. There are two
different kinds of application for which RenewSSP is especially beneficial:
networking servers (tested in Apache) and application launchers (tested on
Android).
ASLR is a generic concept with multiple designs and implementations. In this
thesis, the two most relevant ASLR implementations of Linux have been analysed
(Vanilla Linux and PaX patch), and several weaknesses have been found. Taking
into account technological improvements in execution support (compilers and
libraries), a new ASLR design has been proposed, named ASLR-NG, which
maximises entropy, effectively addresses the fragmentation issue and removes a
number of identified weaknesses. Furthermore, ASLR-NG is transparent to
applications, in that it preserves binary code compatibility and does not add
overheads. ASLR-NG has been implemented as a patch to the Linux kernel 4.1.
Software diversification is a technique that covers a wide range of faults,
including memory errors. The main problem is how to create variants,
i.e. programs which have identical behaviours on normal inputs but
where faults manifest differently. A novel form of automatic variant
generation has been proposed, using multiple cross-compiler suites and
processor emulators.
One of the main goals of this thesis is to create applicable results.
Therefore, I have placed particular emphasis on the development of real
prototypes in parallel with the theoretical study. The results of this thesis
are directly applicable to real systems; in fact, some of the results have
already been included in real-world products.[ES] La creación de software supone uno de los retos más complejos para el
ser humano ya que requiere un alto grado de abstracción. Aunque se ha
avanzado mucho en las metodologías para la prevención de los fallos
software, es patente que el software resultante dista mucho de ser
confiable, y debemos asumir que el software que se produce no está
libre de fallos. Dada la imposibilidad de diseñar o implementar
sistemas libres de fallos, es necesario incorporar técnicas de
mitigación de errores para mejorar la seguridad.
La presente tesis realiza aportaciones en tres de las principales
técnicas de mitigación de errores de corrupción de memoria: Stack
Smashing Protector (SSP), Address Space Layout Randomisation (ASLR) y
Automatic Software Diversification.
SSP es una técnica de protección muy efectiva contra
ataques de desbordamiento de buffer en pila, pero es sensible a ataques de
fuerza bruta, en particular al peligroso ataque denominado byte-for-byte.
Se ha propuesto una novedosa modificación del SSP, llamada RenewSSP,
la cual elimina los ataques de fuerza bruta. Puede ser usada
de manera completamente transparente con los programas existentes sin
introducir sobrecarga. El RenewSSP es especialmente beneficioso en dos áreas de
aplicación: Servidores de red (probado en Apache) y
lanzadores de aplicaciones eficientes (probado en Android).
ASLR es un concepto genérico, del cual hay multitud de diseños e
implementaciones. Se han analizado las dos implementaciones más
relevantes de Linux (Vanilla Linux y PaX patch), encontrándose en
ambas tanto debilidades como elementos mejorables. Teniendo en cuenta
las mejoras tecnológicas en el soporte a la ejecución (compiladores y
librerías), se ha propuesto un nuevo diseño del ASLR, llamado
ASLR-NG, el cual: maximiza la entropía, soluciona el problema de la
fragmentación y elimina las debilidades encontradas. Al igual que la
solución propuesta para el SSP, la nueva propuesta de ASLR es
transparente para las aplicaciones y compatible a nivel
binario sin introducir sobrecarga. ASLR-NG ha sido implementado como
un parche del núcleo de Linux para la versión 4.1.
La diversificación software es una técnica que cubre una amplia gama
de fallos, incluidos los errores de memoria. La principal dificultad
para aplicar esta técnica radica en la generación de las
"variantes", que son programas que tienen un comportamiento idéntico
entre ellos ante entradas normales, pero tienen un comportamiento
diferenciado en presencia de entradas anormales. Se ha propuesto una
novedosa forma de generar variantes de forma automática a partir de un
mismo código fuente, empleando la emulación de sistemas.
Una de las máximas de esta investigación ha sido la aplicabilidad de
los resultados, por lo que se ha hecho especial hincapié en el
desarrollo de prototipos sobre sistemas reales a la par que se llevaba
a cabo el estudio teórico. Como resultado, las propuestas de esta
tesis son directamente aplicables a sistemas reales, algunas de ellas
ya están siendo explotadas en la práctica.[CA] La creació de programari suposa un dels reptes més complexos per al ser humà ja
que requerix un alt grau d'abstracció. Encara que s'ha avançat molt en les
metodologies per a la prevenció de les fallades de programari, és palès que el
programari resultant dista molt de ser confiable, i hem d'assumir que el
programari que es produïx no està lliure de fallades. Donada la impossibilitat
de dissenyar o implementar sistemes lliures de fallades, és necessari
incorporar tècniques de mitigació d'errors per a millorar la seguretat.
La present tesi realitza aportacions en tres de les principals tècniques de
mitigació d'errors de corrupció de memòria: Stack Smashing Protector (SSP),
Address Space Layout Randomisation (ASLR) i Automatic Software
Diversification.
SSP és una tècnica de protecció molt efectiva contra atacs de desbordament de
buffer en pila, però és sensible a atacs de força bruta, en particular al
perillós atac denominat byte-for-byte.
S'ha proposat una nova modificació del SSP, RenewSSP, la qual elimina els atacs
de força bruta. Pot ser usada de manera completament transparent amb els
programes existents sense introduir sobrecàrrega. El RenewSSP és especialment
beneficiós en dos àrees d'aplicació: servidors de xarxa (provat en Apache) i
llançadors d'aplicacions eficients (provat en Android).
ASLR és un concepte genèric, del qual hi ha multitud de dissenys i
implementacions. S'han analitzat les dos implementacions més rellevants de
Linux (Vanilla Linux i PaX patch), trobant-se en ambdues tant debilitats com
elements millorables. Tenint en compte les millores tecnològiques en el suport
a l'execució (compiladors i llibreries), s'ha proposat un nou disseny de
l'ASLR: ASLR-NG, el qual, maximitza l'entropia, soluciona el problema de
la fragmentació i elimina les debilitats trobades. Igual que la solució
proposada per al SSP, la nova proposta d'ASLR és transparent per a les
aplicacions i compatible a nivell binari sense introduir sobrecàrrega. ASLR-NG
ha sigut implementat com un pedaç del nucli de Linux per a la versió 4.1.
La diversificació de programari és una tècnica que cobrix una àmplia gamma de
fa\-llades, inclosos els errors de memòria. La principal dificultat per a aplicar
esta tècnica radica en la generació de les "variants", que són programes que
tenen un comportament idèntic entre ells davant d'entrades normals, però tenen
un comportament diferenciat en presència d'entrades anormals. S'ha proposat una
nova forma de generar variants de forma automàtica a partir d'un mateix codi
font, emprant l'emulació de sistemes.
Una de les màximes d'esta investigació ha sigut l'aplicabilitat dels resultats,
per la qual cosa s'ha fet especial insistència en el desenrotllament de
prototips sobre sistemes reals al mateix temps que es duia a terme l'estudi
teòric. Com a resultat, les propostes d'esta tesi són directament aplicables
a sistemes reals, algunes d'elles ja estan sent explotades en la pràctica.Marco Gisbert, H. (2015). Cyber-security protection techniques to mitigate memory errors exploitation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/57806TESISCompendi
Recommended from our members
Duplo: A framework for OCaml post-link optimisation
We present a novel framework,
Duplo
, for the low-level post-link optimisation of OCaml programs, achieving a speedup of 7% and a reduction of at least 15% of the code size of widely-used OCaml applications. Unlike existing post-link optimisers, which typically operate on target-specific machine code, our framework operates on a Low-Level Intermediate Representation (LLIR) capable of representing both the OCaml programs and any C dependencies they invoke through the foreign-function interface (FFI). LLIR is analysed, transformed and lowered to machine code by our post-link optimiser, LLIR-OPT. Most importantly, LLIR allows the optimiser to cross the OCaml-C language boundary, mitigating the overhead incurred by the FFI and enabling analyses and transformations in a previously unavailable context. The optimised IR is then lowered to amd64 machine code through the existing target-specific code generator of LLVM, modified to handle garbage collection just as effectively as the native OCaml backend. We equip our optimiser with a suite of SSA-based transformations and points-to analyses capable of capturing the semantics and representing the memory models of both languages, along with a cross-language inliner to embed C methods into OCaml callers. We evaluate the gains of our framework, which can be attributed to both our optimiser and the more sophisticated amd64 backend of LLVM, on a wide-range of widely-used OCaml applications, as well as an existing suite of micro- and macro-benchmarks used to track the performance of the OCaml compiler.
EPSRC EP/P020011/1, Cambridge Trust
- …