Search CORE

24 research outputs found

Speculative Staging for Interpreter Optimization

Author: Brunthaler Stefan
Publication venue
Publication date: 08/10/2013
Field of study

Interpreters have a bad reputation for having lower performance than just-in-time compilers. We present a new way of building high performance interpreters that is particularly effective for executing dynamically typed programming languages. The key idea is to combine speculative staging of optimized interpreter instructions with a novel technique of incrementally and iteratively concerting them at run-time. This paper introduces the concepts behind deriving optimized instructions from existing interpreter instructions---incrementally peeling off layers of complexity. When compiling the interpreter, these optimized derivatives will be compiled along with the original interpreter instructions. Therefore, our technique is portable by construction since it leverages the existing compiler's backend. At run-time we use instruction substitution from the interpreter's original and expensive instructions to optimized instruction derivatives to speed up execution. Our technique unites high performance with the simplicity and portability of interpreters---we report that our optimization makes the CPython interpreter up to more than four times faster, where our interpreter closes the gap between and sometimes even outperforms PyPy's just-in-time compiler.Comment: 16 pages, 4 figures, 3 tables. Uses CPython 3.2.3 and PyPy 1.

arXiv.org e-Print Archive

CiteSeerX

List Processing in Real Time on a Serial Computer

Author: Henry G. Baker
R. L. Rivest
S. L. Graham
Publication venue: MIT Artificial Intelligence Laboratory
Publication date: 01/01/1977
Field of study

Key Words and Phrases: real-time, compacting, garbage collection, list processing, virtual memory, file or database management, storage management, storage allocation, LISP, CDR-coding, reference counting. CR Categories: 3.50, 3.60, 373, 3.80, 4.13, 24.32, 433, 4.35, 4.49 This report describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory's artificial intelligence research is provided in part by the Advanced Research Projects Agency of the Department of Defense under Office of Naval Research contract N00014-75-C-0522.A real-time list processing system is one in which the time required by each elementary list operation (CONS, CAR, CDR, RPLACA, RPLACD, EQ, and ATOM in LISP) is bounded by a (small) constant. Classical list processing systems such as LISP do not have this property because a call to CONS may invoke the garbage collector which requires time proportional to the number of accessible cells to finish. The space requirement of a classical LISP system with N accessible cells under equilibrium conditions is (1.5+μ)N or (1+μ)N, depending upon whether a stack is required for the garbage collector, where μ>0 is typically less than 2. A list processing system is presented which: 1) is real-time--i.e. T(CONS) is bounded by a constant independent of the number of cells in use; 2) requires space (2+2μ)N, i.e. not more than twice that of a classical system; 3) runs on a serial computer without a time-sharing clock; 4) handles directed cycles in the data structures; 5) is fast--the average time for each operation is about the same as with normal garbage collection; 6) compacts--minimizes the working set; 7) keeps the free pool in one contiguous block--objects of nonuniform size pose no problem; 8) uses one phase incremental collection--no separate mark, sweep, relocate phases; 9) requires no garbage collector stack; 10) requires no "mark bits", per se; 11) is simple--suitable for microcoded implementation. Extensions of the system to handle a user program stack, compact list representation ("CDR-coding"), arrays of non-uniform size, and hash linking are discussed. CDR-coding is shown to reduce memory requirements for N LISP cells to ≈(I+μ)N. Our system is also compared with another approach to the real-time storage management problem, reference counting, and reference counting is shown to be neither competitive with our system when speed of allocation is critical, nor compatible, in the sense that a system with both forms of garbage collection is worse than our pure one.MIT Artificial Intelligence Laboratory Department of Defense Advanced Research Projects Agenc

CiteSeerX

DSpace@MIT

Garbage Collection Algorithms

Author: Česen Andrej
Publication venue
Publication date: 15/09/2014
Field of study

This thesis focuses on an implementation of automatic memory management in C programming language. Mark-sweep method was modified for use in uncooperative programming language, which does not share data type information of memory slots accessible by the mutator. Due to this fact, decisions on pointer identity are conservative which guarantees safe collector operation - if value looks sufficiently like a pointer, it is considered a pointer (although it might not actually be one). Mark bits were moved from object's headers to bitmaps, stored in a seperate part of memory to prevent accidental writes to user's data by the collector. Finally, the usage of garbage collector was evaluated in practice

ePrints.FRI

Garbage Collection Algorithms

Author: Česen Andrej
Publication venue
Publication date: 15/09/2014
Field of study

Reaaliaikaisen roskankeruun tekniikat

Author: Salonen Antti
Publication venue: Helsingfors universitet
Publication date: 01/01/2020
Field of study

Roskankeruulla tarkoitetaan automaattista muistinhallinnan mekanismia, jossa roskankeräin vapauttaa sovelluksen varaamat muistialueet, joihin sovellus ei enää viittaa. Keskeisiä roskankeruun perustekniikoita ovat muistiviitteiden laskenta ja jäljittävät keruutekniikat, kuten mark-sweep-keruu ja kopioiva keruu. Reaaliaikaisissa ja interaktiivisissa sovelluksissa roskankeruusta koituvat suoritusviiveet eivät saa olla liian pitkiä. Tällaisissa sovelluksissa keruuta ei voida toteuttaa yhtenä atomisena operaationa, jonka ajaksi ohjelman suoritus keskeytyy. Sen sijaan roskankeruu voidaan kohdistaa vain osaan ohjelman muistista, tai roskankeruu toteutetaan etenemään samanaikaisesti ohjelman suorituksen kanssa. Varsinaiset reaaliaikaiset keruutekniikat vuorottavat roskankeräimen suorituksen siten, että keruusta aiheutuvat viiveet ovat tarkkaan ennakoituja. Tutkielmassa vertailtiin Java-kielen roskankeräimiä erilaisilla työkuormilla ja erikokoisilla muistialueilla. Mittauksissa tarkasteltiin mittausajojen kestoa, roskankeruutaukojen kestoa sekä taukojen jakautumista ohjelman suorituksen ajalle. Mittauksissa löydettiin merkittäviä eroja vertailtujen keräimien välillä. Java-kielen uusi G1-keräin suorittaa koko muistiin kohdistuvan merkintävaiheen rinnakkaisena, ja kopiointivaihe kohdistetaan kerrallaan vain pieneen osaan ohjelman muistista. G1-keräin oli suoritetuissa mittauksissa vain hieman hitaampi kuin vanha Parallel-keräin, mutta G1-keräimen keruutauot olivat huomattavasti lyhyempiä. Kun G1-keräimen keruutauoille asetettiin tavoitekesto, viiveet olivat pisimmillään vain muutamia kymmeniä millisekunteja. Vertailussa mukana olleella Shenandoah- keräimellä, joka on suunniteltu takaamaan erityisen lyhyitä suoritusviiveitä, ohjelman suoritukselle aiheutuneet viiveet olivat vain muutamia millisekunteja

Helsingin yliopiston digitaalinen arkisto

Muistin siivous

Author: VARGA ZOLTAN
Publication venue
Publication date: 29/05/2006
Field of study

Tutkielmassa esitellään roskan käsite tietojenkäsittelytieteessä, roskienkeruun keskeiset käsitteet ja perusmenetelmät muunnelmineen sekä nykyaikaiset tehokkaat algoritmit. Keskipisteenä ovat kuitenkin muistinhallintatutkimuksen 2000-luvun saavutukset, tutkimusaiheet ja tutkimusvälineet. Näitä hyödyntää tutkielmassa esiteltävä uusi CBRC-roskienkeruualgoritmi. Lisäksi katsastetaan ohjelmoijan vastuu automaattisessa muistinhallinnassa sekä ohjelmoinnissa käytettävissä olevat roskienkeruutietoiset välineet eräissä ohjelmointikielissä ja –ympäristöissä (Java, .Net, C++). Avainsanat ja -sanonnat: roskienkeruu, muistinsiivous, muistinhallinta, algoritmit, ohjelmointikielet CR-luokat: D 3.4, D.4.2, D.3.

Trepo - Institutional Repository of Tampere University

Fast conservative garbage collection

Author: Apple Inc.
Apple Inc.
Attardi G.
Blackburn S. M.
EC.
Frampton D.
Java Server EC.
Seeley Y.
Thomas J. R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

An Examination of Deferred Reference Counting and Cycle Detection

Author: Quinane Luke
Publication venue
Publication date: 01/01/2003
Field of study

Object-oriented programing languages are becoming increasingly important as are managed runtime-systems. An area of importance in such systems is dynamic automatic memory management. A key function of dynamic automatic memory management is detecting and reclaiming discarded memory regions; this is also referred to as garbage collection. A significant proportion of research has been conducted in the field of memory management, and more specifically garbage collection techniques. In the past, adequate comparisons against a range of competing algorithms and implementations has often been overlooked. JMTk is a flexible memory management toolkit, written in Java, which attempts to provide a testbed for such comparisons. This thesis aims to examine the implementation of one algorithm currently available in JMTk: the deferred reference counter. Other research has shown that the reference counter in JMTk performs poorly both in throughput and responsiveness. Several aspects of the reference counter are tested, including the write barrier, allocation cost, increment and decrement processing and cycle-detection. The results of these examinations found the bump-pointer to be 8% faster than the free-list in raw allocation. The cost of the reference counting write barrier was determined to be 10% on the PPC architecture and 20% on the i686 architecture. Processing increments in the write barrier was found to be up to 13% faster than buffering them until collection time on a uni-processor platform. Cycle detection was identified as a key area of cost in reference counting. In order to improve the performance of the deferred reference counter and to contribute to the JMTk testbed, a new algorithm for detecting cyclic garbage was described. This algorithm is based on a mark scan approach to cycle detection. Using this algorithm, two new cycle detectors were implemented and compared to the original trial deletion cycle detector. The semi-concurrent cycle detector had the best throughput, outperforming trial deletion by more than 25% on the javac benchmark. The non-concurrent cycle detector had poor throughput attributed to poor triggering heuristics. Both new cycle detectors had poor pause times. Even so, the semi-concurrent cycle detector had the lowest pause times on the javac benchmark. The work presented in this thesis contributes to an evaluation of components of the reference counter and a comparsion between approaches to reference counting implementation. Previous to this work, the cost of the reference counter's components had not been quantified. Additionally, past work presented different approaches to reference counting implementation as a whole, instead of individual components

The Australian National University

Exploiting the Weak Generational Hypothesis for Write Reduction and Object Recycling

Author: Shidal Jonathan Andrew
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2016
Field of study

Programming languages with automatic memory management are continuing to grow in popularity due to ease of programming. However, these languages tend to allocate objects excessively, leading to inefficient use of memory and large garbage collection and allocation overheads. The weak generational hypothesis notes that objects tend to die young in languages with automatic dynamic memory management. Much work has been done to optimize allocation and garbage collection algorithms based on this observation. Previous work has largely focused on developing efficient software algorithms for allocation and collection. However, much less work has studied architectural solutions. In this work, we propose and evaluate architectural support for assisting allocation and garbage collection. We first study the effects of languages with automatic memory management on the memory system. As objects often die young, it is likely many objects die while in the processor\u27s caches. Writes of dead data back to main memory are unnecessary, as the data will never be used again. To study this, we develop and present architecture support to identify dead objects while they remain resident in cache and eliminate any unnecessary writes. We show that many writes out of the caches are unnecessary, and can be avoided using our hardware additions. Next, we study the effects of using dead data in cache to assist with allocation and garbage collection. Logic is developed and presented to allow for reuse of cache space found dead to satisfy future allocation requests. We show that dead cache space can be recycled at a high rate, reducing pressure on the allocator and reducing cache miss rates. However, a full implementation of our initial approach is shown to be unscalable. We propose and study limitations to our approach, trading object coverage for scalability. Third, we present a new approach for identifying objects that die young based on a limitation of our previous approach. We show this approach has much lower storage and logic requirements and is scalable, while only slightly decreasing overall object coverage

Washington University St. Louis: Open Scholarship

Ramasse-miettes générationnel et incémental gérant les cycles et les gros objets en utilisant des frames délimités

Author: Adam Sébastien
Publication venue
Publication date: 01/01/2008
Field of study

Ces dernières années, des recherches ont été menées sur plusieurs techniques reliées à la collection des déchets. Plusieurs découvertes centrales pour le ramassage de miettes par copie ont été réalisées. Cependant, des améliorations sont encore possibles. Dans ce mémoire, nous introduisons des nouvelles techniques et de nouveaux algorithmes pour améliorer le ramassage de miettes. En particulier, nous introduisons une technique utilisant des cadres délimités pour marquer et retracer les pointeurs racines. Cette technique permet un calcul efficace de l'ensemble des racines. Elle réutilise des concepts de deux techniques existantes, card marking et remembered sets, et utilise une configuration bidirectionelle des objets pour améliorer ces concepts en stabilisant le surplus de mémoire utilisée et en réduisant la charge de travail lors du parcours des pointeurs. Nous présentons aussi un algorithme pour marquer récursivement les objets rejoignables sans utiliser de pile (éliminant le gaspillage de mémoire habituel). Nous adaptons cet algorithme pour implémenter un ramasse-miettes copiant en profondeur et améliorer la localité du heap. Nous améliorons l'algorithme de collection des miettes older-first et sa version générationnelle en ajoutant une phase de marquage garantissant la collection de toutes les miettes, incluant les structures cycliques réparties sur plusieurs fenêtres. Finalement, nous introduisons une technique pour gérer les gros objets. Pour tester nos idées, nous avons conçu et implémenté, dans la machine virtuelle libre Java SableVM, un cadre de développement portable et extensible pour la collection des miettes. Dans ce cadre, nous avons implémenté des algorithmes de collection semi-space, older-first et generational. Nos expérimentations montrent que la technique du cadre délimité procure des performances compétitives pour plusieurs benchmarks. Elles montrent aussi que, pour la plupart des benchmarks, notre algorithme de parcours en profondeur améliore la localité et augmente ainsi la performance. Nos mesures de la performance générale montrent que, utilisant nos techniques, un ramasse-miettes peut délivrer une performance compétitive et surpasser celle des ramasses-miettes existants pour plusieurs benchmarks. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Ramasse-Miettes, Machine Virtuelle, Java, SableVM

Archipel - Université du Québec à Montréal