262 research outputs found

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Formal Derivation of Concurrent Garbage Collectors

    Get PDF
    Concurrent garbage collectors are notoriously difficult to implement correctly. Previous approaches to the issue of producing correct collectors have mainly been based on posit-and-prove verification or on the application of domain-specific templates and transformations. We show how to derive the upper reaches of a family of concurrent garbage collectors by refinement from a formal specification, emphasizing the application of domain-independent design theories and transformations. A key contribution is an extension to the classical lattice-theoretic fixpoint theorems to account for the dynamics of concurrent mutation and collection.Comment: 38 pages, 21 figures. The short version of this paper appeared in the Proceedings of MPC 201

    Making non-volatile memory programmable

    Get PDF
    Byte-addressable, non-volatile memory (NVM) is emerging as a revolutionary memory technology that provides persistence, near-DRAM performance, and scalable capacity. By using NVM, applications can directly create and manipulate durable data in place without the need for serialization out to SSDs. Ideally, through NVM, persistent applications will be able to maintain crash-consistency at a minimal cost. However, before this is possible, improvements must be made at both the hardware and software level to support persistent applications. Currently, software support for NVM places too high of a burden on the developer, introducing many opportunities for mistakes while also being too rigid for compiler optimizations. Likewise, at the hardware level, too little information is passed to the processor about the instruction-level ordering requirements of persistent applications; this forces the hardware to require the use of coarse fences, which significantly slow down execution. To help realize the promise of NVM, this thesis proposes both new software and hardware support that make NVM programmable. From the software side, this thesis proposes a new NVM programming model which relieves the programmer from performing much of the accounting work in persistent applications, instead relying on the runtime to perform error-prone tasks. Specifically, within the proposed model, the user only needs to provide minimal markings to identify the persistent data set and to ensure data is updated in a crash-consistent manner. Given this new NVM programming model, this thesis next presents an implementation of the model in Java. I call my implementation AutoPersist and build my support into the Maxine research Java Virtual Machine (JVM). In this thesis I describe how the JVM can be changed to support the proposed NVM programming model, including adding new Java libraries, adding new JVM runtime features, and augmenting the behavior of existing Java bytecodes. In addition to being easy-to-use, another advantage of the proposed model is that it is amenable to compiler optimizations. In this thesis I highlight two profile-guided optimizations: eagerly allocating objects directly into NVM and speculatively pruning control flow to only include expected-to-be taken paths. I also describe how to apply these optimizations to AutoPersist and show they have a substantial performance impact. While designing AutoPersist, I often observed that dependency information known by the compiler cannot be passed down to the underlying hardware; instead, the compiler must insert coarse-grain fences to enforce needed dependencies. This is because current instruction set architectures (ISA) cannot describe arbitrary instruction-level execution ordering constraints. To fix this limitation, I introduce the Execution Dependency Extension (EDE), and describe how EDE can be added to an existing ISA as well as be implemented in current processor pipelines. Overall, emerging NVM technologies can deliver programmer-friendly high performance. However, for this to happen, both software and hardware improvements are necessary. This thesis takes steps to address current the software and hardware gaps: I propose new software support to assist in the development of persistent applications and also introduce new instructions which allow for arbitrary instruction-level dependencies to be conveyed and enforced by the underlying hardware. With these improvements, hopefully the dream of programmable high-performance NVM is one step closer to being realized

    Timing Sensitive Dependency Analysis and its Application to Software Security

    Get PDF
    Ich prƤsentiere neue Verfahren zur statischen Analyse von AusfĆ¼hrungszeit-sensitiver Informationsflusskontrolle in Softwaresystemen. Ich wende diese Verfahren an zur Analyse nebenlƤufiger Java Programme, sowie zur Analyse von AusfĆ¼hrungszeit-SeitenkanƤlen in Implementierungen kryptographischer Primitive. Methoden der Informationsflusskontrolle zielen darauf ab, Fluss von Informationen (z.B.: zwischen verschiedenen externen Schnittstellen einer Software-Komponente) anhand expliziter Richtlinien einzuschrƤnken. Solche Methoden kƶnnen daher zur Einhaltung sowohl von Vertraulichkeit als auch IntegritƤt eingesetzt werden. Der Ziel korrekter statischer Programmanalysen in diesem Umfeld ist der Nachweis, dass in allen AusfĆ¼hrungen eines gegebenen Programms die zugehƶrigen Richtlinien eingehalten werden. Ein solcher Nachweis erfordert ein Sicherheitskriterium, welches formalisiert, unter welchen Bedingungen dies der Fall ist. Jedem formalen Sicherheitskriterium entspricht implizit ein Programm- und Angreifermodell. Einfachste Nichtinterferenz-Kriterien beschreiben beispielsweise nur nicht-interaktive Programme. Dies sind Programme die nur bei Beginn und Ende der AusfĆ¼hrung Ein- und Ausgaben erlauben. Im zugehƶrigen Angreifer-Modell kennt der Angreifer das Programm, aber beobachtet nur bestimmte (ƶffentliche) Aus- und Eingaben oder stellt diese bereit. Ein Programm ist nichtinterferent, wenn der Angreifer aus seinen Beobachtungen keinerlei RĆ¼ckschlĆ¼sse auf geheime Aus- und Eingaben terminierender AusfĆ¼hrungen machen kann. Aus nicht-terminierenden AusfĆ¼hrungen hingegen sind dem Angreifer in diesem Modell Schlussfolgerungen auf geheime Eingaben erlaubt. SeitenkanƤle entstehen, wenn einem Angreifer aus Beobachtungen realer Systeme RĆ¼ckschlĆ¼sse auf vertrauliche Informationen ziehen kann, welche im formalen Modell unmƶglich sind. Typische SeitenkanƤle (also: in vielen formalen Sicherheitskriterien unmodelliert) sind neben Nichttermination beispielsweise auch Energieverbrauch und die AusfĆ¼hrungszeit von Programmen. HƤngt diese von geheimen Eingaben ab, so kann ein Angreifer aus der beobachteten AusfĆ¼hrungszeit auf die Eingabe (z.B.: auf den Wert einzelner geheimer Parameter) schlieƟen. In meiner Dissertation prƤsentiere ich neue AbhƤngigkeitsanalysen, die auch Nichtterminations- und AusfĆ¼hrungszeitkanƤle berĆ¼cksichtigen. In Hinblick auf NichtterminationskanƤle stelle ich neue Verfahren zur Berechnung von Programm-AbhƤngigkeiten vor. Hierzu entwickle ich ein vereinheitlichendes Rahmenwerk, in welchem sowohl Nichttermination-sensitive als auch Nichttermination-insensitive AbhƤngigkeiten aus zueinander dualen Postdominanz-Begriffen resultieren. FĆ¼r AusfĆ¼hrungszeitkanƤle entwickle ich neue AbhƤngigkeitsbegriffe und dazugehƶrige Verfahren zu deren Berechnung. In zwei Anwendungen untermauere ich die These: AusfĆ¼hrungszeit-sensitive AbhƤngigkeiten ermƶglichen korrekte statische Informationsfluss-Analyse unter BerĆ¼cksichtigung von AusfĆ¼hrungszeitkanƤlen. Basierend auf AusfĆ¼hrungszeit-sensitiven AbhƤngigkeiten entwerfe ich hierfĆ¼r neue Analysen fĆ¼r nebenlƤufige Programme. AusfĆ¼hrungszeit-sensitive AbhƤngigkeiten sind dort selbst fĆ¼r AusfĆ¼hrungszeit-insensitive Angreifermodelle relevant, da dort interne AusfĆ¼hrungszeitkanƤle zwischen unterschiedlichen AusfĆ¼hrungsfƤden extern beobachtbar sein kƶnnen. Meine Implementierung fĆ¼r nebenlƤufige Java Programme basiert auf auf dem Programmanalyse- System JOANA. AuƟerdem prƤsentiere ich neue Analysen fĆ¼r AusfĆ¼hrungszeitkanƤle aufgrund mikro-architektureller AbhƤngigkeiten. Exemplarisch untersuche ich Implementierungen von AES256 BlockverschlĆ¼sselung. Bei einigen Implementierungen fĆ¼hren Daten-Caches dazu, dass die AusfĆ¼hrungszeit abhƤngt von SchlĆ¼ssel und Geheimtext, wodurch diese aus der AusfĆ¼hrungszeit inferierbar sind. FĆ¼r andere Implementierungen weist meine automatische statische Analyse (unter Annahme einer einfachen konkreten Cache-Mikroarchitektur) die Abwesenheit solcher KanƤle nach

    Web and Semantic Web Query Languages

    Get PDF
    A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This article introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced using common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion

    Reasoning & Querying ā€“ State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    Subheap-Augmented Garbage Collection

    Get PDF
    Automated memory management avoids the tedium and danger of manual techniques. However, as no programmer input is required, no widely available interface exists to permit principled control over sometimes unacceptable performance costs. This dissertation explores the idea that performance-oriented languages should give programmers greater control over where and when the garbage collector (GC) expends effort. We describe an interface and implementation to expose heap partitioning and collection decisions without compromising type safety. We show that our interface allows the programmer to encode a form of reference counting using Hayes\u27 notion of key objects. Preliminary experimental data suggests that our proposed mechanism can avoid high overheads suffered by tracing collectors in some scenarios, especially with tight heaps. However, for other applications, the costs of applying subheaps---in human effort and runtime overheads---remain daunting
    • ā€¦
    corecore