1,638 research outputs found

    ENHANCING CLOUD SYSTEM RUNTIME TO ADDRESS COMPLEX FAILURES

    Get PDF
    As the reliance on cloud systems intensifies in our progressively digital world, understanding and reinforcing their reliability becomes more crucial than ever. Despite impressive advancements in augmenting the resilience of cloud systems, the growing incidence of complex failures now poses a substantial challenge to the availability of these systems. With cloud systems continuing to scale and increase in complexity, failures not only become more elusive to detect but can also lead to more catastrophic consequences. Such failures question the foundational premises of conventional fault-tolerance designs, necessitating the creation of novel system designs to counteract them. This dissertation aims to enhance distributed systems’ capabilities to detect, localize, and react to complex failures at runtime. To this end, this dissertation makes contributions to address three emerging categories of failures in cloud systems. The first part delves into the investigation of partial failures, introducing OmegaGen, a tool adept at generating tailored checkers for detecting and localizing such failures. The second part grapples with silent semantic failures prevalent in cloud systems, showcasing our study findings, and introducing Oathkeeper, a tool that leverages past failures to infer rules and expose these silent issues. The third part explores solutions to slow failures via RESIN, a framework specifically designed to detect, diagnose, and mitigate memory leaks in cloud-scale infrastructures, developed in collaboration with Microsoft Azure. The dissertation concludes by offering insights into future directions for the construction of reliable cloud systems

    Resource-aware scheduling for 2D/3D multi-/many-core processor-memory systems

    Get PDF
    This dissertation addresses the complexities of 2D/3D multi-/many-core processor-memory systems, focusing on two key areas: enhancing timing predictability in real-time multi-core processors and optimizing performance within thermal constraints. The integration of an increasing number of transistors into compact chip designs, while boosting computational capacity, presents challenges in resource contention and thermal management. The first part of the thesis improves timing predictability. We enhance shared cache interference analysis for set-associative caches, advancing the calculation of Worst-Case Execution Time (WCET). This development enables accurate assessment of cache interference and the effectiveness of partitioned schedulers in real-world scenarios. We introduce TCPS, a novel task and cache-aware partitioned scheduler that optimizes cache partitioning based on task-specific WCET sensitivity, leading to improved schedulability and predictability. Our research explores various cache and scheduling configurations, providing insights into their performance trade-offs. The second part focuses on thermal management in 2D/3D many-core systems. Recognizing the limitations of Dynamic Voltage and Frequency Scaling (DVFS) in S-NUCA many-core processors, we propose synchronous thread migrations as a thermal management strategy. This approach culminates in the HotPotato scheduler, which balances performance and thermal safety. We also introduce 3D-TTP, a transient temperature-aware power budgeting strategy for 3D-stacked systems, reducing the need for Dynamic Thermal Management (DTM) activation. Finally, we present 3QUTM, a novel method for 3D-stacked systems that combines core DVFS and memory bank Low Power Modes with a learning algorithm, optimizing response times within thermal limits. This research contributes significantly to enhancing performance and thermal management in advanced processor-memory systems

    Summer/Fall 2023

    Get PDF

    OpenLB User Guide: Associated with Release 1.6 of the Code

    Full text link
    OpenLB is an object-oriented implementation of LBM. It is the first implementation of a generic platform for LBM programming, which is shared with the open source community (GPLv2). Since the first release in 2007, the code has been continuously improved and extended which is documented by thirteen releases as well as the corresponding release notes which are available on the OpenLB website (https://www.openlb.net). The OpenLB code is written in C++ and is used by application programmers as well as developers, with the ability to implement custom models OpenLB supports complex data structures that allow simulations in complex geometries and parallel execution using MPI, OpenMP and CUDA on high-performance computers. The source code uses the concepts of interfaces and templates, so that efficient, direct and intuitive implementations of the LBM become possible. The efficiency and scalability has been checked and proved by code reviews. This user manual and a source code documentation by DoxyGen are available on the OpenLB project website

    Behavior quantification as the missing link between fields: Tools for digital psychiatry and their role in the future of neurobiology

    Full text link
    The great behavioral heterogeneity observed between individuals with the same psychiatric disorder and even within one individual over time complicates both clinical practice and biomedical research. However, modern technologies are an exciting opportunity to improve behavioral characterization. Existing psychiatry methods that are qualitative or unscalable, such as patient surveys or clinical interviews, can now be collected at a greater capacity and analyzed to produce new quantitative measures. Furthermore, recent capabilities for continuous collection of passive sensor streams, such as phone GPS or smartwatch accelerometer, open avenues of novel questioning that were previously entirely unrealistic. Their temporally dense nature enables a cohesive study of real-time neural and behavioral signals. To develop comprehensive neurobiological models of psychiatric disease, it will be critical to first develop strong methods for behavioral quantification. There is huge potential in what can theoretically be captured by current technologies, but this in itself presents a large computational challenge -- one that will necessitate new data processing tools, new machine learning techniques, and ultimately a shift in how interdisciplinary work is conducted. In my thesis, I detail research projects that take different perspectives on digital psychiatry, subsequently tying ideas together with a concluding discussion on the future of the field. I also provide software infrastructure where relevant, with extensive documentation. Major contributions include scientific arguments and proof of concept results for daily free-form audio journals as an underappreciated psychiatry research datatype, as well as novel stability theorems and pilot empirical success for a proposed multi-area recurrent neural network architecture.Comment: PhD thesis cop

    Parallel and Flow-Based High Quality Hypergraph Partitioning

    Get PDF
    Balanced hypergraph partitioning is a classic NP-hard optimization problem that is a fundamental tool in such diverse disciplines as VLSI circuit design, route planning, sharding distributed databases, optimizing communication volume in parallel computing, and accelerating the simulation of quantum circuits. Given a hypergraph and an integer kk, the task is to divide the vertices into kk disjoint blocks with bounded size, while minimizing an objective function on the hyperedges that span multiple blocks. In this dissertation we consider the most commonly used objective, the connectivity metric, where we aim to minimize the number of different blocks connected by each hyperedge. The most successful heuristic for balanced partitioning is the multilevel approach, which consists of three phases. In the coarsening phase, vertex clusters are contracted to obtain a sequence of structurally similar but successively smaller hypergraphs. Once sufficiently small, an initial partition is computed. Lastly, the contractions are successively undone in reverse order, and an iterative improvement algorithm is employed to refine the projected partition on each level. An important aspect in designing practical heuristics for optimization problems is the trade-off between solution quality and running time. The appropriate trade-off depends on the specific application, the size of the data sets, and the computational resources available to solve the problem. Existing algorithms are either slow, sequential and offer high solution quality, or are simple, fast, easy to parallelize, and offer low quality. While this trade-off cannot be avoided entirely, our goal is to close the gaps as much as possible. We achieve this by improving the state of the art in all non-trivial areas of the trade-off landscape with only a few techniques, but employed in two different ways. Furthermore, most research on parallelization has focused on distributed memory, which neglects the greater flexibility of shared-memory algorithms and the wide availability of commodity multi-core machines. In this thesis, we therefore design and revisit fundamental techniques for each phase of the multilevel approach, and develop highly efficient shared-memory parallel implementations thereof. We consider two iterative improvement algorithms, one based on the Fiduccia-Mattheyses (FM) heuristic, and one based on label propagation. For these, we propose a variety of techniques to improve the accuracy of gains when moving vertices in parallel, as well as low-level algorithmic improvements. For coarsening, we present a parallel variant of greedy agglomerative clustering with a novel method to resolve cluster join conflicts on-the-fly. Combined with a preprocessing phase for coarsening based on community detection, a portfolio of from-scratch partitioning algorithms, as well as recursive partitioning with work-stealing, we obtain our first parallel multilevel framework. It is the fastest partitioner known, and achieves medium-high quality, beating all parallel partitioners, and is close to the highest quality sequential partitioner. Our second contribution is a parallelization of an n-level approach, where only one vertex is contracted and uncontracted on each level. This extreme approach aims at high solution quality via very fine-grained, localized refinement, but seems inherently sequential. We devise an asynchronous n-level coarsening scheme based on a hierarchical decomposition of the contractions, as well as a batch-synchronous uncoarsening, and later fully asynchronous uncoarsening. In addition, we adapt our refinement algorithms, and also use the preprocessing and portfolio. This scheme is highly scalable, and achieves the same quality as the highest quality sequential partitioner (which is based on the same components), but is of course slower than our first framework due to fine-grained uncoarsening. The last ingredient for high quality is an iterative improvement algorithm based on maximum flows. In the sequential setting, we first improve an existing idea by solving incremental maximum flow problems, which leads to smaller cuts and is faster due to engineering efforts. Subsequently, we parallelize the maximum flow algorithm and schedule refinements in parallel. Beyond the strive for highest quality, we present a deterministically parallel partitioning framework. We develop deterministic versions of the preprocessing, coarsening, and label propagation refinement. Experimentally, we demonstrate that the penalties for determinism in terms of partition quality and running time are very small. All of our claims are validated through extensive experiments, comparing our algorithms with state-of-the-art solvers on large and diverse benchmark sets. To foster further research, we make our contributions available in our open-source framework Mt-KaHyPar. While it seems inevitable, that with ever increasing problem sizes, we must transition to distributed memory algorithms, the study of shared-memory techniques is not in vain. With the multilevel approach, even the inherently slow techniques have a role to play in fast systems, as they can be employed to boost quality on coarse levels at little expense. Similarly, techniques for shared-memory parallelism are important, both as soon as a coarse graph fits into memory, and as local building blocks in the distributed algorithm

    Value Creation with Extended Reality Technologies - A Methodological Approach for Holistic Deployments

    Get PDF
    Mit zunehmender Rechenkapazität und Übertragungsleistung von Informationstechnologien wächst die Anzahl möglicher Anwendungs-szenarien für Extended Reality (XR)-Technologien in Unternehmen. XR-Technologien sind Hardwaresysteme, Softwaretools und Methoden zur Erstellung von Inhalten, um Virtual Reality, Augmented Reality und Mixed Reality zu erzeugen. Mit der Möglichkeit, Nutzern Inhalte auf immersive, interaktive und intelligente Weise zu vermitteln, können XR-Technologien die Produktivität in Unternehmen steigern und Wachstumschancen eröffnen. Obwohl XR-Anwendungen in der Industrie seit mehr als 25 Jahren wissenschaftlich erforscht werden, gelten nach wie vor als unausgereift. Die Hauptgründe dafür sind die zugrundeliegende Komplexität, die Fokussierung der Forschung auf die Untersuchung spezifische Anwendungsszenarien, die unzu-reichende Wirtschaftlichkeit von Einsatzszenarien und das Fehlen von geeigneten Implementierungsmodellen für XR-Technologien. Grundsätzlich wird der Mehrwert von Technologien durch deren Integration in die Wertschöpfungsarchitektur von Geschäftsmodellen freigesetzt. Daher wird in dieser Arbeit eine Methodik für den Einsatz von XR-Technologien in der Wertschöpfung vorgestellt. Das Hauptziel der Methodik ist es, die Identifikation geeigneter Einsatzszenarien zu ermöglichen und mit einem strukturierten Ablauf die Komplexität der Umsetzung zu beherrschen. Um eine ganzheitliche Anwendbarkeit zu ermöglichen, basiert die Methodik auf einem branchen- und ge-schäftsprozessunabhängigen Wertschöpfungsreferenzmodell. Dar-über hinaus bezieht sie sich auf eine ganzheitliche Morphologie von XR-Technologien und folgt einer iterativen Einführungssequenz. Das Wertschöpfungsmodell wird durch ein vorliegendes Potential, eine Wertschöpfungskette, ein Wertschöpfungsnetzwerk, physische und digitale Ressourcen sowie einen durch den Einsatz von XR-Technologien realisierten Mehrwert repräsentiert. XR-Technologien werden durch eine morphologische Struktur mit Anwendungsmerk-malen und erforderlichen technologischen Ressourcen repräsentiert. Die Umsetzung erfolgt in einer iterativen Sequenz, die für den zu-grundeliegenden Kontext anwendbare Methoden der agilen Soft-wareentwicklung beschreibt und relevante Stakeholder berücksich-tigt. Der Schwerpunkt der Methodik liegt auf einem systematischen Ansatz, der universell anwendbar ist und den Endnutzer und das Ökosystem der betrachteten Wertschöpfung berücksichtigt. Um die Methodik zu validieren, wird der Einsatz von XR-Technologien in zwei industriellen Anwendungsfällen unter realen wirtschaftlichen Bedingungen durchgeführt. Die Anwendungsfälle stammen aus unterschiedlichen Branchen, mit unterschiedlichen XR-Technologiemerkmalen sowie unterschiedlichen Formen von Wert-schöpfungsketten, um die universelle Anwendbarkeit der Methodik zu demonstrieren und relevante Herausforderungen bei der Durch-führung eines XR-Technologieeinsatzes aufzuzeigen. Mit Hilfe der vorgestellten Methodik können Unternehmen XR-Technologien zielgerichtet in ihrer Wertschöpfung einsetzen. Sie ermöglicht eine detaillierte Planung der Umsetzung, eine fundierte Auswahl von Anwendungsszenarien, die Bewertung möglicher Her-ausforderungen und Hindernisse sowie die gezielte Einbindung der relevanten Stakeholder. Im Ergebnis wird die Wertschöpfung mit wirtschaftlichem Mehrwert durch XR-Technologien optimiert

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Specificity of the innate immune responses to different classes of non-tuberculous mycobacteria

    Get PDF
    Mycobacterium avium is the most common nontuberculous mycobacterium (NTM) species causing infectious disease. Here, we characterized a M. avium infection model in zebrafish larvae, and compared it to M. marinum infection, a model of tuberculosis. M. avium bacteria are efficiently phagocytosed and frequently induce granuloma-like structures in zebrafish larvae. Although macrophages can respond to both mycobacterial infections, their migration speed is faster in infections caused by M. marinum. Tlr2 is conservatively involved in most aspects of the defense against both mycobacterial infections. However, Tlr2 has a function in the migration speed of macrophages and neutrophils to infection sites with M. marinum that is not observed with M. avium. Using RNAseq analysis, we found a distinct transcriptome response in cytokine-cytokine receptor interaction for M. avium and M. marinum infection. In addition, we found differences in gene expression in metabolic pathways, phagosome formation, matrix remodeling, and apoptosis in response to these mycobacterial infections. In conclusion, we characterized a new M. avium infection model in zebrafish that can be further used in studying pathological mechanisms for NTM-caused diseases
    corecore