183 research outputs found

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

    Contributions for improving debugging of kernel-level services in a monolithic operating system

    Get PDF
    Alors que la recherche sur la qualité du code des systèmes a connu un formidable engouement, les systèmes d exploitation sont encore aux prises avec des problèmes de fiabilité notamment dus aux bogues de programmation au niveau des services noyaux tels que les pilotes de périphériques et l implémentation des systèmes de fichiers. Des études ont en effet montré que chaque version du noyau Linux contient entre 600 et 700 fautes, et que la propension des pilotes de périphériques à contenir des erreurs est jusqu à sept fois plus élevée que toute autre partie du noyau. Ces chiffres suggèrent que le code des services noyau n est pas suffisamment testé et que de nombreux défauts passent inaperçus ou sont difficiles à réparer par des programmeurs non-experts, ces derniers formant pourtant la majorité des développeurs de services. Cette thèse propose une nouvelle approche pour le débogage et le test des services noyau. Notre approche est focalisée sur l interaction entre les services noyau et le noyau central en abordant la question des trous de sûreté dans le code de définition des fonctions de l API du noyau. Dans le contexte du noyau Linux, nous avons mis en place une approche automatique, dénommée Diagnosys, qui repose sur l analyse statique du code du noyau afin d identifier, classer et exposer les différents trous de sûreté de l API qui pourraient donner lieu à des fautes d exécution lorsque les fonctions sont utilisées dans du code de service écrit par des développeurs ayant une connaissance limitée des subtilités du noyau. Pour illustrer notre approche, nous avons implémenté Diagnosys pour la version 2.6.32 du noyau Linux. Nous avons montré ses avantages à soutenir les développeurs dans leurs activités de tests et de débogage.Despite the existence of an overwhelming amount of research on the quality of system software, Operating Systems are still plagued with reliability issues mainly caused by defects in kernel-level services such as device drivers and file systems. Studies have indeed shown that each release of the Linux kernel contains between 600 and 700 faults, and that the propensity of device drivers to contain errors is up to seven times higher than any other part of the kernel. These numbers suggest that kernel-level service code is not sufficiently tested and that many faults remain unnoticed or are hard to fix bynon-expert programmers who account for the majority of service developers. This thesis proposes a new approach to the debugging and testing of kernel-level services focused on the interaction between the services and the core kernel. The approach tackles the issue of safety holes in the implementation of kernel API functions. For Linux, we have instantiated the Diagnosys automated approach which relies on static analysis of kernel code to identify, categorize and expose the different safety holes of API functions which can turn into runtime faults when the functions are used in service code by developers with limited knowledge on the intricacies of kernel code. To illustrate our approach, we have implemented Diagnosys for Linux 2.6.32 and shown its benefits in supporting developers in their testing and debugging tasks.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Hypertracing: Tracing through virtualization layers

    Get PDF
    Cloud computing enables on-demand access to remote computing resources. It provides dynamic scalability and elasticity with a low upfront cost. As the adoption of this computing model is rapidly growing, this increases the system complexity, since virtual machines (VMs) running on multiple virtualization layers become very difficult to monitor without interfering with their performance. In this paper, we present hypertracing, a novel method for tracing VMs by using various paravirtualization techniques, enabling efficient monitoring across virtualization boundaries. Hypertracing is a monitoring infrastructure that facilitates seamless trace sharing among host and guests. Our toolchain can detect latencies and their root causes within VMs, even for boot-up and shutdown sequences, whereas existing tools fail to handle these cases. We propose a new hypervisor optimization, for handling efficient nested paravirtualization, which allows hypertracing to be enabled in any nested environment without triggering VM exit multiplication. This is a significant improvement over current monitoring tools, with their large I/O overhead associated with activating monitoring within each virtualization layer

    New interior-point approach for one- and two-class linear support vector machines using multiple variable splitting

    Get PDF
    Multiple variable splitting is a general technique for decomposing problems by using copies of variables and additional linking constraints that equate their values. The resulting large optimization problem can be solved with a specialized interior-point method that exploits the problem structure and computes the Newton direction with a combination of direct and iterative solvers (i.e., Cholesky factorizations and preconditioned conjugate gradients for linear systems related to, respectively, subproblems and new linking constraints). The present work applies this method to solving real-world binary classification and novelty (or outlier) detection problems by means of, respectively, two-class and one-class linear support vector machines (SVMs). Unlike previous interior-point approaches for SVMs, which were practical only with low-dimensional points, the new proposal can also deal with high-dimensional data. The new method is compared with state-of-the-art solvers for SVMs, that are based on either interior-point algorithms (such as SVM-OOPS) or specific algorithms developed by the machine learning community (such as LIBSVM and LIBLINEAR). The computational results show that, for two-class SVMs, the new proposal is competitive not only against previous interior-point methods—and much more efficient than they are with high-dimensional data—but also against LIBSVM; whereas LIBLINEAR generally outperformed the proposal. For one-class SVMs, the new method consistently outperformed all other approaches, in terms of either solution time or solution qualityPeer ReviewedPreprin

    Precise garbage collection for C

    Get PDF
    Journal ArticleMagpie is a source-to-source transformation for C programs that enables precise garbage collection, where precise means that integers are not confused with pointers, and the liveness of a pointer is apparent at the source level. Precise GC is primarily useful for long-running programs and programs that interact with untrusted components. In particular, we have successfully deployed precise GC in the C implementation of a language run-time system that was originally designed to use conservative GC. We also report on our experience in transforming parts of the Linux kernel to use precise GC instead of manual memory management

    Towards optimization-safe systems: analyzing the impact of undefined behavior

    Get PDF
    This paper studies an emerging class of software bugs called optimization-unstable code: code that is unexpectedly discarded by compiler optimizations due to undefined behavior in the program. Unstable code is present in many systems, including the Linux kernel and the Postgres database. The consequences of unstable code range from incorrect functionality to missing security checks. To reason about unstable code, this paper proposes a novel model, which views unstable code in terms of optimizations that leverage undefined behavior. Using this model, we introduce a new static checker called Stack that precisely identifies unstable code. Applying Stack to widely used systems has uncovered 160 new bugs that have been confirmed and fixed by developers.United States. Defense Advanced Research Projects Agency (DARPA Clean-slate design of Resilient, Adaptive, Secure Hosts (CRASH) program under contract #N66001-10-2-4089)National Science Foundation (U.S.) (NSF award CNS-1053143
    • …
    corecore