9 research outputs found

    A case study for a new invasive extension of Intel's Threading Building Blocks.

    Get PDF
    We study codes deploying multiple MPI ranks to one node where each rank is parallelised with TBB. A static assignment of cores to ranks here is disadvantageous if the load is not perfectly balanced, the runtime is subject to fluctuations or one MPI rank runs through phases with low concurrency. We propose an extension to TBB where developers manually annotate which code parts could exploit further cores. The cores are then dynamically associated with ranks. Our approach is decentralised, lightweight and minimally invasive w.r.t. code modifications. Some brief performance studies suggest that a flexible, permanently changing assignment of cores to compute ranks can outperform a static distribution, while greedily haggling over cores throughout a simulation might perform even better

    Performance analysis for parallel programs from multicore to petascale

    Get PDF
    Cutting-edge science and engineering applications require petascale computing. Petascale computing platforms are characterized by both extreme parallelism (systems of hundreds of thousands to millions of cores) and hybrid parallelism (nodes with multicore chips). Consequently, to effectively use petascale resources, applications must exploit concurrency at both the node and system level --- a difficult problem. The challenge of developing scalable petascale applications is only partially aided by existing languages and compilers. As a result, manual performance tuning is often necessary to identify and resolve poor parallel and serial efficiency. Our thesis is that it is possible to achieve unique, accurate, and actionable insight into the performance of fully optimized parallel programs by measuring them with asynchronous-sampling-based call path profiles; attributing the resulting binary-level measurements to source code structure; analyzing measurements on-the-fly and postmortem to highlight performance inefficiencies; and presenting the resulting context- sensitive metrics in three complementary views. To support this thesis, we have developed several techniques for identifying performance problems in fully optimized serial, multithreaded and petascale programs. First, we describe how to attribute very precise (instruction-level) measurements to source-level static and dynamic contexts in fully optimized applications --- all for an average run-time overhead of a few percent. We then generalize this work with the development of logical call path profiling and apply it to work-stealing-based applications. Second, we describe techniques for pinpointing and quantifying parallel inefficiencies such as parallel idleness, parallel overhead and lock contention in multithreaded executions. Third, we show how to diagnose scalability bottlenecks in petascale applications by scaling our our measurement, analysis and presentation tools to support large-scale executions. Finally, we provide a coherent framework for these techniques by sketching a unique and comprehensive performance analysis methodology. This work forms the basis of Rice University's HPCTOOLKIT performance tools

    3rd Many-core Applications Research Community (MARC) Symposium. (KIT Scientific Reports ; 7598)

    Get PDF
    This manuscript includes recent scientific work regarding the Intel Single Chip Cloud computer and describes approaches for novel approaches for programming and run-time organization

    Reliable Software for Unreliable Hardware - A Cross-Layer Approach

    Get PDF
    A novel cross-layer reliability analysis, modeling, and optimization approach is proposed in this thesis that leverages multiple layers in the system design abstraction (i.e. hardware, compiler, system software, and application program) to exploit the available reliability enhancing potential at each system layer and to exchange this information across multiple system layers

    Enhancements to jml and its extended static checking technology

    Get PDF
    Formal methods are useful for developing high-quality software, but to make use of them, easy-to-use tools must be available. This thesis presents our work on the Java Modeling Language (JML) and its static verification tools. A main contribution is Offline User-Assisted Extended Static Checking (OUA-ESC), which is positioned between the traditional, fully automatic ESC and interactive Full Static Program Verification (FSPV). With OUA-ESC, automated theorem provers are used to discharge as many Verification Conditions (VCs) as possible, then users are allowed to provide Isabelle/HOL proofs for the sub-VCs that cannot be discharged automatically. Thus, users are able to take advantage of the full power of Isabelle/HOL to manually prove the system correct, if they so choose. Exploring unproven sub-VCs with Isabelle's ProofGeneral has also proven very useful for debugging code and their specifications. We also present syntax and semantics for monotonic non-null references, a common category that has not been previously identified. This monotonic non-null modifier allows some fields previously declared as nullable to be treated like local variables for nullity flow analysis. To support this work, we developed JML4, an Eclipse-based Integration Verification Environment (IVE) for the Java Modeling Language. JML4 provides integration of JML into all of the phases of the Eclipse JDT's Java compiler, makes use of external API specifications, and provides native error reporting. The verification techniques initially supported include a Non-Null Type System (NNTS), Runtime Assertion Checking (RAC), and Extended Static Checking (ESC); and verification tools to be developed by other researchers can be incorporated. JML4 was adopted by the JML4 community as the platform for their combined research efforts. ESC4, JML4's ESC component, provides other novel features not found before in ESC tools. Multiple provers are used automatically, which provides a greater coverage of language constructs that can be verified. Multi-threaded generation and distributed discharging of VCs, as well as a proof-status caching strategy, greatly speed up this CPU-intensive verification technique. VC caches are known to be fragile, and we developed a simple way to remove some of that fragility. These features combine to form the first IVE for JML, which will hopefully bring the improved quality promised by formal methods to Java developer

    Proceedings, MSVSCC 2015

    Get PDF
    The Virginia Modeling, Analysis and Simulation Center (VMASC) of Old Dominion University hosted the 2015 Modeling, Simulation, & Visualization Student capstone Conference on April 16th. The Capstone Conference features students in Modeling and Simulation, undergraduates and graduate degree programs, and fields from many colleges and/or universities. Students present their research to an audience of fellow students, faculty, judges, and other distinguished guests. For the students, these presentations afford them the opportunity to impart their innovative research to members of the M&S community from academic, industry, and government backgrounds. Also participating in the conference are faculty and judges who have volunteered their time to impart direct support to their students’ research, facilitate the various conference tracks, serve as judges for each of the tracks, and provide overall assistance to this conference. 2015 marks the ninth year of the VMASC Capstone Conference for Modeling, Simulation and Visualization. This year our conference attracted a number of fine student written papers and presentations, resulting in a total of 51 research works that were presented. This year’s conference had record attendance thanks to the support from the various different departments at Old Dominion University, other local Universities, and the United States Military Academy, at West Point. We greatly appreciated all of the work and energy that has gone into this year’s conference, it truly was a highly collaborative effort that has resulted in a very successful symposium for the M&S community and all of those involved. Below you will find a brief summary of the best papers and best presentations with some simple statistics of the overall conference contribution. Followed by that is a table of contents that breaks down by conference track category with a copy of each included body of work. Thank you again for your time and your contribution as this conference is designed to continuously evolve and adapt to better suit the authors and M&S supporters. Dr.Yuzhong Shen Graduate Program Director, MSVE Capstone Conference Chair John ShullGraduate Student, MSVE Capstone Conference Student Chai

    A Case Study for a New Invasive Extension of Intel's Threading Building Blocks

    Get PDF
    We study codes deploying multiple MPI ranks to one node where each rank is parallelised with TBB. A static assignment of cores to ranks here is disadvantageous if the load is not perfectly balanced, the runtime is subject to fluctuations or one MPI rank runs through phases with low concurrency. We propose an extension to TBB where developers manually annotate which code parts could exploit further cores. The cores are then dynamically associated with ranks. Our approach is decentralised, lightweight and minimally invasive w.r.t. code modifications. Some brief performance studies suggest that a flexible, permanently changing assignment of cores to compute ranks can outperform a static distribution, while greedily haggling over cores throughout a simulation might perform even better

    CIMODE 2016: 3º Congresso Internacional de Moda e Design: proceedings

    Get PDF
    O CIMODE 2016 é o terceiro Congresso Internacional de Moda e Design, a decorrer de 9 a 12 de maio de 2016 na cidade de Buenos Aires, subordinado ao tema : EM--‐TRAMAS. A presente edição é organizada pela Faculdade de Arquitetura, Desenho e Urbanismo da Universidade de Buenos Aires, em conjunto com o Departamento de Engenharia Têxtil da Universidade do Minho e com a ABEPEM – Associação Brasileira de Estudos e Pesquisa em Moda.info:eu-repo/semantics/publishedVersio
    corecore