528 research outputs found

    Benchmarking Memory Management Capabilities within ROOT-Sim

    Get PDF
    In parallel discrete event simulation techniques, the simulation model is partitioned into objects, concurrently executing events on different CPUs and/or multiple CPUCores. In such a context, run-time supports for logical time synchronization across the different simulation objects play a central role in determining the effectiveness of the specific parallel simulation environment. In this paper we present an experimental evaluation of the memory management capabilities offered by the ROme OpTimistic Simulator (ROOT-Sim). This is an open source parallel simulation environment transparently supporting optimistic synchronization via recoverability (based on incremental log/restore techniques) of any type of memory operation affecting the state of simulation objects, i.e., memory allocation, deallocation and update operations. The experimental study is based on a synthetic benchmark which mimics different read/write patterns inside the dynamic memory map associated with the state of simulation objects. This allows sensibility analysis of time and space effects due to the memory management subsystem while varying the type and the locality of the accesses associated with event processin

    Principled software microengineering

    Get PDF

    Resource-aware Programming in a High-level Language - Improved performance with manageable effort on clustered MPSoCs

    Get PDF
    Bis 2001 bedeutete Moores und Dennards Gesetz eine Verdoppelung der Ausführungszeit alle 18 Monate durch verbesserte CPUs. Heute ist Nebenläufigkeit das dominante Mittel zur Beschleunigung von Supercomputern bis zu mobilen Geräten. Allerdings behindern neuere Phänomene wie "Dark Silicon" zunehmend eine weitere Beschleunigung durch Hardware. Um weitere Beschleunigung zu erreichen muss sich auch die Soft­ware mehr ihrer Hardware Resourcen gewahr werden. Verbunden mit diesem Phänomen ist eine immer heterogenere Hardware. Supercomputer integrieren Beschleuniger wie GPUs. Mobile SoCs (bspw. Smartphones) integrieren immer mehr Fähigkeiten. Spezialhardware auszunutzen ist eine bekannte Methode, um den Energieverbrauch zu senken, was ein weiterer wichtiger Aspekt ist, welcher mit der reinen Geschwindigkeit abgewogen werde muss. Zum Beispiel werden Supercomputer auch nach "Performance pro Watt" bewertet. Zur Zeit sind systemnahe low-level Programmierer es gewohnt über Hardware nachzudenken, während der gemeine high-level Programmierer es vorzieht von der Plattform möglichst zu abstrahieren (bspw. Cloud). "High-level" bedeutet nicht, dass Hardware irrelevant ist, sondern dass sie abstrahiert werden kann. Falls Sie eine Java-Anwendung für Android entwickeln, kann der Akku ein wichtiger Aspekt sein. Irgendwann müssen aber auch Hochsprachen resourcengewahr werden, um Geschwindigkeit oder Energieverbrauch zu verbessern. Innerhalb des Transregio "Invasive Computing" habe ich an diesen Problemen gearbeitet. In meiner Dissertation stelle ich ein Framework vor, mit dem man Hochsprachenanwendungen resourcengewahr machen kann, um so die Leistung zu verbessern. Das könnte beispielsweise erhöhte Effizienz oder schnellerer Ausführung für das System als Ganzes bringen. Ein Kerngedanke dabei ist, dass Anwendungen sich nicht selbst optimieren. Stattdessen geben sie alle Informationen an das Betriebssystem. Das Betriebssystem hat eine globale Sicht und trifft Entscheidungen über die Resourcen. Diesen Prozess nennen wir "Invasion". Die Aufgabe der Anwendung ist es, sich an diese Entscheidungen anzupassen, aber nicht selbst welche zu fällen. Die Herausforderung besteht darin eine Sprache zu definieren, mit der Anwendungen Resourcenbedingungen und Leistungsinformationen kommunizieren. So eine Sprache muss ausdrucksstark genug für komplexe Informationen, erweiterbar für neue Resourcentypen, und angenehm für den Programmierer sein. Die zentralen Beiträge dieser Dissertation sind: Ein theoretisches Modell der Resourcen-Verwaltung, um die Essenz des resourcengewahren Frameworks zu beschreiben, die Korrektheit der Entscheidungen des Betriebssystems bezüglich der Bedingungen einer Anwendung zu begründen und zum Beweis meiner Thesen von Effizienz und Beschleunigung in der Theorie. Ein Framework und eine Übersetzungspfad resourcengewahrer Programmierung für die Hochsprache X10. Zur Bewertung des Ansatzes haben wir Anwendungen aus dem High Performance Computing implementiert. Eine Beschleunigung von 5x konnte gemessen werden. Ein Speicherkonsistenzmodell für die X10 Programmiersprache, da dies ein notwendiger Schritt zu einer formalen Semantik ist, die das theoretische Modell und die konkrete Implementierung verknüpft. Zusammengefasst zeige ich, dass resourcengewahre Programmierung in Hoch\-sprachen auf zukünftigen Architekturen mit vielen Kernen mit vertretbarem Aufwand machbar ist und die Leistung verbessert

    Towards Intelligent Runtime Framework for Distributed Heterogeneous Systems

    Get PDF
    Scientific applications strive for increased memory and computing performance, requiring massive amounts of data and time to produce results. Applications utilize large-scale, parallel computing platforms with advanced architectures to accommodate their needs. However, developing performance-portable applications for modern, heterogeneous platforms requires lots of effort and expertise in both the application and systems domains. This is more relevant for unstructured applications whose workflow is not statically predictable due to their heavily data-dependent nature. One possible solution for this problem is the introduction of an intelligent Domain-Specific Language (iDSL) that transparently helps to maintain correctness, hides the idiosyncrasies of lowlevel hardware, and scales applications. An iDSL includes domain-specific language constructs, a compilation toolchain, and a runtime providing task scheduling, data placement, and workload balancing across and within heterogeneous nodes. In this work, we focus on the runtime framework. We introduce a novel design and extension of a runtime framework, the Parallel Runtime Environment for Multicore Applications. In response to the ever-increasing intra/inter-node concurrency, the runtime system supports efficient task scheduling and workload balancing at both levels while allowing the development of custom policies. Moreover, the new framework provides abstractions supporting the utilization of heterogeneous distributed nodes consisting of CPUs and GPUs and is extensible to other devices. We demonstrate that by utilizing this work, an application (or the iDSL) can scale its performance on heterogeneous exascale-era supercomputers with minimal effort. A future goal for this framework (out of the scope of this thesis) is to be integrated with machine learning to improve its decision-making and performance further. As a bridge to this goal, since the framework is under development, we experiment with data from Nuclear Physics Particle Accelerators and demonstrate the significant improvements achieved by utilizing machine learning in the hit-based track reconstruction process

    Runko: Modern multi-physics toolbox for simulating plasma

    Full text link
    Runko is a new open-source plasma simulation framework implemented in C++ and Python. It is designed to function as an easy-to-extend general toolbox for simulating astrophysical plasmas with different theoretical and numerical models. Computationally intensive low-level "kernels" are written in modern C++14 taking advantage of polymorphic classes, multiple inheritance, and template metaprogramming. High-level functionality is operated with Python3 scripts. This hybrid program design ensures fast code and ease of use. The framework has a modular object-oriented design that allow the user to easily add new numerical algorithms to the system. The code can be run on various computing platforms ranging from laptops (shared-memory systems) to massively parallel supercomputer architectures (distributed-memory systems). The framework also supports heterogeneous multi-physics simulations in which different physical solvers can be combined and run simultaneously. Here we report on the first results from the framework's relativistic particle-in-cell (PIC) module. Using the PIC module, we simulate decaying relativistic kinetic turbulence in suddenly stirred magnetically-dominated pair plasma. We show that the resulting particle distribution can be separated into a thermal part that forms the turbulent cascade and into a separate decoupled non-thermal particle population that acts as an energy sink for the system.Comment: 17 pages, 6 figures. Comments welcome! Code available from https://github.com/natj/runk

    Semantic Integration Platform for Web Widgets’ Communication

    Get PDF
    Semantiline integratsiooni platvorm veebividinate suhtlemiseks on raamistik mashup-tüüpi veebirakendusel suhtlemise võimaldamiseks erinevate lõdvalt seotud veebikomponentide (veebividinate) vahel. Mashup-tüüpi rakendused kasutavad ja kombineerivad olemasolevaid veebiressursse, peamiselt erinevaid vidinaid (widgets ing. k.). Praegused mashup-platvormid ei toeta sõnumite vahetust vidinate vahel, kui vidinad on loodud ja hallatud erinevate loojate poolt ning ei mõista üksteiste poolt saadetud sõnumeid. See piirab tunduvalt keerukamate mashup-rakenduste loomist, kui ei ole võimalik kombineerida andmeid erinevatest allikatest (näiteks kolmandate osapoolte vidinate abil) nii, et neid allikaid oleks võimalik omavahel kombineerida, näiteks luua rakendusi, kus erinevad vidinad suudavad omavahel interaktiivselt andmeid vahetada. Antud magistritöö pakub välja lahenduse, mis integreeriks semantiliselt erinevate vidinate poolt saadetud andmed, nii et erinevad vidinad oleks võimelised omavahel andmeid jagama. Kõikide vidinate poolt vahetavate sõnumite andmeelemendid seotaks erinevate ontoloogia terminitega, mis võimaldaks sõnumite sisu masinloetavaks ja –arusaadavaks muuta, nii et saadetud sõnumitest oleks võimalik korjata kokku kõik vajalikud andmeelemendid, milleks oleks võimalik koostada uusi sõnumeid. See võimaldaks kombineerida erinevate vidinate poolt saadetud andmeid ja luua uusi sõnumeid erinevatest allikatest kombineeritud andmetest ning seejärel saata loodud sõnumid edasi nendele vidinatele, kelle jaoks on need andmed kasulikud. Lahendus vidinatevahelise koostöö hõlbustamiseks realiseeriti kasutades raamistikku OpenAjax Hub, mis on keskne sõnumite jaotur (hub ing. k.) lubamaks erinevatel vidinatel jaoturi külge kinnituda ja jaoturi kaudu omavahel sõnumeid vahetada. Jaoturi kasutamine võimaldab küll vidinatel omavahel sõnumeid vahetada, kuid ei lahenda probleemi, kui vidinad kasutavad sõnumite vahetamiseks erinevaid andmeformaate ja –struktuure. Lahendusena realiseeriti magistritöö raames eraldiseisev vidin (nimega Transformatsioonividin ehk Transformer Widget ing. k.), mis ühendub jaoturi külge ja agregeerib andmeelemente kõikidest sõnumitest, mida vidinad välja saadavad, ning genereerib uusi sõnumeid agregeeritud andmeelementidest, mida saab saata nendele vidinatele, mis oskavad kogutud andmeid vastu võtta ja neid kasutada. Magistrtöö raames defineeriti hulk reegleid, millega on võimalik kirjeldada sõnumite sisu ja määrata iga sõnumis esinev andmeelement vastavusse ontoloogia terminiga. Nende reeglite põhjal on võimalik luua rakenduse konfiguratsioone, mille põhjal transformatsioonividin oskab vidinate poolt saadetud sõnumeid interpreteerida ja neist andmeelemente koguda, mille põhjal uusi sõnumeid genereerida. Transformatsioonividina kasutatavust testiti näidisrakenduse peal, mis koosnes kolmest vidinast, mis omavahel otse suhelda ei osanud. Testi eesmärk oli teha kindlaks, kas transformatsioonividinat on võimalik sellelaadsete stsenaariumite puhul kasutada, kus olemasolevad vidinad ei saa üksteiste poolt saadetud sõnumitest aru. Testi käigus loodi näidisrakendusele sõnumite konfiguratsioon, mis kirjeldas vidinate poolt saadetud sõnumite semantika ja struktuuri, mis võimaldas transformasioonividinal vidinate poolt saadetud andmeid transformeerida vidinatele arusaadavale kujule, nii et kui üks vidin saatis sõnumi, mis sisaldas andmeid, mida mõni teine vidin oskas kasutada, siis transformatsioonividin genereeris saadud andmetest uue sõnumi ja saatis selle edasi vastavale vidinale. Näidisrakenduse test oli edukas ja kinnitas transformatsioonividina kasulikkust selliste probleemide lahendamisel.Semantic integration platform for web widgets communication is a framework for providing collaboration capabilities between loosely coupled Web components (called widgets) in a mashup-like Web application. Mashups are Web applications that allow reuse of excising resources by combining different widgets that use data from various sources in the Web. Current mashup platforms do not support collaboration between different widgets, especially if those widgets have been developed and maintained by different vendors and are not able to interpret messages sent by other widgets. This limits the creation of more sophisticated mashups if collaboration between different independent components is not possible where Web widgets could interactively share and exchange data between each other. This thesis tries to solve the problem of making the data published by a widget on a Web application available to all the other widgets on the Web application so that the widgets could share information with each other. That makes it possible to interactively combine data from various sources to allow collaboration between loosely coupled components on a Web application. The thesis proposes a solution for aggregating data from messages sent by different widgets to generate new messages to widgets which could use the combined data. The main problem is collecting useful data from the exchanged messages and transforming the collected data into new messages using various data structures and formats interpretable by widgets that are interested in that data. Integrating and sharing data from various sources is the main research problem in the field of semantic integration and this thesis proposes one solution for sharing data between independent Web widgets on a Web application. A solution of this problem proposed in the thesis is built on the OpenAjax Hub framework that provides the means for Web widgets to exchange messages between each other. OpenAjax Hub provides a central hub that allows messaging between widgets that connect to the hub and subscribe and publish related messages. The problem rises in the use of different data structures in messages exchanged by widgets developed by different vendors. Even though the widgets can use the hub for exchanging messages, the content of exchanged messages remains unknown for the widgets and they are not able to interpret the messages that are exchanged through the hub. The solution proposed and implemented in this thesis is a JavaScript application built as a Web widget that is connected to the OpenAjax messaging hub on a Web application to transform all the exchanged data to be interpretable to every widget in the application. The proposed widget is called Transformer Widget and it uses semantic integration to transform data. The Transformer Widget is connected to the hub as other widgets and, while staying invisible, it listens to all of the messages exchanged by other widgets. Using preconfigured mappings which help the Transformer Widget to identify data elements with their meanings in messages, it aggregates data elements to generate new messages that can be sent to widgets. Mappings describe the structure and the semantics of the messages that are being exchanged through the hub. Mappings contain descriptions of atomic data elements in the messages where each atomic data element is matched with a term in an ontology that describes the meaning of that particular data element. This allows automatic understanding of the content of the exchanged messages regardless of which data structures are used in the messages. It is then possible to collect atomic data elements from the received messages to generate new messages that can be sent to the widgets that could interpret those messages. This solution makes it possible to build complex Web applications (i.e. mashups) using independent Web components (e.g. widgets) that could collaborate and share data with each other

    Chapter Networking Applications for Embedded Systems

    Get PDF
    Embedded system
    corecore