11 research outputs found

    Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

    Get PDF

    Medical microprocessor systems

    Get PDF
    The practical classes and laboratory work in the discipline "Medical microprocessor systems", performed using software in the programming environment of microprocessors Texas Instruments (Code Composer Studio) and using of digital microprocessors of the Texas Instruments DSK6400 family, and models of electrical equipment in the environment of graphical programming LabVIEW 2010.Π›Π°Π±ΠΎΡ€Π°Ρ‚ΠΎΡ€Π½ΠΈΠΉ ΠΏΡ€Π°ΠΊΡ‚ΠΈΠΊΡƒΠΌ Π· програмування Ρ‚Π° ΠΏΠΎΠ±ΡƒΠ΄ΠΎΠ²ΠΈ ΠΌΠ΅Π΄ΠΈΡ‡Π½ΠΈΡ… мікропроцСсорних систСм, який Π²ΠΈΠΊΠ»Π°Π΄Π΅Π½ΠΎ Ρƒ Π½Π°Π²Ρ‡Π°Π»ΡŒΠ½ΠΎΠΌΡƒ посібнику Π΄ΠΎΠΏΠΎΠΌΠ°Π³Π°Ρ” Π½Π°ΠΊΠΎΠΏΠΈΡ‡ΡƒΠ²Π°Ρ‚ΠΈ ΠΉ Π΅Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎ використовувати ΠΎΡ‚Ρ€ΠΈΠΌΠ°Π½Ρƒ Ρ–Π½Ρ„ΠΎΡ€ΠΌΠ°Ρ†Ρ–ΡŽ Π· Ρ‚Π΅ΠΎΡ€Π΅Ρ‚ΠΈΡ‡Π½ΠΎΠ³ΠΎ курсу Π½Π° всіх стадіях Π½Π°Π²Ρ‡Π°Π»ΡŒΠ½ΠΎΠ³ΠΎ процСсу, Ρ‰ΠΎ Ρ” Π²Π°ΠΆΠ»ΠΈΠ²ΠΈΠΌ для ΠΏΡ–Π΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠΈ магістрів Ρ‚Π° Π½Π΅ΠΎΠ±Ρ…Ρ–Π΄Π½ΠΎΡŽ ланкою Ρƒ Π½Π°ΡƒΠΊΠΎΠ²ΠΎΠΌΡƒ ΠΏΡ–Π·Π½Π°Π½Π½Ρ– ΠΏΡ€Π°ΠΊΡ‚ΠΈΡ‡Π½ΠΈΡ… основ Π±Ρ–ΠΎΠΌΠ΅Π΄ΠΈΡ‡Π½ΠΎΡ— Π΅Π»Π΅ΠΊΡ‚Ρ€ΠΎΠ½Ρ–ΠΊΠΈ.The laboratory workshop on the programming and construction of medical microprocessor systems, which is outlined in the tutorial, helps to accumulate and effectively use the information obtained from a theoretical course at all stages of the educational process, which is important for the preparation of masters and a necessary link in the scientific knowledge of the practical basics of biomedicine.Π›Π°Π±ΠΎΡ€Π°Ρ‚ΠΎΡ€Π½Ρ‹ΠΉ ΠΏΡ€Π°ΠΊΡ‚ΠΈΠΊΡƒΠΌ ΠΏΠΎ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠΌΠΈΡ€ΠΎΠ²Π°Π½ΠΈΡŽ ΠΈ ΠΏΠΎΡΡ‚Ρ€ΠΎΠ΅Π½ΠΈΡŽ мСдицинских микропроцСссорных систСм, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ ΠΈΠ·Π»ΠΎΠΆΠ΅Π½ Π² ΡƒΡ‡Π΅Π±Π½ΠΎΠΌ пособии ΠΏΠΎΠΌΠΎΠ³Π°Π΅Ρ‚ Π½Π°ΠΊΠ°ΠΏΠ»ΠΈΠ²Π°Ρ‚ΡŒ ΠΈ эффСктивно ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚ΡŒ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½ΡƒΡŽ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΡŽ ΠΈΠ· тСорСтичСского курса Π½Π° всСх стадиях ΡƒΡ‡Π΅Π±Π½ΠΎΠ³ΠΎ процСсса, Ρ‡Ρ‚ΠΎ Π²Π°ΠΆΠ½ΠΎ для ΠΏΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠΈ магистров ΠΈ являСтся Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΡ‹ΠΌ Π·Π²Π΅Π½ΠΎΠΌ Π² Π½Π°ΡƒΡ‡Π½ΠΎΠΌ ΠΏΠΎΠ·Π½Π°Π½ΠΈΠΈ практичСских основ биомСдицинской элСктроники

    Design and implementation of an array language for computational science on a heterogeneous multicore architecture

    Get PDF
    The packing of multiple processor cores onto a single chip has become a mainstream solution to fundamental physical issues relating to the microscopic scales employed in the manufacture of semiconductor components. Multicore architectures provide lower clock speeds per core, while aggregate floating-point capability continues to increase. Heterogeneous multicore chips, such as the Cell Broadband Engine (CBE) and modern graphics chips, also address the related issue of an increasing mismatch between high processor speeds, and huge latency to main memory. Such chips tackle this memory wall by the provision of addressable caches; increased bandwidth to main memory; and fast thread context switching. An associated cost is often reduced functionality of the individual accelerator cores; and the increased complexity involved in their programming. This dissertation investigates the application of a programming language supporting the first-class use of arrays; and capable of automatically parallelising array expressions; to the heterogeneous multicore domain of the CBE, as found in the Sony PlayStation 3 (PS3). The language is a pre-existing and well-documented proper subset of Fortran, known as the β€˜F’ programming language. A bespoke compiler, referred to as E , is developed to support this aim, and written in the Haskell programming language. The output of the compiler is in an extended C++ dialect known as Offload C++, which targets the PS3. A significant feature of this language is its use of multiple, statically typed, address spaces. By focusing on generic, polymorphic interfaces for both the generated and hand constructed code, a number of interesting design patterns relating to the memory locality are introduced. A suite of medium-sized (100-700 lines), real-world benchmark programs are used to evaluate the performance, correctness, and scalability of the compiler technology. Absolute speedup values, well in excess of one, are observed for all of the programs. The work ultimately demonstrates that an array language can significantly reduce the effort expended to utilise a parallel heterogeneous multicore architecture, while retaining high performance. A substantial, related advantage in using standard β€˜F’ is that any Fortran compiler can create debuggable, and competitively performing serial programs

    A new parallelisation technique for heterogeneous CPUs

    Get PDF
    Parallelization has moved in recent years into the mainstream compilers, and the demand for parallelizing tools that can do a better job of automatic parallelization is higher than ever. During the last decade considerable attention has been focused on developing programming tools that support both explicit and implicit parallelism to keep up with the power of the new multiple core technology. Yet the success to develop automatic parallelising compilers has been limited mainly due to the complexity of the analytic process required to exploit available parallelism and manage other parallelisation measures such as data partitioning, alignment and synchronization. This dissertation investigates developing a programming tool that automatically parallelises large data structures on a heterogeneous architecture and whether a high-level programming language compiler can use this tool to exploit implicit parallelism and make use of the performance potential of the modern multicore technology. The work involved the development of a fully automatic parallelisation tool, called VSM, that completely hides the underlying details of general purpose heterogeneous architectures. The VSM implementation provides direct and simple access for users to parallelise array operations on the Cell’s accelerators without the need for any annotations or process directives. This work also involved the extension of the Glasgow Vector Pascal compiler to work with the VSM implementation as a one compiler system. The developed compiler system, which is called VP-Cell, takes a single source code and parallelises array expressions automatically. Several experiments were conducted using Vector Pascal benchmarks to show the validity of the VSM approach. The VP-Cell system achieved significant runtime performance on one accelerator as compared to the master processor’s performance and near-linear speedups over code runs on the Cell’s accelerators. Though VSM was mainly designed for developing parallelising compilers it also showed a considerable performance by running C code over the Cell’s accelerators

    Ein Modell zur effizienten Parallelisierung von Algorithmen auf komplexen, dynamischen Datenstrukturen

    Get PDF
    Moderne berechnungsintensive Algorithmen, beispielsweise adaptive numerische Lâsungsverfahren für partielle Differentialgleichungen, arbeiten oftmals auf komplexen, dynamischen Datenstrukturen. Die Implementierung solcher Algorithmen auf Parallelrechnern mit verteiltem Speicher mittels Datenpartitionierung wirft zahlreiche Probleme auf (z.B. Lastverteilung). Im Rahmen der vorliegenden Arbeit wurde das neue parallele Programmiermodell Dynamic Distributed Data (DDD) entwickelt, durch das die Parallelisierungsarbeit vom Design der verteilten Datenstrukturen bis hin zur Erstellung des portablen, parallelen und effizienten Programmcodes unterstützt wird. Dem DDD-Konzept liegt ein graphbasiertes formales Modell zugrunde. Dabei wird die Datenstruktur des jeweiligen Programms (z.B. unstrukturierte Gitter) formal auf einen verteilten Graphen abgebildet, der aus mehreren lokalen Graphen besteht. Das formale Modell dient als Spezifikation des Programmiermodells und gleichzeitig zur Definition der wichtigen in dieser Arbeit verwendeten Begriffe. Der Systemarchitektur von DDD-basierten Anwendungen liegt ein Schichtenmodell zugrunde, den Kern stellt dabei die DDD-Programmbibliothek dar. Diese bietet Funktionen zur dynamischen Definition verteilter Datentypen und zur Verwaltung lokaler Objekte. In den Überlappungsbereichen der lokalen Graphen stehen abstrakte Kommunikationsfunktionen in Form von sog. Interfaces zur Verfügung. Die wesentliche Neuerung gegenüber nahezu allen bestehenden Arbeiten ist jedoch die Mâglichkeit zur dynamischen VerÀnderung des verteilten Graphen; dies ermâglicht es beispielsweise, dynamische Lastverteilung oder Gittergenerierungsverfahren einfach und effizient zu implementieren. Damit kânnen beliebig komplexe Datentopologien dynamisch erzeugt, migriert und wieder entfernt werden
    corecore