24 research outputs found

    Anàlisi de metodologies docents a l’aula catalana: la classe magistral i la classe invertida

    Get PDF
    Aquest treball s’ha iniciat com un eforç per avaluar les diferents tecnologies i metodologies docents que es poden usar a l'aula. En aquest context, hem analitzat quines d'aquestes tecnologies i metodologies s'apliquen realment a les escoles catalanes i quines dificultats troben els professors per a la seva aplicació. Finalment, hem fet un estudi de camp amb una de les metodologies, per conèixer de primera mà el funcionament dins d'una classe. És per això que aquest treball està dividit en tres estudis, que van des de l'anàlisi més teòric i abstracte, a l'anàlisi més pràctic i específic. Els tres estudis que composen el treball són els següents: La primera part és un estudi fet per la bibliografia dels darrers quinze anys sobre les diferents tecnologies i metodologies docents que s'ha aplicat a la docència, des de cursos de primària fins a cursos universitaris, passant per la secundària obligatòria i postobligatòria. D'aquest estudi hem estret una biblioteca amb les tecnologies i metodologies que ens han semblat més interessants i que han tingut un impacte més important a la docència. La segona part és un estudi de camp fet amb una mostra de 46 professors de tot el territori català, des de primària fins a la universitat, per a conèixer quin és l'estat real a les escoles pel que fa l’ús de les TIC i l’aplicació de metodologies que hem estudiat a la primera part del treball. Aquest estudi s'ha fet basat en enquestes, utilitzant el qüestionari per a la recollida de dades, escollint un mostreig no probabilístic i fent una anàlisi transversal. La tercera part és un estudi de camp sobre la aplicació de la metodologia invertida a un curs de cicles formatius de grau superior. Per a aquest estudi s'ha utilitzat la pràctica reflexiva durant el curs, per tal de millorar la metodologia. De les tècniques estudiades, hem escollit la metodologia invertida per què és de les més noves, que menys s'ha utilitzat i que pensem que té més potencial. Per a analitzar el funcionament de la metodologia s'ha utilitzat, d'una banda, els diferents instruments d'avaluació executats durant el curs, i d'altra, un estudi basat en enquestes de satisfacció als estudiants. Per la recollida d'informació també s'ha usat el qüestionari i, evidentment, també ha estat un mostreig no probabilístic. Aquest també ha estat un estudi transversal, que s'ha fet al final del curs on s'ha aplicat la metodologia

    High-level compiler analysis for OpenMP

    Get PDF
    Nowadays, applications from dissimilar domains, such as high-performance computing and high-integrity systems, require levels of performance that can only be achieved by means of sophisticated heterogeneous architectures. However, the complex nature of such architectures hinders the production of efficient code at acceptable levels of time and cost. Moreover, the need for exploiting parallelism adds complications of its own (e.g., deadlocks, race conditions,...). In this context, compiler analysis is fundamental for optimizing parallel programs. There is however a trade-off between complexity and profit: low complexity analyses (e.g., reaching definitions) provide information that may be insufficient for many relevant transformations, and complex analyses based on mathematical representations (e.g., polyhedral model) give accurate results at a high computational cost. A range of parallel programming models providing different levels of programmability, performance and portability enable the exploitation of current architectures. However, OpenMP has proved many advantages over its competitors: 1) it delivers levels of performance comparable to highly tunable models such as CUDA and MPI, and better robustness than low level libraries such as Pthreads; 2) the extensions included in the latest specification meet the characteristics of current heterogeneous architectures (i.e., the coupling of a host processor to one or more accelerators, and the capability of expressing fine-grained, both structured and unstructured, and highly-dynamic task parallelism); 3) OpenMP is widely implemented by several chip (e.g., Kalray MPPA, Intel) and compiler (e.g., GNU, Intel) vendors; and 4) although currently the model lacks resiliency and reliability mechanisms, many works, including this thesis, pursue their introduction in the specification. This thesis addresses the study of compiler analysis techniques for OpenMP with two main purposes: 1) enhance the programmability and reliability of OpenMP, and 2) prove OpenMP as a suitable model to exploit parallelism in safety-critical domains. Particularly, the thesis focuses on the tasking model because it offers the flexibility to tackle the parallelization of algorithms with load imbalance, recursiveness and uncountable loop based kernels. Additionally, current works have proved the time-predictability of this model, shortening the distance towards its introduction in safety-critical domains. To enable the analysis of applications using the OpenMP tasking model, the first contribution of this thesis is the extension of a set of classic compiler techniques with support for OpenMP. As a basis for including reliability mechanisms, the second contribution consists of the development of a series of algorithms to statically detect situations involving OpenMP tasks, which may lead to a loss of performance, non-deterministic results or run-time failures. A well-known problem of parallel processing related to compilers is the static scheduling of a program represented by a directed graph. Although the literature is extensive in static scheduling techniques, the work related to the generation of the task graph at compile-time is very scant. Compilers are limited by the knowledge they can extract, which depends on the application and the programming model. The third contribution of this thesis is the generation of a predicated task dependency graph for OpenMP that can be interpreted by the runtime in such a way that the cost of solving dependences is reduced to the minimum. With the previous contributions as a basis for determining the functional safety of OpenMP, the final contribution of this thesis is the adaptation of OpenMP to the safety-critical domain considering two directions: 1) indicating how OpenMP can be safely used in such a domain, and 2) integrating OpenMP into Ada, a language widely used in the safety-critical domain.Actualment, aplicacions de dominis diversos com la computació d'altes prestacions i els sistemes d'alta integritat, requereixen nivells de rendiment assolibles només mitjançant arquitectures heterogènies sofisticades. No obstant, la natura complexa d'aquestes dificulta la producció de codi eficient en un temps i cost acceptables. A més, la necessitat d’explotar paral·lelisme introdueix complicacions en sí mateixa (p. ex. bloqueig mutu, condicions de carrera,...). En aquest context, l'anàlisi de compiladors és fonamental per optimitzar programes paral·lels. Existeix però un equilibri entre complexitat i beneficis: la informació obtinguda amb anàlisis simples (p. ex. definicions abastables) pot ser insuficient per moltes transformacions rellevants, i anàlisis complexos basats en models matemàtics (p. ex. model polièdric) faciliten resultats acurats a un alt cost computacional. Existeixen molts models de programació paral·lela que proporcionen diferents nivells de programabilitat, rendiment i portabilitat per l'explotació de les arquitectures actuals. En aquest marc, OpenMP ha demostrat molts avantatges respecte dels seus competidors: 1) el seu nivell de rendiment és comparable a models molt ajustables com CUDA i MPI, i proporciona més robustesa que llibreries de baix nivell com Pthreads; 2) les extensions que inclou la darrera especificació satisfan les característiques de les actuals arquitectures heterogènies (és a dir, l’acoblament d’un processador principal i un o més acceleradors, i la capacitat d'expressar paral·lelisme de tasques de gra fi, ja sigui estructurat o sense estructura; 3) OpenMP és àmpliament implementat per venedors de xips (p. ex. Kalray MPPA, Intel) i compiladors (p. ex. GNU, Intel); i 4) tot i que el model actual manca de mecanismes de resiliència i fiabilitat, molts treballs, incloent aquesta tesi, busquen la seva introducció a l'especificació. Aquesta tesi adreça l'estudi de tècniques d’anàlisi de compiladors amb dos objectius: 1) millorar la programabilitat i la fiabilitat de OpenMP, i 2) provar que OpenMP és un model adequat per explotar paral·lelisme en sistemes crítics. En particular, la tesi es centra en el model de tasques per què aquest ofereix la flexibilitat per abordar aplicacions amb problemes de balanceig de càrrega, recursivitat i bucles incomptables. A més, treballs recents han provat la predictibilitat en qüestió de temps del model, escurçant la distància cap a la seva introducció en sistemes crítics. Per a poder analitzar aplicacions que utilitzen el model de tasques d’OpenMP, la primera contribució d’aquesta tesi consisteix en l’extensió d'un conjunt de tècniques clàssiques de compilació per suportar OpenMP. Com a base per incloure mecanismes de fiabilitat, la segona contribució consisteix en el desenvolupament duna sèrie d'algorismes per detectar de forma estàtica situacions que involucren tasques d’OpenMP, i que poden conduir a una pèrdua de rendiment, resultats no deterministes, o fallades en temps d’execució. Un problema ben conegut del processament paral·lel relacionat amb els compiladors és la planificació estàtica d’un programa representat mitjançant un graf dirigit. Tot i que la literatura sobre planificació estàtica és extensa, aquella relacionada amb la generació del graf en temps de compilació és molt escassa. Els compiladors estan limitats pel coneixement que poden extreure, que depèn de l’aplicació i del model de programació. La tercera contribució de la tesi és la generació d’un graf de dependències enriquit que pot ser interpretat pel sistema en temps d’execució de manera que el cost de resoldre les dependències sigui mínim. Amb les anteriors contribucions com a base per a determinar la seguretat funcional de OpenMP, la darrera contribució de la tesi consisteix en adaptar OpenMP a sistemes crítics, explorant dues direccions: 1) indicar com OpenMP es pot utilitzar de forma segura en un domini com, i 2) integrar OpenMP en Ada, un llenguatge molt utilitzat en el domini de seguretat.Postprint (published version

    Una eina per verificar propietats en xarxes de petri usant àlgebra lineal

    Get PDF
    El codi de l'eina desenvolupada no es pot pujar per aquesta aplicació. Els interessats s'han de possar en contacte amb l'autor

    Compiler Analysis and its application to OmpSs

    Get PDF
    Nowadays, productivity is the buzzword in any computer science area. Several metrics have been defined in order to measure the productivity in any type of system. Some of the most important are the performance, the programmability, the cost or the power usage. From architects to programmers, the improvement of the productivity has became an important aspect of any development. Programming models play an important role in this topic. Thanks to the expressiveness of any high level representation not specified for any particular architecture, and the extra level of abstraction they contribute against specific programming languages, programming models aim to be a cornerstone in the enhancement of the productivity. OmpSs is a programming model developed at the Barcelona Supercomputing Center, built on the top of the Mercurium compiler and the Nanos++ runtime library, which aims to exploit task level parallelism and heterogeneous architectures. This model covers many productivity aspects such as the programmability, defining easy directives that can be integrated in sequential codes avoiding the need of restructuring the originals to get parallelism, and the performance, allowing the use of these directives to give support to multiple architectures and support for asynchronous parallelism. Nonetheless, not only the convenient design of a programming model and the use of a powerful architecture can help in the achievement of good productivity.Compilers are crucial in the communication between these two components in computers. They are meant to exploit both the underlying architectures and the programmers codes. In order to do that, analysis and optimizations are the techniques that can procure better transformations. Therefore, we have focused our work in the enhancement of the productivity of OmpSs by means of implementing a set of high level analysis and optimizations in the Mercurium compiler. They address two directions: obtain better performance by improving the code generation and improve the programmability of the programming model relieving the programmer of some tedious and error-prone tasks. Since Mercurium is a source-to-source compiler, we have applied these analyses in a high level representation and they are important because they are architecture independent and, thereupon, they can be useful for any target device in the back-end transformations

    OpenMP static TDG runtime implementation and its usage in heterogeneous computing

    Get PDF
    OpenMP being the standard to use in shared memory parallel programming, it offers the possibility to parallelize sequential program with accelerators by using target directive. However, CUDA Graph as a new, efficient feature is not supported yet. In this work, we present an automatic transformation of OpenMP TDG to CUDA Graph, increasing the programmability of the latter

    OpenMP to CUDA graphs: a compiler-based transformation to enhance the programmability of NVIDIA devices

    Get PDF
    Heterogeneous computing is increasingly being used in a diversity of computing systems, ranging from HPC to the real-time embedded domain, to cope with the performance requirements. Due to the variety of accelerators, e.g., FPGAs, GPUs, the use of high-level parallel programming models is desirable to exploit the performance capabilities of them, while maintaining an adequate productivity level. In that regard, OpenMP is a well-known high-level programming model that incorporates powerful task and accelerator models capable of efficiently exploiting structured and unstructured parallelism in heterogeneous computing. This paper presents a novel compiler transformation technique that automatically transforms OpenMP code into CUDA graphs, combining the benefits of programmability of a high-level programming model such as OpenMP, with the performance benefits of a low-level programming model such as CUDA. Evaluations have been performed on two NVIDIA GPUs from the HPC and embedded domains, i.e., the V100 and the Jetson AGX respectively.This work has been supported by the EU H2020 project AMPERE under the grant agreement no. 871669.Peer ReviewedPostprint (author's final draft

    Enabling Ada and OpenMP runtimes interoperability through template-based execution

    Get PDF
    The growing trend to support parallel computation to enable the performance gains of the recent hardware architectures is increasingly present in more conservative domains, such as safety-critical systems. Applications such as autonomous driving require levels of performance only achievable by fully leveraging the potential parallelism in these architectures. To address this requirement, the Ada language, designed for safety and robustness, is considering to support parallel features in the next revision of the standard (Ada 202X). Recent works have motivated the use of OpenMP, a de facto standard in high-performance computing, to enable parallelism in Ada, showing the compatibility of the two models, and proposing static analysis to enhance reliability. This paper summarizes these previous efforts towards the integration of OpenMP into Ada to exploit its benefits in terms of portability, programmability and performance, while providing the safety benefits of Ada in terms of correctness. The paper extends those works proposing and evaluating an application transformation that enables the OpenMP and the Ada runtimes to operate (under certain restrictions) as they were integrated. The objective is to allow Ada programmers to (naturally) experiment and evaluate the benefits of parallelizing concurrent Ada tasks with OpenMP while ensuring the compliance with both specifications.This work was supported by the Spanish Ministry of Science and Innovation under contract TIN2015-65316-P, by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreements no. 611016 and No 780622, and by the FCT (Portuguese Foundation for Science and Technology) within the CISTER Research Unit (CEC/04234).Peer ReviewedPostprint (published version

    A toolchain to verify the parallelization of OmpSs-2 applications

    Get PDF
    Programming models for task-based parallelization based on compile-time directives are very effective at uncovering the parallelism available in HPC applications. Despite that, the process of correctly annotating complex applications is error-prone and may hinder the general adoption of these models. In this paper, we target the OmpSs-2 programming model and present a novel toolchain able to detect parallelization errors coming from non-compliant OmpSs-2 applications. Our toolchain verifies the compliance with the OmpSs-2 programming model using local task analysis to deal with each task separately, and structural induction to extend the analysis to the whole program. To improve the effectiveness of our tools, we also introduce some ad-hoc verification annotations, which can be used manually or automatically to disable the analysis of specific code regions. Experiments run on a sample of representative kernels and applications show that our toolchain can be successfully used to verify the parallelization of complex real-world applications.This project is supported by the European Union’s Horizon 2021 research and innovation programme under grant agreement No 754304 (DEEP-EST), by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871669 (AMPERE) and the Project HPCEUROPA3 (INFRAIA-2016-1-730897), by the Ministry of Economy of Spain through the Severo Ochoa Center of Excellence Program (SEV-2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), and by the Generalitat de Catalunya (2017-SGR-1481).Peer ReviewedPostprint (author's final draft

    Experiences on the characterization of parallel applications in embedded systems with Extrae/Paraver

    Get PDF
    Cutting-edge functionalities in embedded systems require the use of parallel architectures to meet their performance requirements. This imposes the introduction of a new layer in the software stacks of embedded systems: the parallel programming model. Unfortunately, the tools used to analyze embedded systems fall short to characterize the performance of parallel applications at a parallel programming model level, and correlate this with information about non-functional requirements such as real-time, energy, memory usage, etc. HPC tools, like Extrae, are designed with that level of abstraction in mind, but their main focus is on performance evaluation. Overall, providing insightful information about the performance of parallel embedded applications at the parallel programming model level, and relate it to the non-functional requirements, is of paramount importance to fully exploit the performance capabilities of parallel embedded architectures. This paper contributes to the state-of-the-art of analysis tools for embedded systems by: (1) analyzing the particular constraints of embedded systems compared to HPC systems (e.g., static setting, restricted memory, limited drivers) to support HPC analysis tools; (2) porting Extrae, a powerful tracing tool from the HPC domain, to the GR740 platform, a SoC used in the space domain; and (3) augmenting Extrae with new features needed to correlate the parallel execution with the following non-functional requirements: energy, temperature and memory usage. Finally, the paper presents the usefulness of Extrae to characterize OpenMP applications and its non-functional requirements, evaluating different aspects of the applications running in the GR740.This work has been partially funded from the HP4S (High Performance Parallel Payload Processing for Space) project under the ESA-ESTEC ITI contract â„– 4000124124/18/NL/CRS.Peer ReviewedPostprint (author's final draft

    Heuristic-based task-to-thread mapping in multi-core processors

    Get PDF
    OpenMP can be used in real-time applications to enhance system performance. However, predictability of OpenMP applications is still a challenge. This paper investigates heuristics for the mapping of OpenMP task graphs in underlying threads, for the development of time-predictable OpenMP programs. These approaches are based on a global scheduling queue, as well as per-thread allocation queues. The proposed method is divided into scheduling and allocation phases. In the former phase, OpenMP task-parts are discovered from OpenMP graph and placed in the scheduling queue. Afterwards, an appropriate allocation queue is selected for each task-part using four heuristic algorithms. In the latter phase, the best task-part is selected from the allocation queue to be allocated to and executed by an idle thread. Preliminary simulation results show that the new method overcomes BFS and WFS in terms of scheduling time and idle time.This work has been co-funded by the European commission through the AMPERE (H2020 grant agreement N° 745601) project.Peer ReviewedPostprint (author's final draft
    corecore