33 research outputs found

    Implementación eficiente del teorema chino del resto

    Get PDF
    Estudio, diseño e implementación de nuevos enfoques paralelos software del Teorema Chino del Resto sobre diferentes plataformas y arquitecturas de tipo paralelo. Algunos de estos diseños típicos generan problemas conocidos de almacenamiento y gestión de la memoria en su ejecución al manejar Enteros Grandes. Este problema lleva al diseño de nuevos métodos más eficientes y refinados que vienen a solventar estas cuestiones de forma original y escalable. Los resultados obtenidos no solo mejoran de forma sustancial la implementación secuencial del mismo, sino que representan un avance en eficiencia, rendimiento y escalabilidad respecto a otras alternativas existentes actualmente

    Hybrid algorithms for efficient Cholesky decomposition and matrix inverse using multicore CPUs with GPU accelerators

    Get PDF
    The use of linear algebra routines is fundamental to many areas of computational science, yet their implementation in software still forms the main computational bottleneck in many widely used algorithms. In machine learning and computational statistics, for example, the use of Gaussian distributions is ubiquitous, and routines for calculating the Cholesky decomposition, matrix inverse and matrix determinant must often be called many thousands of times for common algorithms, such as Markov chain Monte Carlo. These linear algebra routines consume most of the total computational time of a wide range of statistical methods, and any improvements in this area will therefore greatly increase the overall efficiency of algorithms used in many scientific application areas. The importance of linear algebra algorithms is clear from the substantial effort that has been invested over the last 25 years in producing low-level software libraries such as LAPACK, which generally optimise these linear algebra routines by breaking up a large problem into smaller problems that may be computed independently. The performance of such libraries is however strongly dependent on the specific hardware available. LAPACK was originally developed for single core processors with a memory hierarchy, whereas modern day computers often consist of mixed architectures, with large numbers of parallel cores and graphics processing units (GPU) being used alongside traditional CPUs. The challenge lies in making optimal use of these different types of computing units, which generally have very different processor speeds and types of memory. In this thesis we develop novel low-level algorithms that may be generally employed in blocked linear algebra routines, which automatically optimise themselves to take full advantage of the variety of heterogeneous architectures that may be available. We present a comparison of our methods with MAGMA, the state of the art open source implementation of LAPACK designed specifically for hybrid architectures, and demonstrate up to 400% increase in speed that may be obtained using our novel algorithms, specifically when running commonly used Cholesky matrix decomposition, matrix inverse and matrix determinant routines

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

    Continuous Electrode Inertial Electrostatic Confinement Fusion

    Get PDF
    The NIAC Phase I project on Inertial Electrostatic Confinement was a continuation of early stage research that was funded by an NSTRF. The student on the project, Andrew Chap, was funded by the NSTRF from Fall 2013 through the Summer of 2017, and then was funded on the NIAC through the completion of his PhD. A significant amount of work targeting the plasma confinement physics was the focus of his NSTRF, and over the course of that effort he developed a number of analyses and computational tools that leveraged GPU parallelization. A detailed discussion of these models can be found in his dissertation, which has been included as Appendix D in this report. As a requirement for the NSTRF, Andrew's full dissertation was submitted at the end of the program.Having developed the computational tools, a substantial amount of simulation and analyses leveraging those tools were conducted during the Fall of 2017, under the auspices of the NIAC funded research. Much of this work targeted optimization of the confinement fields, investigating their structure and the possible advantages of having them be time-varying. The results of these simulations can also be found in Appendix D.One of the main results from this research is that the density of ions electrostatically confined within the system can indeed be increased by several orders of magnitude by optimizing the radial potential distribution, and by dynamically varying these fields to maintain compressed ion bunches. An electron population can also be confined within the core by a static radial cusped magnetic field,which helps to support a greater ion density within the core. The issue with the confinement mechanism is that as the ion densities are increased toward fusion-relevant levels, the electrostatic forces generated by the confined electron population become so great that the ions are no longer energetic enough to leave the device core. As their excursions into the outer channels are diminished, the mechanism that is used to maintain their non-thermal velocity distributions becomes ineffective, and eventually the ions become fully confined within the core, where they thermalize. A possible fix to the problem comes by discarding the active ion control (a main pillar of the concept)but retaining the structure of the permanent magnet confinement of the electron population. Such cusped field confinement has been the focus of other IEC approaches (e.g. Polywell), but the high transparency of the permanent magnet structure lends itself to better ion extraction and power conversion (a second pillar of the concept). The question then becomes whether any influence on the ion evolution within the core can be achieved to slow the thermalization of the ions. Such approaches have been studied in highly idealized analytic models, but face major criticisms within the literature. While this is a possible path forward, the uncertainty in the approach did not warrant committing NIAC Phase II resources to investigating the concept at this time

    Enabling Independent Communication for FPGAs in High Performance Computing

    Get PDF

    JTIT

    Get PDF
    kwartalni
    corecore