3,140 research outputs found

    Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

    Full text link
    [EN] The ever-increasing computational requirements of HPC and service provider applications are becoming a great challenge for hardware and software designers. These requirements are reaching levels where the isolated development on either computational field is not enough to deal with such challenge. A holistic view of the computational thinking is therefore the only way to success in real scenarios. However, this is not a trivial task as it requires, among others, of hardware¿software codesign. In the hardware side, most high-throughput computers are designed aiming for heterogeneity, where accelerators (e.g. Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), etc.) are connected through high-bandwidth bus, such as PCI-Express, to the host CPUs. Applications, either via programmers, compilers, or runtime, should orchestrate data movement, synchronization, and so on among devices with different compute and memory capabilities. This increases the programming complexity and it may reduce the overall application performance. This article evaluates different offloading strategies to leverage heterogeneous systems, based on several cards with the firstgeneration Xeon Phi coprocessors (Knights Corner). We use a 11-point 3-D Stencil kernel that models heat dissipation as a case study. Our results reveal substantial performance improvements when using several accelerator cards. Additionally, we show that computing of an approximate result by reducing the communication overhead can yield 23% performance gains for double-precision data sets.The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is jointly supported by the Fundacion Seneca (Agencia Regional de Ciencia y Tecnologia, Region de Murcia) under grants 15290/PI/2010 and 18946/JLI/13 and by the Spanish MINECO, as well as European Commission FEDER funds, under grants TIN2015-66972-C5-3-R and TIN2016-78799-P (AEI/ FEDER, UE). MH was supported by a research grant from the PRODEP under the Professional Development Program for Teachers (UAGro-197) MéxicoHernández, M.; Cebrián, JM.; Cecilia-Canales, JM.; García, JM. (2020). Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance. International Journal of High Performance Computing Applications. 34(2):199-297. https://doi.org/10.1177/1094342017738352S199297342Michael Brown, W., Carrillo, J.-M. Y., Gavhane, N., Thakkar, F. M., & Plimpton, S. J. (2015). Optimizing legacy molecular dynamics software with directive-based offload. Computer Physics Communications, 195, 95-101. doi:10.1016/j.cpc.2015.05.004Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., & Burger, D. (2012). Power Limitations and Dark Silicon Challenge the Future of Multicore. ACM Transactions on Computer Systems, 30(3), 1-27. doi:10.1145/2324876.2324879Feng, L. (2015). Data Transfer Using the Intel COI Library. High Performance Parallelism Pearls, 341-348. doi:10.1016/b978-0-12-802118-7.00020-0Jeffers, J., & Reinders, J. (2013). Offload. Intel Xeon Phi Coprocessor High Performance Programming, 189-241. doi:10.1016/b978-0-12-410414-3.00007-4Rahman, R. (2013). Intel® Xeon Phi™ Coprocessor Architecture and Tools. doi:10.1007/978-1-4302-5927-5Reinders J, Jeffers J (2014) High Performance Parallelism Pearls, Multicore and Many-core Programming Approaches (Characterization and Auto-tuning of 3DFD). Morgan Kaufmann, pp. 377–396.Shareef, B., de Doncker, E., & Kapenga, J. (2015). Monte Carlo simulations on Intel Xeon Phi: Offload and native mode. 2015 IEEE High Performance Extreme Computing Conference (HPEC). doi:10.1109/hpec.2015.7322456Ujaldón, M. (2016). CUDA Achievements and GPU Challenges Ahead. Lecture Notes in Computer Science, 207-217. doi:10.1007/978-3-319-41778-3_20Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., & Wang, Y. (2014). High-Performance Computing on the Intel® Xeon Phi™. doi:10.1007/978-3-319-06486-4Wende, F., Klemm, M., Steinke, T., & Reinefeld, A. (2015). Concurrent Kernel Offloading. High Performance Parallelism Pearls, 201-223. doi:10.1016/b978-0-12-802118-7.00012-

    The Kramer sampling theorem revisited

    Get PDF
    The classical Kramer sampling theorem provides a method for obtaining orthogonal sampling formulas. Besides, it has been the cornerstone for a significant mathematical literature on the topic of sampling theorems associated with differential and difference problems. In this work we provide, in an unified way, new and old generalizations of this result corresponding to various different settings; all these generalizations are illustrated with examples. All the different situations along the paper share a basic approach: the functions to be sampled are obtaining by duality in a separable Hilbert space through an -valued kernel K defined on an appropriate domain.This work has been supported by the grant MTM2009–08345 from the Spanish Ministerio de Ciencia e Innovación (MICNN).Publicad

    Pattern-wavelength coarsening from topological dynamics in silicon nanofoams

    Get PDF
    We report the experimental observation of a submicron cellular structure on the surface of silicon targets eroded by an ion plasma. Analysis by atomic force microscopy allows us to assess the time evolution and show that the system can be described quantitatively by the convective Cahn-Hilliard equation, found in the study of domain coarsening for a large class of driven systems. The space-filling trait of the ensuing pattern relates it to evolving foams. Through this connection, we are actually able to derive the coarsening law for the pattern wavelength from the nontrivial topological dynamics of the cellular structure. Thus, the study of the topological properties of patterns in nonvariational spatially extended systems emerges as complementary to morphological approaches to their challenging coarsening properties.This work has been partially supported by MICINN and MEC (Spain) via Grants No. IS2009-12964-C05-01, No. FIS2009-12964-C05-03, No. FIS2012-38866-C05-01, No. IS2012-38866-C05-01-05, No. MAT2011-27470-C02-02, and No. CSD2009-00013.Publicad

    On Some Sampling-Related Frames in U-Invariant Spaces

    Get PDF
    This paper is concerned with the characterization as frames of some sequences in -invariant spaces of a separable Hilbert space H where U denotes an unitary operator defined on H ; besides, the dual frames having the same form are also found. This general setting includes, in particular, shift-invariant or modulation-invariant subspaces in L2(R) , where these frames are intimately related to the generalized sampling problem. We also deal with some related perturbation problems. In doing so, we need the unitary operator to belong to a continuous group of unitary operators

    Psychometric properties of an emotional adjustment computerized adaptive test

    Full text link
    En el presente trabajo se describen las propiedades psicométricas de un Test Adaptativo Informatizado para la medición del ajuste emocional de las personas. La revisión de la literatura acerca de la aplicación de los modelos de la teoría de la respuesta a los ítems (TRI) muestra que ésta se ha utilizado más en el trabajo con variables aptitudinales que para la medición de variables de personalidad, sin embargo diversos estudios han mostrado la eficacia de la TRI para la descripción psicométrica de dichas variables. Aun así, pocos trabajos han explorado las características de un Test Adaptativo informatizado, basado en la TRI, para la medición de una variable de personalidad como es el ajuste emocional. Nuestros resultados muestran la eficiencia del TAI para la evaluación del ajuste emocional, proporcionando una medición válida y precisa, utilizando menor número de elementos de medida en comparación con las escalas de ajuste emocional de instrumentos fuertemente implantadosIn the present work it was described the psychometric properties of an emotional adjustment computerized adaptive test. An examination of Item Response Theory (IRT) research literature indicates that IRT has been mainly used for assessing achievements and ability rather than personality factors. Nevertheless last years have shown several studies wich have successfully used IRT to personality assessment instruments. Even so, a few amount of works has inquired the computerized adaptative test features, based on IRT, for the measurement of a personality traits as it’s the emotional adjustment. Our results show the CAT efficiency for the emotional adjustment assessment so this provides a valid and accurate measurement; by using a less number of items in comparison with the emotional adjustment scales from the most strongly established questionnaire

    Percolation-based precursors of transitions in extended systems

    Get PDF
    Abrupt transitions are ubiquitous in the dynamics of complex systems. Finding precursors, i.e. early indicators of their arrival, is fundamental in many areas of science ranging from electrical engineering to climate. However, obtaining warnings of an approaching transition well in advance remains an elusive task. Here we show that a functional network, constructed from spatial correlations of the system’s time series, experiences a percolation transition way before the actual system reaches a bifurcation point due to the collective phenomena leading to the global change. Concepts from percolation theory are then used to introduce early warning precursors that anticipate the system’s tipping point. We illustrate the generality and versatility of our percolation-based framework with model systems experiencing different types of bifurcations and with Sea Surface Temperature time series associated to El Niño phenomenon.V.R.-M. was supported by the European Commission Marie-Curie ITN program (FP7-320 PEOPLE-2011-ITN) through the LINC project (Grant no. 289447). We also acknowledge support from FEDER and Spanish Ministry of Economy and Competitiveness (MINECO) through the project INTENSE@COSYP (FIS2012-30634) and from the European Commission through project LASAGNE (FP7-ICT- 318132). J.J.R. acknowledges funding from the Ramón y Cajal program of MINECO.EUR 1,165 APC fee funded by the EC FP7 Post-Grant Open Access PilotPeer reviewe

    The cyanobacterial ribosomal-associated protein LrtA from Synechocystis sp. PCC 6803 is an oligomeric protein in solution with chameleonic sequence properties

    Get PDF
    The LrtA protein of Synechocystis sp. PCC 6803 intervenes in cyanobacterial post-stress survival and in stabilizing 70S ribosomal particles. It belongs to the hibernating promoting factor (HPF) family of proteins, involved in protein synthesis. In this work, we studied the conformational preferences and stability of isolated LrtA in solution. At physiological conditions, as shown by hydrodynamic techniques, LrtA was involved in a self-association equilibrium. As indicated by Nuclear Magnetic Resonance (NMR), circular dichroism (CD) and fluorescence, the protein acquired a folded, native-like conformation between pH 6.0 and 9.0. However, that conformation was not very stable, as suggested by thermal and chemical denaturations followed by CD and fluorescence. Theoretical studies of its highly-charged sequence suggest that LrtA had a Janus sequence, with a context-dependent fold. Our modelling and molecular dynamics (MD) simulations indicate that the protein adopted the same fold observed in other members of the HPF family ( - - - - - ) at its N-terminal region (residues 1–100), whereas the C terminus (residues 100–197) appeared disordered and collapsed, supporting the overall percentage of overall secondary structure obtained by CD deconvolution. Then, LrtA has a chameleonic sequence and it is the first member of the HPF family involved in a self-association equilibrium, when isolated in solution.Ministerio de Economía y Competitividad CTQ2015-64445-RMinisterio de Economía y Competitividad BIO2016-78020-RMinisterio de Economía y Competitividad FIS2014-52212-RMinisterio de Economía y Competitividad BIO2016-75634-PFundación Séneca 19353/PI/1

    Synthesis of TiB2-Ni3B Nanocomposite Powders by Mechanical Alloying

    Get PDF
    Combination of good oxidation resistance, thermal stability, hardness and high strength are great interest properties in engineering and, that are possible to obtain with the Ni-Ti-B ternary system. Mechanical alloying (MA) is an alternative method and cheapest for the synthesis of this kind of metal-ceramic materials with respect to the traditional melt and quench process. The transformation sequence of all the mixtures reported the formation of (ɣ Ni) phase with a nodular morphology and identified the additional presence of the TiB2 phase (needle morphology), which was more evident with the increase of titanium content (M2 and M3 mixtures) after 24 h of milling. Thermal activation of the milled powders showed the nucleation and growth of the Ni3B (O boride) and TiB2 (Hex) as the main phases after heat treatment, where the TiB2 phase (thin flakes morphology) was nucleated onto Ni3B matrix. Ternary alloy by MA took place under a metastable equilibrium, offering the possibility to form glassy alloys for compositions, which are not accessible by melting or quenching techniques
    • …
    corecore