47,744 research outputs found

    Architectural support for task dependence management with flexible software scheduling

    Get PDF
    The growing complexity of multi-core architectures has motivated a wide range of software mechanisms to improve the orchestration of parallel executions. Task parallelism has become a very attractive approach thanks to its programmability, portability and potential for optimizations. However, with the expected increase in core counts, finer-grained tasking will be required to exploit the available parallelism, which will increase the overheads introduced by the runtime system. This work presents Task Dependence Manager (TDM), a hardware/software co-designed mechanism to mitigate runtime system overheads. TDM introduces a hardware unit, denoted Dependence Management Unit (DMU), and minimal ISA extensions that allow the runtime system to offload costly dependence tracking operations to the DMU and to still perform task scheduling in software. With lower hardware cost, TDM outperforms hardware-based solutions and enhances the flexibility, adaptability and composability of the system. Results show that TDM improves performance by 12.3% and reduces EDP by 20.4% on average with respect to a software runtime system. Compared to a runtime system fully implemented in hardware, TDM achieves an average speedup of 4.2% with 7.3x less area requirements and significant EDP reductions. In addition, five different software schedulers are evaluated with TDM, illustrating its flexibility and performance gains.This work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the European HiPEAC Network of Excellence, by the Spanish Ministry of Science and Innovation (contracts TIN2015-65316-P, TIN2016-76635-C2-2-R and TIN2016-81840-REDT), by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), and by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671697 and No. 671610. M. Moretó has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI-2012-15047.Peer ReviewedPostprint (author's final draft

    Phase synchronization of instrumental music signals

    Full text link
    Signal analysis is one of the finest scientific techniques in communication theory. Some quantitative and qualitative measures describe the pattern of a music signal, vary from one to another. Same musical recital, when played by different instrumentalists, generates different types of music patterns. The reason behind various patterns is the psychoacoustic measures - Dynamics, Timber, Tonality and Rhythm, varies in each time. However, the psycho-acoustic study of the music signals does not reveal any idea about the similarity between the signals. For such cases, study of synchronization of long-term nonlinear dynamics may provide effective results. In this context, phase synchronization (PS) is one of the measures to show synchronization between two non-identical signals. In fact, it is very critical to investigate any other kind of synchronization for experimental condition, because those are completely non identical signals. Also, there exists equivalence between the phases and the distances of the diagonal line in Recurrence plot (RP) of the signals, which is quantifiable by the recurrence quantification measure tau-recurrence rate. This paper considers two nonlinear music signals based on same raga played by two eminent sitar instrumentalists as two non-identical sources. The psycho-acoustic study shows how the Dynamics, Timber, Tonality and Rhythm vary for the two music signals. Then, long term analysis in the form of phase space reconstruction is performed, which reveals the chaotic phase spaces for both the signals. From the RP of both the phase spaces, tau-recurrence rate is calculated. Finally by the correlation of normalized tau-recurrence rate of their 3D phase spaces and the PS of the two music signals has been established. The numerical results well support the analysis

    Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems

    Full text link
    Two emerging hardware trends will dominate the database system technology in the near future: increasing main memory capacities of several TB per server and massively parallel multi-core processing. Many algorithmic and control techniques in current database technology were devised for disk-based systems where I/O dominated the performance. In this work we take a new look at the well-known sort-merge join which, so far, has not been in the focus of research in scalable massively parallel multi-core data processing as it was deemed inferior to hash joins. We devise a suite of new massively parallel sort-merge (MPSM) join algorithms that are based on partial partition-based sorting. Contrary to classical sort-merge joins, our MPSM algorithms do not rely on a hard to parallelize final merge step to create one complete sort order. Rather they work on the independently created runs in parallel. This way our MPSM algorithms are NUMA-affine as all the sorting is carried out on local memory partitions. An extensive experimental evaluation on a modern 32-core machine with one TB of main memory proves the competitive performance of MPSM on large main memory databases with billions of objects. It scales (almost) linearly in the number of employed cores and clearly outperforms competing hash join proposals - in particular it outperforms the "cutting-edge" Vectorwise parallel query engine by a factor of four.Comment: VLDB201

    SKIRT: hybrid parallelization of radiative transfer simulations

    Full text link
    We describe the design, implementation and performance of the new hybrid parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which has been used extensively for modeling the continuum radiation of dusty astrophysical systems including late-type galaxies and dusty tori. The hybrid scheme combines distributed memory parallelization, using the standard Message Passing Interface (MPI) to communicate between processes, and shared memory parallelization, providing multiple execution threads within each process to avoid duplication of data structures. The synchronization between multiple threads is accomplished through atomic operations without high-level locking (also called lock-free programming). This improves the scaling behavior of the code and substantially simplifies the implementation of the hybrid scheme. The result is an extremely flexible solution that adjusts to the number of available nodes, processors and memory, and consequently performs well on a wide variety of computing architectures.Comment: 21 pages, 20 figure

    New Concept of PLC Modems: Multi-Carrier System for Frequency Selective Slow-Fading Channels Based on Layered SCCC Turbocodes

    Get PDF
    The article introduces a novel concept of a PLC modem as a complement to the existing G3 and PRIME standards for communications using medium- or high-voltage overhead or cable lines. The proposed concept is based on the fact that the levels of impulse noise and frequency selectivity are lower on high-voltage lines than on low-voltage ones. Also, the demands for “cost-effective” circuitry design are not so crucial as in the case of modems for low-voltage level. In contract to these positive conditions, however, there is the need to overcome much longer distances and to take into account low SNR on the receiving side. With respect to the listed reasons, our concept makes use of MCM, instead of OFDM. The assumption of low SNR is compensated through the use of an efficient channel coding based on a serially concatenated turbo code. In addition, MCM offers lower latency and PAPR compared to OFDM. Therefore, when using MCM, it is possible to excite the line with higher power. The proposed concept has been verified during experimental transmission of testing data over a real, 5 km long, 22kV overhead line

    Method and apparatus for a single channel digital communications system

    Get PDF
    A method and apparatus are described for synchronizing a received PCM communications signal without requiring a separate synchronizing channel. The technique provides digital correlation of the received signal with a reference signal, first with its unmodulated subcarrier and then with a bit sync code modulated subcarrier, where the code sequence length is equal in duration to each data bit
    corecore