9 research outputs found

    OpenSHMEM Application Programming Interface, v1.0 Final

    Full text link
    This document defines the elements of the OpenSHMEM Application Programming Interface. The purpose of the OpenSHMEM API is to provide programmers with a standard interface for writing parallel programs using C, C++ and Fortran with one-sided communication

    The cooperative parallel: A discussion about run-time schedulers for nested parallelism

    Get PDF
    Nested parallelism is a well-known parallelization strategy to exploit irregular parallelism in HPC applications. This strategy also fits in critical real-time embedded systems, composed of a set of concurrent functionalities. In this case, nested parallelism can be used to further exploit the parallelism of each functionality. However, current run-time implementations of nested parallelism can produce inefficiencies and load imbalance. Moreover, in critical real-time embedded systems, it may lead to incorrect executions due to, for instance, a work non-conserving scheduler. In both cases, the reason is that the teams of OpenMP threads are a black-box for the scheduler, i.e., the scheduler that assigns OpenMP threads and tasks to the set of available computing resources is agnostic to the internal execution of each team. This paper proposes a new run-time scheduler that considers dynamic information of the OpenMP threads and tasks running within several concurrent teams, i.e., concurrent parallel regions. This information may include the existence of OpenMP threads waiting in a barrier and the priority of tasks ready to execute. By making the concurrent parallel regions to cooperate, the shared computing resources can be better controlled and a work conserving and priority driven scheduler can be guaranteed.Peer ReviewedPostprint (author's final draft

    A gradient based facile HPLC method for simultaneous estimation of antioxidants extracted from tea powder

    No full text
    A new simple, rapid and precise RP-HPLC method was developed for the extraction and quantitative estimation of caffeine (C), (−)-epigallocatechin gallate (EGCG), (+)-catechin(Ct), (−)-epicatechin(EC), and (−)-epicatechin gallate (ECG) (collectively named as Tea Powder Bioactives TPBAs) extracted from tea powder using different ratios of ethanol: water. The simultaneous determination of TPBAs was performed using the UV spectrophotometric method which employs the absorbance at 205 nm (λmax of caffeine and polyphenols). This method is a gradient based HPLC method with a flow rate of 0.8 mL/min using Inertsil ODS 100 × 4.6 mm, 3 μm column with methanol and ammonium dihydrogen phosphate (pH-2.8) as mobile phase. The method was validated in terms of specificity, precision, linearity, accuracy, limit of quantification (LOQ), and limit of detection (LOD). The linearity of the proposed method was investigated for concentration ranging between 0.5–60 μg/mL with regression co-efficient, R2 = 0.999–1.0. This method estimates all the TPBAs simultaneously with enhanced precision and linearity as per the ICH guidelines. Also, to confirm the individual TPBA, the antioxidant property of the each TPBA was analyzed which was commensurate with that of the previous reports

    A Comparison of the Scalability of OpenMP Implementations

    No full text
    OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be controlled and kept to a minimum to avoid low performance at scale. Previous work has shown that overheads do not scale favourably in commonly used OpenMP implementations. Focusing on synchronization overhead, this work analyses the overhead of core OpenMP runtime library components for GNU and LLVM compilers, reflecting on the implementation's source code and algorithms. In addition, this work investigates the implementation's capability to handle current CPU-internal NUMA structure observed in recent Intel CPUs. Using a custom benchmark designed to expose synchronization overhead of OpenMP regardless of user code, substantial differences between both implementations are observed. In summary, the LLVM implementation can be considered more scalable than the GNU implementation, but the GNU implementation yields lower overhead for lower threadcounts in some occasions. Neither implementation reacts to the system architecture, although the effects of the internal NUMA structure on the overhead can be observed

    Abstract

    No full text
    corecore