24 research outputs found

    Inter-cluster Thread-to-core Mapping and DVFS on Heterogeneous Multi-cores

    Get PDF
    Heterogeneous multi-core platforms that contain different types of cores, organized as clusters, are emerging, e.g. ARM's big.LITTLE architecture. These platforms often need to deal with multiple applications, having different performance requirements, executing concurrently. This leads to generation of varying and mixed workloads (e.g. compute and memory intensive) due to resource sharing. Run-time management is required for adapting to such performance requirements and workload variabilities and to achieve energy efficiency. Moreover, the management becomes challenging when the applications are multi-threaded and the heterogeneity needs to be exploited. The existing run-time management approaches do not efficiently exploit cores situated in different clusters simultaneously (referred to as inter-cluster exploitation) and DVFS potential of cores, which is the aim of this paper. Such exploitation might help to satisfy the performance requirement while achieving energy savings at the same time. Therefore, in this paper, we propose a run-time management approach that first selects thread-to-core mapping based on the performance requirements and resource availability. Then, it applies online adaptation by adjusting the voltage-frequency (V-f) levels to achieve energy optimization, without trading-off application performance. For thread-to-core mapping, offline profiled results are used, which contain performance and energy characteristics of applications when executed on the heterogeneous platform by using different types of cores in various possible combinations. For an application, thread-to-core mapping process defines the number of used cores and their type, which are situated in different clusters. The online adaptation process classifies the inherent workload characteristics of concurrently executing applications, incurring a lower overhead than existing learning-based approaches as demonstrated in this paper. The classification of workload is performed using the metric Memory Reads Per Instruction (MRPI). The adaptation process pro-actively selects an appropriate V-f pair for a predicted workload. Subsequently, it monitors the workload prediction error and performance loss, quantified by instructions per second (IPS), and adjusts the chosen V-f to compensate. We validate the proposed run-time management approach on a hardware platform, the Odroid-XU3, with various combinations of multi-threaded applications from PARSEC and SPLASH benchmarks. Results show an average improvement in energy efficiency up to 33% compared to existing approaches while meeting the performance requirements

    Evolutionary History of Tissue Kallikreins

    Get PDF
    The gene family of human kallikrein-related peptidases (KLKs) encodes proteins with diverse and pleiotropic functions in normal physiology as well as in disease states. Currently, the most widely known KLK is KLK3 or prostate-specific antigen (PSA) that has applications in clinical diagnosis and monitoring of prostate cancer. The KLK gene family encompasses the largest contiguous cluster of serine proteases in humans which is not interrupted by non-KLK genes. This exceptional and unique characteristic of KLKs makes them ideal for evolutionary studies aiming to infer the direction and timing of gene duplication events. Previous studies on the evolution of KLKs were restricted to mammals and the emergence of KLKs was suggested about 150 million years ago (mya). In order to elucidate the evolutionary history of KLKs, we performed comprehensive phylogenetic analyses of KLK homologous proteins in multiple genomes including those that have been completed recently. Interestingly, we were able to identify novel reptilian, avian and amphibian KLK members which allowed us to trace the emergence of KLKs 330 mya. We suggest that a series of duplication and mutation events gave rise to the KLK gene family. The prominent feature of the KLK family is that it consists of tandemly and uninterruptedly arrayed genes in all species under investigation. The chromosomal co-localization in a single cluster distinguishes KLKs from trypsin and other trypsin-like proteases which are spread in different genetic loci. All the defining features of the KLKs were further found to be conserved in the novel KLK protein sequences. The study of this unique family will further assist in selecting new model organisms for functional studies of proteolytic pathways involving KLKs

    Fully integrated 533 MHz programmable switched current PLL in 0.012 mm(2)

    No full text
    A 533 MHz programmable phase-locked loop is designed for DDR applications using a switched current filter and implicit phase detection. The use of switched current technology allows a fully integrated loop filter which is much smaller than equivalent integrated passive filters, as a result the circuit occupies only 0.012 mm(2) on a 0.12 mm 1.2 V digital CMOS process

    Online Fault Tolerance Technique for TSV-Based 3-D-IC

    Get PDF

    A new analytical model for predicting SWCNT band-gap from geometrical properties

    No full text
    In the following paper we present a complete analytical model that predicts the band-gap (E-g) of Single-Walled Carbon nanotubes (SWCNTs) directly from their diameter (d) and chiral angle (theta). The proposed analytical model is based on two mathematical expressions that have been derived by curve-fitting the outcome generated from the third-nearest-neighbor Tight-Binding (TB) method in conjunction with the zone-folding technique. Tests performed on the model demonstrated that 82% of a set of both metallic and semiconducting CNTs were accurately distinguished. In addition, the maximum band-gap error recorded for the semiconducting tubes was 10%. The model was also verified against previously published experimental data where 17 out of 21 tubes were correctly predicted. Finally, it is shown that the proposed model computes Eg with a speed that is 10(5) times faster compared to the third-nearest-neighbor TB method with zone-folding. The outcome of this work offers a fast and accurate technique for engineers who are seeking to simulate CNT based devices and want to ascertain the CNT's electronic properties with respect to the geometrical variation manifested in their synthesis process

    Thermal-aware SoC test scheduling with test set partitioning and interleaving

    No full text
    High temperature has become a major problem for system-on-chip testing. In order to reduce the test application time while keeping the temperatures of the cores under test within safe ranges, a thermal-aware test scheduling technique is required. This paper presents an approach to minimize the test application time and, at the same time, prevent the temperatures of cores under test going beyond given limits. We employ test set partitioning to divide test sets into shorter test sequences, and add cooling periods between test sequences so that overheating can be avoided. Moreover, test sequences from different test sets are interleaved, such that the cooling periods and the bandwidth of the test bus can be utilized for test data transportation, and hence the test application time can be reduced. The test scheduling problem is formulated as a combinatorial optimization problem, and we use the constraint logic programming (CLP) to build the optimization model and find the optimal solution. As the optimization time of the CLP-based approach increases exponentially with the problem size, we also propose a heuristic which generates longer test schedules but requires substantially shorter optimization time. Experimental results have shown the efficiency of the proposed approach

    Learning transfer-based adaptive energy minimization in embedded systems

    No full text
    corecore