9 research outputs found

    Leakage-Aware Multiprocessor Scheduling

    Full text link

    Limiting the Number of Dirty Cache Lines

    No full text
    Abstract—Caches often employ write-back instead of writethrough, since write-back avoids unnecessary transfers for multiple writes to the same block. For several reasons, however, it is undesirable that a significant number of cache lines will be marked “dirty”. Energy-efficient cache organizations, for example, often apply techniques that resize, reconfigure, or turn off (parts of) the cache. In such cache organizations, dirty lines have to be written back before the cache is reconfigured. The delay imposed by these write-backs or the required additional logic and buffers can significantly reduce the attained energy savings. A cache organization called the clean/dirty cache (CDcache) is proposed that combines the properties of write-back and write-through. It avoids unnecessary transfers for recurring writes, while restricting the number of dirty lines to a hard limit. Detailed experimental results show that the CD-cache reduces the number of dirty lines significantly, while achieving similar or better performance. We also use the CD-cache to implement cache decay. Experimental results show that the CD-cache attains similar or higher performance than a normal decay cache, while using a significantly less complex design. I

    Dynamic Techniques to Reduce Memory Traffic in Embedded Systems

    No full text
    Memory transfers in particular from/to off-chip memories cons ume as0P ificant amount of power. In order to reduce the amount of o#-chip memory traffic, one or more levels of cache can be employed, located on thes ame dieas the procesOW core. For performance, energy, and cos reas) s , it is expedient that the on-chip cache is s mall and directmapped. Small, direct-mapped caches , however, generally produce much more traffic than needed. The purpose of this paper is two-fold. First, to measure how much traffic is generated by small, direct-mapped caches and what the minimal amount of traffic is . This yields an upper bound on the amount of traffic that can bes aved by utilizing the on-chip memory more effectively. Second, wes urveys ome techniques that can be deployed to reduce the amount of traffic produced by direct-mapped caches and pres ent res ults fors ome of thes e techniques

    Reducing Traffic Generated by Conflict Misses in Caches

    No full text
    Off-chip memory accesses are a major source of power consumption in embedded processors. In order to reduce the amount of tra#c between the processor and the off-chip memory as well as to hide the memory latency, nearly all embedded processors have a cache on the same die as the processor core. Because small caches dissipate less power and are cheaper than large caches, a small cache is preferable to a large cache. Furthermore, because set-associative caches consume more power than direct-mapped caches, a direct-mapped cache is preferable to a set-associative one. Small, direct-mapped caches generally incur many conflict misses, however. In this paper we propose and evaluate a structure called the Conflict Detection Table (CDT). This table can be used to determine if a memory access is expected to hit the cache. If a hit is expected and a miss occurs, then a conflict is detected and appropriate action can be taken. In addition, we propose two cache structures that employ this technique: the Bypass in Case of Conflict (BCC) cache and the Sub-block in Case of Conflict (SCC) cache. The BCC cache bypasses the cache when a conflict is detected, whereas the SCC cache fetches a sub-block of the missing cache block in such a case. Experimental results using several embedded workloads show that the BCC and SCC cache reduce the amount of traffic significantly in many cases. Furthermore, overall they incur the same number of cache misses as the direct-mapped cache. This shows that the BCC and SCC cache reduce the amount of power consumed with a negligible reduction in performance

    B.: Leakage-Aware Multiprocessor Scheduling for Low Power

    No full text
    It is expected that (single chip) multiprocessors will increasingly be deployed to realize high-performance embedded systems. Because in current technologies the dynamic power consumption dominates the static power dissipation, an effective technique to reduce energy consumption is to employ as many processors as possible in order to finish the tasks as early as possible, and to use the remaining time before the deadline (the slack) to apply voltage scaling. We refer to this heuristic as Schedule and Stretch (S&S). However, since the static power consumption is expected to become more significant, this approach will no longer be efficient when leakage current is taken into account. In this paper, we first show for which combinations of leakage current, supply voltage, and clock frequency the static power consumption dominates the dynamic power dissipation. These results imply that, at a certain point, it is no longer advantageous from an energy perspective to employ as many processors as possible. Thereafter, a heuristic is presented to schedule the tasks on a number of processors that minimizes the total energy consumption. Experimental results obtained using a public task graph benchmark set show that our leakage-aware scheduling algorithm reduces the total energy consumption by up to 24 % for tight deadlines (1.5x the critical path length) and by up to 67% for loose deadlines (8x the critical path length) compared to S&S.

    Trade-Offs Between Voltage Scaling and Processor Shutdown for Low-Energy Embedded Multiprocessors ⋆

    No full text
    Abstract. When peak performance is unnecessary, Dynamic Voltage Scaling (DVS) can be used to reduce the dynamic power consumption of embedded multiprocessors. In future technologies, however, static power consumption is expected to increase significantly. Then it will be more effective to limit the number of employed processors, and use a combination of DVS and processor shutdown. Scheduling heuristics are presented that determine the best trade-off between these three techniques: DVS, processor shutdown, and finding the optimal number of processors. Experimental results show that our approach reduces the total energy consumption by up to 25 % for tight deadlines and by up to 57% for loose deadlines compared to DVS. We also compare the energy consumed by our scheduling algorithm to two lower bounds, and show that our best approach leaves little room for improvement.

    Structural Alterations of MET Trigger Response to MET Kinase Inhibition in Lung Adenocarcinoma Patients

    No full text
    Purpose: We sought to investigate the clinical response to MET inhibition in patients diagnosed with structural MET alterations and to characterize their functional relevance in cellular models. Experimental Design: Patients were selected for treatment with crizotinib upon results of hybrid capture-based next-generation sequencing. To confirm the clinical observations, we analyzed cellular models that express these MET kinase alterations. Results: Three individual patients were identified to harbor alterations within the MET receptor. Two patients showed genomic rearrangements, leading to a gene fusion of KIF5B or STARD3NL and MET. One patient diagnosed with an EML4-ALK rearrangement developed a MET kinase domain duplication as a resistance mechanism to ceritinib. All 3 patients showed a partial response to crizotinib that effectively inhibits MET and ALK among other kinases. The results were further confirmed using orthogonal cellular models. Conclusions: Crizotinib leads to a clinical response in patients with MET rearrangements. Our functional analyses together with the clinical data suggest that these structural alterations may represent actionable targets in lung cancer patients. (C) 2017 AACR

    Swarm Intelligence-Enhanced Detection of Non-Small-Cell Lung Cancer Using Tumor-Educated Platelets

    No full text
    Blood-based liquid biopsies, including tumor-educated blood platelets (TEPs), have emerged as promising biomarker sources for non-invasive detection of cancer. Here we demonstrate that particle-swarm optimization (PSO)-enhanced algorithms enable efficient selection of RNA biomarker panels from platelet RNA sequencing libraries (n = 779). This resulted in accurate TEP-based detection of early- and late-stage non-small-cell lung cancer (n = 518 late-stage validation cohort, accuracy, 88%; AUC, 0.94; 95% CI, 0.92-0.96; p < 0.001; n = 106 early-stage validation cohort, accuracy, 81%; AUC, 0.89; 95% CI, 0.83-0.95; p < 0.001), independent of age of the individuals, smoking habits, whole-blood storage time, and various inflammatory conditions. PSO enabled selection of gene panels to diagnose cancer from TEPs, suggesting that swarm intelligence may also benefit the optimization of diagnostics readout of other liquid biopsy biosources
    corecore