Search CORE

459 research outputs found

Power-aware caches for GPGPUs

Author: Saghir Ahsan
Publication venue
Publication date: 01/01/2015
Field of study

In this thesis, we propose two optimization techniques to reduce power consumption in L1 caches (data, texture and constant), shared memory and L2 cache. The first optimization technique targets static power. Evaluation of GPGPU applications shows that once a cache block is accessed by a thread, it takes several hundreds of clock cycles until the same block is accessed again. The long inter-access cycle can be used to put cache cells into drowsy mode and reduce static power. While drowsy cells reduce static power, they increase access time as voltage of a cache cell in drowsy mode should be raised before the block can be accessed. To mitigate performance impact of drowsy cells, we propose a novel technique called coarse grained drowsy mode. In coarse grained drowsy mode, we partition each cache into regions of consecutive cache blocks and wake up a region upon cache access. Due to temporal and spatial locality of cache accesses, this method dramatically reduces performance impact caused by drowsy cells. The second optimization technique relies on branch divergence in GPGPUs. The execution model in GPGPUs is Single Instruction Multiple Thread (SIMT) which means processing cores execute the same instruction with different data for GPGPU threads. The SIMT execution model may result in divergence of threads when a control instruction is executed. GPGPUs execute branch instructions in two phases. In the first phase, threads in the taken path are active and the rest are idle. In the second phase, threads in the not-taken path are executed and the rest are idle. Contemporary GPGPUs access all portions of cache blocks, although some threads are idle due to branch divergence. We propose accessing only portions of cache blocks corresponding to active threads. By disabling unnecessary sections of cache blocks, we are able to reduce dynamic power of caches. Our results show that on average, the two optimization techniques together reduce power of caches by up to 98% and 15% for static and dynamic power, respectively

Lakehead University Knowledge Commons

Microarchitectural techniques to reduce energy consumption in the memory hierarchy

Author: Ghosh Mrinmoy
Publication venue: Georgia Institute of Technology
Publication date: 03/04/2009
Field of study

This thesis states that dynamic profiling of the memory reference stream can improve energy and performance in the memory hierarchy. The research presented in this theses provides multiple instances of using lightweight hardware structures to profile the memory reference stream. The objective of this research is to develop microarchitectural techniques to reduce energy consumption at different levels of the memory hierarchy. Several simple and implementable techniques were developed as a part of this research. One of the techniques identifies and eliminates redundant refresh operations in DRAM and reduces DRAM refresh power. Another, reduces leakage energy in L2 and higher level caches for multiprocessor systems. The emphasis of this research has been to develop several techniques of obtaining energy savings in caches using a simple hardware structure called the counting Bloom filter (CBF). CBFs have been used to predict L2 cache misses and obtain energy savings by not accessing the L2 cache on a predicted miss. A simple extension of this technique allows CBFs to do way-estimation of set associative caches to reduce energy in cache lookups. Another technique using CBFs track addresses in a Virtual Cache and reduce false synonym lookups. Finally this thesis presents a technique to reduce dynamic power consumption in level one caches using significance compression. The significant energy and performance improvements demonstrated by the techniques presented in this thesis suggest that this work will be of great value for designing memory hierarchies of future computing platforms.Ph.D.Committee Chair: Lee, Hsien-Hsin S.; Committee Member: Cahtterjee,Abhijit; Committee Member: Mukhopadhyay, Saibal; Committee Member: Pande, Santosh; Committee Member: Yalamanchili, Sudhaka

Scholarly Materials And Research @ Georgia Tech

Human Health Engineering Volume II

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

In this Special Issue on “Human Health Engineering Volume II”, we invited submissions exploring recent contributions to the field of human health engineering, i.e., technology for monitoring the physical or mental health status of individuals in a variety of applications. Contributions could focus on sensors, wearable hardware, algorithms, or integrated monitoring systems. We organized the different papers according to their contributions to the main parts of the monitoring and control engineering scheme applied to human health applications, namely papers focusing on measuring/sensing physiological variables, papers highlighting health-monitoring applications, and examples of control and process management applications for human health. In comparison to biomedical engineering, we envision that the field of human health engineering will also cover applications for healthy humans (e.g., sports, sleep, and stress), and thus not only contribute to the development of technology for curing patients or supporting chronically ill people, but also to more general disease prevention and optimization of human well-being

Directory of Open Access Books (DOAB)

Evolution of diaphragmatic function in children under mechanical ventilation

Author: Crulli Benjamin
Publication venue
Publication date: 01/12/2019
Field of study

Introduction : La dysfonction diaphragmatique est très fréquente chez des patients adultes aux soins intensifs et elle est associée à des évolutions cliniques défavorables. Il n’y a pour l’instant aucune méthode reconnue pour évaluer la fonction du diaphragme chez l’enfant sous ventilation mécanique (VM), et aucune étude décrivant son évolution dans le temps chez cette population. Méthodes : Dans ce travail, nous avons évalué la fonction contractile du diaphragme chez des enfants sous ventilation invasive aux soins intensifs pédiatriques (SIP) et en salle d’opération (SOP). Pour ce faire, la pression au tube endotrachéal (Paw) et l’activité électrique du diaphragme (EAdi) étaient enregistrées simultanément lors de respirations spontanées pendant une brève manœuvre d’occlusion des voies respiratoires. Afin de prendre en compte la commande respiratoire, un ratio d’efficience neuro-mécanique (NME, Paw/EAdi) a d’abord été calculé puis validé par une analyse de variabilité. La fonction du diaphragme a ensuite été comparée entre les deux populations, et son évolution dans le temps au sein du groupe SIP décrite. Résultats : Le NME médian était la mesure de fonction diaphragmatique la plus fiable, avec un coefficient de variation de 23.7% et 21.1% dans les groups SIP et SOP, respectivement. Le NME dans le groupe SIP après 21 heures de VM (1.80 cmH2O/μV, IQR 1.25–2.39) était significativement inférieur à celui du groupe SOP (3.65 cmH2O/μV, IQR 3.45–4.24, p = 0.015). Dans le groupe SIP, le NME n’a pas diminué de façon significative pendant la VM (coefficient de corrélation -0.011, p = 0.133). Conclusion : La fonction diaphragmatique peut être mesurée au chevet des enfants sous VM par de brèves manœuvres d’occlusion. L’efficience du diaphragme était significativement plus élevée dans un groupe sain que dans une cohorte d’enfants critiquement malades, mais elle était stable dans ce groupe avec une commande respiratoire préservée. Dans le futur, les contributions relatives de la maladie critique et de la ventilation mécanique sur la fonction diaphragmatique devront être mieux caractérisées avant de procéder à l’évaluation de potentielles interventions visant à protéger le diaphragme.Introduction : Diaphragmatic dysfunction is highly prevalent in adult critical care and is associated with worse outcomes. There is at present no recognized method to assess diaphragmatic function in children under mechanical ventilation (MV) and no study describing its evolution over time in this population. Methods : In this work, we have assessed the contractile function of the diaphragm in children under invasive MV in the pediatric intensive care unit (PICU) and in the operating room (OR). This was done by simultaneously recording airway pressure at the endotracheal tube (Paw) and electrical activity of the diaphragm (EAdi) over consecutive spontaneous breaths during brief airway occlusion maneuvers. In order to account for central respiratory drive, a neuro-mechanical efficiency ratio (NME, Paw/EAdi) was first computed and then validated using variability analysis. Diaphragmatic function was then compared between the two populations and its evolution over time in the PICU group described. Results : Median NME was the most reliable measure of diaphragmatic function with a coefficient of variation of 23.7% and 21.1% in the PICU and OR groups, respectively. NME in the PICU group after 21 hours of MV (1.80 cmH2O/μV, IQR 1.25–2.39) was significantly lower than in the OR group (3.65 cmH2O/μV, IQR 3.45–4.24, p = 0.015). In the PICU group, NME did not decrease significantly over time under MV (correlation coefficient -0.011, p = 0.133). Conclusion : Diaphragmatic function can be measured at the bedside of children under MV using brief airway occlusions. Diaphragm efficiency was significantly higher in healthy controls than in a cohort of critically ill children, but it was stable over time under MV in this group with preserved respiratory drive. In the future, the relative contributions of critical illness and mechanical ventilation on diaphragmatic function should be better characterized before evaluating potential interventions aimed at protecting the diaphragm

Dépôt Institutionnel Numérique

Brain attack : a new approach to stroke and transient ischaemic attack

Author: Hand Peter James
Publication venue: The University of Edinburgh
Publication date: 01/01/2002
Field of study

Edinburgh Research Archive

Energy Efficiency and Performance in Multiprocessors Systems on Chip

Author: Kadjo David
Publication venue
Publication date: 21/09/2015
Field of study

As process technology shrinks, the transistor count on CPUs has increased. The breakdown of Dennard scaling has led to diminishing returns in terms of performance per power. A trend which promises to impact future CPU designs. This breakdown is due in part to the increase in transistor leakage driven static power. We, now, have constrained energy and power budgets. Thus, energy consumption has to be justified by an increased in performance. Simultaneously, architects have shifted to chip multiprocessors(CMPs) designs with large shared last level cache(LLC) to mitigate the cost of long latency off-chip memory access. A primary reason for that shift is the power efficiency of CMPs. Additionally, technology scaling has allowed the integration of platform components on the chip; a design referred to as multiprocessors system on chip (MpSoC). This integration improves the system performance as the communication latency between the components is reduced. Memory subsystems are essential to CPUs performance. Larger caches provide the CPU faster access to a larger data set. Consequently, the size of last level caches have increased to become a significant leakage power dissipation source. We propose a technique to facilitate power gating a partition of the LLC by migrating the high temporal blocks to a live partition; Thus reducing the performance impact. Given the high latency of memory subsystems, prefetching improves CPU performance by speculating future memory accesses and requesting the data ahead of the demand. In the context of CMPs running multiple concurrent processes, prefetching accuracy is critical to prevent cache pollution effects. Furthermore, given the current constraint energy environment, there is a need for lightweight prefetchers with high accuracy. To this end, we present BFetch a lightweight and accurate prefetcher driven by control flow predictions and effective address speculation. MpSoCs have mostly been used in mobile devices. The energy constraint is more pronounced in MpSoCs-based, battery powered mobile devices. The need to address the energy consumption in MpSoCs is further accentuated by the proliferation of mobile devices. This dissertation presents a framework to optimize the energy in MpSoCs. The proposed framework minimizes the energy consumption while meeting performance and power budgets constraints. We first apply this framework to the CPU then extend it to accommodate the GPU

Texas A&M Repository

Detection of Driver Drowsiness and Distraction Using Computer Vision and Machine Learning Approaches

Author: Okon Ofonime Dominic
Publication venue
Publication date: 27/05/2019
Field of study

Drowsiness and distracted driving are leading factor in most car crashes and near-crashes. This research study explores and investigates the applications of both conventional computer vision and deep learning approaches for the detection of drowsiness and distraction in drivers. In the first part of this MPhil research study conventional computer vision approaches was studied to develop a robust drowsiness and distraction system based on yawning detection, head pose detection and eye blinking detection. These algorithms were implemented by using existing human crafted features. Experiments were performed for the detection and classification with small image datasets to evaluate and measure the performance of system. It was observed that the use of human crafted features together with a robust classifier such as SVM gives better performance in comparison to previous approaches. Though, the results were satisfactorily, there are many drawbacks and challenges associated with conventional computer vision approaches, such as definition and extraction of human crafted features, thus making these conventional algorithms to be subjective in nature and less adaptive in practice. In contrast, deep learning approaches automates the feature selection process and can be trained to learn the most discriminative features without any input from human. In the second half of this research study, the use of deep learning approaches for the detection of distracted driving was investigated. It was observed that one of the advantages of the applied methodology and technique for distraction detection includes and illustrates the contribution of CNN enhancement to a better pattern recognition accuracy and its ability to learn features from various regions of a human body simultaneously. The comparison of the performance of four convolutional deep net architectures (AlexNet, ResNet, MobileNet and NASNet) was carried out, investigated triplet training and explored the impact of combining a support vector classifier (SVC) with a trained deep net. The images used in our experiments with the deep nets are from the State Farm Distracted Driver Detection dataset hosted on Kaggle, each of which captures the entire body of a driver. The best results were obtained with the NASNet trained using triplet loss and combined with an SVC. It was observed that one of the advantages of deep learning approaches are their ability to learn discriminative features from various regions of a human body simultaneously. The ability has enabled deep learning approaches to reach accuracy at human level.

University of Hertfordshire Research Archive

Acceptability of wind-induced vibrations in tall buildings’ office environments

Author: Heshmati Kaveh
Publication venue
Publication date: 22/06/2022
Field of study

OPUS

Speculative Techniques for Memory Hierarchy Management

Author: Albarakat Laith
Publication venue
Publication date: 20/05/2021
Field of study

The “Memory Wall” [1], is the gap in performance between the processor and the main memory. Over the last 30 years computer architects have added multiple levels of cache to fill this gap, cache levels that are closer to the processors are smaller and faster. On the other hand, the levels that are far from the processors are bigger and slower. However the processors are still exposed to the latency of DRAM on misses. Therefore, speculative memory management techniques such as prefetching are used in modern microprocessors to bridge this gap in performance. First, we propose Synchronization-aware Hardware Prefetching for Chip Multiprocessors, a novel hardware data prefetching scheme designed for prefetching shared-memory, multi- threaded workloads. This is the first work we are aware of to characterize the causes of poor prefetching performance in shared- memory multi-threaded applications. These are the inability to prefetch beyond synchronization points and tendency to prefetch shared data before it has been written. SB-Fetch, a low-complexity, low-overhead prefetcher design that addresses both issues. Second, we propose a new prefetching algorithm, Set-Level Adaptive Prefetching for Com- pressed Caches (SLAP-CC), which seeks to address this problem by varying the prefetching aggressiveness based on how much effective capacity is available in each set. The ontribu- tions of this work is characterize the increase and per-set variability of cache efficiency which typical cache compression schemes create, and propose a new prefetching scheme, SLAP-CC, designed to leverage this cache efficiency variability. Third, we propose a new a scheduling mechanism that predicts the hard- to-prefetch loads at issue time and preemptively schedule them for execution as soon as they are ready, to allow the cache hierarchy to start the mishandling mechanism sooner. Such scheduling mechanism reduces the miss penalty on the dependent instructions after a hard-to-prefetch loads

Texas A&M Repository

Proceedings of the Human Factors and Ergonomics Society Europe Chapter 2016 Annual Conference:Human Factors and User Needs in Transport, Control, and the Workplace

Author
Publication venue: HFES
Publication date: 15/06/2017
Field of study

ARTS repository - University of Groningen