253 research outputs found

    A Priority-based Fair Queuing (PFQ) Model for Wireless Healthcare System

    Get PDF
    Healthcare is a very active research area, primarily due to the increase in the elderly population that leads to increasing number of emergency situations that require urgent actions. In recent years some of wireless networked medical devices were equipped with different sensors to measure and report on vital signs of patient remotely. The most important sensors are Heart Beat Rate (ECG), Pressure and Glucose sensors. However, the strict requirements and real-time nature of medical applications dictate the extreme importance and need for appropriate Quality of Service (QoS), fast and accurate delivery of a patient’s measurements in reliable e-Health ecosystem. As the elderly age and older adult population is increasing (65 years and above) due to the advancement in medicine and medical care in the last two decades; high QoS and reliable e-health ecosystem has become a major challenge in Healthcare especially for patients who require continuous monitoring and attention. Nevertheless, predictions have indicated that elderly population will be approximately 2 billion in developing countries by 2050 where availability of medical staff shall be unable to cope with this growth and emergency cases that need immediate intervention. On the other side, limitations in communication networks capacity, congestions and the humongous increase of devices, applications and IOT using the available communication networks add extra layer of challenges on E-health ecosystem such as time constraints, quality of measurements and signals reaching healthcare centres. Hence this research has tackled the delay and jitter parameters in E-health M2M wireless communication and succeeded in reducing them in comparison to current available models. The novelty of this research has succeeded in developing a new Priority Queuing model ‘’Priority Based-Fair Queuing’’ (PFQ) where a new priority level and concept of ‘’Patient’s Health Record’’ (PHR) has been developed and integrated with the Priority Parameters (PP) values of each sensor to add a second level of priority. The results and data analysis performed on the PFQ model under different scenarios simulating real M2M E-health environment have revealed that the PFQ has outperformed the results obtained from simulating the widely used current models such as First in First Out (FIFO) and Weight Fair Queuing (WFQ). PFQ model has improved transmission of ECG sensor data by decreasing delay and jitter in emergency cases by 83.32% and 75.88% respectively in comparison to FIFO and 46.65% and 60.13% with respect to WFQ model. Similarly, in pressure sensor the improvements were 82.41% and 71.5% and 68.43% and 73.36% in comparison to FIFO and WFQ respectively. Data transmission were also improved in the Glucose sensor by 80.85% and 64.7% and 92.1% and 83.17% in comparison to FIFO and WFQ respectively. However, non-emergency cases data transmission using PFQ model was negatively impacted and scored higher rates than FIFO and WFQ since PFQ tends to give higher priority to emergency cases. Thus, a derivative from the PFQ model has been developed to create a new version namely “Priority Based-Fair Queuing-Tolerated Delay” (PFQ-TD) to balance the data transmission between emergency and non-emergency cases where tolerated delay in emergency cases has been considered. PFQ-TD has succeeded in balancing fairly this issue and reducing the total average delay and jitter of emergency and non-emergency cases in all sensors and keep them within the acceptable allowable standards. PFQ-TD has improved the overall average delay and jitter in emergency and non-emergency cases among all sensors by 41% and 84% respectively in comparison to PFQ model

    Evaluating memory energy efficiency in parallel I/O workloads

    Full text link

    An accurate prefetching policy for object oriented systems

    Get PDF
    PhD ThesisIn the latest high-performance computers, there is a growing requirement for accurate prefetching(AP) methodologies for advanced object management schemes in virtual memory and migration systems. The major issue for achieving this goal is that of finding a simple way of accurately predicting the objects that will be referenced in the near future and to group them so as to allow them to be fetched same time. The basic notion of AP involves building a relationship for logically grouping related objects and prefetching them, rather than using their physical grouping and it relies on demand fetching such as is done in existing restructuring or grouping schemes. By this, AP tries to overcome some of the shortcomings posed by physical grouping methods. Prefetching also makes use of the properties of object oriented languages to build inter and intra object relationships as a means of logical grouping. This thesis describes how this relationship can be established at compile time and how it can be used for accurate object prefetching in virtual memory systems. In addition, AP performs control flow and data dependency analysis to reinforce the relationships and to find the dependencies of a program. The user program is decomposed into prefetching blocks which contain all the information needed for block prefetching such as long branches and function calls at major branch points. The proposed prefetching scheme is implemented by extending a C++ compiler and evaluated on a virtual memory simulator. The results show a significant reduction both in the number of page fault and memory pollution. In particular, AP can suppress many page faults that occur during transition phases which are unmanageable by other ways of fetching. AP can be applied to a local and distributed virtual memory system so as to reduce the fault rate by fetching groups of objects at the same time and consequently lessening operating system overheads.British Counci

    Improving the SLLC Efficiency by exploiting reuse locality and adjusting prefetch

    Get PDF
    Desde los teléfonos móviles inteligentes hasta nuestro ordenador portátil los sistemas electrónicos que incluyen chips multiprocesador (CMP) están presentes en nuestra vida cotidiana de una manera abrumadora. Los CMPs contienen varios núcleos o CPUs que tienen que ser alimentados con datos provenientes de la memoria. Pero la velocidad a la que los núcleos que forman el CMP necesitan los datos es mucho mayor que la velocidad a la que la memoria es capaz de proporcionar dichos datos. De hecho, esta diferencia ha ido aumentando desde prácticamente el día en el que ambos dispositivos fueron concebidos. Esta diferencia en el rendimiento de ambos dispositivos se ha venido a llamar "the memory gap". Al mismo tiempo que dicha diferencia aumentaba, los lenguajes de programación proporcionaban a los programadores modelos de memoria que podían acceder a un espacio prácticamente infinito y al que, además, se accedía de manera instantánea. Pero el tamaño de cualquier estructura hardware está íntimamente relacionado con su tiempo de acceso y éste será mayor cuanto mayor sea el tamaño la estructura hardware a acceder. Con el ánimo de deshacer esta aparente contradicción, los arquitectos de computadores incluyeron memorias intermedias entre las CPUs y la grande, aunque al mismo tiempo lenta, memoria principal. Estas memorias intermedias se denominan memorias cache o simplemente caches. Debido a la gran diferencia que existe entre la velocidad del procesador y la de la memoria principal. Los CMPs en la actualidad están provistos de una jerarquía de memorias cache que tiene dos o tres niveles. Las caches que están cerca del procesador sólo contienen unos pocos kilobytes (entre 4 y 64) accesibles en uno o pocos ciclos de reloj, mientras que las que se encuentran más alejadas del procesador pueden llegar a contener varios megabytes y tener un tiempo de acceso de varias decenas de ciclos. Los programas al ser ejecutados muestran una propiedad llamada localidad que se expresa en los ejes espacial y temporal. La localidad temporal es la propiedad que dice que el programa volverá a usar datos que usó recientemente, cuanto más recientemente los usó, más probable es que vuelva a hacerlo. Mientras que la localidad espacial es la propiedad que dice que el programa tenderá a usar datos que están próximos en el espacio de memoria a datos que usó recientemente. Las memorias cache han sido diseñadas tradicionalmente para explotar la localidad. En concreto, la localidad temporal se explotaba mediante una adecuada política de reemplazo, mientras que la localidad espacial se explota al contener cada bloque de cache varios datos o palabras. Un modo adicional de conseguir explotar una mayor cantidad de localidad espacial es mediante el uso de la técnica llamada prebúsqueda. La política de reemplazo influye de manera crítica en la tasa de aciertos de la memoria cache. En un CMP provisto de una jerarquía de memorias cache, la localidad temporal se explota en aquellos niveles más cercanos a los núcleos. Así que muchos de los bloques insertados en la SLLC son de un solo uso, es decir, estos bloques no experimentarán ningún acierto más durante todo el tiempo que permanezcan en la SLLC. Sin embargo, aquellos bloques que lleguen a experimentar un acierto en la SLLC, normalmente experimentarán muchos más aciertos. Por lo tanto, que la política de reemplazo base sus decisiones en la posible explotación de la localidad temporal, es una asunción inválida cuando hablamos de la SLLC. Por el contrario, Este comportamiento indica que dicha política de reemplazo de la SLLC debería estar basada en el reúso1 en lugar de en la localidad temporal. La prebúsqueda hardware tiene por objetivo cargar en la cache datos antes de que sea el procesador quien los pida. La validez de esta técnica a la hora de reducir la latencia media de acceso a memoria ha sido ampliamente demostrada. La prebúsqueda funciona especialmente bien en las jerarquías de memoria de sistemas monoprocesador, donde solamente hay un flujo de datos entre el procesador y la memoria. Sin embargo, cuando la prebúsqueda se usa en un sistema multiprocesador donde diferentes aplicaciones se están ejecutando al mismo tiempo, las prebúsquedas asociadas a un núcleo podrían interferir con los datos cargados en la cache por otro núcleo, provocando la eliminación de los contenidos de otra aplicación y dañando su rendimiento. Es necesario por tanto un mecanismo para regular la prebúsqueda asociada a cada uno de los núcleos. Este mecanismo debería tener por objetivo el mejorar el rendimiento general del sistema. 1 Aunque el DRAE no contenga su definición, usaremos aquí el verbo reusar (así como sus formas derivadas) como sinónimo de volver a utilizar. Cada fallo en la SLLC provoca un acceso a la memoria principal que se encuentra fuera del chip. Además la memoria principal está hecha de chips de DRAM. Ambos factores incrementan su latencia de acceso, latencia que se suma a cada uno de los accesos que falla en la SLLC, penalizando a la vez la latencia media de acceso a memoria. Por lo tanto, la tasa de aciertos de la SLLC es un factor crítico para lograr una latencia media de acceso a memoria óptima. Esta tesis fija su atención en la eficiencia de los dos aspectos comentados con anterioridad: la eficiencia de la prebúsqueda y la eficiencia de la política de reemplazo. Las contribuciones principales de esta tesis son las siguientes: 1) Enunciamos una propiedad llamada localidad de reúso que dice que i) los bloques de cache que hayan sido usados más de una vez tienen una alta probabilidad de ser usados muchas veces en el futuro. ii) Los bloques de cache recientemente reusados son más útiles que otros reúsados previamente. Defendemos en esta tesis que el patrón de acceso a la SLLC muestra localidad de reúso. 2) En esta tesis se proponen dos algoritmos de reemplazo capaces de explotar la localidad de reúso, Least-recently reused (LRR) y Not-recently reused (NRR). Estos dos nuevos algoritmos son modificaciones de otros dos muy bien conocidos: Least-recently used (LRU) y Not-recently used (NRU). Dichos algoritmos fueron diseñados para explotar la localidad temporal, mientras que los nuestros explotan la local- idad de reúso. Las modificaciones propuestas no suponen ninguna sobrecarga hardware respecto a los algoritmos base. Durante esta tesis se muestra que nuestros algoritmos mejoran consistentemente el rendimiento de los originales. 3) Proponemos un novedoso diseño para la SLLC llamado Reuse Cache. En este diseño los arrays de etiquetas y datos de la cache están desacoplados. Solamente se almacenan en el array de datos aquellos bloques que hayan mostrado reúso. El array de etiquetas se usa para detectar reúso y mantener la coherencia. Esta estructura permite reducir el tamaño del array de datos de manera drástica. Como ejemplo, una Reuse Cache con un array de etiquetas equivalente al de una cache convencional de 4MB y un array de datos de 1MB, tiene el mismo rendimiento medio que una cache convencional de 8MB, pero con un ahorro de almacenamiento de en torno al 84%. 4) Un controlador de bajo coste llamado ABS capaz de ajustar la agresividad de la prebúsqueda asociada a cada uno de los núcleos de un CMP pero con el ánimo de mejorar el rendimiento general del sistema. El controlador funciona de manera aislada en cada uno de los bancos de la SLLC y recoge métricas locales. Para optimizar el rendimiento global del sistema busca la combinación óptima de valores de la agresividad de prebúsqueda. Para inferir cuál es esa combinación óptima usa una estrategia de búsqueda hill-climbing

    Flashing up the storage hierarchy

    Get PDF
    The focus of this thesis is on systems that employ both flash and magnetic disks as storage media. Considering the widely disparate I/O costs of flash disks currently on the market, our approach is a cost-aware one: we explore techniques that exploit the I/O costs of the underlying storage devices to improve I/O performance. We also study the asymmetric I/O properties of magnetic and flash disks and propose algorithms that take advantage of this asymmetry. Our work is geared towards database systems; however, most of the ideas presented in this thesis can be generalised to any data-intensive application. For the case of low-end, inexpensive flash devices with large capacities, we propose using them at the same level of the memory hierarchy as magnetic disks. In such setups, we study the problem of data placement, that is, on which type of storage medium each data page should be stored. We present a family of online algorithms that can be used to dynamically decide the optimal placement of each page. Our algorithms adapt to changing workloads for maximum I/O efficiency. We found that substantial performance benefits can be gained with such a design, especially for queries touching large sets of pages with read-intensive workloads. Moving one level higher in the storage hierarchy, we study the problem of buffer allocation in databases that store data across multiple storage devices. We present our novel approach to per-device memory allocation, under which both the I/O costs of the storage devices and the cache behaviour of the data stored on each medium determine the size of the main memory buffers that will be allocated to each device. Towards informed decisions, we found that the ability to predict the cache behaviour of devices under various cache sizes is of paramount importance. In light of this, we study the problem of efficiently tracking the hit ratio curve for each device and introduce a lowoverhead technique that provides high accuracy. The price and performance characteristics of high-end flash disks make them perfectly suitable for use as caches between the main memory and the magnetic disk(s) of a storage system. In this context, we primarily focus on the problem of deciding which data should be placed in the flash cache of a system: how the data flows from one level of the memory hierarchy to the others is crucial for the performance of such a system. Considering such decisions, we found that the I/O costs of the flash cache play a major role. We also study several implementation issues such as the optimal size of flash pages and the properties of the page directory of a flash cache. Finally, we explore sorting in external memory using external merge-sort, as the latter employs access patterns that can take full advantage of the I/O characteristics of flash memory. We study the problem of sorting hierarchical data, as such is necessary for a wide variety of applications including archiving scientific data and dealing with large XML datasets. The proposed algorithm efficiently exploits the hierarchical structure in order to minimize the number of disk accesses and optimise the utilization of available memory. Our proposals are not specific to sorting over flash memory: the presented techniques are highly efficient over magnetic disks as well

    Re-Architecting Mass Storage Input/Output for Performance and Efficiency

    Full text link
    The semantics and fundamental structure of modern operating system IO systems dates from the mid-1960\u27s to the mid-1970\u27s, a period of time when computing power and memory capacity were a mere fraction of today\u27s systems. Engineering tradeoffs made in the past enshrine the resource availability context of computing at that time. Deconstructing the semantics of the IO infrastructure allows a re-examination of long-standing design decisions in the context of today\u27s greater processing and memory resources. The re-examination allows changes to several wide-spread paradigms to improve efficiency and performance

    Memory management in a distributed system of single address space operating systems supporting quality of service

    Get PDF
    The choices provided by an operating system to the application developer for managing memory came in two forms: no choice at all, with the operating system making all decisions about managing memory; or the choice to implement virtual memory management specific to the individual application. The second of these choices is, for all intents and purposes, the same as the first: no choice at all. For many application developers, the cost of implementing a customised virtual memory management system is just too high. The results is that, regardless of the level of flexibility available, the developer ends up using the system-provided default. Further exacerbating the problem is the tendency for operating system developers to be extremely unimaginative when providing that same default. Advancements in virtual memory techniques such as prefetching, remote paging, compressed caching, and user-level page replacement coupled with the provision of user-level virtual memory management should have heralded a new era of choice and an application-centric approach to memory management. Unfortunately, this has failed to materialise. This dissertation describes the design and implementation of the Heracles virtual memory management system. The Heracles approach is one of inclusion rather than exclusion. The main goal of Heracles is to provide an extensible environment that is configurable to the extent of providing application-centric memory management without the need for application developers to implement their own. However, should the application developer wish to provide a more specialised implementation for all or any part of Heracles, the system is constructed around well-defined interfaces that allow new implementations to be "plugged in" where required. The result is a virtual memory management hierarchy that is highly configurable, highly flexible, and can be adapted at run-time to meet new phases in the application's behaviour. Furthermore, different parts of an application's address space can have different hierarchies associated with managing its memory

    Fourth NASA Goddard Conference on Mass Storage Systems and Technologies

    Get PDF
    This report contains copies of all those technical papers received in time for publication just prior to the Fourth Goddard Conference on Mass Storage and Technologies, held March 28-30, 1995, at the University of Maryland, University College Conference Center, in College Park, Maryland. This series of conferences continues to serve as a unique medium for the exchange of information on topics relating to the ingestion and management of substantial amounts of data and the attendant problems involved. This year's discussion topics include new storage technology, stability of recorded media, performance studies, storage system solutions, the National Information infrastructure (Infobahn), the future for storage technology, and lessons learned from various projects. There also will be an update on the IEEE Mass Storage System Reference Model Version 5, on which the final vote was taken in July 1994