Search CORE

69 research outputs found

Recommended from our members

NVSwap Latency-Aware Paging Using Non-Volatile Main Memory

Author: Wu Yekang
Publication venue: Washington State University
Publication date: 01/01/2021
Field of study

Page relocation (paging) from DRAM to swap devices is an important task of a virtual memory system in operating systems. Existing Linux paging mechanisms have two main deficiencies: (1) they may incur a high I/O latency due to write interference on solid-state disks and aggressive memory page reclaiming rate under high memory pressure and (2) they do not provide predictable latency bound for latency-sensitive applications because they cannot control the allocation of system resources among concurrent processes sharing swap devices. In this thesis, we present the design and implementation of a latency-aware paging mechanism called NVSwap. It supports a hybrid swap space using both regular secondary storage devices (e.g., solid-state disks) and non-volatile main memory (NVMM). The design is more cost-effective than using only NVMM as swap spaces. Furthermore, NVSwap uses NVMM as a persistent paging buffer to serve the page-out requests and hide the latency of paging between the regular swap device and DRAM. It supports in-situ paging for pages in the persistent paging buffer avoiding the slow I/O path. Finally, NVSwap allows users to specify latency bounds for individual processes or a group of related processes and enforces the bounds by dynamically controlling the resource allocation of NVMM and page reclaiming rate in memory among scheduling units. We have implemented a prototype of NVSwap in the Linux kernel-3.16.74. Our results demonstrate that NVSwap reduces paging latency by up to 99% and provides performance guarantee and isolation among concurrent applications sharing swap devices

Washington State University institutional repository

Enhancing the Programmability of Cloud Object Storage

Author: Sampé Domenech Josep
Publication venue: 'Universitat Rovira I Virgili'
Publication date: 20/11/2018
Field of study

En un món que depèn cada vegada més de la tecnologia, les dades digitals es generen a una escala sense precedents. Això fa que empreses que requereixen d'un gran espai d'emmagatzematge, com Netflix o Dropbox, utilitzin solucions d'emmagatzematge al núvol. Mes concretament, l'emmagatzematge d'objectes, donada la seva simplicitat, escalabilitat i alta disponibilitat. No obstant això, aquests magatzems s'enfronten a tres desafiaments principals: 1) Gestió flexible de càrregues de treball de múltiples usuaris. Normalment, els magatzems d'objectes són sistemes multi-usuari, la qual cosa significa que tots ells comparteixen els mateixos recursos, el que podria ocasionar problemes d'interferència. A més, és complex administrar polítiques d'emmagatzematge heterogènies a gran escala en ells. 2) Autogestió de dades. Els magatzems d'objectes no ofereixen molta flexibilitat pel que fa a l'autogestió de dades per part dels usuaris. Típicament, són sistemes rígids, la qual cosa impedeix gestionar els requisits específics dels objectes. 3) Còmput elàstic prop de les dades. Situar els càlculs prop de les dades pot ser útil per reduir la transferència de dades. Però, el desafiament aquí és com aconseguir la seva elasticitat sense provocar contenció de recursos i interferències en la capa d'emmagatzematge. En aquesta tesi presentem tres contribucions innovadores que resolen aquests desafiaments. En primer lloc, presentem la primera arquitectura d'emmagatzematge definida per programari (SDS) per a magatzems d'objectes que separa les capes de control i de dades. Això permet gestionar les càrregues de treball de múltiples usuaris d'una manera flexible i dinàmica. En segon lloc, hem dissenyat una nova abstracció de polítiques anomenada "microcontrolador" que transforma els objectes comuns en objectes intel·ligents, permetent als usuaris programar el seu comportament. Finalment, presentem la primera plataforma informàtica "serverless" guiada per dades i elàstica, que mitiga els problemes de col·locar el càlcul prop de les dades.En un mundo que depende cada vez más de la tecnología, los datos digitales se generan a una escala sin precedentes. Esto hace que empresas que requieren de un gran espacio de almacenamiento, como Netflix o Dropbox, usen soluciones de almacenamiento en la nube. Mas concretamente, el almacenamiento de objectos, dada su escalabilidad y alta disponibilidad. Sin embargo, estos almacenes se enfrentan a tres desafíos principales: 1) Gestión flexible de cargas de trabajo de múltiples usuarios. Normalmente, los almacenes de objetos son sistemas multi-usuario, lo que significa que todos ellos comparten los mismos recursos, lo que podría ocasionar problemas de interferencia. Además, es complejo administrar políticas de almacenamiento heterogéneas a gran escala en ellos. 2) Autogestión de datos. Los almacenes de objetos no ofrecen mucha flexibilidad con respecto a la autogestión de datos por parte de los usuarios. Típicamente, son sistemas rígidos, lo que impide gestionar los requisitos específicos de los objetos. 3) Cómputo elástico cerca de los datos. Situar los cálculos cerca de los datos puede ser útil para reducir la transferencia de datos. Pero, el desafío aquí es cómo lograr su elasticidad sin provocar contención de recursos e interferencias en la capa de almacenamiento. En esta tesis presentamos tres contribuciones que resuelven estos desafíos. En primer lugar, presentamos la primera arquitectura de almacenamiento definida por software (SDS) para almacenes de objetos que separa las capas de control y de datos. Esto permite gestionar las cargas de trabajo de múltiples usuarios de una manera flexible y dinámica. En segundo lugar, hemos diseñado una nueva abstracción de políticas llamada "microcontrolador" que transforma los objetos comunes en objetos inteligentes, permitiendo a los usuarios programar su comportamiento. Finalmente, presentamos la primera plataforma informática "serverless" guiada por datos y elástica, que mitiga los problemas de colocar el cálculo cerca de los datos.In a world that is increasingly dependent on technology, digital data is generated in an unprecedented way. This makes companies that require large storage space, such as Netflix or Dropbox, use cloud object storage solutions. This is mainly thanks to their built-in characteristics, such as simplicity, scalability and high-availability. However, cloud object stores face three main challenges: 1) Flexible management of multi-tenant workloads. Commonly, cloud object stores are multi-tenant systems, meaning that all tenants share the same system resources, which could lead to interference problems. Furthermore, it is now complex to manage heterogeneous storage policies in a massive scale. 2) Data self-management. Cloud object stores themselves do not offer much flexibility regarding data self-management by tenants. Typically, they are rigid, which prevent tenants to handle the specific requirements of their objects. 3) Elastic computation close to the data. Placing computations close to the data can be useful to reduce data transfers. But, the challenge here is how to achieve elasticity in those computations without provoking resource contention and interferences in the storage layer. In this thesis, we present three novel research contributions that solve the aforementioned challenges. Firstly, we introduce the first Software-defined Storage (SDS) architecture for cloud object stores that separates the control plane from the data plane, allowing to manage multi-tenant workloads in a flexible and dynamic way. For example, by applying different service levels of bandwidth to different tenants. Secondly, we designed a novel policy abstraction called microcontroller that transforms common objects into smart objects, enabling tenants to programmatically manage their behavior. For example, a content-level access control microcontroller attached to an specific object to filter its content depending on who is accessing it. Finally, we present the first elastic data-driven serverless computing platform that mitigates the resource contention problem of placing computation close to the data

Tesis Doctorals en Xarxa

Recommended from our members

High performance Monte Carlo computation for finance risk data analysis

Author: Zhao Yu
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Finance risk management has been playing an increasingly important role in the finance sector, to analyse finance data and to prevent any potential crisis. It has been widely recognised that Value at Risk (VaR) is an effective method for finance risk management and evaluation. This thesis conducts a comprehensive review on a number of VaR methods and discusses in depth their strengths and limitations. Among these VaR methods, Monte Carlo simulation and analysis has proven to be the most accurate VaR method in finance risk evaluation due to its strong modelling capabilities. However, one major challenge in Monte Carlo analysis is its high computing complexity of O(n²). To speed up the computation in Monte Carlo analysis, this thesis parallelises Monte Carlo using the MapReduce model, which has become a major software programming model in support of data intensive applications. MapReduce consists of two functions - Map and Reduce. The Map function segments a large data set into small data chunks and distribute these data chunks among a number of computers for processing in parallel with a Mapper processing a data chunk on a computing node. The Reduce function collects the results generated by these Map nodes (Mappers) and generates an output. The parallel Monte Carlo is evaluated initially in a small scale MapReduce experimental environment, and subsequently evaluated in a large scale simulation environment. Both experimental and simulation results show that the MapReduce based parallel Monte Carlo is greatly faster than the sequential Monte Carlo in computation, and the accuracy level is maintained as well. In data intensive applications, moving huge volumes of data among the computing nodes could incur high overhead in communication. To address this issue, this thesis further considers data locality in the MapReduce based parallel Monte Carlo, and evaluates the impacts of data locality on the performance in computation

Brunel University Research Archive

Sistemas interativos e distribuídos para telemedicina

Author: Monteiro Eriksson Jorge Melicio
Publication venue: Universidade de Aveiro
Publication date: 20/04/2017
Field of study

doutoramento Ciências da ComputaçãoDurante as últimas décadas, as organizações de saúde têm vindo a adotar continuadamente as tecnologias de informação para melhorar o funcionamento dos seus serviços. Recentemente, em parte devido à crise financeira, algumas reformas no sector de saúde incentivaram o aparecimento de novas soluções de telemedicina para otimizar a utilização de recursos humanos e de equipamentos. Algumas tecnologias como a computação em nuvem, a computação móvel e os sistemas Web, têm sido importantes para o sucesso destas novas aplicações de telemedicina. As funcionalidades emergentes de computação distribuída facilitam a ligação de comunidades médicas, promovem serviços de telemedicina e a colaboração em tempo real. Também são evidentes algumas vantagens que os dispositivos móveis podem introduzir, tais como facilitar o trabalho remoto a qualquer hora e em qualquer lugar. Por outro lado, muitas funcionalidades que se tornaram comuns nas redes sociais, tais como a partilha de dados, a troca de mensagens, os fóruns de discussão e a videoconferência, têm o potencial para promover a colaboração no sector da saúde. Esta tese teve como objetivo principal investigar soluções computacionais mais ágeis que permitam promover a partilha de dados clínicos e facilitar a criação de fluxos de trabalho colaborativos em radiologia. Através da exploração das atuais tecnologias Web e de computação móvel, concebemos uma solução ubíqua para a visualização de imagens médicas e desenvolvemos um sistema colaborativo para a área de radiologia, baseado na tecnologia da computação em nuvem. Neste percurso, foram investigadas metodologias de mineração de texto, de representação semântica e de recuperação de informação baseada no conteúdo da imagem. Para garantir a privacidade dos pacientes e agilizar o processo de partilha de dados em ambientes colaborativos, propomos ainda uma metodologia que usa aprendizagem automática para anonimizar as imagens médicasDuring the last decades, healthcare organizations have been increasingly relying on information technologies to improve their services. At the same time, the optimization of resources, both professionals and equipment, have promoted the emergence of telemedicine solutions. Some technologies including cloud computing, mobile computing, web systems and distributed computing can be used to facilitate the creation of medical communities, and the promotion of telemedicine services and real-time collaboration. On the other hand, many features that have become commonplace in social networks, such as data sharing, message exchange, discussion forums, and a videoconference, have also the potential to foster collaboration in the health sector. The main objective of this research work was to investigate computational solutions that allow us to promote the sharing of clinical data and to facilitate the creation of collaborative workflows in radiology. By exploring computing and mobile computing technologies, we have designed a solution for medical imaging visualization, and developed a collaborative system for radiology, based on cloud computing technology. To extract more information from data, we investigated several methodologies such as text mining, semantic representation, content-based information retrieval. Finally, to ensure patient privacy and to streamline the data sharing in collaborative environments, we propose a machine learning methodology to anonymize medical images

Repositório Institucional da Universidade de Aveiro

Improving Data Management and Data Movement Efficiency in Hybrid Storage Systems

Author: GE XIONGZI
Publication venue
Publication date: 01/07/2017
Field of study

University of Minnesota Ph.D. dissertation.July 2017. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); ix, 116 pages.In the big data era, large volumes of data being continuously generated drive the emergence of high performance large capacity storage systems. To reduce the total cost of ownership, storage systems are built in a more composite way with many different types of emerging storage technologies/devices including Storage Class Memory (SCM), Solid State Drives (SSD), Shingle Magnetic Recording (SMR), Hard Disk Drives (HDD), and even across off-premise cloud storage. To make better utilization of each type of storage, industries have provided multi-tier storage through dynamically placing hot data in the faster tiers and cold data in the slower tiers. Data movement happens between devices on one single device and as well as between devices connected via various networks. Toward improving data management and data movement efficiency in such hybrid storage systems, this work makes the following contributions: To bridge the giant semantic gap between applications and modern storage systems, passing a piece of tiny and useful information (I/O access hints) from upper layers to the block storage layer may greatly improve application performance or ease data management in heterogeneous storage systems. We present and develop a generic and flexible framework, called HintStor, to execute and evaluate various I/O access hints on heterogeneous storage systems with minor modifications to the kernel and applications. The design of HintStor contains a new application/user level interface, a file system plugin and a block storage data manager. With HintStor, storage systems composed of various storage devices can perform pre-devised data placement, space reallocation and data migration polices assisted by the added access hints. Each storage device/technology has its own unique price-performance tradeoffs and idiosyncrasies with respect to workload characteristics they prefer to support. To explore the internal access patterns and thus efficiently place data on storage systems with fully connected (i.e., data can move from one device to any other device instead of moving tier by tier) differential pools (each pool consists of storage devices of a particular type), we propose a chunk-level storage-aware workload analyzer framework, simplified as ChewAnalyzer. With ChewAnalzyer, the storage manager can adequately distribute and move the data chunks across different storage pools. To reduce the duplicate content transferred between local storage devices and devices in remote data centers, an inline Network Redundancy Elimination (NRE) process with Content-Defined Chunking (CDC) policy can obtain a higher Redundancy Elimination (RE) ratio but may suffer from a considerably higher computational requirement than fixed-size chunking. We build an inline NRE appliance which incorporates an improved FPGA based scheme to speed up CDC processing. To efficiently utilize the hardware resources, the whole NRE process is handled by a Virtualized NRE (VNRE) controller. The uniqueness of this VNRE that we developed lies in its ability to exploit the redundancy patterns of different TCP flows and customize the chunking process to achieve a higher RE ratio

University of Minnesota Digital Conservancy

Millimeter-wave Wireless LAN and its Extension toward 5G Heterogeneous Networks

Author: Kusano Hideyuki
Miyamoto Shinichi
Mizukami Makoto
Mohamed Ehab Mahmoud
Namba Shinobu
Peng Hailan
Rezagah Roya
Sakaguchi Kei
Shirakata Naganori
Takahashi Kazuaki
Takinami Koji
Yamamoto Toshiaki
Publication venue
Publication date: 01/01/2015
Field of study

Millimeter-wave (mmw) frequency bands, especially 60 GHz unlicensed band, are considered as a promising solution for gigabit short range wireless communication systems. IEEE standard 802.11ad, also known as WiGig, is standardized for the usage of the 60 GHz unlicensed band for wireless local area networks (WLANs). By using this mmw WLAN, multi-Gbps rate can be achieved to support bandwidth-intensive multimedia applications. Exhaustive search along with beamforming (BF) is usually used to overcome 60 GHz channel propagation loss and accomplish data transmissions in such mmw WLANs. Because of its short range transmission with a high susceptibility to path blocking, multiple number of mmw access points (APs) should be used to fully cover a typical target environment for future high capacity multi-Gbps WLANs. Therefore, coordination among mmw APs is highly needed to overcome packet collisions resulting from un-coordinated exhaustive search BF and to increase the total capacity of mmw WLANs. In this paper, we firstly give the current status of mmw WLANs with our developed WiGig AP prototype. Then, we highlight the great need for coordinated transmissions among mmw APs as a key enabler for future high capacity mmw WLANs. Two different types of coordinated mmw WLAN architecture are introduced. One is the distributed antenna type architecture to realize centralized coordination, while the other is an autonomous coordination with the assistance of legacy Wi-Fi signaling. Moreover, two heterogeneous network (HetNet) architectures are also introduced to efficiently extend the coordinated mmw WLANs to be used for future 5th Generation (5G) cellular networks.Comment: 18 pages, 24 figures, accepted, invited paper

arXiv.org e-Print Archive

Crossref

Effective Use of SSDs in Database Systems

Author: Ghodsnia Pedram
Publication venue: 'University of Waterloo'
Publication date: 03/05/2018
Field of study

With the advent of solid state drives (SSDs), the storage industry has experienced a revolutionary improvement in I/O performance. Compared to traditional hard disk drives (HDDs), SSDs benefit from shorter I/O latency, better power efficiency, and cheaper random I/Os. Because of these superior properties, SSDs are gradually replacing HDDs. For decades, database management systems have been designed, architected, and optimized based on the performance characteristics of HDDs. In order to utilize the superior performance of SSDs, new methods should be developed, some database components should be redesigned, and architectural decisions should be revisited. In this thesis, novel methods are proposed to exploit the new capabilities of modern SSDs to improve the performance of database systems. The first is a new method for using SSDs as a fully persistent second level memory buffer pool. This method uses SSDs as a supplementary storage device to improve transactional throughput and to reduce the checkpoint and recovery times. A prototype of the proposed method is compared with its closest existing competitor. The second considers the impact of the parallel I/O capability of modern SSDs on the database query optimizer. It is shown that a query optimizer that is unaware of the parallel I/O capability of SSDs can make significantly sub-optimal decisions. In addition, a practical method for making the query optimizer parallel-I/O-aware is introduced and evaluated empirically. The third technique is an SSD-friendly external merge sort. This sorting technique has better performance than other common external sorting techniques. It also improves the SSD's lifespan by reducing the number of write operations required during sorting

University of Waterloo's Institutional Repository

A stack cache for the C-processor

Author: Paalman J.A.H.
Publication venue
Publication date: 01/01/1990
Field of study

Repository TU/e

Pure OAI Repository