498 research outputs found
Designing Low Cost Error Correction Schemes for Improving Memory Reliability
abstract: Memory systems are becoming increasingly error-prone, and thus guaranteeing their reliability is a major challenge. In this dissertation, new techniques to improve the reliability of both 2D and 3D dynamic random access memory (DRAM) systems are presented. The proposed schemes have higher reliability than current systems but with lower power, better performance and lower hardware cost.
First, a low overhead solution that improves the reliability of commodity DRAM systems with no change in the existing memory architecture is presented. Specifically, five erasure and error correction (E-ECC) schemes are proposed that provide at least Chipkill-Correct protection for x4 (Schemes 1, 2 and 3), x8 (Scheme 4) and x16 (Scheme 5) DRAM systems. All schemes have superior error correction performance due to the use of strong symbol-based codes. In addition, the use of erasure codes extends the lifetime of the 2D DRAM systems.
Next, two error correction schemes are presented for 3D DRAM memory systems. The first scheme is a rate-adaptive, two-tiered error correction scheme (RATT-ECC) that provides strong reliability (10^10x) reduction in raw FIT rate) for an HBM-like 3D DRAM system that services CPU applications. The rate-adaptive feature of RATT-ECC enables permanent bank failures to be handled through sparing. It can also be used to significantly reduce the refresh power consumption without decreasing the reliability and timing performance.
The second scheme is a two-tiered error correction scheme (Config-ECC) that supports different sized accesses in GPU applications with strong reliability. It addresses the mismatch between data access size and fixed sized ECC scheme by designing a product code based flexible scheme. Config-ECC is built around a core unit designed for 32B access with a simple extension to support 64B and 128B accesses. Compared to fixed 32B and 64B ECC schemes, Config-ECC reduces the failure in time (FIT) rate by 200x and 20x, respectively. It also reduces the memory energy by 17% (in the dynamic mode) and 21% (in the static mode) compared to a state-of-the-art fixed 64B ECC scheme.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
How Can Optical Communications Shape the Future of Deep Space Communications? A Survey
With a large number of deep space (DS) missions anticipated by the end of
this decade, reliable and high capacity DS communications systems are needed
more than ever. Nevertheless, existing DS communications technologies are far
from meeting such a goal. Improving current DS communications systems does not
only require system engineering leadership but also, very crucially, an
investigation of potential emerging technologies that overcome the unique
challenges of ultra-long DS communications links. To the best of our knowledge,
there has not been any comprehensive surveys of DS communications technologies
over the last decade. Free space optical (FSO) technology is an emerging DS
technology, proven to acquire lower communications systems size, weight, and
power (SWaP) and achieve a very high capacity compared to its counterpart radio
frequency (RF) technology, the current used DS technology. In this survey, we
discuss the pros and cons of deep space optical communications (DSOC).
Furthermore, we review the modulation, coding, and detection, receiver, and
protocols schemes and technologies for DSOC. We provide, for the very first
time, thoughtful discussions about implementing orbital angular momentum (OAM)
and quantum communications (QC) for DS. We elaborate on how these technologies
among other field advances, including interplanetary network, and RF/FSO
systems improve reliability, capacity, and security and address related
implementation challenges and potential solutions. This paper provides a
holistic survey in DSOC technologies gathering 200+ fragmented literature and
including novel perspectives aiming to setting the stage for more developments
in the field.Comment: 17 pages, 8 Figure
Architectural Techniques to Enable Reliable and Scalable Memory Systems
High capacity and scalable memory systems play a vital role in enabling our
desktops, smartphones, and pervasive technologies like Internet of Things
(IoT). Unfortunately, memory systems are becoming increasingly prone to faults.
This is because we rely on technology scaling to improve memory density, and at
small feature sizes, memory cells tend to break easily. Today, memory
reliability is seen as the key impediment towards using high-density devices,
adopting new technologies, and even building the next Exascale supercomputer.
To ensure even a bare-minimum level of reliability, present-day solutions tend
to have high performance, power and area overheads. Ideally, we would like
memory systems to remain robust, scalable, and implementable while keeping the
overheads to a minimum. This dissertation describes how simple cross-layer
architectural techniques can provide orders of magnitude higher reliability and
enable seamless scalability for memory systems while incurring negligible
overheads.Comment: PhD thesis, Georgia Institute of Technology (May 2017
Recommended from our members
Strong, thorough, and efficient memory protection against existing and emerging DRAM errors
Memory protection is necessary to ensure the correctness of data in the presence of unavoidable faults. As such, large-scale systems typically employ Error Correcting Codes (ECC) to trade off redundant storage and bandwidth for increased reliability. Single Device Data Correction (SDDC) ECC mechanisms are required to meet the reliability demands of servers and large-scale systems by tolerating even severe faults that disable an entire memory chip. In the future, however, stronger memory protection will be required due to increasing levels of system integration, shrinking process technology, and growing transfer rates. The energy-efficiency of memory protection is also important as DRAM already consumes a significant fraction of system energy budget. This dissertation develops a novel set of ECC schemes to provide strong, safe, flexible, and thorough protection against existing and emerging types of DRAM errors. This research also reduces energy consumption of such protection while only marginally impacting performance. First, this dissertation develops Bamboo ECC, a technique with strongerthan-SDDC correction and very safe detection capabilities (≥ 99.999994% of data errors with any severity are detected). Bamboo ECC changes ECC layout based on frequent DRAM error patterns, and can correct concurrent errors from multiple devices and all but eliminates the risk of silent data corruption. Also, Bamboo ECC provides flexible configurations to enable more adaptive graceful downgrade schemes in which the system continues to operate correctly after even severe chip faults, albeit at a reduced capacity to protect against future faults. These strength, safety, and flexibility advantages translate to a significantly more reliable memory sub-system for future exascale computing. Then, this dissertation focuses on emerging error types from scaling process technology and increasing data bandwidth. As DRAM process technology scales down to below 10nm, DRAM cells are becoming more vulnerable to errors from an imperfect manufacturing process. At the same time, DRAM signal transfers are getting more susceptible to timing and electrical noises as DRAM interfaces keep increasing signal transfer rates and decreasing I/O voltage levels. With individual DRAM chips getting more vulnerable to errors, industry and academia have proposed mechanisms to tolerate these emerging types of errors; yet they are inefficient because they rely on multiple levels of redundancy in the case of cell errors and ad-hoc schemes with suboptimal protection coverage for transmission errors. Active Guardband ECC and All-Inclusive ECC make systematic use of ECC and existing mechanisms to provide thorough end-to-end protection without requiring redundancy beyond what is common today. Finally, this dissertation targets the energy efficiency of memory protection. Frugal ECC combines ECC with fine-grained compression to provide versatile and energy-efficient protection. Frugal ECC compresses main memory at cache-block granularity, using any left over space to store ECC information. Frugal ECC allows more energy-efficient memory configurations while maintaining SDDC protection. Its tailored compression scheme minimizes insufficiently compressed blocks and results in acceptable performance overhead. The strong, thorough, and efficient protection described by this dissertation may allow for more aggressive design of future computing systems with larger integration, finer process technology, higher transfer rates, and better energy efficiencyElectrical and Computer Engineerin
Secure and efficient storage of multimedia: content in public cloud environments using joint compression and encryption
The Cloud Computing is a paradigm still with many unexplored areas ranging from the
technological component to the de nition of new business models, but that is revolutionizing the way we design, implement and manage the entire infrastructure of information technology.
The Infrastructure as a Service is the delivery of computing infrastructure, typically a virtual data center, along with a set of APIs that allow applications, in an automatic way, can control the resources they wish to use. The choice of the service provider and how it applies to their business model may lead to higher or lower cost in the operation and maintenance of applications near the suppliers.
In this sense, this work proposed to carry out a literature review on the topic of Cloud
Computing, secure storage and transmission of multimedia content, using lossless compression, in public cloud environments, and implement this system by building an application that manages data in public cloud environments (dropbox and meocloud).
An application was built during this dissertation that meets the objectives set. This system provides the user a wide range of functions of data management in public cloud environments, for that the user only have to login to the system with his/her credentials, after performing the login, through the Oauth 1.0 protocol (authorization protocol) is generated an access token, this token is generated only with the consent of the user and allows the application to get access to data/user les without having to use credentials. With this token the framework can now operate and unlock the full potential of its functions. With this application
is also available to the user functions of compression and encryption so that user can make the most of his/her cloud storage system securely. The compression function works using the compression algorithm LZMA being only necessary for the user to choose the les to be compressed.
Relatively to encryption it will be used the encryption algorithm AES (Advanced Encryption Standard) that works with a 128 bit symmetric key de ned by user.
We build the research into two distinct and complementary parts: The rst part consists
of the theoretical foundation and the second part is the development of computer application where the data is managed, compressed, stored, transmitted in various environments of cloud computing. The theoretical framework is organized into two chapters, chapter 2 - Background
on Cloud Storage and chapter 3 - Data compression.
Sought through theoretical foundation demonstrate the relevance of the research, convey some of the pertinent theories and input whenever possible, research in the area. The second part of the work was devoted to the development of the application in cloud environment.
We showed how we generated the application, presented the features, advantages, and
safety standards for the data. Finally, we re ect on the results, according to the theoretical
framework made in the rst part and platform development.
We think that the work obtained is positive and that ts the goals we set ourselves
to achieve. This research has some limitations, we believe that the time for completion was scarce and the implementation of the platform could bene t from the implementation of other features.In future research it would be appropriate to continue the project expanding the capabilities
of the application, test the operation with other users and make comparative tests.A Computação em nuvem é um paradigma ainda com muitas áreas por explorar que
vão desde a componente tecnológica à definição de novos modelos de negócio, mas que está
a revolucionar a forma como projetamos, implementamos e gerimos toda a infraestrutura da
tecnologia da informação.
A Infraestrutura como Serviço representa a disponibilização da infraestrutura computacional,
tipicamente um datacenter virtual, juntamente com um conjunto de APls que permitirá
que aplicações, de forma automática, possam controlar os recursos que pretendem utilizar_ A
escolha do fornecedor de serviços e a forma como este aplica o seu modelo de negócio poderão
determinar um maior ou menor custo na operacionalização e manutenção das aplicações junto
dos fornecedores.
Neste sentido, esta dissertação propôs· se efetuar uma revisão bibliográfica sobre a
temática da Computação em nuvem, a transmissão e o armazenamento seguro de conteúdos
multimédia, utilizando a compressão sem perdas, em ambientes em nuvem públicos, e implementar
um sistema deste tipo através da construção de uma aplicação que faz a gestão dos
dados em ambientes de nuvem pública (dropbox e meocloud).
Foi construída uma aplicação no decorrer desta dissertação que vai de encontro aos objectivos
definidos. Este sistema fornece ao utilizador uma variada gama de funções de gestão
de dados em ambientes de nuvem pública, para isso o utilizador tem apenas que realizar o login
no sistema com as suas credenciais, após a realização de login, através do protocolo Oauth 1.0
(protocolo de autorização) é gerado um token de acesso, este token só é gerado com o consentimento
do utilizador e permite que a aplicação tenha acesso aos dados / ficheiros do utilizador
~em que seja necessário utilizar as credenciais. Com este token a aplicação pode agora operar e
disponibilizar todo o potencial das suas funções. Com esta aplicação é também disponibilizado
ao utilizador funções de compressão e encriptação de modo a que possa usufruir ao máximo
do seu sistema de armazenamento cloud com segurança. A função de compressão funciona
utilizando o algoritmo de compressão LZMA sendo apenas necessário que o utilizador escolha os
ficheiros a comprimir. Relativamente à cifragem utilizamos o algoritmo AES (Advanced Encryption
Standard) que funciona com uma chave simétrica de 128bits definida pelo utilizador.
Alicerçámos a investigação em duas partes distintas e complementares: a primeira parte
é composta pela fundamentação teórica e a segunda parte consiste no desenvolvimento da aplicação
informática em que os dados são geridos, comprimidos, armazenados, transmitidos em
vários ambientes de computação em nuvem. A fundamentação teórica encontra-se organizada
em dois capítulos, o capítulo 2 - "Background on Cloud Storage" e o capítulo 3 "Data Compression",
Procurámos, através da fundamentação teórica, demonstrar a pertinência da investigação. transmitir algumas das teorias pertinentes e introduzir, sempre que possível, investigações
existentes na área. A segunda parte do trabalho foi dedicada ao desenvolvimento da
aplicação em ambiente "cloud". Evidenciámos o modo como gerámos a aplicação, apresentámos
as funcionalidades, as vantagens. Por fim, refletimos sobre os resultados , de acordo com o
enquadramento teórico efetuado na primeira parte e o desenvolvimento da plataforma.
Pensamos que o trabalho obtido é positivo e que se enquadra nos objetivos que nos propusemos
atingir. Este trabalho de investigação apresenta algumas limitações, consideramos que
o tempo para a sua execução foi escasso e a implementação da plataforma poderia beneficiar
com a implementação de outras funcionalidades. Em investigações futuras seria pertinente dar continuidade ao projeto ampliando as potencialidades da aplicação, testar o funcionamento
com outros utilizadores e efetuar testes comparativos.Fundação para a Ciência e a Tecnologia (FCT
Combining symbiotic simulation systems with enterprise data storage systems for real-time decision-making
This is the author accepted manuscript. The final version is available from Taylor & Francis via the DOI in this recordA symbiotic simulation system (S3) enables interactions between a physical system and its computational model representation. To support operational decisions, an S3 uses real-time data from the physical system, which is gathered via sensors and saved in an enterprise data storage system (EDSS). Both real-time and historical data are then used as inputs to the different components of an S3. This paper proposes a generic system architecture for an S3 and discusses its integration within EDSSs. The paper also reviews the literature on S3 and analyses how these systems can be used for real-time decision-making.Erasmus
Fault-tolerant satellite computing with modern semiconductors
Miniaturized satellites enable a variety space missions which were in the past infeasible, impractical or uneconomical with traditionally-designed heavier spacecraft. Especially CubeSats can be launched and manufactured rapidly at low cost from commercial components, even in academic environments. However, due to their low reliability and brief lifetime, they are usually not considered suitable for life- and safety-critical services, complex multi-phased solar-system-exploration missions, and missions with a longer duration. Commercial electronics are key to satellite miniaturization, but also responsible for their low reliability: Until 2019, there existed no reliable or fault-tolerant computer architectures suitable for very small satellites. To overcome this deficit, a novel on-board-computer architecture is described in this thesis.Robustness is assured without resorting to radiation hardening, but through software measures implemented within a robust-by-design multiprocessor-system-on-chip. This fault-tolerant architecture is component-wise simple and can dynamically adapt to changing performance requirements throughout a mission. It can support graceful aging by exploiting FPGA-reconfiguration and mixed-criticality. Experimentally, we achieve 1.94W power consumption at 300Mhz with a Xilinx Kintex Ultrascale+ proof-of-concept, which is well within the powerbudget range of current 2U CubeSats. To our knowledge, this is the first COTS-based, reproducible on-board-computer architecture that can offer strong fault coverage even for small CubeSats.European Space AgencyComputer Systems, Imagery and Medi
- …