268 research outputs found

    Gestión de jerarquías de memoria híbridas a nivel de sistema

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadoras y Automática y de Ku Leuven, Arenberg Doctoral School, Faculty of Engineering Science, leída el 11/05/2017.In electronics and computer science, the term ‘memory’ generally refers to devices that are used to store information that we use in various appliances ranging from our PCs to all hand-held devices, smart appliances etc. Primary/main memory is used for storage systems that function at a high speed (i.e. RAM). The primary memory is often associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices. The secondary/auxiliary memory, in comparison provides program and data storage that is slower to access but offers larger capacity. Examples include external hard drives, portable flash drives, CDs, and DVDs. These devices and media must be either plugged in or inserted into a computer in order to be accessed by the system. Since secondary storage technology is not always connected to the computer, it is commonly used for backing up data. The term storage is often used to describe secondary memory. Secondary memory stores a large amount of data at lesser cost per byte than primary memory; this makes secondary storage about two orders of magnitude less expensive than primary storage. There are two main types of semiconductor memory: volatile and nonvolatile. Examples of non-volatile memory are ‘Flash’ memory (sometimes used as secondary, sometimes primary computer memory) and ROM/PROM/EPROM/EEPROM memory (used for firmware such as boot programs). Examples of volatile memory are primary memory (typically dynamic RAM, DRAM), and fast CPU cache memory (typically static RAM, SRAM, which is fast but energy-consuming and offer lower memory capacity per are a unit than DRAM). Non-volatile memory technologies in Si-based electronics date back to the 1990s. Flash memory is widely used in consumer electronic products such as cellphones and music players and NAND Flash-based solid-state disks (SSDs) are increasingly displacing hard disk drives as the primary storage device in laptops, desktops, and even data centers. The integration limit of Flash memories is approaching, and many new types of memory to replace conventional Flash memories have been proposed. The rapid increase of leakage currents in Silicon CMOS transistors with scaling poses a big challenge for the integration of SRAM memories. There is also the case of susceptibility to read/write failure with low power schemes. As a result of this, over the past decade, there has been an extensive pooling of time, resources and effort towards developing emerging memory technologies like Resistive RAM (ReRAM/RRAM), STT-MRAM, Domain Wall Memory and Phase Change Memory(PRAM). Emerging non-volatile memory technologies promise new memories to store more data at less cost than the expensive-to build silicon chips used by popular consumer gadgets including digital cameras, cell phones and portable music players. These new memory technologies combine the speed of static random-access memory (SRAM), the density of dynamic random-access memory (DRAM), and the non-volatility of Flash memory and so become very attractive as another possibility for future memory hierarchies. The research and information on these Non-Volatile Memory (NVM) technologies has matured over the last decade. These NVMs are now being explored thoroughly nowadays as viable replacements for conventional SRAM based memories even for the higher levels of the memory hierarchy. Many other new classes of emerging memory technologies such as transparent and plastic, three-dimensional(3-D), and quantum dot memory technologies have also gained tremendous popularity in recent years...En el campo de la informática, el término ‘memoria’ se refiere generalmente a dispositivos que son usados para almacenar información que posteriormente será usada en diversos dispositivos, desde computadoras personales (PC), móviles, dispositivos inteligentes, etc. La memoria principal del sistema se utiliza para almacenar los datos e instrucciones de los procesos que se encuentre en ejecución, por lo que se requiere que funcionen a alta velocidad (por ejemplo, DRAM). La memoria principal está implementada habitualmente mediante memorias semiconductoras direccionables, siendo DRAM y SRAM los principales exponentes. Por otro lado, la memoria auxiliar o secundaria proporciona almacenaje(para ficheros, por ejemplo); es más lenta pero ofrece una mayor capacidad. Ejemplos típicos de memoria secundaria son discos duros, memorias flash portables, CDs y DVDs. Debido a que estos dispositivos no necesitan estar conectados a la computadora de forma permanente, son muy utilizados para almacenar copias de seguridad. La memoria secundaria almacena una gran cantidad de datos aun coste menor por bit que la memoria principal, siendo habitualmente dos órdenes de magnitud más barata que la memoria primaria. Existen dos tipos de memorias de tipo semiconductor: volátiles y no volátiles. Ejemplos de memorias no volátiles son las memorias Flash (algunas veces usadas como memoria secundaria y otras veces como memoria principal) y memorias ROM/PROM/EPROM/EEPROM (usadas para firmware como programas de arranque). Ejemplos de memoria volátil son las memorias DRAM (RAM dinámica), actualmente la opción predominante a la hora de implementar la memoria principal, y las memorias SRAM (RAM estática) más rápida y costosa, utilizada para los diferentes niveles de cache. Las tecnologías de memorias no volátiles basadas en electrónica de silicio se remontan a la década de1990. Una variante de memoria de almacenaje por carga denominada como memoria Flash es mundialmente usada en productos electrónicos de consumo como telefonía móvil y reproductores de música mientras NAND Flash solid state disks(SSDs) están progresivamente desplazando a los dispositivos de disco duro como principal unidad de almacenamiento en computadoras portátiles, de escritorio e incluso en centros de datos. En la actualidad, hay varios factores que amenazan la actual predominancia de memorias semiconductoras basadas en cargas (capacitivas). Por un lado, se está alcanzando el límite de integración de las memorias Flash, lo que compromete su escalado en el medio plazo. Por otra parte, el fuerte incremento de las corrientes de fuga de los transistores de silicio CMOS actuales, supone un enorme desafío para la integración de memorias SRAM. Asimismo, estas memorias son cada vez más susceptibles a fallos de lectura/escritura en diseños de bajo consumo. Como resultado de estos problemas, que se agravan con cada nueva generación tecnológica, en los últimos años se han intensificado los esfuerzos para desarrollar nuevas tecnologías que reemplacen o al menos complementen a las actuales. Los transistores de efecto campo eléctrico ferroso (FeFET en sus siglas en inglés) se consideran una de las alternativas más prometedores para sustituir tanto a Flash (por su mayor densidad) como a DRAM (por su mayor velocidad), pero aún está en una fase muy inicial de su desarrollo. Hay otras tecnologías algo más maduras, en el ámbito de las memorias RAM resistivas, entre las que cabe destacar ReRAM (o RRAM), STT-RAM, Domain Wall Memory y Phase Change Memory (PRAM)...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    A Review of Low-end, Middle-end and High-end IoT Devices

    Get PDF
    Internet of Things (IoT) devices play a crucial role in the overall development of IoT in providing countless applications in various areas. Due to the increasing interest and rapid technological growth of sensor technology, which have certainly revolutionized the way we live today, a need to provide a detailed analysis of the embedded platforms and boards is consequential. This paper presents a comprehensive survey of the recent and most-widely used commercial and research embedded systems and boards in different classification emphasizing their key attributes including processing and memory capabilities, security features, connectivity and communication interfaces, size, cost and appearance, operating system (OS) support, power specifications and battery life and listing some interesting projects for each device. Through this exploration and discussion, readers can have an overall understanding on this area and foster more subsequent studies

    Benchmarking ARM-based Application Integrated Systems

    Get PDF
    ARM Holding Ltd. is in a very good position; by owning a very prolific instruction set architecture (named ARM) and licensing it to other companies that compete with one another, they are always winners. Though ARM is used throughout the domain covered by embedded systems; one particular market that they have cornered is in the application cores used in mobile handset devices. Licensees oftentimes make their own alterations to the core, add other system components, and print or package them together into devices that this thesis will refer to as integrated systems. Though finding very specific architectural details between the cores themselves is as simple as going onto the website, true performance comparisons stemming from credible benchmarks of the integrated systems that contain them are not as forthcoming. This thesis aims to explore the realities of benchmarking ARM-based application integrated systems. By looking at what kind of benchmarks are available and what kind of tests have been done in the past it should become apparent what, if anything, needs to be developed to provide thorough comparisons between these devices. During the course of this investigation, topics such as the ambiguity in integrated system methods, the future of application integrated systems, and the process in actually selecting and running a benchmark on a sample system shall be explored.School of Electrical & Computer Engineerin

    Verification of SD/MMC Controller IP Using UVM

    Get PDF
    Wide spread IP reuse in SoC Designs has enabled meteoric development of derivative designs. Several hardware block IPs are integrated together to reduce production costs, time-to-fab/timeto- market and achieve higher levels of productivity. These block IPs must be verified independently before shipping to ensure proper working and conformance to protocols that they are implementing. But, since the application of these IPs will vary from SoC to SoC, the verification environment must consider the important features and functions that are critical for that application. This may mean, revamping the entire testbench to verify the application critical features. Verification takes a major chunk of the total time of the manufacturing cycle. Thus, Verification IPs are created that can be re-used by making minor modifications to the existing test bench. In this project, an Open Cores IP – “SD/MMC Card Controller” (written in Verilog) is re-used by adding an interrupt line and card-detect feature and is verified using Universal Verification Methodology (UVM). The SD/MMC Card Controller has Wishbone as the Host Controller and SPI Master as the Core Controller. The test environment is layered and can be reused. This means, if this IP is re-designed to be controlled by another Host Controller (AXI for example), the verification environment can be re-used by inserting the BFM of that host controller. This paper discusses SD/MMC, Wishbone bus and SPI protocols, along with SD/MMC Controller and UVM based test-bench architecture

    Performance Evaluation of In-storage Processing Architectures for Diverse Applications and Benchmarks

    Get PDF
    University of Minnesota Ph.D. dissertation.June 2018. Major: Electrical/Computer Engineering. Advisor: David Lilja. 1 computer file (PDF); vii, 100 pages.As we inch towards the future, the storage needs of the world are going to be massive and diversied. To tackle the needs of the next generation, the storage systems are required to be studied and require innovative solutions. These solutions need to solve multitude of issues involving high power consumption of traditional systems, manageability, easy scaling out, and integration into existing systems. Therefore, we need to rethink the new technologies from the ground up. To keep the energy signature under control we devised a new architecture called Storage Processing Unit (SPU). For the modeling of this architecture we incorporate a processing element inside the storage medium to limit the data movement between the storage device and the host processor. This resulted in a hierarchal architecture which required an extensive design space exploration along with in-depth study of the applications. We found this new architecture to provide energy savings from 11-423X and gave performance gains from 4-66X for applications including k-means, Sparse BLAS, and others. Moreover, to understand the diverse nature of the applications and newer technologies, we tried the concept of in-storage processing for unstructured data. This type of data is demonstrating huge amount of growth and would continue to do so. Seagate's new class of drives - Kinetic Drives, address the rise of unstructured data. They have a processing element inside disk drives that execute LevelDB, a key-value store. We evaluated this off-the-shelf device using micro and macro benchmarks for an in-depth throughput and latency benchmarking. We observed sequential write throughput of 63 MB/sec and sequential read throughput of 78 MB/sec for 1 MB value sizes. We tested several unique features including P2P transfer that takes place in a Kinetic Drive. These new class of drives outperformed traditional servers workloads for several test cases. Finally, large number of these devices are needed for huge amounts of data. To demonstrate that Kinetic Drives reduce the management complexity for large-scale deployment, we conducted a study. We allocated large amounts of data on Kinetic Drives and then evaluated the performance of the system for migration of data amongst drives. Previously developed key indexing schemes were evaluated which gave important insights into their performance differences. Based on this study we can conclude that efficient mapping of key-value pairs to drives could be obtained. This lead to an understanding of the trade-offs between the number of empty drives and mapping of different key ranges to different drives. In conclusion, in-storage processing architectures bring an interesting aspect where processing is moved closer to the data. This leads to a paradigm shift which often results in a major software and hardware architectural changes. Furthermore, the new architectures have the potential to perform better than the traditional systems but require easy integration with the existing systems

    Architecting Energy Efficient Servers.

    Full text link
    This dissertation investigates how energy efficient servers can be architected using current and future technology. We leverage recent trends in packaging and device technology to deliver low power and high throughput. Specifically at the package level, this dissertation looks at 3D stacking technology that has emerged as a promising solution in achieving energy efficiency by delivering high throughput at a low cost. It shows how one would leverage this new technology into a datacenter. 3D stacking technology can be used to implement a simple, low-power, high-performance chip multiprocessor suitable for throughput processing. Our proposed architecture leveraging this technology, PicoServer, employs 3D technology to bond one die containing several simple slow processing cores to multiple memory dies sufficient for a primary memory. The multiple memory dies are composed of DRAM. 3D stacking technology also enables wide low-latency buses between processors and memory. These remove the need for an L2 cache allowing its area to be re-allocated to additional simple cores. The additional cores allow the clock frequency to be lowered without impairing throughput. Lower clock frequency along with the integration of non-volatile memory in turn reduces power and means that thermal constraints, a concern with 3D stacking, are easily satisfied. The PicoServer architecture targets server applications,which exhibit a high degree of thread level parallelism. An architecture targeted to efficient throughput is ideal for this application domain. At the memory device level, this dissertation investigates how the system memory could be re-architected to reduce the rising power consumption of system memory and disk drives. Flash memory has emerged as a strong candidate to reduce system memory power while remaining cost effective than conventional system memory. This dissertation discusses how Flash could be integrated at the system level and provides insights on the architectural support for Flash in servers. Our architecture uses a two level disk cache composed of a relatively small DRAM, which includes a primary disk cache, and a Flash based secondary disk cache. Further, based on our observations, we found that the Flash based disk caches should be split into a read optimized disk cache and write optimized disk cache.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/57602/2/tkgil_1.pd

    Advances of mobile forensic procedures in Firefox OS

    Get PDF
    The advancement of smartphone technology has attracted many companies in developing mobile operating system (OS). Mozilla Corporation recently released Linux-based open source mobile OS, named Firefox OS. The emergence of Firefox OS has created new challenges, concentrations and opportunities for digital investigators. In general, Firefox OS is designed to allow smartphones to communicate directly with HTML5 applications using JavaScript and newly introduced WebAPI. However, the used of JavaScript in HTML5 applications and solely no OS restriction might lead to security issues and potential exploits. Therefore, forensic analysis for Firefox OS is urgently needed in order to investigate any criminal intentions. This paper will present an overview and methodology of mobile forensic procedures in forensically sound manner for Firefox OS

    Doctor of Philosophy

    Get PDF
    dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled with ways to continue scaling computing performance in line with historical trends. In recent years, the de facto solution to utilize excess transistors has been to increase the size of the on-chip data cache, allowing fast access to an increased portion of main memory. These large caches allowed the continued scaling of single thread performance, which had not yet reached the limit of instruction level parallelism (ILP). As we approach the potential limits of parallelism within a single threaded application, new approaches such as chip multiprocessors (CMP) have become popular for scaling performance utilizing thread level parallelism (TLP). This dissertation identifies the operating system as a ubiquitous area where single threaded performance and multithreaded performance have often been ignored by computer architects. We propose that novel hardware and OS co-design has the potential to significantly improve current chip multiprocessor designs, enabling increased performance and improved power efficiency. We show that the operating system contributes a nontrivial overhead to even the most computationally intense workloads and that this OS contribution grows to a significant fraction of total instructions when executing several common applications found in the datacenter. We demonstrate that architectural improvements have had little to no effect on the performance of the OS over the last 15 years, leaving ample room for improvements. We specifically consider three potential solutions to improve OS execution on modern processors. First, we consider the potential of a separate operating system processor (OSP) operating concurrently with general purpose processors (GPP) in a chip multiprocessor organization, with several specialized structures acting as efficient conduits between these processors. Second, we consider the potential of segregating existing caching structures to decrease cache interference between the OS and application. Third, we propose that there are components within the OS itself that should be refactored to be both multithreaded and cache topology aware, which in turn, improves the performance and scalability of many-threaded applications
    corecore