Search CORE

41 research outputs found

Operating System Support for High-Performance Solid State Drives

Author: Bjørling Matias
Publication venue: IT-Universitetet i København
Publication date: 01/01/2016
Field of study

The IT University of Copenhagen's Repository

Emerging research directions in computer science : contributions from the young informatics faculty in Karlsruhe

Author: Kounev Samuel
Pankratius Victor
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

In order to build better human-friendly human-computer interfaces, such interfaces need to be enabled with capabilities to perceive the user, his location, identity, activities and in particular his interaction with others and the machine. Only with these perception capabilities can smart systems ( for example human-friendly robots or smart environments) become posssible. In my research I\u27m thus focusing on the development of novel techniques for the visual perception of humans and their activities, in order to facilitate perceptive multimodal interfaces, humanoid robots and smart environments. My work includes research on person tracking, person identication, recognition of pointing gestures, estimation of head orientation and focus of attention, as well as audio-visual scene and activity analysis. Application areas are humanfriendly humanoid robots, smart environments, content-based image and video analysis, as well as safety- and security-related applications. This article gives a brief overview of my ongoing research activities in these areas

KITopen

An automated OpenCL FPGA compilation framework targeting a configurable, VLIW chip multiprocessor

Author: Samuel J. Parker (7203041)
Publication venue
Publication date: 01/01/2015
Field of study

Modern system-on-chips augment their baseline CPU with coprocessors and accelerators to increase overall computational capacity and power efficiency, and thus have evolved into heterogeneous systems. Several languages have been developed to enable this paradigm shift, including CUDA and OpenCL. This thesis discusses a unified compilation environment to enable heterogeneous system design through the use of OpenCL and a customised VLIW chip multiprocessor (CMP) architecture, known as the LE1. An LLVM compilation framework was researched and a prototype developed to enable the execution of OpenCL applications on the LE1 CPU. The framework fully automates the compilation flow and supports work-item coalescing to better utilise the CPU cores and alleviate the effects of thread divergence. This thesis discusses in detail both the software stack and target hardware architecture and evaluates the scalability of the proposed framework on a highly precise cycle-accurate simulator. This is achieved through the execution of 12 benchmarks across 240 different machine configurations, as well as further results utilising an incomplete development branch of the compiler. It is shown that the problems generally scale well with the LE1 architecture, up to eight cores, when the memory system becomes a serious bottleneck. Results demonstrate superlinear performance on certain benchmarks (x9 for the bitonic sort benchmark with 8 dual-issue cores) with further improvements from compiler optimisations (x14 for bitonic with the same configuration

Loughborough University Institutional Repository

Recommended from our members

Designing systems for emerging memory technologies

Author: Kwon Youngjin, Ph. D.
Publication venue
Publication date: 19/09/2018
Field of study

Emerging memory technologies open new challenges in system software: diversity and large capacity. Non-volatile memory (NVM) technologies will have excellent performance, byte- addressability, and large capacity, blurring the line between traditional volatile DRAM and non-volatile storage. NVM diverges from DRAM in significant ways, like limited write bandwidth. It is likely that future storage market will be diversified, having DRAM, NVM, SSD, and hard disk. Unfortunately, current file systems, built on top of old design ideas, cannot provide an efficient way to take advantage of the different storage media. Strata is a cross-media file system, fundamentally redesigning file systems to leverage different strengths of storage technologies while compensating their weaknesses. Modern applications such as large-scale machine learning and graph analytics want to load huge datasets into memory for fast computation. For these workloads, merely adding more RAM to a machine reaches a point of diminishing returns for performance because their poor spatial locality causes them to suffer high virtual to physical memory translation costs. NVM will make this problem worse because it provides cheaper cost-per-capacity than DRAM. Ingens, a efficient memory management system, addresses the shortcomings in modern operating systems and hypervisors that underlies these excessive address translation overheads and redesign huge page memory systems to make huge page widely used in practice.Computer Science

Texas ScholarWorks

Analysis and optimization of storage IO in distributed and massive parallel high performance systems

Author: Sayed Mohamed Salem el-
Publication venue
Publication date: 01/01/2011
Field of study

Although Moore’s law ensures the increase in computational power, IO performance appears to be left behind. This minimizes the benefits gained from increased computational power. Processors have to idle for a long time waiting for IO. Another factor that slows the IO communication is the increased parallelism required in today’s computations. Most modern processing units are built from multiple weak cores. Since IO has a low parallelism the weak cores will decrease system performance. Furthermore to avoid added delay of external storage, future High Performance Computing (HPC) systems will employ Active Storage Fabrics (ASF). These embed storage directly into large HPC systems. Single HPC node IO performance will therefore require optimization. This can only be achieved with a full understanding of the IO stack operations. The analysis of the IO stack under the new conditions of multi-core and massive parallelism leads to some important conclusions. The IO stack is generally built for single devices and is heavily optimized for HDD. Two main optimization approaches are taken. The first is optimizing the IO stack to accommodate parallelism. Conclusions on IO analysis shows that a design based on several parallel operating storage devices is the best approach for parallelism in the IO stack. A parallel IO device with unified storage space is introduced. The unified storage space allows for optimal function division among resources for both read and write. The design also avoids large parallel file systems overhead by using limited changes to a conventional file system. Furthermore the interface of the IO stack is not changed by the design. This is a rather important restriction to avoid application rewrite. The implementation of such a design is shown to result in an increase in performance. The second approach is Optimizing the IO stack for Solid State Drives (SSD). The optimization for the new storage technology demanded further analysis. These show that the IO stack requires revision on many levels for optimal accommodation of SSD. File system preallocation of free blocks is used as an example. Preallocation is important for data contingency on HDD. However due to fast random access of SSD preallocation represents an overhead. By careful analysis to the block allocation algorithms, preallocation is removed. As an additional optimization approach IO compression is suggested for future work. It can utilize idle cores during an IO transaction to perform on the fly IO data compression

Digital signal processing application based on residue number system

Author: Rolko Maroš
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2011
Field of study

Tato práce se zabývá systémem zbytkových tříd a jeho aplikacemi v digitálních obvodech. První část se zabývá VHDL návrhem různých typů sčítaček v systému zbytkových tříd a jejich porovnání se standartními sčítačkami. V druhé části je implementován obrázkový processor který pracuje v systému zbytkových tříd a jeho výkonostní analýza. V textu je popsán postup návrhu a jsou prezentovány výsledky analýz.This work deals with residue number system and its applications in digital circuits. The first part is VHDL design of different adder types in residue number system and their comparison with regular adders. The second part is VHDL implementation of image processor that computes in residue number system and its performance analysis. Presented text contains description of design procedures and presentation of analysis results.

Digital library of Brno University of Technology

National Repository of Grey Literature

Vers la Compression à Tous les Niveaux de la Hiérarchie de la Mémoire

Author: Rodrigues Carvalho Daniel
Publication venue: HAL CCSD
Publication date: 09/04/2021
Field of study

Hardware compression techniques are typically simplifications of software compression methods. They must, however, comply with area, power and latency constraints. This study unveils the challenges of adopting compression in memory design. The goal of this analysis is not to summarize proposals, but to put in evidence the solutions they employ to handle those challenges. An in-depth description of the main characteristics of multiple methods is provided, as well as criteria that can be used as a basis for the assessment of such schemes.Typically, these schemes are not very efficient, and those that do compress well decompress slowly. This work explores their granularity to redefine their perspectives and improve their efficiency, through a concept called Region-Chunk compression. Its goal is to achieve low (good) compression ratio and fast decompression latency. The key observation is that by further sub-dividing the chunks of data being compressed one can reduce data duplication. This concept can be applied to several previously proposed compressors, resulting in a reduction of their average compressed size. In particular, a single-cycle-decompression compressor is boosted to reach a compressibility level competitive to state-of-the-art proposals.Finally, to increase the probability of successfully co-allocating compressed lines, Pairwise Space Sharing (PSS) is proposed. PSS can be applied orthogonally to compaction methods at no extra latency penalty, and with a cost-effective metadata overhead. The proposed system (Region-Chunk+PSS) further enhances the normalized average cache capacity by 2.7% (geometric mean), while featuring short decompression latency.Les techniques de compression matérielle sont généralement des simplifications des méthodes de compression logicielle. Elles doivent, toutefois, se conformer aux contraintes de surface, de puissance et de latence. Cette étude dévoile les défis de l’adoption de la compression dans la conception de la mémoire. Le but de l’analyse n’est pas de résumer les propositions, mais de mettre en évidence les solutions qu’ils emploient pour relever ces défis. Une description détaillée des principales caractéristiques de plusieurs méthodes est fournie, ainsi que des critères qui peuvent être utilisés comme base pour l’évaluation de ces systèmes.Généralement, ces schémas ne sont pas très efficaces, et les schémas qui compressent bien décompressent lentement. Ce travail explore leur granularité pour redéfinir leurs perspectives et améliorer leur efficacité, à travers un concept appelé compression Region-Chunk. Son objectif est d’obtenir un haut (bon) taux de compression et une latence de décompression rapide. L’observation clé est qu’en subdivisant davantage les blocs de données compressés, on peut réduire la duplication des données. Ce concept peut être appliqué à plusieurs compresseurs précédemment proposés, entraînant une réduction de leur taille moyenne compressée. En particulier, un compresseur à décompression à cycle unique est boosté pour atteindre un niveau de compressibilité compétitif par rapport aux propositions de pointe.Enfin, pour augmenter la probabilité de co-allouer avec succès des lignes compressées, Pairwise Space Sharing (PSS) est proposé. PSS peutêtre appliqué orthogonalement aux méthodes de compactage sans pénalité de latence supplémentaire, et avec une surcharge de métadonnées rentable. Le système proposé (Region-Chunk + PSS) améliore encore la capacité normalisé moyenne du cache de 2,7% (moyenne géométrique), tout en offrant une courte latence de décompression

INRIA a CCSD electronic archive server

THE EMOTIONS OF PUBLIC HOUSING POLICY A CRITICAL HUMANIST EXPLORATION OF HOPE VI

Author: Hostetter Ellen
Publication venue: UKnowledge
Publication date: 01/01/2008
Field of study

Homeownership and Opportunity for People Everywhere VI (HOPE VI) is dramatically changing the face of public housing. The HOPE VI program proposes to replace barracks-style and high rise apartments with a new public housing landscape built on the planning principles of New Urbanism: small-scale developments of single family homes and townhouses with front lawns and porches. Academic and governmental analyses of HOPE VI have used economic, political, and social perspectives to analyze this significant financial investment, radical landscape alteration, and change in residents lives. This dissertation analyzes the process of HOPE VI and its attendant landscapes using a critical humanist perspective focused on the human, emotional dimension of public housing policy. By bringing together geography, psychology, sociology, and philosophy literatures on emotion with geographic literatures on critical humanism and the cultural landscape this dissertation shows that specific emotions such as disgust, fear, shame, and enjoyment permeate, shape, and direct public housing policy and appearance in different places and across time. More specifically, the dissertation shows that 1) disgust, fear, shame, and enjoyment constitute both the political and economic logic essential to HOPE VI and 2) disgust, fear, shame, and enjoyment are articulated through and crystallized in reactions to the public housing landscape its aesthetic and social context. The overall contribution of the project is to first, challenge the binaries that often structure academic and governmental analyses of HOPE VI including rational-emotional, outsiders-residents, creation-implementation, and national-local. In challenging these binaries, the project offers an alternative way to think about and understand HOPE VI and housing policy. And second, the dissertation contributes to the methods literature by exploring how to analyze emotion through discourse analysis and how to ask people about emotions

University of Kentucky