58,889 research outputs found
A new approach to collaborative frameworks using shared objects
Multi-user graphical applications currently require the creation of a set of interface objects to maintain each participating display. The concept of shared objects allows a single object instance to be used in multiple contexts concurrently. This provides a novel way of reducing collaborative overheads by requiring the maintenance of only a single set of interface objects. The paper presents the concept of a shared-object collaborative framework and illustrates how the concept can be incorporated into an existing object-oriented toolkit
Cache Equalizer: A Cache Pressure Aware Block Placement Scheme for Large-Scale Chip Multiprocessors
This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. CE provides Quality of Service (QoS) by robustly offering better performance than the baseline shared NUCA cache. Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. Furthermore, evaluations manifested the outperformance of CE versus related CMP cache designs
OpenPING: A Reflective Middleware for the Construction of Adaptive Networked Game Applications
The emergence of distributed Virtual Reality (VR) applications
that run over the Internet has presented networked game
application designers with new challenges. In an environment
where the public internet streams multimedia data and is
constantly under pressure to deliver over widely heterogeneous
user-platforms, there has been a growing need that distributed VR
applications be aware of and adapt to frequent variations in their
context of execution. In this paper, we argue that in contrast to
research efforts targeted at improvement of scalability, persistence
and responsiveness capabilities, much less attempts have been
aimed at addressing the flexibility, maintainability and
extensibility requirements in contemporary distributed VR
platforms. We propose the use of structural reflection as an
approach that not only addresses these requirements but also
offers added value in the form of providing a framework for
scalability, persistence and responsiveness that is itself flexible,
maintainable and extensible. We also present an adaptive
middleware platform implementation called OpenPING1 that
supports our proposal in addressing these requirements
H2O: An Autonomic, Resource-Aware Distributed Database System
This paper presents the design of an autonomic, resource-aware distributed
database which enables data to be backed up and shared without complex manual
administration. The database, H2O, is designed to make use of unused resources
on workstation machines. Creating and maintaining highly-available, replicated
database systems can be difficult for untrained users, and costly for IT
departments. H2O reduces the need for manual administration by autonomically
replicating data and load-balancing across machines in an enterprise.
Provisioning hardware to run a database system can be unnecessarily costly as
most organizations already possess large quantities of idle resources in
workstation machines. H2O is designed to utilize this unused capacity by using
resource availability information to place data and plan queries over
workstation machines that are already being used for other tasks. This paper
discusses the requirements for such a system and presents the design and
implementation of H2O.Comment: Presented at SICSA PhD Conference 2010 (http://www.sicsaconf.org/
AliEnFS - a Linux File System for the AliEn Grid Services
Among the services offered by the AliEn (ALICE Environment
http://alien.cern.ch) Grid framework there is a virtual file catalogue to allow
transparent access to distributed data-sets using various file transfer
protocols. (AliEn File System) integrates the AliEn file catalogue as
a new file system type into the Linux kernel using LUFS, a hybrid user space
file system framework (Open Source http://lufs.sourceforge.net). LUFS uses a
special kernel interface level called VFS (Virtual File System Switch) to
communicate via a generalised file system interface to the AliEn file system
daemon. The AliEn framework is used for authentication, catalogue browsing,
file registration and read/write transfer operations. A C++ API implements the
generic file system operations. The goal of AliEnFS is to allow users easy
interactive access to a worldwide distributed virtual file system using
familiar shell commands (f.e. cp,ls,rm ...) The paper discusses general aspects
of Grid File Systems, the AliEn implementation and present and future
developments for the AliEn Grid File System.Comment: 9 pages, 12 figure
Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times
- …