1,903 research outputs found
AccaSim: a Customizable Workload Management Simulator for Job Dispatching Research in HPC Systems
We present AccaSim, a simulator for workload management in HPC systems.
Thanks to AccaSim's scalability to large workload datasets, support for easy
customization, and practical automated tools to aid experimentation, users can
easily represent various real HPC systems, develop novel advanced dispatchers
and evaluate them in a convenient way across different workload sources.
AccaSim is thus an attractive tool for conducting job dispatching research in
HPC systems.Comment: 27 page
A Survey of Techniques for Improving Security of GPUs
Graphics processing unit (GPU), although a powerful performance-booster, also
has many security vulnerabilities. Due to these, the GPU can act as a
safe-haven for stealthy malware and the weakest `link' in the security `chain'.
In this paper, we present a survey of techniques for analyzing and improving
GPU security. We classify the works on key attributes to highlight their
similarities and differences. More than informing users and researchers about
GPU security techniques, this survey aims to increase their awareness about GPU
security vulnerabilities and potential countermeasures
NSML: Meet the MLaaS platform with a real-world case study
The boom of deep learning induced many industries and academies to introduce
machine learning based approaches into their concern, competitively. However,
existing machine learning frameworks are limited to sufficiently fulfill the
collaboration and management for both data and models. We proposed NSML, a
machine learning as a service (MLaaS) platform, to meet these demands. NSML
helps machine learning work be easily launched on a NSML cluster and provides a
collaborative environment which can afford development at enterprise scale.
Finally, NSML users can deploy their own commercial services with NSML cluster.
In addition, NSML furnishes convenient visualization tools which assist the
users in analyzing their work. To verify the usefulness and accessibility of
NSML, we performed some experiments with common examples. Furthermore, we
examined the collaborative advantages of NSML through three competitions with
real-world use cases
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments
Deep neural networks (DNNs) have become core computation components within
low latency Function as a Service (FaaS) prediction pipelines: including image
recognition, object detection, natural language processing, speech synthesis,
and personalized recommendation pipelines. Cloud computing, as the de-facto
backbone of modern computing infrastructure for both enterprise and consumer
applications, has to be able to handle user-defined pipelines of diverse DNN
inference workloads while maintaining isolation and latency guarantees, and
minimizing resource waste. The current solution for guaranteeing isolation
within FaaS is suboptimal -- suffering from "cold start" latency. A major cause
of such inefficiency is the need to move large amount of model data within and
across servers. We propose TrIMS as a novel solution to address these issues.
Our proposed solution consists of a persistent model store across the GPU, CPU,
local storage, and cloud storage hierarchy, an efficient resource management
layer that provides isolation, and a succinct set of application APIs and
container technologies for easy and transparent integration with FaaS, Deep
Learning (DL) frameworks, and user code. We demonstrate our solution by
interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x
speedup in latency for image classification models and up to 210x speedup for
large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201
Criteria and Approaches for Virtualization on Modern FPGAs
Modern field programmable gate arrays (FPGAs) can produce high performance in
a wide range of applications, and their computational capacity is becoming
abundant in personal computers. Regardless of this fact, FPGA virtualization is
an emerging research field. Nowadays, challenges of the research area come from
not only technical difficulties but also from the ambiguous standards of
virtualization. In this paper, we introduce novel criteria of FPGA
virtualization and discuss several approaches to accomplish those criteria. In
addition, we present and describe in detail the specific FPGA virtualization
architecture that we developed on Intel Arria 10 FPGA. We evaluate our solution
with a combination of applications and microbenchmarks. The result shows that
our virtualization solution can provide a full abstraction of FPGA device in
both user and developer perspective while maintaining a reasonable performance
compared to native FPGA
Vulnerable GPU Memory Management: Towards Recovering Raw Data from GPU
In this paper, we present that security threats coming with existing GPU
memory management strategy are overlooked, which opens a back door for
adversaries to freely break the memory isolation: they enable adversaries
without any privilege in a computer to recover the raw memory data left by
previous processes directly. More importantly, such attacks can work on not
only normal multi-user operating systems, but also cloud computing platforms.
To demonstrate the seriousness of such attacks, we recovered original data
directly from GPU memory residues left by exited commodity applications,
including Google Chrome, Adobe Reader, GIMP, Matlab. The results show that,
because of the vulnerable memory management strategy, commodity applications in
our experiments are all affected
Towards Predictable Real-Time Performance on Multi-Core Platforms
Cyber-physical systems (CPS) integrate sensing, computing, communication and
actuation capabilities to monitor and control operations in the physical
environment. A key requirement of such systems is the need to provide
predictable real-time performance: the timing correctness of the system should
be analyzable at design time with a quantitative metric and guaranteed at
runtime with high assurance. This requirement of predictability is particularly
important for safety-critical domains such as automobiles, aerospace, defense,
manufacturing and medical devices.
The work in this dissertation focuses on the challenges arising from the use
of modern multi-core platforms in CPS. Even as of today, multi-core platforms
are rarely used in safety-critical applications primarily due to the temporal
interference caused by contention on various resources shared among processor
cores, such as caches, memory buses, and I/O devices. Such interference is hard
to predict and can significantly increase task execution time, e.g., up to 12x
on commodity quad-core platforms. To address the problem of ensuring timing
predictability on multi-core platforms, we develop novel analytical and systems
techniques in this dissertation. Our proposed techniques theoretically bound
temporal interference that tasks may suffer from when accessing shared
resources. Our techniques also involve software primitives and algorithms for
real-time operating systems and hypervisors, which significantly reduce the
degree of the temporal interference. Specifically, we tackle the issues of
cache and memory contention, locking and synchronization, interrupt handling,
and access control for computational accelerators such as GPGPUs, all of which
are crucial to achieving predictable real-time performance on a modern
multi-core platform.Comment: This is the Ph.D. dissertation of the autho
Virtualization Technologies and Cloud Security: advantages, issues, and perspectives
Virtualization technologies allow multiple tenants to share physical
resources with a degree of security and isolation that cannot be guaranteed by
mere containerization. Further, virtualization allows protected transparent
introspection of Virtual Machine activity and content, thus supporting
additional control and monitoring. These features provide an explanation,
although partial, of why virtualization has been an enabler for the flourishing
of cloud services. Nevertheless, security and privacy issues are still present
in virtualization technology and hence in Cloud platforms. As an example, even
hardware virtualization protection/isolation is far from being perfect and
uncircumventable, as recently discovered vulnerabilities show. The objective of
this paper is to shed light on current virtualization technology and its
evolution from the point of view of security, having as an objective its
applications to the Cloud setting.Comment: arXiv admin note: text overlap with arXiv:1702.07521 by other author
A Distributed Multi-agent Market Place for HPC Compute Cycle Resource Trading
Computer simulation is finding a role in an increasing number of scientific
disciplines, concomitant with the rise in available computing power. Realizing
this inevitably requires access to computational power beyond the desktop,
making use of clusters, supercomputers, data repositories, networks and
distributed aggregations of these resources. Accessing one such resource
entails a number of usability and security problems; when multiple
geographically distributed resources are involved, the difficulty is
compounded.
This presents the user with the problem of how to gain access to suitable
resources to run their workloads as they need them. In this paper we present
our solutions to this problem, a resource trading platform that allows users to
purchase access to resources within a distributed e-infrastructure. We present
the implementation of this Resource Allocation Market Place as a distributed
multi-agent system, and show how it provides a highly flexible, efficient tool
to schedule workflows across high performance computing resources
- …