7,251 research outputs found
Security, Performance and Energy Trade-offs of Hardware-assisted Memory Protection Mechanisms
The deployment of large-scale distributed systems, e.g., publish-subscribe
platforms, that operate over sensitive data using the infrastructure of public
cloud providers, is nowadays heavily hindered by the surging lack of trust
toward the cloud operators. Although purely software-based solutions exist to
protect the confidentiality of data and the processing itself, such as
homomorphic encryption schemes, their performance is far from being practical
under real-world workloads.
The performance trade-offs of two novel hardware-assisted memory protection
mechanisms, namely AMD SEV and Intel SGX - currently available on the market to
tackle this problem, are described in this practical experience.
Specifically, we implement and evaluate a publish/subscribe use-case and
evaluate the impact of the memory protection mechanisms and the resulting
performance. This paper reports on the experience gained while building this
system, in particular when having to cope with the technical limitations
imposed by SEV and SGX.
Several trade-offs that provide valuable insights in terms of latency,
throughput, processing time and energy requirements are exhibited by means of
micro- and macro-benchmarks.Comment: European Commission Project: LEGaTO - Low Energy Toolset for
Heterogeneous Computing (EC-H2020-780681
Speculative Segmented Sum for Sparse Matrix-Vector Multiplication on Heterogeneous Processors
Sparse matrix-vector multiplication (SpMV) is a central building block for
scientific software and graph applications. Recently, heterogeneous processors
composed of different types of cores attracted much attention because of their
flexible core configuration and high energy efficiency. In this paper, we
propose a compressed sparse row (CSR) format based SpMV algorithm utilizing
both types of cores in a CPU-GPU heterogeneous processor. We first
speculatively execute segmented sum operations on the GPU part of a
heterogeneous processor and generate a possibly incorrect results. Then the CPU
part of the same chip is triggered to re-arrange the predicted partial sums for
a correct resulting vector. On three heterogeneous processors from Intel, AMD
and nVidia, using 20 sparse matrices as a benchmark suite, the experimental
results show that our method obtains significant performance improvement over
the best existing CSR-based SpMV algorithms. The source code of this work is
downloadable at https://github.com/bhSPARSE/Benchmark_SpMV_using_CSRComment: 22 pages, 8 figures, Published at Parallel Computing (PARCO
Recommended from our members
Multi-aspect, robust, and memory exclusive guest os fingerprinting
Precise fingerprinting of an operating system (OS) is critical to many security and forensics applications in the cloud, such as virtual machine (VM) introspection, penetration testing, guest OS administration, kernel dump analysis, and memory forensics. The existing OS fingerprinting techniques primarily inspect network packets or CPU states, and they all fall short in precision and usability. As the physical memory of a VM always exists in all these applications, in this article, we present OS-Sommelier+, a multi-aspect, memory exclusive approach for precise and robust guest OS fingerprinting in the cloud. It works as follows: given a physical memory dump of a guest OS, OS-Sommelier+ first uses a code hash based approach from kernel code aspect to determine the guest OS version. If code hash approach fails, OS-Sommelier+ then uses a kernel data signature based approach from kernel data aspect to determine the version. We have implemented a prototype system, and tested it with a number of Linux kernels. Our evaluation results show that the code hash approach is faster but can only fingerprint the known kernels, and data signature approach complements the code signature approach and can fingerprint even unknown kernels
Programmability and Performance of Parallel ECS-based Simulation of Multi-Agent Exploration Models
While the traditional objective of parallel/distributed simulation techniques has been mainly in improving performance and making very large models tractable, more recent research trends targeted complementary aspects, such as the âease of programmingâ. Along this line, a recent proposal called Event and Cross State (ECS) synchronization, stands as a solution allowing to break the traditional programming rules proper of Parallel Discrete Event Simulation (PDES) systems, where the application code processing a specific event is only allowed to access the state (namely the memory image) of the target simulation object. In fact with ECS, the programmer is allowed to write ANSI-C event-handlers capable of accessing (in either read or write mode) the state of whichever simulation object included in the simulation model. Correct concurrent execution of events, e.g., on top of multi-core machines, is guaranteed by ECS with no intervention by the programmer, who is in practice exposed to a sequential-style programming model where events are processed one at a time, and have the ability to access the current memory image of the whole simulation model, namely the collection of the states of any involved object. This can strongly simplify the development of specific models, e.g., by avoiding the need for passing state information across concurrent objects in the form of events. In this article we investigate on both programmability and performance aspects related to developing/supporting a multi-agent exploration model on top of the ROOT-Sim PDES platform, which supports ECS
System Issues in Multi-agent Simulation of Large Crowds
Crowd simulation is a complex and challenging domain. Crowds demonstrate many complex behaviours and are consequently difficult to model for realistic simulation systems. Analyzing crowd dynamics has been an active area of research and efforts have been made to develop models to explain crowd behaviour. In this paper we describe an agent based simulation of crowds, based on a continuous field force model. Our simulation can handle movement of crowds over complex terrains and we have been able to simulate scenarios like clogging of exits during emergency evacuation situations. The focus of this paper, however, is on the scalability issues for such a multi-agent based crowd simulation system. We believe that scalability is an important criterion for rescue simulation systems. To realistically model a disaster scenario for a large city, the system should ideally scale up to accommodate hundreds of thousands of agents. We discuss the attempts made so far to meet this challenge, and try to identify the architectural and system constraints that limit scalability. Thereafter we propose a novel technique which could be used to richly simulate huge crowds
NIDS in Airgapped LANs--Does it Matter?
This paper presents an assessment of the methods and benefits of adding network intrusion detection systems (NIDS) to certain high-security airgapped isolated local area networks. The proposed network architecture was empirically tested via a series of simulated network attacks on a virtualized network. The results show an improvement of double the chances of an analyst receiving a specific, appropriately-severe alert when NIDS is implemented alongside host-based measures when compared to host-based measures alone. Further, the inclusion of NIDS increased the likelihood of the analyst receiving a high-severity alert in response to the simulated attack attempt by four times when compared to host-based measures alone. Despite a tendency to think that networks without cross-boundary traffic do not require boundary defense measures, such measures can significantly improve the efficiency of incident response operations on such networks
Dendritic Cells for Anomaly Detection
Artificial immune systems, more specifically the negative selection
algorithm, have previously been applied to intrusion detection. The aim of this
research is to develop an intrusion detection system based on a novel concept
in immunology, the Danger Theory. Dendritic Cells (DCs) are antigen presenting
cells and key to the activation of the human signals from the host tissue and
correlate these signals with proteins know as antigens. In algorithmic terms,
individual DCs perform multi-sensor data fusion based on time-windows. The
whole population of DCs asynchronously correlates the fused signals with a
secondary data stream. The behaviour of human DCs is abstracted to form the DC
Algorithm (DCA), which is implemented using an immune inspired framework,
libtissue. This system is used to detect context switching for a basic machine
learning dataset and to detect outgoing portscans in real-time. Experimental
results show a significant difference between an outgoing portscan and normal
traffic.Comment: 8 pages, 10 tables, 4 figures, IEEE Congress on Evolutionary
Computation (CEC2006), Vancouver, Canad
- âŠ