31 research outputs found
ScaRR: Scalable Runtime Remote Attestation for Complex Systems
The introduction of remote attestation (RA) schemes has allowed academia and
industry to enhance the security of their systems. The commercial products
currently available enable only the validation of static properties, such as
applications fingerprint, and do not handle runtime properties, such as
control-flow correctness. This limitation pushed researchers towards the
identification of new approaches, called runtime RA. However, those mainly work
on embedded devices, which share very few common features with complex systems,
such as virtual machines in a cloud. A naive deployment of runtime RA schemes
for embedded devices on complex systems faces scalability problems, such as the
representation of complex control-flows or slow verification phase.
In this work, we present ScaRR: the first Scalable Runtime Remote attestation
schema for complex systems. Thanks to its novel control-flow model, ScaRR
enables the deployment of runtime RA on any application regardless of its
complexity, by also achieving good performance. We implemented ScaRR and tested
it on the benchmark suite SPEC CPU 2017. We show that ScaRR can validate on
average 2M control-flow events per second, definitely outperforming existing
solutions.Comment: 14 page
ShareJIT: JIT Code Cache Sharing across Processes and Its Practical Implementation
Just-in-time (JIT) compilation coupled with code caching are widely used to
improve performance in dynamic programming language implementations. These code
caches, along with the associated profiling data for the hot code, however,
consume significant amounts of memory. Furthermore, they incur extra JIT
compilation time for their creation. On Android, the current standard JIT
compiler and its code caches are not shared among processes---that is, the
runtime system maintains a private code cache, and its associated data, for
each runtime process. However, applications running on the same platform tend
to share multiple libraries in common. Sharing cached code across multiple
applications and multiple processes can lead to a reduction in memory use. It
can directly reduce compile time. It can also reduce the cumulative amount of
time spent interpreting code. All three of these effects can improve actual
runtime performance.
In this paper, we describe ShareJIT, a global code cache for JITs that can
share code across multiple applications and multiple processes. We implemented
ShareJIT in the context of the Android Runtime (ART), a widely used,
state-of-the-art system. To increase sharing, our implementation constrains the
amount of context that the JIT compiler can use to optimize the code. This
exposes a fundamental tradeoff: increased specialization to a single process'
context decreases the extent to which the compiled code can be shared. In
ShareJIT, we limit some optimization to increase shareability. To evaluate the
ShareJIT, we tested 8 popular Android apps in a total of 30 experiments.
ShareJIT improved overall performance by 9% on average, while decreasing memory
consumption by 16% on average and JIT compilation time by 37% on average.Comment: OOPSLA 201
FastMPJ: a scalable and efficient Java message-passing library
This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: http://dx.doi.org/https://doi.org/10.1007/s10586-014-0345-4[Abstract] The performance and scalability of communications are key for high performance computing (HPC) applications in the current multi-core era. Despite the significant benefits (e.g., productivity, portability, multithreading) of Java for parallel programming, its poor communications support has hindered its adoption in the HPC community. This paper presents FastMPJ, an efficient message-passing in Java (MPJ) library, boosting Java for HPC by: (1) providing high-performance shared memory communications using Java threads; (2) taking full advantage of high-speed cluster networks (e.g., InfiniBand) to provide low-latency and high bandwidth communications; (3) including a scalable collective library with topology aware primitives, automatically selected at runtime; (4) avoiding Java data buffering overheads through zero-copy protocols; and (5) implementing the most widely extended MPI-like Java bindings for a highly productive development. The comprehensive performance evaluation on representative testbeds (InfiniBand, 10 Gigabit Ethernet, Myrinet, and shared memory systems) has shown that FastMPJ communication primitives rival native MPI implementations, significantly improving the efficiency and scalability of Java HPC parallel applications.Ministerio de Educación y Ciencia; AP2010-4348Ministerio de Economía y Competitividad; TIN2010-16735Xunta de Galicia; CN2012/211Xunta de Galicia; GRC2013/05
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
A distributed authentication architecture and protocol
Većina metoda autentifikacije korisnika oslanjaju se na jedan verifikator koji se pohranjuje na središnjem mjestu unutar informacijskog sustava. Takva pohrana osjetljivih informacija predstavlja jedinstvenu točku ispada iz sigurnosne perspektive. Kompromitacija verifikatora jednog sustava predstavlja izravnu prijetnju korisnikovom digitalnom identitetu. U radu se predlaže raspodijeljeno okruženje za autentifikaciju u kojem ne postoji takva točka ispada. Rad opisuje arhitekturu koja omogućuje raspodijeljenu autentifikaciju korisnika u kojoj više autentifikacijskih poslužitelja sudjeluju u provjeri autentičnosti korisnika. Razmatra se autentifikacijsko okruženje u kojem se proces autentifikacije korisnika raspodjeljuje na više nezavisnih poslužitelja. Svaki poslužitelj samostalno obavlja autentifikaciju korisnika, na primjer tražeći od korisnika da odgovori na izazov kako bi dokazao da je vlasnik digitalnog identiteta. Predložena arhitektura omogućuje svakom poslužitelju da koristi drugi autentifikacijski faktor. Provedena je sigurnosna analiza predložene arhitekture i protokola, čime se dokazuje otpornost sustava od napada odabranih u analizi.Most user authentication methods rely on a single verifier being stored at a central location within the information system. Such information storage presents a single point of compromise from a security perspective. If this system is compromised it poses a direct threat to users’ digital identities if the verifier can be extracted from the system. This paper proposes a distributed authentication environment in which there is no such single point of compromise. We propose an architecture that does not rely on a single verifier to authenticate users, but rather a distributed authentication architecture where several authentication servers are used to authenticate a user. We consider an authentication environment in which the user authentication process is distributed among independent servers. Each server independently performs its own authentication of the user, for example by asking the user to complete a challenge in order to prove his claim to a digital identity. The proposed architecture allows each server to use any authentication factor. We provide a security analysis of the proposed architecture and protocol, which shows they are secure against the attacks chosen in the analysis
Reification: A Process to Configure Java Realtime Processors
Real-time systems require stringent requirements both on the processor and the software application. The primary concern is speed and the predictability of execution times. In all real-time applications the developer must identify and calculate the worst case execution times (WCET) of their software. In almost all cases the processor design complexity impacts the analysis when calculating the WCET. Design features which impact this analysis include cache and instruction pipelining. With both cache and pipelining the time taken for a particular instruction can vary depending on cache and pipeline contents. When calculating the WCET the developer must ignore the speed advantages from these enhancements and use the normal instruction timings.
This investigation is about a Java processor targeted to run within an FPGA environment (Java soft chip) supporting Java real-time applications. The investigation focuses on a simple processor design that allows simple analysis of WCET. The processor design has no cache and no instruction pipeline enhancements yet achieves higher performance than existing designs with these enhancements.
The investigation centers on a process that translates Java byte codes and folds these translated codes into a modified Harvard Micro Controller (HMC). The modifications include better alignment with the application code and take advantage of the FPGA’s parallel capability. A prototyped ontology is used where the top level categories defined by Sowa are expanded to support the process.
The proposed HMC and process are used to produce investigation results. Performance testing using the Sobel edge detection algorithm is used to compare the results with the only Java processor claiming real-time abilities