54 research outputs found
APHID: Anomaly Processor in Hardware for Intrusion Detection
The Anomaly Processor in Hardware for Intrusion Detection (APHID) is a step forward in the field of co-processing intrusion detection mechanism. By using small, fast hardware primitives APHID relieves the production CPU from the burden of security processing. These primitives are tightly coupled to the CPU giving them access to critical state information such as the current instruction(s) in execution, the next instruction, registers, and processor state information. By monitoring these hardware elements, APHID is able to determine when an anomalous action occurs within one clock cycle. Upon detection, APHID can force the processor into a corrective state, or a halted state, depending on the required response. APHID primitives also harden the production system against attacks such as Distribute Denial of Service attack and buffer overflow attacks. APHID is designed to be fast and agile, with the ability to create multiple monitors that switch in and out of monitoring with the context switches of the production processor to highly focused coverage over multiple devices and sections of code
Neural network computing using on-chip accelerators
The use of neural networks, machine learning, or artificial intelligence, in its broadest and most controversial sense, has been a tumultuous journey involving three distinct hype cycles and a history dating back to the 1960s. Resurgent, enthusiastic interest in machine learning and its applications bolsters the case for machine learning as a fundamental computational kernel. Furthermore, researchers have demonstrated that machine learning can be utilized as an auxiliary component of applications to enhance or enable new types of computation such as approximate computing or automatic parallelization. In our view, machine learning becomes not the underlying application, but a ubiquitous component of applications. This view necessitates a different approach towards the deployment of machine learning computation that spans not only hardware design of accelerator architectures, but also user and supervisor software to enable the safe, simultaneous use of machine learning accelerator resources.
In this dissertation, we propose a multi-transaction model of neural network computation to meet the needs of future machine learning applications. We demonstrate that this model, encompassing a decoupled backend accelerator for inference and learning from hardware and software for managing neural network transactions can be achieved with low overhead and integrated with a modern RISC-V microprocessor. Our extensions span user and supervisor software and data structures and, coupled with our hardware, enable multiple transactions from different address spaces to execute simultaneously, yet safely. Together, our system demonstrates the utility of a multi-transaction model to increase energy efficiency improvements and improve overall accelerator throughput for machine learning applications
Next generation satellite orbital control system
Selection of the correct software architecture is vital for building successful software-intensive systems. Its realization requires important decisions about the organization of the system and by and large permits or prevents a system\u27s acceptance and quality attributes such as performance and reliability. The correct architecture is essential for program success while the wrong one is a formula for disaster.
In this investigation, potential software architectures for the Next Generation Satellite Orbital Control System (NG-SOCS) are developed from compiled system specifications and a review of existing technologies. From the developed architectures, the recommended architecture is selected based on real-world considerations that face corporations today, including maximizing code reuse, mitigation of project risks and the alignment of the solution with business objectives
On designing hardware accelerator-based systems: interfaces, taxes and benefits
Complementary Metal Oxide Semiconductor (CMOS) Technology scaling has slowed down. One promising approach to sustain the historic performance improvement of computing systems is to utilize hardware accelerators. Today, many commercial computing systems integrate one or more accelerators, with each accelerator optimized to efficiently execute specific tasks.
Over the years, there has been a substantial amount of research on designing hardware accelerators for machine learning (ML) training and inference tasks. Hardware accelerators are also widely employed to accelerate data privacy and security algorithms. In particular, there is currently a growing interest in the use of hardware accelerators for accelerating homomorphic encryption (HE) based privacy-preserving computing.
While the use of hardware accelerators is promising, a realistic end-to-end evaluation of an accelerator when integrated into the full system often reveals that the benefits of an accelerator are not always as expected. Simply assessing the performance of the accelerated portion of an application, such as the inference kernel in ML applications, during performance analysis can be misleading. When designing an accelerator-based system, it is critical to evaluate the system as a whole and account for all the accelerator taxes.
In the first part of our research, we highlight the need for a holistic, end-to-end analysis of workloads using ML and HE applications. Our evaluation of an ML application for a database management system (DBMS) shows that the benefits of offloading ML inference to accelerators depend on several factors, including backend hardware, model complexity, data size, and the level of integration between the ML inference pipeline and the DBMS. We also found that the end-to-end performance improvement is bottlenecked by data retrieval and pre-processing, as well as inference. Additionally, our evaluation of an HE video encryption application shows that while HE client-side operations, i.e., message-to- ciphertext and ciphertext-to-message conversion operations, are bottlenecked by number theoretic transform (NTT) operations, accelerating NTT in hardware alone is not sufficient to get enough application throughput (frame rate per second) improvement. We need to address all bottlenecks such as error sampling, encryption, and decryption in message-to-ciphertext and ciphertext-to-message conversion pipeline.
In the second part of our research, we address the lack of a scalable evaluation infrastructure for building and evaluating accelerator-based systems. To solve this problem, we propose a robust and scalable software-hardware framework for accelerator evaluation, which uses an open-source RISC-V based System-on-Chip (SoC) design called BlackParrot. This framework can be utilized by accelerator designers and system architects to perform an end-to-end performance analysis of coherent and non-coherent accelerators while carefully accounting for the interaction between the accelerator and the rest of the system.
In the third part of our research, we present RISE, which is a full RISC-V SoC designed to efficiently perform message-to-ciphertext and ciphertext-to-message conversion operations. RISE comprises of a BlackParrot core and an efficient custom-designed accelerator tailored to accelerate end-to-end message-to-ciphertext and ciphertext-to-message conversion operations. Our RTL-based evaluation demonstrates that RISE improves the throughput of the video encryption application by 10x-27x for different frame resolutions
High Availability and Scalability of Mainframe Environments using System z and z/OS as example
Mainframe computers are the backbone of industrial and commercial computing, hosting the most relevant and critical data of businesses. One of the most important mainframe environments is IBM System z with the operating system z/OS. This book introduces mainframe technology of System z and z/OS with respect to high availability and scalability. It highlights their presence on different levels within the hardware and software stack to satisfy the needs for large IT organizations
Secure and safe virtualization-based framework for embedded systems development
Tese de Doutoramento - Programa Doutoral em Engenharia Electrónica e de Computadores (PDEEC)The Internet of Things (IoT) is here. Billions of smart, connected devices are proliferating
at rapid pace in our key infrastructures, generating, processing and exchanging
vast amounts of security-critical and privacy-sensitive data. This strong connectivity
of IoT environments demands for a holistic, end-to-end security approach, addressing
security and privacy risks across different abstraction levels: device, communications,
cloud, and lifecycle managment.
Security at the device level is being misconstrued as the addition of features in a
late stage of the system development. Several software-based approaches such as
microkernels, and virtualization have been used, but it is proven, per se, they fail in
providing the desired security level. As a step towards the correct operation of these
devices, it is imperative to extend them with new security-oriented technologies
which guarantee security from the outset.
This thesis aims to conceive and design a novel security and safety architecture
for virtualized systems by 1) evaluating which technologies are key enablers for
scalable and secure virtualization, 2) designing and implementing a fully-featured
virtualization environment providing hardware isolation 3) investigating which "hard
entities" can extend virtualization to guarantee the security requirements dictated by
confidentiality, integrity, and availability, and 4) simplifying system configurability
and integration through a design ecosystem supported by a domain-specific language.
The developed artefacts demonstrate: 1) why ARM TrustZone is nowadays a reference
technology for security, 2) how TrustZone can be adequately exploited for
virtualization in different use-cases, 3) why the secure boot process, trusted execution
environment and other hardware trust anchors are essential to establish and
guarantee a complete root and chain of trust, and 4) how a domain-specific language
enables easy design, integration and customization of a secure virtualized
system assisted by the above mentioned building blocks.Vivemos na era da Internet das Coisas (IoT). Biliões de dispositivos inteligentes
começam a proliferar nas nossas infraestruturas chave, levando ao processamento
de avolumadas quantidades de dados privados e sensíveis. Esta forte conectividade
inerente ao conceito IoT necessita de uma abordagem holística, em que os riscos
de privacidade e segurança são abordados nas diferentes camadas de abstração:
dispositivo, comunicações, nuvem e ciclo de vida.
A segurança ao nível dos dispositivos tem sido erradamente assegurada pela inclusão
de funcionalidades numa fase tardia do desenvolvimento. Têm sido utilizadas diversas
abordagens de software, incluindo a virtualização, mas está provado que estas
não conseguem garantir o nível de segurança desejado. De forma a garantir a correta
operação dos dispositivos, é fundamental complementar os mesmos com novas tecnologias
que promovem a segurança desde os primeiros estágios de desenvolvimento.
Esta tese propõe, assim, o desenvolvimento de uma solução arquitetural inovadora
para sistemas virtualizados seguros, contemplando 1) a avaliação de tecnologias
chave que promovam tal realização, 2) a implementação de uma solução de virtualização
garantindo isolamento por hardware, 3) a identificação de componentes
que integrados permitirão complementar a virtualização para garantir os requisitos
de segurança, e 4) a simplificação do processo de configuração e integração da solução
através de um ecossistema suportado por uma linguagem de domínio específico.
Os artefactos desenvolvidos demonstram: 1) o porquê da tecnologia ARM TrustZone
ser uma tecnologia de referência para a segurança, 2) a efetividade desta tecnologia
quando utilizada em diferentes domínios, 3) o porquê do processo seguro de inicialização,
juntamente com um ambiente de execução seguro e outros componentes de
hardware, serem essenciais para estabelecer uma cadeia de confiança, e 4) a viabilidade
em utilizar uma linguagem de um domínio específico para configurar e integrar
um ambiente virtualizado suportado pelos artefactos supramencionados
ACOTES project: Advanced compiler technologies for embedded streaming
Streaming applications are built of data-driven, computational components, consuming and producing unbounded data streams. Streaming oriented systems have become dominant in a wide range of domains, including embedded applications and DSPs. However, programming efficiently for streaming architectures is a challenging task, having to carefully partition the computation and map it to processes in a way that best matches the underlying streaming architecture, taking into account the distributed resources (memory, processing, real-time requirements) and communication overheads (processing and delay). These challenges have led to a number of suggested solutions, whose goal is to improve the programmer’s productivity in developing applications that process massive streams of data on programmable, parallel embedded architectures. StreamIt is one such example. Another more recent approach is that developed by the ACOTES project (Advanced Compiler Technologies for Embedded Streaming). The ACOTES approach for streaming applications consists of compiler-assisted mapping of streaming tasks to highly parallel systems in order to maximize cost-effectiveness, both in terms of energy and in terms of design effort. The analysis and transformation techniques automate large parts of the partitioning and mapping process, based on the properties of the application domain, on the quantitative information about the target systems, and on programmer directives. This paper presents the outcomes of the ACOTES project, a 3-year collaborative work of industrial (NXP, ST, IBM, Silicon Hive, NOKIA) and academic (UPC, INRIA, MINES ParisTech) partners, and advocates the use of Advanced Compiler Technologies that we developed to support Embedded Streaming.Peer ReviewedPostprint (published version
Virtualization services: scalable methods for virtualizing multicore systems
Multi-core technology is bringing parallel processing capabilities
from servers to laptops and even handheld devices. At the same time,
platform support for system virtualization is making it easier to
consolidate server and client resources, when and as needed by
applications. This consolidation is achieved by dynamically mapping
the virtual machines on which applications run to underlying
physical machines and their processing cores. Low cost processor and
I/O virtualization methods efficiently scaled to different numbers of
processing cores and I/O devices are key enablers of such consolidation.
This dissertation develops and evaluates new methods for scaling
virtualization functionality to multi-core and future many-core systems.
Specifically, it re-architects virtualization functionality to improve
scalability and better exploit multi-core system resources. Results
from this work include a self-virtualized I/O abstraction, which
virtualizes I/O so as to flexibly use different platforms' processing
and I/O resources. Flexibility affords improved performance and resource
usage and most importantly, better scalability than that offered by
current I/O virtualization solutions. Further, by describing system virtualization as a
service provided to virtual machines and the underlying computing platform,
this service can be enhanced to provide new and innovative functionality.
For example, a virtual device may provide obfuscated data to guest operating
systems to maintain data privacy; it could mask differences in device
APIs or properties to deal with heterogeneous underlying resources; or it
could control access to data based on the ``trust' properties of the
guest VM.
This thesis demonstrates that extended virtualization services are
superior to existing operating system or user-level implementations
of such functionality, for multiple reasons. First, this solution
technique makes more efficient use of key performance-limiting resource in
multi-core systems, which are memory and I/O bandwidth. Second, this
solution technique better exploits the parallelism inherent in multi-core
architectures and exhibits good scalability properties, in
part because at the hypervisor level, there is greater control in precisely
which and how resources are used to realize extended virtualization services.
Improved control over resource usage makes it possible to provide
value-added functionalities for both guest VMs and the platform.
Specific instances of virtualization services described in this thesis are the
network virtualization service that exploits heterogeneous processing cores,
a storage virtualization service that provides location transparent access
to block devices by extending
the functionality provided by network virtualization service, a multimedia
virtualization service that allows efficient media device sharing based on semantic
information, and an object-based storage service with enhanced access
control.Ph.D.Committee Chair: Schwan, Karsten; Committee Member: Ahamad, Mustaq; Committee Member: Fujimoto, Richard; Committee Member: Gavrilovska, Ada; Committee Member: Owen, Henry; Committee Member: Xenidis, Jim
- …