59 research outputs found

    PIM-Enclave: Bringing Confidential Computation Inside Memory

    Full text link
    Demand for data-intensive workloads and confidential computing are the prominent research directions shaping the future of cloud computing. Computer architectures are evolving to accommodate the computing of large data better. Protecting the computation of sensitive data is also an imperative yet challenging objective; processor-supported secure enclaves serve as the key element in confidential computing in the cloud. However, side-channel attacks are threatening their security boundaries. The current processor architectures consume a considerable portion of its cycles in moving data. Near data computation is a promising approach that minimizes redundant data movement by placing computation inside storage. In this paper, we present a novel design for Processing-In-Memory (PIM) as a data-intensive workload accelerator for confidential computing. Based on our observation that moving computation closer to memory can achieve efficiency of computation and confidentiality of the processed information simultaneously, we study the advantages of confidential computing \emph{inside} memory. We then explain our security model and programming model developed for PIM-based computation offloading. We construct our findings into a software-hardware co-design, which we call PIM-Enclave. Our design illustrates the advantages of PIM-based confidential computing acceleration. Our evaluation shows PIM-Enclave can provide a side-channel resistant secure computation offloading and run data-intensive applications with negligible performance overhead compared to baseline PIM model

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Full text link
    Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

    SoK: Confidential Quartet - Comparison of Platforms for Virtualization-Based Confidential Computing

    Get PDF
    Confidential computing allows processing sensitive workloads in securely isolated spaces. Following earlier adop- tion of process-based approaches to isolation, vendors are now enabling hardware and firmware support for virtualization-based confidential computing on several server platforms. Due to variations in the technology stack, threat model, implemen-tation and functionality, the available solutions offer somewhat different capabilities, trade-offs and security guarantees. In this paper we review, compare and contextualize four virtualization-based confidential computing technologies for enterprise server platforms - AMD SEV, ARM CCA, IBM PEF and Intel TDX

    Plundervolt:software-based fault injection attacks against Intel SGX

    Get PDF

    Sistema de arquivos criptográfico com aceleração especulativa em GPU

    Get PDF
    Orientador: Dr. Wagner Machado Nunan ZolaCoorientador: Dr. Luis Carlos Erpen de BonaDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 03/09/2018Inclui referênciasÁrea de concentração: Ciência da ComputaçãoResumo: A informação pode assumir um caráter valioso em diversas situações, inclusive ao ser armazenada em formato digital. É comum encontrar diversos sistemas de armazenamento de dados que se preocupam em cumprir com algumas propriedades básicas da segurança da informação. Geralmente utilizam técnicas de criptografia, principalmente a da cifragem simétrica. A utilização de criptografia pode exigir quantidades significativas de processamento em CPUs. Consequentemente, sistemas de armazenamento criptográficos podem se tornar grandes consumidores de recursos de processamento e ser impactados por outras aplicações ao concorrer pelo uso da CPU. Uma forma alternativa ao processamento em CPUs é o processamento paralelo utilizando múltiplos processadores de placas gráficas (GPUs). Um dos algoritmos de cifragem simétrica mais utilizados é o AES e sua aceleração em GPUs foi amplamente estudada. Um desses estudos resultou na criação do WAES e de sua biblioteca WAESlib, que permite executar funções de cifragem do AES em GPUs. O funcionamento do WAES está baseado no modo de operação CTR, o qual consiste em regras que orientam como devem ser aplicados os algoritmos de cifragem visando manter o processo de cifragem seguro. As principais vantagens do modo CTR são ser totalmente paralelizável e permitir realizar a etapa inicial do processo de cifragem de forma antecipada, gerando máscaras de cifragem. Procurando se beneficiar dessas vantagens, este trabalho explora a utilização do modo CTR, aplicando-o na implementação do sistema de arquivos criptográfico EncFS++. A biblioteca WAESlib foi utilizada para auxiliar no processo de implementação. Na primeira etapa deste trabalho foi implementado o modo CTR, onde foram tratadas questões relacionadas a um componente essencial do modo CTR denominado nonce. Foram criadas e implementadas técnicas que lidam com a geração, armazenamento e gerenciamento de nonces. Na segunda etapa foram criadas e implementadas técnicas relacionadas ao gerenciamento dos contextos de cifragem, procurando realizar a cifragem especulativa de forma eficiente, gerando as máscaras de cifragem na GPU com o tempo de antecedência adequado. Foram realizadas análises de desempenho envolvendo vazão, tempo de execução e latência na implementação resultante da primeira etapa, bem como vazão e utilização de CPU na implementação da segunda. Os resultados da primeira etapa demonstram que a simples utilização do modo CTR traz ganhos significativos de desempenho principalmente nas operações de escrita. Os resultados da segunda etapa demonstram que os ganhos podem ser ampliados, inclusive nas operações de leitura sequencial, com a produção especulativa das máscaras de cifragem e seu processamento em GPU. Em ambientes que não utilizam processadores com aceleração das funções criptográficas do AES, os ganhos são bem significativos, inclusive resultando em utilização mais eficiente da CPU.Abstract: Information can be valuable in many situations, including when is stored in digital format. It is common to find several storage systems that try to comply with some basic information security properties. For those purposes, they use cryptographic techniques, mainly symmetric encryption. The use of cryptography may require significant amounts of processing on CPUs. As a result, cryptographic storage systems can become large consumers of processing resources and be impacted by other applications when competing for CPU usage. An alternative to CPU processing is parallel processing using multiple graphics processing units (GPUs). One of the most widely used symmetric encryption algorithms is AES and its acceleration in GPUs has been extensively studied. One of these studies resulted in the creation of WAES and its library named WAESlib, which allows execution of AES encryption functions on GPUs. The operation of WAES is based on CTR operation mode, which consists of rules that guide how encryption algorithms should be applied in order to keep the encryption process safe.The main advantages of CTR mode are to be fully parallelizable and allow to carry out the initial step of the encryption process in advance, generating encryption masks. In order to benefit from these features, this work explores the use of CTR mode, applying it in the implementation of a cryptographic filesystem named EncFS++. TheWAESlib library was used to aid in the implementation process. In the first part of this work, CTR mode was implemented and issues related to an essential component of CTR mode known as nonce were addressed. Techniques have been created and implemented to deal with the generation, storage and management of nonces. In the second part, techniques related to the management of the encryption contexts have been created and implemented, aiming to perform the speculative encryption in an efficient way, generating the encryption masks in the GPU with adequate time in advance. Performance analysis were conducted measuring throughput, execution time and latency in the implementation resulting from the first part, as well as throughput and CPU utilization in the implementation of the second one. The performance analysis results of the first part demonstrate that the simple use of CTR mode brings significant performance gains, mainly in write operations. The performance analysis results of the second part demonstrate that gains can be enhanced, including in sequential read operations, with the speculative encryption of masks and its processing in GPU. In environments that do not use processors with accelerated AES cryptographic functions, gains in throughput were quite significant and a more efficient CPU utilization were obtained

    TrustZone based attestation in secure runtime verification for embedded systems

    Get PDF
    Dissertação de mestrado integrado em Engenharia InformáticaARM TrustZone é um “Ambiente de Execução Confiável” disponibilizado em processadores da ARM, que equipam grande parte dos sistemas embebidos. Este mecanismo permite assegurar que componentes críticos de uma aplicação executem num ambiente que garante a confidencialidade dos dados e integridade do código, mesmo que componentes maliciosos estejam instalados no mesmo dispositivo. Neste projecto pretende-se tirar partido do TrustZone no contexto de uma framework segura de monitorização em tempo real de sistemas embebidos. Especificamente, pretende-se recorrer a components como o ARM Trusted Firmware, responsável pelo processo de secure boot em sistemas ARM, para desenvolver um mecanismo de atestação que providencie garantias de computação segura a entidades remotas.ARM TrustZone is a security extension present on ARM processors that enables the development of hardware based Trusted Execution Environments (TEEs). This mechanism allows the critical components of an application to execute in an environment that guarantees data confidentiality and code integrity, even when a malicious agent is installed on the device. This projects aims to harness TrustZone in the context of a secure runtime verification framework for embedded devices. Specifically, it aims to harness existing components, namely ARM Trusted Firmware, responsible for the secure boot process of ARM devices, to implement an attestation mechanism that provides proof of secure computation to remote parties.This work has been partially supported by the Portuguese Foundation for Science and Technology (FCT), project REASSURE (PTDC/EEI-COM/28550/2017), co-financed by the European Regional Development Fund (FEDER), through the North Regional Operational Program (NORTE 2020)

    Authentication and Data Protection under Strong Adversarial Model

    Get PDF
    We are interested in addressing a series of existing and plausible threats to cybersecurity where the adversary possesses unconventional attack capabilities. Such unconventionality includes, in our exploration but not limited to, crowd-sourcing, physical/juridical coercion, substantial (but bounded) computational resources, malicious insiders, etc. Our studies show that unconventional adversaries can be counteracted with a special anchor of trust and/or a paradigm shift on a case-specific basis. Complementing cryptography, hardware security primitives are the last defense in the face of co-located (physical) and privileged (software) adversaries, hence serving as the special trust anchor. Examples of hardware primitives are architecture-shipped features (e.g., with CPU or chipsets), security chips or tokens, and certain features on peripheral/storage devices. We also propose changes of paradigm in conjunction with hardware primitives, such as containing attacks instead of counteracting, pretended compliance, and immunization instead of detection/prevention. In this thesis, we demonstrate how our philosophy is applied to cope with several exemplary scenarios of unconventional threats, and elaborate on the prototype systems we have implemented. Specifically, Gracewipe is designed for stealthy and verifiable secure deletion of on-disk user secrets under coercion; Hypnoguard protects in-RAM data when a computer is in sleep (ACPI S3) in case of various memory/guessing attacks; Uvauth mitigates large-scale human-assisted guessing attacks by receiving all login attempts in an indistinguishable manner, i.e., correct credentials in a legitimate session and incorrect ones in a plausible fake session; Inuksuk is proposed to protect user files against ransomware or other authorized tampering. It augments the hardware access control on self-encrypting drives with trusted execution to achieve data immunization. We have also extended the Gracewipe scenario to a network-based enterprise environment, aiming to address slightly different threats, e.g., malicious insiders. We believe the high-level methodology of these research topics can contribute to advancing the security research under strong adversarial assumptions, and the promotion of software-hardware orchestration in protecting execution integrity therein

    Trusted SoC Realization for Remote Dynamic IP Integration

    Get PDF
    Heutzutage bieten field-programmable gate arrays (FPGAs) enorme Rechenleistung und Flexibilität. Zudem sind sie oft auf einem einzigen Chip mit eingebetteten Multicore-Prozessoren, DSP-Engines und Speicher-Controllern integriert. Dadurch sind sie für große und komplexe Anwendungen geeignet. Gleichzeitig führten die Fortschritte auf dem Gebiet der High-Level-Synthese und die Verfügbarkeit standardisierter Schnittstellen (wie etwa das Advanced eXtensible Interface 4) zur Entwicklung spezialisierter und neuartiger Funktionalitäten durch Designhäuser. All dies schuf einen Bedarf für ein Outsourcing der Entwicklung oder die Lizenzierung von FPGA-IPs (Intellectual Property). Ein Pay-per-Use IP-Lizenzierungsmodell, bei dem diese IPs vor allen Marktteilnehmern geschützt sind, kommt den Entwicklern der IPs zugute. Außerdem handelt es sich bei den Entwicklern von FPGA-Systemen in der Regel um kleine bis mittlere Unternehmen, die in Bezug auf die Markteinführungszeit und die Kosten pro Einheit von einem solchen Lizenzierungsmodell profitieren können. Im akademischen Bereich und in der Industrie gibt es mehrere IP-Lizenzierungsmodelle und Schutzlösungen, die eingesetzt werden können, die jedoch mit zahlreichen Sicherheitsproblemen behaftet sind. In einigen Fällen verursachen die vorgeschlagenen Sicherheitsmaßnahmen einen unnötigen Ressourcenaufwand und Einschränkungen für die Systementwickler, d. h., sie können wesentliche Funktionen ihres Geräts nicht nutzen. Darüber hinaus lassen sie zwei funktionale Herausforderungen außer Acht: das Floorplanning der IP auf der programmierbaren Logik (PL) und die Generierung des Endprodukts der IP (Bitstream) unabhängig vom Gesamtdesign. In dieser Arbeit wird ein Pay-per-Use-Lizenzierungsschema vorgeschlagen und unter Verwendung eines security framework (SFW) realisiert, um all diese Herausforderungen anzugehen. Das vorgestellte Schema ist pragmatisch, weniger restriktiv für Systementwickler und bietet Sicherheit gegen IP-Diebstahl. Darüber hinaus werden Maßnahmen ergriffen, um das System vor einem IP zu schützen, das bösartige Schaltkreise enthält. Das „Secure Framework“ umfasst ein vertrauenswürdiges Betriebssystem, ein reichhaltiges Betriebssystem, mehrere unterstützende Komponenten (z. B. TrustZone- Logik, gegen Seitenkanalangriffe (SCA) resistente Entschlüsselungsschaltungen) und Softwarekomponenten, z. B. für die Bitstromanalyse. Ein Gerät, auf dem das SFW läuft, kann als vertrauenswürdiges Gerät betrachtet werden, das direkt mit einem Repository oder einem IP-Core-Entwickler kommunizieren kann, um IPs in verschlüsselter Form zu erwerben. Die Entschlüsselung und Authentifizierung des IPs erfolgt auf dem Gerät, was die Angriffsfläche verringert und es weniger anfällig für IP-Diebstahl macht. Außerdem werden Klartext-IPs in einem geschützten Speicher des vertrauenswürdigen Betriebssystems abgelegt. Das Klartext-IP wird dann analysiert und nur dann auf der programmierbaren Logik konfiguriert, wenn es authentisch ist und keine bösartigen Schaltungen enthält. Die Bitstrom-Analysefunktionalität und die SFW-Unterkomponenten ermöglichen die Partitionierung der PL-Ressourcen in sichere und unsichere Ressourcen, d. h. die Erweiterung desKonzepts der vertrauenswürdigen Ausführungsumgebung (TEE) auf die PL. Dies ist die erste Arbeit, die das TEE-Konzept auf die programmierbare Logik ausweitet. Bei der oben erwähnten SCA-resistenten Entschlüsselungsschaltung handelt es sich um die Implementierung des Advanced Encryption Standard, der so modifiziert wurde, dass er gegen elektromagnetische und stromverbrauchsbedingte Leckagen resistent ist. Das geschützte Design verfügt über zwei Gegenmaßnahmen, wobei die erste auf einer Vielzahl unterschiedler Implementierungsvarianten und veränderlichen Zielpositionen bei der Konfiguration basiert, während die zweite nur unterschiedliche Implementierungsvarianten verwendet. Diese Gegenmaßnahmen sind auch während der Laufzeit skalierbar. Bei der Bewertung werden auch die Auswirkungen der Skalierbarkeit auf den Flächenbedarf und die Sicherheitsstärke berücksichtigt. Darüber hinaus wird die zuvor erwähnte funktionale Herausforderung des IP Floorplanning durch den Vorschlag eines feinkörnigen Automatic Floorplanners angegangen, der auf gemischt-ganzzahliger linearer Programmierung basiert und aktuelle FPGAGenerationen mit größeren und komplexen Bausteine unterstützt. Der Floorplanner bildet eine Reihe von IPs auf dem FPGA ab, indem er präzise rekonfigurierbare Regionen schafft. Dadurch werden die verbleibenden verfügbaren Ressourcen für das Gesamtdesign maximiert. Die zweite funktionale Herausforderung besteht darin, dass die vorhandenen Tools keine native Funktionalität zur Erzeugung von IPs in einer eigenständigen Umgebung bieten. Diese Herausforderung wird durch den Vorschlag eines unabhängigen IP-Generierungsansatzes angegangen. Dieser Ansatz kann von den Marktteilnehmern verwendet werden, um IPs eines Entwurfs unabhängig vom Gesamtentwurf zu generieren, ohne die Kompatibilität der IPs mit dem Gesamtentwurf zu beeinträchtigen
    corecore