2,024 research outputs found
Automatic Sharing Classification and Timely Push for Cache-coherent Systems
This paper proposes and evaluates Sharing/Timing Adaptive Push (STAP), a dynamic scheme for preemptively sending data from producers to consumers to minimize criticalpath communication latency. STAP uses small hardware buffers to dynamically detect sharing patterns and timing requirements. The scheme applies to both intra-node and inter-socket directorybased shared memory networks. We integrate STAP into a MOESI cache-coherence protocol using heuristics to detect different data sharing patterns, including broadcasts, producer/consumer, and migratory-data sharing. Using 12 benchmarks from the PARSEC and SPLASH-2 suites in 3 different configurations, we show that our scheme significantly reduces communication latency in NUMA systems and achieves an average of 10% performance improvement (up to 46%), with at most 2% on-chip storage overhead. When combined with existing prefetch schemes, STAP either outperforms prefetching or combines with prefetching for improved performance (up to 15% extra) in most cases
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
Vision Transformers (ViTs) have achieved state-of-the-art performance on
various vision tasks. However, ViTs' self-attention module is still arguably a
major bottleneck, limiting their achievable hardware efficiency. Meanwhile,
existing accelerators dedicated to NLP Transformers are not optimal for ViTs.
This is because there is a large difference between ViTs and NLP Transformers:
ViTs have a relatively fixed number of input tokens, whose attention maps can
be pruned by up to 90% even with fixed sparse patterns; while NLP Transformers
need to handle input sequences of varying numbers of tokens and rely on
on-the-fly predictions of dynamic sparse attention patterns for each input to
achieve a decent sparsity (e.g., >=50%). To this end, we propose a dedicated
algorithm and accelerator co-design framework dubbed ViTCoD for accelerating
ViTs. Specifically, on the algorithm level, ViTCoD prunes and polarizes the
attention maps to have either denser or sparser fixed patterns for regularizing
two levels of workloads without hurting the accuracy, largely reducing the
attention computations while leaving room for alleviating the remaining
dominant data movements; on top of that, we further integrate a lightweight and
learnable auto-encoder module to enable trading the dominant high-cost data
movements for lower-cost computations. On the hardware level, we develop a
dedicated accelerator to simultaneously coordinate the enforced denser/sparser
workloads and encoder/decoder engines for boosted hardware utilization.
Extensive experiments and ablation studies validate that ViTCoD largely reduces
the dominant data movement costs, achieving speedups of up to 235.3x, 142.9x,
86.0x, 10.1x, and 6.8x over general computing platforms CPUs, EdgeGPUs, GPUs,
and prior-art Transformer accelerators SpAtten and Sanger under an attention
sparsity of 90%, respectively.Comment: Accepted to HPCA 202
Automatically Securing Permission-Based Software by Reducing the Attack Surface: An Application to Android
A common security architecture, called the permission-based security model
(used e.g. in Android and Blackberry), entails intrinsic risks. For instance,
applications can be granted more permissions than they actually need, what we
call a "permission gap". Malware can leverage the unused permissions for
achieving their malicious goals, for instance using code injection. In this
paper, we present an approach to detecting permission gaps using static
analysis. Our prototype implementation in the context of Android shows that the
static analysis must take into account a significant amount of
platform-specific knowledge. Using our tool on two datasets of Android
applications, we found out that a non negligible part of applications suffers
from permission gaps, i.e. does not use all the permissions they declare
Electronic Medical Record (EMR) Informatics Security
Medical records, once archived on paper and stored in filing cabinets, are now housed in electronic data repositories. While converting medical information into electronic format yields enormous benefits, it also raises new privacy and security concerns. The fact that medical information is now accessed, stored, processed, and transmitted through multiple organizations has led to the need for medical informatics security. In this paper, we examine the evolution of electronic medical records, review relevant legislation, and examine issues of privacy and security as it relates to medical information
Recommended from our members
QoS-aware mechanisms for improving cost-efficiency of datacenters
Warehouse Scale Computers (WSCs) promise high cost-efficiency by amortizing power, cooling, and management overheads. WSCs today host a large variety of jobs with two broad performance requirements categories: latency-critical (LC) and best-effort (BE). Ideally, to fully utilize all hardware resources, WSC operators can simply fill all the nodes with computing jobs. Unfortunately, because colocated jobs contend for shared resources, systems with high loads often experience performance degradation, which negatively impacts the Quality of Service (QoS) for LC jobs. In fact, service providers usually over-provision resources to avoid any interference with LC jobs, leading to significant resource inefficiencies. In this dissertation, I explore opportunities across different system-abstraction layers to improve the cost-efficiency of dataceters by increasing resource utilization of WSCs with little or no impact on the performance of LC jobs. The dissertation has three main components. First, I explore opportunities to improve the throughput of multicore systems by reducing the performance variation of LC jobs. The main insight is that by reshaping the latency distribution curve, performance headroom of LC jobs can be effectively converted to improved BE throughput. I develop, implement, and evaluate a runtime system that achieves this goal with existing hardware. I leverage the cache partitioning, per-core frequency scaling, and thread masking of server processors. Evaluation results show the proposed solution enables 30% higher system throughput compared to solutions proposed in prior works while maintaining at least as good QoS for LC jobs. Second, I study resource contention in near-future heterogeneous memory architectures (HMA). This study is motivated by recent developments in non-volatile memory (NVM) technologies, which enable higher storage density at the cost of same performance. To understand the performance and QoS impact of HMAs, I design and implement a performance emulator in the Linux kernel that runs unmodified workloads with high accuracy, low overhead, and complete transparency. I further propose and evaluate multiple data and resource management QoS mechanisms, such as locality-aware page admission, occupancy management, and write buffer jailing. Third, I focus on accelerated machine learning (ML) systems. By profiling the performance of production workloads and accelerators, I show that accelerated ML tasks are highly sensitive to main memory interference due to fine-grained interaction between CPU and accelerator tasks. As a result, memory resource contention can significantly decreases the performance and efficiency gains of accelerators. I propose a runtime system that leverages existing hardware capabilities and show 17% higher system efficiency compared to previous approaches. This study further exposes opportunities for future processor architecturesElectrical and Computer Engineerin
Designing the Extended Zero Trust Maturity Model A Holistic Approach to Assessing and Improving an Organization’s Maturity Within the Technology, Processes and People Domains of Information Security
Zero Trust is an approach to security where implicit trust is removed, forcing applications, workloads, servers and users to verify themselves every time a request is made. Furthermore, Zero Trust means assuming anything can be compromised, and designing networks, identities and systems with this in mind and following the principle of least privilege. This approach to information security has been coined as the solution to the weaknesses of traditional perimeter-based information security models, and adoption is starting to increase. However, the principles of Zero Trust are only applied within the technical domain to aspects such as networks, data and identities in past research. This indicates a knowledge gap, as the principles of Zero Trust could be applied to organizational domains such as people and processes to further strengthen information security, resulting in a holistic approach. To fill this gap, we employed design science research to develop a holistic maturity model for Zero Trust maturity based on these principles: The EZTMM. We performed two systematic literature reviews on Zero Trust and Maturity Model theory respectively and collaborated closely with experts and practitioners on the operational, tactical and strategic levels of six different organizations. The resulting maturity model was anchored in prior Zero Trust and maturity model literature, as well as practitioner and expert experiences and knowledge. The EZTMM was evaluated by our respondent organizations through two rounds of interviews before being used by one respondent organization to perform a maturity assessment of their own organization as a part of our case study evaluation. Each interview round resulted in ample feedback and learning, while the case study allowed us to evaluate and improve on the model in a real-world setting. Our contribution is twofold: A fully functional, holistic Zero Trust maturity model with an accompanying maturity assessment spreadsheet (the artifact), and our reflections and suggestions regarding further development of the EZTMM and research on the holistic application of Zero Trust principles for improved information security
Designing the Extended Zero Trust Maturity Model A Holistic Approach to Assessing and Improving an Organization’s Maturity Within the Technology, Processes and People Domains of Information Security
Zero Trust is an approach to security where implicit trust is removed, forcing applications, workloads, servers and users to verify themselves every time a request is made. Furthermore, Zero Trust means assuming anything can be compromised, and designing networks, identities and systems with this in mind and following the principle of least privilege. This approach to information security has been coined as the solution to the weaknesses of traditional perimeter-based information security models, and adoption is starting to increase. However, the principles of Zero Trust are only applied within the technical domain to aspects such as networks, data and identities in past research. This indicates a knowledge gap, as the principles of Zero Trust could be applied to organizational domains such as people and processes to further strengthen information security, resulting in a holistic approach. To fill this gap, we employed design science research to develop a holistic maturity model for Zero Trust maturity based on these principles: The EZTMM. We performed two systematic literature reviews on Zero Trust and Maturity Model theory respectively and collaborated closely with experts and practitioners on the operational, tactical and strategic levels of six different organizations. The resulting maturity model was anchored in prior Zero Trust and maturity model literature, as well as practitioner and expert experiences and knowledge. The EZTMM was evaluated by our respondent organizations through two rounds of interviews before being used by one respondent organization to perform a maturity assessment of their own organization as a part of our case study evaluation. Each interview round resulted in ample feedback and learning, while the case study allowed us to evaluate and improve on the model in a real-world setting. Our contribution is twofold: A fully functional, holistic Zero Trust maturity model with an accompanying maturity assessment spreadsheet (the artifact), and our reflections and suggestions regarding further development of the EZTMM and research on the holistic application of Zero Trust principles for improved information security
Measuring and Controlling Multicore Contention in a RISC-V System-on-Chip
[ES] Los procesadores multinúcleo empezaron una revolución en el cómputo moderno cuando fueron introducidos en el espacio de cómputo comercial y de consumidor. Estos procesadores multinúcleo presentaban un aumento significativo en consumo, eficiencia y rendimiento en un periodo de tiempo en el aumento de la frecuencia y el IPC del procesador parecÃa estar tocando techo. Sin embargo, en sistemas crÃticos, la introducción de los procesadores multinúcleo ha traÃdo a la luz diferentes dificultades en el proceso de certificación. La principal área que dificulta la caracterización de los sistemas multicore en tiempo real es el uso de recursos compartidos, en especÃfico, los buses compartidos.
En este trabajo proveeremos las herramientas necesarias para facilitar la caracterización de sistemas que hacen uso de buses compartidos en sistemas de criticidad mixta. En especÃfico, combinamos las polÃticas desarrolladas para sistemas con buses con polÃticas de limitación de ancho de banda basadas en interferencia causada al núcleo principal. Con esta combinación de polÃticas podemos limitar el WCET de la tarea crÃtica en el sistema multinúcleo mientras que proveemos un "best effort" para permitir el progreso en los núcleos secundarios.[CAT] Els processadors multinucli van començar una revolució en el còmput modern quan
van ser introduïts en l’espai de còmput comercial i de consumidor. Aquests processadors
multinucli presentaven un augment significatiu en consum, eficiència i rendiment en un
perÃode de temps en l’augment de la freqüència i l’IPC de l’processador semblava estar
tocant sostre. No obstant això, en sistemes crÃtics, la introducció dels processadors multi-
nucli ha portat a la llum diferents dificultats en el procés de certificació. La principal à rea
que dificulta la caracterització dels sistemes multinucli en temps real és l’ús de recursos
compartits, en especÃfic, els busos compartits.
En aquest treball proveirem les eines necessà ries per facilitar la caracterització de sis-
temes que fan ús de busos compartits en sistemes de criticitat mixta. En especÃfic, combi-
nem les polÃtiques desenvolupades per a sistemes amb busos amb polÃtiques de limitació
d’ample de banda basades en interferència causada a el nucli principal. Amb aquesta
combinació de polÃtiques podem limitar l’WCET de la tasca crÃtica en el sistema multinu-
cli mentre que proveïm un "best effort"per permetre el progrés en els nuclis secundaris.[EN] Multicore processors were a revolution when introduced into the commercial computing space, they presented great power efficiency and performance in a time where clock speeds and instruction level parallelism were plateauing. But, on safety critical systems, the introduction of multi-core processors has brought serious difficulties to the certification process. The main trouble spot for multicore characterization is the usage of shared resources, in specific, shared buses.
In this work, we provide tools to ease the characterization of shared bus mechanisms timing interference on critical and mixed criticality systems. In particular, we combine shared bus arbitration policies with rate limiting policies based on critical workload interference to bound the WCET of a critical workload on a multi-core system while doing a best effort to let secondary cores progress as much as possible.Andreu Cerezo, P. (2021). Measuring and Controlling Multicore Contention in a RISC-V System-on-Chip. Universitat Politècnica de València. http://hdl.handle.net/10251/173563TFG
- …