502 research outputs found

    Task Scheduling on the Cloud with Hard Constraints

    Full text link
    Scheduling Bag-of-Tasks (BoT) applications on the cloud can be more challenging than grid and cluster environ- ments. This is because a user may have a budgetary constraint or a deadline for executing the BoT application in order to keep the overall execution costs low. The research in this paper is motivated to investigate task scheduling on the cloud, given two hard constraints based on a user-defined budget and a deadline. A heuristic algorithm is proposed and implemented to satisfy the hard constraints for executing the BoT application in a cost effective manner. The proposed algorithm is evaluated using four scenarios that are based on the trade-off between performance and the cost of using different cloud resource types. The experimental evaluation confirms the feasibility of the algorithm in satisfying the constraints. The key observation is that multiple resource types can be a better alternative to using a single type of resource.Comment: Visionary Track of the IEEE 11th World Congress on Services (IEEE SERVICES 2015

    Autotuning Apache TVM-based Scientific Applications Using Bayesian Optimization

    Full text link
    Apache TVM (Tensor Virtual Machine), an open source machine learning compiler framework designed to optimize computations across various hardware platforms, provides an opportunity to improve the performance of dense matrix factorizations such as LU (Lower Upper) decomposition and Cholesky decomposition on GPUs and AI (Artificial Intelligence) accelerators. In this paper, we propose a new TVM autotuning framework using Bayesian Optimization and use the TVM tensor expression language to implement linear algebra kernels such as LU, Cholesky, and 3mm. We use these scientific computation kernels to evaluate the effectiveness of our methods on a GPU cluster, called Swing, at Argonne National Laboratory. We compare the proposed autotuning framework with the TVM autotuning framework AutoTVM with four tuners and find that our framework outperforms AutoTVM in most cases

    Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM

    Full text link
    We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS and OpenBLAS, in order to obtain high-performance blocked formulations of the general matrix multiplication (GEMM). % In addition, we fully automatize the generation process, by also leveraging the Apache TVM framework to derive a complete variety of the processor-specific micro-kernels for GEMM. This is in contrast with the convention in high performance libraries, which hand-encode a single micro-kernel per architecture using Assembly code. % In global, the combination of our TVM-generated blocked algorithms and micro-kernels for GEMM 1)~improves portability, maintainability and, globally, streamlines the software life cycle; 2)~provides high flexibility to easily tailor and optimize the solution to different data types, processor architectures, and matrix operand shapes, yielding performance on a par (or even superior for specific matrix shapes) with that of hand-tuned libraries; and 3)~features a small memory footprint.Comment: 35 pages, 22 figures. Submitted to ACM TOM

    On the feasibility of optimizing ML-based intrusion detection for CAN on real-world hardware platforms

    Get PDF
    LAUREA MAGISTRALEIn recent years, the automotive industry has seen a surge, thanks to the growing use of Vehicle-to-Vehicle (V2V) technology and the implementation of the Vehicle-to-Infrastructure (V2I) communication model, information systems that make possible the data exchange between vehicles and road infrastructure. Nevertheless, the creation of new opportunities and the diffusion of these technologies lead to a growth of the number of attackable surfaces and, consequently, the potential threats. In fact, the communication between the Electronic Control Unit (ECU) exploits a network protocol highly reliable and economical (the Control Area Network (CAN)) that results vulnerable in front of possible attacks, because of the total absence of appropriate protective mechanisms. On top of this, it seems clear that it is necessary to strengthen vehicles’ cyber-security systems, to detect intrusions. One of the most common security solutions is the usage of the so-called Intrusion Detection System (IDS) that must be able to detect the attacks in real-time while keeping a moderate consumption of energy, low costs and employing hardware with limited computational capabilities. In this sense, this work aims to evaluate the capabilities of different optimization techniques on various hardware platforms to provide their strengths and weaknesses in the detection time speedup of CAN IDSs. This thesis target CANnolo, one of the most complete state-of-the-art IDS. In the automotive context, to the best of my knowledge, I am the first to make such an evaluation, comprising different grades hardware platforms (CPUs, GPUs and FPGAs) and optimization techniques such as quantization (with the use of PyTorch library and Vitis Ai) and graph-level (with the use of Apache TVM).Negli ultimi anni, nel settore automotive c’è stata una svolta in termini di connettività, grazie al crescente impiego della tecnologia V2V (Vehicle-To-Vehicle) ed all’implementazione del modello comunicativo V2I (Vechicle-To-Infrastracture), sistemi informatici che permettono lo scambio di dati tra veicoli ed infrastrutture stradali. Al momento, però, non si può ancora parlare concretamente di progresso perché, oltre alla creazione di nuove opportunità, la diffusione di queste tecnologie ha aumentato anche le superfici attaccabili e, conseguentemente, le potenziali minacce. Attualmente, infatti, la comunicazione tra le varie unità di controllo elettronico (ECU) avviene tramite Control Area Network (CAN), un protocollo di rete altamente affidabile ed economico che risulta, tuttavia, evidentemente vulnerabile di fronte a possibili attacchi, per via della totale mancanza di meccanismi di difesa adeguati. A fronte di ciò, appare evidente che sia necessario intervenire per rafforzare il sistema di sicurezza informatica delle autovetture, in modo da evitare intrusioni. L’implementazione della sicurezza è legata all’utilizzo dei cosiddetti IDS (acronimo che sta per sistemi di rilevamento delle intrusioni) che devono tassativamente rilevare gli attacchi in tempo reale, pur mantenendo un consumo di energia contenuto, costi modesti ed utilizzando hardware con capacità molto limitate. Il mio lavoro punta a valutare le capacità di differenti tecniche di ottimizzazione su diverse piattaforme hardware in modo da rilevare i punti di forza e di debolezza nel migliorare la velocità della detection di CANnolo, uno degli IDS più completi nell’attuale stato dell’arte. Nel contesto automotive, stando alle informazioni in mio possesso, sono il primo ad effettuare una valutazione di questo tipo, andando a comprendere differenti tipologie di piattaforme hardware (CPUs, GPUs e FPGAs) e tecniche di ottimizzazione: come quantizzazione (attraverso Pytorch quantization library e Vitis Ai) e graph-level (attraverso Apache TVM)
    corecore