1,234 research outputs found

    GPU ํ™˜๊ฒฝ์—์„œ ๋จธ์‹ ๋Ÿฌ๋‹ ์›Œํฌ๋กœ๋“œ์˜ ํšจ์œจ์ ์ธ ์‹คํ–‰

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2023. 2. ์ „๋ณ‘๊ณค.Machine learning (ML) workloads are becoming increasingly important in many types of real-world applications. We attribute this trend to the development of software systems for ML, which have facilitated the widespread adoption of heterogeneous accelerators such as GPUs. Todays ML software stack has made great improvements in terms of efficiency, however, not all use cases are well supported. In this dissertation, we study how to improve execution efficiency of ML workloads on GPUs from a software system perspective. We identify workloads where current systems for ML have inefficiencies in utilizing GPUs and devise new system techniques that handle those workloads efficiently. We first present Nimble, a ML execution engine equipped with carefully optimized GPU scheduling. The proposed scheduling techniques can be used to improve execution efficiency by up to 22.34ร—. Second, we propose Orca, an inference serving system specialized for Transformer-based generative models. By incorporating new scheduling and batching techniques, Orca significantly outperforms state-of-the-art systems โ€“ 36.9ร— throughput improvement at the same level of latency. The last topic of this dissertation is WindTunnel, a framework that translates classical ML pipelines into neural networks, providing GPU training capabilities for classical ML workloads. WindTunnel also allows joint training of pipeline components via backpropagation, resulting in improved accuracy over the original pipeline and neural network baselines.์ตœ๊ทผ ๊ฒฝํ–ฅ์„ ๋ณด๋ฉด ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ ๋จธ์‹  ๋Ÿฌ๋‹(ML) ์›Œํฌ๋กœ๋“œ๊ฐ€ ์  ์  ๋” ์ค‘์š”ํ•˜๊ฒŒ ํ™œ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Š” ML์šฉ ์‹œ์Šคํ…œ ์†Œํ”„ํŠธ์›จ์–ด์˜ ๊ฐœ๋ฐœ์„ ํ†ตํ•ด GPU ์™€ ๊ฐ™์€ ์ด๊ธฐ์ข… ๊ฐ€์†๊ธฐ์˜ ๊ด‘๋ฒ”์œ„ํ•œ ํ™œ์šฉ์ด ๊ฐ€๋Šฅํ•ด์กŒ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋งŽ์€ ์—ฐ๊ตฌ์ž๋“ค์˜ ๊ด€์‹ฌ ๋•์— ML์šฉ ์‹œ์Šคํ…œ ์†Œํ”„ํŠธ์›จ์–ด ์Šคํƒ์€ ๋ถ„๋ช… ํ•˜๋ฃจ๊ฐ€ ๋‹ค๋ฅด๊ฒŒ ๊ฐœ์„ ๋˜๊ณ  ์žˆ์ง€๋งŒ, ์—ฌ์ „ํžˆ ๋ชจ๋“  ์‚ฌ๋ก€์—์„œ ๋†’์€ ํšจ์œจ์„ฑ์„ ๋ณด์—ฌ์ฃผ์ง€๋Š” ๋ชปํ•œ๋‹ค. ์ด ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ์‹œ์Šค ํ…œ ์†Œํ”„ํŠธ์›จ์–ด ๊ด€์ ์—์„œ GPU ํ™˜๊ฒฝ์—์„œ ML ์›Œํฌ๋กœ๋“œ์˜ ์‹คํ–‰ ํšจ์œจ์„ฑ์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์—ฐ๊ตฌํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ๋Š” ์˜ค๋Š˜๋‚ ์˜ ML์šฉ ์‹œ์Šคํ…œ์ด GPU๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์‚ฌ ์šฉํ•˜์ง€ ๋ชปํ•˜๋Š” ์›Œํฌ๋กœ๋“œ๋ฅผ ๊ทœ๋ช…ํ•˜๊ณ  ๋” ๋‚˜์•„๊ฐ€์„œ ํ•ด๋‹น ์›Œํฌ๋กœ๋“œ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ์Šคํ…œ ๊ธฐ์ˆ ์„ ๊ณ ์•ˆํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋จผ์ € ์ตœ์ ํ™”๋œ GPU ์Šค์ผ€์ค„๋ง์„ ๊ฐ–์ถ˜ ML ์‹คํ–‰ ์—”์ง„์ธ Nimble ์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ƒˆ ์Šค์ผ€์ค„๋ง ๊ธฐ๋ฒ•์„ ํ†ตํ•ด Nimble์€ ๊ธฐ์กด ๋Œ€๋น„ GPU ์‹คํ–‰ ํšจ์œจ์„ฑ ์„ ์ตœ๋Œ€ 22.34๋ฐฐ๊นŒ์ง€ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ๋‘˜์งธ๋กœ Transformer ๊ธฐ๋ฐ˜์˜ ์ƒ์„ฑ ๋ชจ๋ธ์— ํŠนํ™”๋œ ์ถ”๋ก  ์„œ๋น„์Šค ์‹œ์Šคํ…œ Orca๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ƒˆ๋กœ์šด ์Šค์ผ€์ค„๋ง ๋ฐ batching ๊ธฐ ์ˆ ์— ํž˜์ž…์–ด, Orca๋Š” ๋™์ผํ•œ ์ˆ˜์ค€์˜ ์ง€์—ฐ ์‹œ๊ฐ„์„ ๊ธฐ์ค€์œผ๋กœ ํ–ˆ์„ ๋•Œ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ 36.9๋ฐฐ ํ–ฅ์ƒ๋œ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ๋ณด์ธ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ณ ์ „ ML ํŒŒ์ดํ”„๋ผ์ธ์„ ์‹ ๊ฒฝ๋ง์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ WindTunnel์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ด ๋ฅผ ํ†ตํ•ด ๊ณ ์ „ ML ํŒŒ์ดํ”„๋ผ์ธ ํ•™์Šต์„ GPU๋ฅผ ์‚ฌ์šฉํ•ด ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. ๋˜ํ•œ WindTunnel์€ gradient backpropagation์„ ํ†ตํ•ด ํŒŒ์ดํ”„๋ผ์ธ์˜ ์—ฌ๋Ÿฌ ์š”์†Œ๋ฅผ ํ•œ ๋ฒˆ์— ๊ณต๋™์œผ๋กœ ํ•™์Šต ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ํŒŒ์ดํ”„๋ผ์ธ์˜ ์ •ํ™•๋„๋ฅผ ๋” ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Overview 2 1.3 Previous Publications 4 1.4 Roadmap 5 Chapter 2 Background 6 2.1 ML Workloads 6 2.2 The GPU Execution Model 7 2.3 GPU Scheduling in ML Frameworks 8 2.4 Engine Scheduling in Inference Servers 10 2.5 Inference Procedure of Generative Models 11 Chapter 3 Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning 17 3.1 Introduction 17 3.2 Motivation 21 3.3 System Design 24 3.3.1 Ahead-of-time (AoT) Scheduling 25 3.3.2 Stream Assignment Algorithm 28 3.4 Evaluation 32 3.4.1 Inference Latency 36 3.4.2 Impact of Multi-stream Execution 36 3.4.3 Training Throughput 37 3.5 Summary 38 Chapter 4 Orca: A Distributed Serving System for Transformer-Based Generative Models 40 4.1 Introduction 40 4.2 Challenges and Proposed Solutions 44 4.3 Orca System Design 51 4.3.1 Distributed Architecture 51 4.3.2 Scheduling Algorithm 54 4.4 Implementation 60 4.5 Evaluation 61 4.5.1 Engine Microbenchmark 63 4.5.2 End-to-end Performance 66 4.6 Summary 71 Chapter 5 WindTunnel: Towards Differentiable ML Pipelines Beyond a Single Model 72 5.1 Introduction 72 5.2 Pipeline Translation 78 5.2.1 Translating Arithmetic Operators 80 5.2.2 Translating Algorithmic Operators: GBDT 81 5.2.3 Translating Algorithmic Operators for Categorical Features 85 5.2.4 Fine-Tuning 87 5.3 Implementation 87 5.4 Experiments 88 5.4.1 Experimental Setup 89 5.4.2 Overall Performance 94 5.4.3 Ablation Study 95 5.5 Summary 98 Chapter 6 Related Work 99 Chapter 7 Conclusion 105 Bibliography 107 Appendix A Appendix: Nimble 131 A.1 Proofs on the Stream Assignment Algorithm of Nimble 131 A.1.1 Proof of Theorem 1 132 A.1.2 Proof of Theorem 2 134 A.1.3 Proof of Theorem 3 135 A.1.4 Time Complexity Analysis 137 A.2 Evaluation Results on Various GPUs 139 A.3 Evaluation Results on Different Training Batch Sizes 139๋ฐ•

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    NLP-Based Techniques for Cyber Threat Intelligence

    Full text link
    In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and mitigating threats and enabling proactive defense strategies. In this context, NLP, an artificial intelligence branch, has emerged as a powerful tool for enhancing threat intelligence capabilities. This survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI. Finally, the challenges and limitations of NLP in threat intelligence are exhaustively examined, including data quality issues and ethical considerations. This survey draws a complete framework and serves as a valuable resource for security professionals and researchers seeking to understand the state-of-the-art NLP-based threat intelligence techniques and their potential impact on cybersecurity

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Cyber Security and Critical Infrastructures 2nd Volume

    Get PDF
    The second volume of the book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles, including an editorial that explains the current challenges, innovative solutions and real-world experiences that include critical infrastructure and 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part โ€œTechnologies and Methodsโ€ contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part โ€œProcesses and Applicationsโ€ details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Edge Computing for Internet of Things

    Get PDF
    The Internet-of-Things is becoming an established technology, with devices being deployed in homes, workplaces, and public areas at an increasingly rapid rate. IoT devices are the core technology of smart-homes, smart-cities, intelligent transport systems, and promise to optimise travel, reduce energy usage and improve quality of life. With the IoT prevalence, the problem of how to manage the vast volumes of data, wide variety and type of data generated, and erratic generation patterns is becoming increasingly clear and challenging. This Special Issue focuses on solving this problem through the use of edge computing. Edge computing offers a solution to managing IoT data through the processing of IoT data close to the location where the data is being generated. Edge computing allows computation to be performed locally, thus reducing the volume of data that needs to be transmitted to remote data centres and Cloud storage. It also allows decisions to be made locally without having to wait for Cloud servers to respond

    Modeling Time-Series and Spatial Data for Recommendations and Other Applications

    Full text link
    With the research directions described in this thesis, we seek to address the critical challenges in designing recommender systems that can understand the dynamics of continuous-time event sequences. We follow a ground-up approach, i.e., first, we address the problems that may arise due to the poor quality of CTES data being fed into a recommender system. Later, we handle the task of designing accurate recommender systems. To improve the quality of the CTES data, we address a fundamental problem of overcoming missing events in temporal sequences. Moreover, to provide accurate sequence modeling frameworks, we design solutions for points-of-interest recommendation, i.e., models that can handle spatial mobility data of users to various POI check-ins and recommend candidate locations for the next check-in. Lastly, we highlight that the capabilities of the proposed models can have applications beyond recommender systems, and we extend their abilities to design solutions for large-scale CTES retrieval and human activity prediction. A significant part of this thesis uses the idea of modeling the underlying distribution of CTES via neural marked temporal point processes (MTPP). Traditional MTPP models are stochastic processes that utilize a fixed formulation to capture the generative mechanism of a sequence of discrete events localized in continuous time. In contrast, neural MTPP combine the underlying ideas from the point process literature with modern deep learning architectures. The ability of deep-learning models as accurate function approximators has led to a significant gain in the predictive prowess of neural MTPP models. In this thesis, we utilize and present several neural network-based enhancements for the current MTPP frameworks for the aforementioned real-world applications.Comment: Ph.D. Thesis (2022

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part โ€œTechnologies and Methodsโ€ contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part โ€œProcesses and Applicationsโ€ details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems
    • โ€ฆ
    corecore