Search CORE

84 research outputs found

Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI

Author: Chu Yunfei
Ji Luo
Jia Kunyang
Kuang Kun
Ma Jianxin
Shen Tao
Tan Ziqi
Wang Feng
Wu Anpeng
Wu Chao
Wu Fei
Yang Hongxia
Yao Jiangchao
Yao Yang
Zhang Fengda
Zhang Jianwei
Zhang Shengyu
Zhou Jingren
Publication venue
Publication date: 23/05/2022
Field of study

Influenced by the great success of deep learning via cloud computing and the rapid development of edge chips, research in artificial intelligence (AI) has shifted to both of the computing paradigms, i.e., cloud computing and edge computing. In recent years, we have witnessed significant progress in developing more advanced AI models on cloud servers that surpass traditional deep learning models owing to model innovations (e.g., Transformers, Pretrained families), explosion of training data and soaring computing capabilities. However, edge computing, especially edge and cloud collaborative computing, are still in its infancy to announce their success due to the resource-constrained IoT scenarios with very limited algorithms deployed. In this survey, we conduct a systematic review for both cloud and edge AI. Specifically, we are the first to set up the collaborative learning mechanism for cloud and edge modeling with a thorough review of the architectures that enable such mechanism. We also discuss potentials and practical experiences of some on-going advanced edge AI topics including pretraining models, graph neural networks and reinforcement learning. Finally, we discuss the promising directions and challenges in this field.Comment: 20 pages, Transactions on Knowledge and Data Engineerin

arXiv.org e-Print Archive

Towards Scalable, Private and Practical Deep Learning

Author: Zawad Syed
Publication venue
Publication date: 01/02/2023
Field of study

Deep Learning (DL) models have drastically improved the performance of Artificial Intelligence (AI) tasks such as image recognition, word prediction, translation, among many others, on which traditional Machine Learning (ML) models fall short. However, DL models are costly to design, train, and deploy due to their computing and memory demands. Designing DL models usually requires extensive expertise and significant manual tuning efforts. Even with the latest accelerators such as Graphics Processing Unit (GPU) and Tensor Processing Unit (TPU), training DL models can take prohibitively long time, therefore training large DL models in a distributed manner is a norm. Massive amount of data is made available thanks to the prevalence of mobile and internet-of-things (IoT) devices. However, regulations such as HIPAA and GDPR limit the access and transmission of personal data to protect security and privacy. Therefore, enabling DL model training in a decentralized but private fashion is urgent and critical. Deploying trained DL models in a real world environment usually requires meeting Quality of Service (QoS) standards, which makes adaptability of DL models an important yet challenging matter. In this dissertation, we aim to address the above challenges to make a step towards scalable, private, and practical deep learning. To simplify DL model design, we propose Efficient Progressive Neural-Architecture Search (EPNAS) and FedCust to automatically design model architectures and tune hyperparameters, respectively. To provide efficient and robust distributed training while preserving privacy, we design LEASGD, TiFL, and HDFL. We further conduct a study on the security aspect of distributed learning by focusing on how data heterogeneity affects backdoor attacks and how to mitigate such threats. Finally, we use super resolution (SR) as an example application to explore model adaptability for cross platform deployment and dynamic runtime environment. Specifically, we propose DySR and AdaSR frameworks which enable SR models to meet QoS by dynamically adapting to available resources instantly and seamlessly without excessive memory overheads

University of Nevada, Reno ScholarWorks Repository

HIGH-THROUGHPUT AREA-EFFICIENT INTEGER TRANSFORMS FOR VIDEO CODING

Author: DO THI THU TRANG
Publication venue
Publication date: 25/01/2013
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Full Stack Optimization of Transformer Inference: a Survey

Author: Dinh Grace
Genc Hasan
Gholami Amir
Hooper Coleman
Huang Qijing
Kang Minwoo
Keutzer Kurt
Kim Sehoon
Mahoney Michael W.
Shao Yakun Sophia
Wattanawong Thanakul
Yan Ruohan
Publication venue
Publication date: 27/02/2023
Field of study

Recent advances in state-of-the-art DNN architecture design have been moving toward Transformer models. These models achieve superior accuracy across a wide range of applications. This trend has been consistent over the past several years since Transformer models were originally introduced. However, the amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate, and this has made their deployment in latency-sensitive applications challenging. As such, there has been an increased focus on making Transformer models more efficient, with methods that range from changing the architecture design, all the way to developing dedicated domain-specific accelerators. In this work, we survey different approaches for efficient Transformer inference, including: (i) analysis and profiling of the bottlenecks in existing Transformer architectures and their similarities and differences with previous convolutional models; (ii) implications of Transformer architecture on hardware, including the impact of non-linear operations such as Layer Normalization, Softmax, and GELU, as well as linear operations, on hardware design; (iii) approaches for optimizing a fixed Transformer architecture; (iv) challenges in finding the right mapping and scheduling of operations for Transformer models; and (v) approaches for optimizing Transformer models by adapting the architecture using neural architecture search. Finally, we perform a case study by applying the surveyed optimizations on Gemmini, the open-source, full-stack DNN accelerator generator, and we show how each of these approaches can yield improvements, compared to previous benchmark results on Gemmini. Among other things, we find that a full-stack co-design approach with the aforementioned methods can result in up to 88.7x speedup with a minimal performance degradation for Transformer inference

arXiv.org e-Print Archive

Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

Author: Awaysheh Feras Mahmoud Naji
Publication venue
Publication date: 01/01/2020
Field of study

One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study

Repositorio Institucional da Universidade de Santiago de Compostela

Aerospace Medicine and Biology: A continuing bibliography with indexes

Author
Publication venue
Publication date
Field of study

This bibliography lists 253 reports, articles, and other documents introduced into the NASA scientific and technical information system in October 1975

NASA Technical Reports Server

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Author: Ben-Nun Tal
Hoefler Torsten
Publication venue
Publication date: 15/09/2018
Field of study

Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning

arXiv.org e-Print Archive

Repository for Publications and Research Data

Kommunikation in der Automation : Beiträge des Jahreskolloquiums KommA 2022

Author: Jasperneite Jürgen
Jumar Ulrich
Technische Hochschule Ostwestfalen-Lippe
Publication venue: Institut für industrielle Informationstechnik - inIT der Technischen Hochschule Ostwestfalen-Lippe
Publication date: 01/01/2022
Field of study

Publikationen an der Technischen Hochschule Ostwestfalen-Lippe

Kommunikation in der Automation : Beiträge des Jahreskolloquiums KommA 2022

Author: Jasperneite Jürgen
Jumar Ulrich
Technische Hochschule Ostwestfalen-Lippe
Publication venue: Institut für industrielle Informationstechnik - inIT der Technischen Hochschule Ostwestfalen-Lippe
Publication date: 01/01/2022
Field of study

Publikationen an der Technischen Hochschule Ostwestfalen-Lippe

Air Force Institute of Technology Contributions to Air Force Research and Development, Calendar Year 1987

Author: Air Force Institute of Technology
Publication venue: AFIT Scholar
Publication date: 01/03/1987
Field of study

From the introduction:The primary mission of the Air Force Institute of Technology (AFIT) is education, but research and consulting are essential integral elements in the process. This report highlights AFIT\u27s contributions to Air Force research and development activities [in 1987]

AFTI Scholar (Air Force Institute of Technology)