208 research outputs found

    Intelligent systems for efficiency and security

    Get PDF
    As computing becomes ubiquitous and personalized, resources like energy, storage and time are becoming increasingly scarce and, at the same time, computing systems must deliver in multiple dimensions, such as high performance, quality of service, reliability, security and low power. Building such computers is hard, particularly when the operating environment is becoming more dynamic, and systems are becoming heterogeneous and distributed. Unfortunately, computers today manage resources with many ad hoc heuristics that are suboptimal, unsafe, and cannot be composed across the computer’s subsystems. Continuing this approach has severe consequences: underperforming systems, resource waste, information loss, and even life endangerment. This dissertation research develops computing systems which, through intelligent adaptation, deliver efficiency along multiple dimensions. The key idea is to manage computers with principled methods from formal control. It is with these methods that the multiple subsystems of a computer sense their environment and configure themselves to meet system-wide goals. To achieve the goal of intelligent systems, this dissertation makes a series of contributions, each building on the previous. First, it introduces the use of formal MIMO (Multiple Input Multiple Output) control for processors, to simultaneously optimize many goals like performance, power, and temperature. Second, it develops the Yukta control system, which uses coordinated formal controllers in different layers of the stack (hardware and operating system). Third, it uses robust control to develop a fast, globally coordinated and decentralized control framework called Tangram, for heterogeneous computers. Finally, it presents Maya, a defense against power side-channel attacks that uses formal control to reshape the power dissipated by a computer, confusing the attacker. The ideas in the dissertation have been demonstrated successfully with several prototypes, including one built along with AMD (Advanced Micro Devices, Inc.) engineers. These designs significantly outperformed the state of the art. The research in this dissertation brought formal control closer to computer architecture and has been well-received in both domains. It has the first application of full-fledged MIMO control for processors, the first use of robust control in computer systems, and the first application of formal control for side-channel defense. It makes a significant stride towards intelligent systems that are efficient, secure and reliable

    Embedded Machine Learning: Emphasis on Hardware Accelerators and Approximate Computing for Tactile Data Processing

    Get PDF
    Machine Learning (ML) a subset of Artificial Intelligence (AI) is driving the industrial and technological revolution of the present and future. We envision a world with smart devices that are able to mimic human behavior (sense, process, and act) and perform tasks that at one time we thought could only be carried out by humans. The vision is to achieve such a level of intelligence with affordable, power-efficient, and fast hardware platforms. However, embedding machine learning algorithms in many application domains such as the internet of things (IoT), prostheses, robotics, and wearable devices is an ongoing challenge. A challenge that is controlled by the computational complexity of ML algorithms, the performance/availability of hardware platforms, and the application\u2019s budget (power constraint, real-time operation, etc.). In this dissertation, we focus on the design and implementation of efficient ML algorithms to handle the aforementioned challenges. First, we apply Approximate Computing Techniques (ACTs) to reduce the computational complexity of ML algorithms. Then, we design custom Hardware Accelerators to improve the performance of the implementation within a specified budget. Finally, a tactile data processing application is adopted for the validation of the proposed exact and approximate embedded machine learning accelerators. The dissertation starts with the introduction of the various ML algorithms used for tactile data processing. These algorithms are assessed in terms of their computational complexity and the available hardware platforms which could be used for implementation. Afterward, a survey on the existing approximate computing techniques and hardware accelerators design methodologies is presented. Based on the findings of the survey, an approach for applying algorithmic-level ACTs on machine learning algorithms is provided. Then three novel hardware accelerators are proposed: (1) k-Nearest Neighbor (kNN) based on a selection-based sorter, (2) Tensorial Support Vector Machine (TSVM) based on Shallow Neural Networks, and (3) Hybrid Precision Binary Convolution Neural Network (BCNN). The three accelerators offer a real-time classification with monumental reductions in the hardware resources and power consumption compared to existing implementations targeting the same tactile data processing application on FPGA. Moreover, the approximate accelerators maintain a high classification accuracy with a loss of at most 5%

    Edge Video Analytics: A Survey on Applications, Systems and Enabling Techniques

    Full text link
    Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. The basic concepts of EVA (e.g., definition, architectures) were not fully elucidated due to the rapid development of this domain. To fill these gaps, we provide a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.Comment: 31 pages, 13 figure

    Enabling AI in Future Wireless Networks: A Data Life Cycle Perspective

    Full text link
    Recent years have seen rapid deployment of mobile computing and Internet of Things (IoT) networks, which can be mostly attributed to the increasing communication and sensing capabilities of wireless systems. Big data analysis, pervasive computing, and eventually artificial intelligence (AI) are envisaged to be deployed on top of the IoT and create a new world featured by data-driven AI. In this context, a novel paradigm of merging AI and wireless communications, called Wireless AI that pushes AI frontiers to the network edge, is widely regarded as a key enabler for future intelligent network evolution. To this end, we present a comprehensive survey of the latest studies in wireless AI from the data-driven perspective. Specifically, we first propose a novel Wireless AI architecture that covers five key data-driven AI themes in wireless networks, including Sensing AI, Network Device AI, Access AI, User Device AI and Data-provenance AI. Then, for each data-driven AI theme, we present an overview on the use of AI approaches to solve the emerging data-related problems and show how AI can empower wireless network functionalities. Particularly, compared to the other related survey papers, we provide an in-depth discussion on the Wireless AI applications in various data-driven domains wherein AI proves extremely useful for wireless network design and optimization. Finally, research challenges and future visions are also discussed to spur further research in this promising area.Comment: Accepted at the IEEE Communications Surveys & Tutorials, 42 page

    並列計算アクセラレータへの効率的なアプリケーションマッピングに関する研究

    Get PDF
    長崎大学学位論文 学位記番号:博(工)甲第3号 学位授与年月日:平成26年3月20日Nagasaki University (長崎大学)課程博

    Internet of Underwater Things and Big Marine Data Analytics -- A Comprehensive Survey

    Full text link
    The Internet of Underwater Things (IoUT) is an emerging communication ecosystem developed for connecting underwater objects in maritime and underwater environments. The IoUT technology is intricately linked with intelligent boats and ships, smart shores and oceans, automatic marine transportations, positioning and navigation, underwater exploration, disaster prediction and prevention, as well as with intelligent monitoring and security. The IoUT has an influence at various scales ranging from a small scientific observatory, to a midsized harbor, and to covering global oceanic trade. The network architecture of IoUT is intrinsically heterogeneous and should be sufficiently resilient to operate in harsh environments. This creates major challenges in terms of underwater communications, whilst relying on limited energy resources. Additionally, the volume, velocity, and variety of data produced by sensors, hydrophones, and cameras in IoUT is enormous, giving rise to the concept of Big Marine Data (BMD), which has its own processing challenges. Hence, conventional data processing techniques will falter, and bespoke Machine Learning (ML) solutions have to be employed for automatically learning the specific BMD behavior and features facilitating knowledge extraction and decision support. The motivation of this paper is to comprehensively survey the IoUT, BMD, and their synthesis. It also aims for exploring the nexus of BMD with ML. We set out from underwater data collection and then discuss the family of IoUT data communication techniques with an emphasis on the state-of-the-art research challenges. We then review the suite of ML solutions suitable for BMD handling and analytics. We treat the subject deductively from an educational perspective, critically appraising the material surveyed.Comment: 54 pages, 11 figures, 19 tables, IEEE Communications Surveys & Tutorials, peer-reviewed academic journa

    New cross-layer techniques for multi-criteria scheduling in large-scale systems

    Get PDF
    The global ecosystem of information technology (IT) is in transition to a new generation of applications that require more and more intensive data acquisition, processing and storage systems. As a result of that change towards data intensive computing, there is a growing overlap between high performance computing (HPC) and Big Data techniques in applications, since many HPC applications produce large volumes of data, and Big Data needs HPC capabilities. The hypothesis of this PhD. thesis is that the potential interoperability and convergence of the HPC and Big Data systems are crucial for the future, being essential the unification of both paradigms to address a broad spectrum of research domains. For this reason, the main objective of this Phd. thesis is purposing and developing a monitoring system to allow the HPC and Big Data convergence, thanks to giving information about behaviors of applications in a system which execute both kind of them, giving information to improve scalability, data locality, and to allow adaptability to large scale computers. To achieve this goal, this work is focused on the design of resource monitoring and discovery to exploit parallelism at all levels. These collected data are disseminated to facilitate global improvements at the whole system, and, thus, avoid mismatches between layers. The result is a two-level monitoring framework (both at node and application level) with a low computational load, scalable, and that can communicate with different modules thanks to an API provided for this purpose. All data collected is disseminated to facilitate the implementation of improvements globally throughout the system, and thus avoid mismatches between layers, which combined with the techniques applied to deal with fault tolerance, makes the system robust and with high availability. On the other hand, the developed framework includes a task scheduler capable of managing the launch of applications, their migration between nodes, as well as the possibility of dynamically increasing or decreasing the number of processes. All these thanks to the cooperation with other modules that are integrated into LIMITLESS, and whose objective is to optimize the execution of a stack of applications based on multi-criteria policies. This scheduling mode is called coarse-grain scheduling based on monitoring. For better performance and in order to further reduce the overhead during the monitorization, different optimizations have been applied at different levels to try to reduce communications between components, while trying to avoid the loss of information. To achieve this objective, data filtering techniques, Machine Learning (ML) algorithms, and Neural Networks (NN) have been used. In order to improve the scheduling process and to design new multi-criteria scheduling policies, the monitoring information has been combined with other ML algorithms to identify (through classification algorithms) the applications and their execution phases, doing offline profiling. Thanks to this feature, LIMITLESS can detect which phase is executing an application and tries to share the computational resources with other applications that are compatible (there is no performance degradation between them when both are running at the same time). This feature is called fine-grain scheduling, and can reduce the makespan of the use cases while makes efficient use of the computational resources that other applications do not use.El ecosistema global de las tecnologías de la información (IT) se encuentra en transición a una nueva generación de aplicaciones que requieren sistemas de adquisición de datos, procesamiento y almacenamiento cada vez más intensivo. Como resultado de ese cambio hacia la computación intensiva de datos, existe una superposición, cada vez mayor, entre la computación de alto rendimiento (HPC) y las técnicas Big Data en las aplicaciones, pues muchas aplicaciones HPC producen grandes volúmenes de datos, y Big Data necesita capacidades HPC. La hipótesis de esta tesis es que hay un gran potencial en la interoperabilidad y convergencia de los sistemas HPC y Big Data, siendo crucial para el futuro tratar una unificación de ambos para hacer frente a un amplio espectro de problemas de investigación. Por lo tanto, el objetivo principal de esta tesis es la propuesta y desarrollo de un sistema de monitorización que facilite la convergencia de los paradigmas HPC y Big Data gracias a la provisión de datos sobre el comportamiento de las aplicaciones en un entorno en el que se pueden ejecutar aplicaciones de ambos mundos, ofreciendo información útil para mejorar la escalabilidad, la explotación de la localidad de datos y la adaptabilidad en los computadores de gran escala. Para lograr este objetivo, el foco se ha centrado en el diseño de mecanismos de monitorización y localización de recursos para explotar el paralelismo en todos los niveles de la pila del software. El resultado es un framework de monitorización en dos niveles (tanto a nivel de nodo como de aplicación) con una baja carga computacional, escalable, y que se puede comunicar con distintos módulos gracias a una API proporcionada para tal objetivo. Todos datos recolectados se difunden para facilitar la realización de mejoras de manera global en todo el sistema, y así evitar desajustes entre capas, lo que combinado con las técnicas aplicadas para lidiar con la tolerancia a fallos, hace que el sistema sea robusto y con una alta disponibilidad. Por otro lado, el framework desarrollado incluye un planificador de tareas capaz de gestionar el lanzamiento de aplicaciones, la migración de las mismas entre nodos, además de la posibilidad de incrementar o disminuir su número de procesos de forma dinámica. Todo ello gracias a la cooperación con otros módulos que se integran en LIMITLESS, y cuyo objetivo es optimizar la ejecución de una pila de aplicaciones en base a políticas multicriterio. Esta funcionalidad se llama planificación de grano grueso. Para un mejor desempeño y con el objetivo de reducir más aún la carga durante la ejecución, se han aplicado distintas optimizaciones en distintos niveles para tratar de reducir las comunicaciones entre componentes, a la vez que se trata de evitar la pérdida de información. Para lograr este objetivo se ha hecho uso de técnicas de filtrado de datos, algoritmos de Machine Learning (ML), y Redes Neuronales (NN). Finalmente, para obtener mejores resultados en la planificación de aplicaciones y para diseñar nuevas políticas de planificación multi-criterio, los datos de monitorización recolectados han sido combinados con nuevos algoritmos de ML para identificar (por medio de algoritmos de clasificación) aplicaciones y sus fases de ejecución. Todo ello realizando tareas de profiling offline. Gracias a estas técnicas, LIMITLESS puede detectar en qué fase de su ejecución se encuentra una determinada aplicación e intentar compartir los recursos de computacionales con otras aplicaciones que sean compatibles (no se produce una degradación del rendimiento entre ellas cuando ambas se ejecutan a la vez en el mismo nodo). Esta funcionalidad se llama planificación de grano fino y puede reducir el tiempo total de ejecución de la pila de aplicaciones en los casos de uso porque realiza un uso más eficiente de los recursos de las máquinas.This PhD dissertation has been partially supported by the Spanish Ministry of Science and Innovation under an FPI fellowship associated to a National Project with reference TIN2016-79637-P (from July 1, 2018 to October 10, 2021)Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Félix García Carballeira.- Secretario: Pedro Ángel Cuenca Castillo.- Vocal: María Cristina V. Marinesc
    corecore