90 research outputs found
λ³λ ¬ λ° λΆμ° μλ² λλ μμ€ν μ μν λͺ¨λΈ κΈ°λ° μ½λ μμ± νλ μμν¬
νμλ
Όλ¬Έ(λ°μ¬)--μμΈλνκ΅ λνμ :곡과λν μ»΄ν¨ν°κ³΅νλΆ,2020. 2. νμν.μννΈμ¨μ΄ μ€κ³ μμ°μ± λ° μ μ§λ³΄μμ±μ ν₯μμν€κΈ° μν΄ λ€μν μννΈμ¨μ΄ κ°λ° λ°©λ²λ‘ μ΄ μ μλμμ§λ§, λλΆλΆμ μ°κ΅¬λ μμ© μννΈμ¨μ΄λ₯Ό νλμ νλ‘μΈμμμ λμμν€λ λ°μ μ΄μ μ λ§μΆκ³ μλ€. λν, μλ² λλ μμ€ν
μ κ°λ°νλ λ°μ νμν μ§μ°μ΄λ μμ μꡬ μ¬νμ λν λΉκΈ°λ₯μ μꡬ μ¬νμ κ³ λ €νμ§ μκ³ μκΈ° λλ¬Έμ μΌλ°μ μΈ μννΈμ¨μ΄ κ°λ° λ°©λ²λ‘ μ μλ² λλ μννΈμ¨μ΄λ₯Ό κ°λ°νλ λ°μ μ μ©νλ κ²μ μ ν©νμ§ μλ€.
μ΄ λ
Όλ¬Έμμλ λ³λ ¬ λ° λΆμ° μλ² λλ μμ€ν
μ λμμΌλ‘ νλ μννΈμ¨μ΄λ₯Ό λͺ¨λΈλ‘ νννκ³ , μ΄λ₯Ό μννΈμ¨μ΄ λΆμμ΄λ κ°λ°μ νμ©νλ κ°λ° λ°©λ²λ‘ μ μκ°νλ€. μ°λ¦¬μ λͺ¨λΈμμ μμ© μννΈμ¨μ΄λ κ³μΈ΅μ μΌλ‘ ννν μ μλ μ¬λ¬ κ°μ νμ€ν¬λ‘ μ΄λ£¨μ΄μ Έ μμΌλ©°, νλμ¨μ΄ νλ«νΌκ³Ό λ
립μ μΌλ‘ λͺ
μΈνλ€. νμ€ν¬ κ°μ ν΅μ λ° λκΈ°νλ λͺ¨λΈμ΄ μ μν κ·μ½μ΄ μ ν΄μ Έ μκ³ , μ΄λ¬ν κ·μ½μ ν΅ν΄ μ€μ νλ‘κ·Έλ¨μ μ€ννκΈ° μ μ μννΈμ¨μ΄ μλ¬λ₯Ό μ μ λΆμμ ν΅ν΄ νμΈν μ μκ³ , μ΄λ μμ©μ κ²μ¦ 볡μ‘λλ₯Ό μ€μ΄λ λ°μ κΈ°μ¬νλ€. μ§μ ν νλμ¨μ΄ νλ«νΌμμ λμνλ νλ‘κ·Έλ¨μ νμ€ν¬λ€μ νλ‘μΈμμ 맀νν μ΄νμ μλμ μΌλ‘ ν©μ±ν μ μλ€.
μμ λͺ¨λΈ κΈ°λ° μννΈμ¨μ΄ κ°λ° λ°©λ²λ‘ μμ μ¬μ©νλ νλ‘κ·Έλ¨ ν©μ±κΈ°λ₯Ό λ³Έ λ
Όλ¬Έμμ μ μνμλλ°, λͺ
μΈν νλ«νΌ μꡬ μ¬νμ λ°νμΌλ‘ λ³λ ¬ λ° λΆμ° μλ² λλ μμ€ν
μμμ λμνλ μ½λλ₯Ό μμ±νλ€. μ¬λ¬ κ°μ μ νμ λͺ¨λΈλ€μ κ³μΈ΅μ μΌλ‘ νννμ¬ μμ©μ λμ ννλ₯Ό λνκ³ , ν©μ±κΈ°λ μ¬λ¬ λͺ¨λΈλ‘ ꡬμ±λ κ³μΈ΅μ μΈ λͺ¨λΈλ‘λΆν° λ³λ ¬μ±μ κ³ λ €νμ¬ νμ€ν¬λ₯Ό μ€νν μ μλ€. λν, νλ‘κ·Έλ¨ ν©μ±κΈ°μμ λ€μν νλ«νΌμ΄λ λ€νΈμν¬λ₯Ό μ§μν μ μλλ‘ μ½λλ₯Ό κ΄λ¦¬νλ λ°©λ²λ 보μ¬μ£Όκ³ μλ€. λ³Έ λ
Όλ¬Έμμ μ μνλ μννΈμ¨μ΄ κ°λ° λ°©λ²λ‘ μ 6κ°μ νλμ¨μ΄ νλ«νΌκ³Ό 3 μ’
λ₯μ λ€νΈμν¬λ‘ ꡬμ±λμ΄ μλ μ€μ κ°μ μννΈμ¨μ΄ μμ€ν
μμ© μμ μ μ΄μ’
λ©ν° νλ‘μΈμλ₯Ό νμ©νλ μ격 λ₯ λ¬λ μμ λ₯Ό μννμ¬ κ°λ° λ°©λ²λ‘ μ μ μ© κ°λ₯μ±μ μννμλ€. λν, νλ‘κ·Έλ¨ ν©μ±κΈ°κ° μλ‘μ΄ νλ«νΌμ΄λ λ€νΈμν¬λ₯Ό μ§μνκΈ° μν΄ νμλ‘ νλ κ°λ° λΉμ©λ μ€μ μΈ‘μ λ° μμΈ‘νμ¬ μλμ μΌλ‘ μ μ λ
Έλ ₯μΌλ‘ μλ‘μ΄ νλ«νΌμ μ§μν μ μμμ νμΈνμλ€.
λ§μ μλ² λλ μμ€ν
μμ μμμΉ λͺ»ν νλμ¨μ΄ μλ¬μ λν΄ κ²°ν¨μ κ°λ΄νλ κ²μ νμλ‘ νκΈ° λλ¬Έμ κ²°ν¨ κ°λ΄μ λν μ½λλ₯Ό μλμΌλ‘ μμ±νλ μ°κ΅¬λ μ§ννμλ€. λ³Έ κΈ°λ²μμ κ²°ν¨ κ°λ΄ μ€μ μ λ°λΌ νμ€ν¬ κ·Έλνλ₯Ό μμ νλ λ°©μμ νμ©νμμΌλ©°, κ²°ν¨ κ°λ΄μ λΉκΈ°λ₯μ μꡬ μ¬νμ μμ© κ°λ°μκ° μ½κ² μ μ©ν μ μλλ‘ νμλ€. λν, κ²°ν¨ κ°λ΄ μ§μνλ κ²κ³Ό κ΄λ ¨νμ¬ μ€μ μλμΌλ‘ ꡬννμ κ²½μ°μ λΉκ΅νμκ³ , κ²°ν¨ μ£Όμ
λꡬλ₯Ό μ΄μ©νμ¬ κ²°ν¨ λ°μ μλ리μ€λ₯Ό μ¬ννκ±°λ, μμλ‘ κ²°ν¨μ μ£Όμ
νλ μ€νμ μννμλ€.
λ§μ§λ§μΌλ‘ κ²°ν¨ κ°λ΄λ₯Ό μ€νν λμ νμ©ν κ²°ν¨ μ£Όμ
λꡬλ λ³Έ λ
Όλ¬Έμ λ λ€λ₯Έ κΈ°μ¬ μ¬ν μ€ νλλ‘ λ¦¬λ
μ€ νκ²½μΌλ‘ λμμΌλ‘ μμ© μμ λ° μ»€λ μμμ κ²°ν¨μ μ£Όμ
νλ λꡬλ₯Ό κ°λ°νμλ€. μμ€ν
μ κ²¬κ³ μ±μ κ²μ¦νκΈ° μν΄ κ²°ν¨μ μ£Όμ
νμ¬ κ²°ν¨ μλ리μ€λ₯Ό μ¬ννλ κ²μ λ리 μ¬μ©λλ λ°©λ²μΌλ‘, λ³Έ λ
Όλ¬Έμμ κ°λ°λ κ²°ν¨ μ£Όμ
λꡬλ μμ€ν
μ΄ λμνλ λμ€μ μ¬ν κ°λ₯ν κ²°ν¨μ μ£Όμ
ν μ μλ λꡬμ΄λ€. 컀λ μμμμμ κ²°ν¨ μ£Όμ
μ μν΄ λ μ’
λ₯μ κ²°ν¨ μ£Όμ
λ°©λ²μ μ 곡νλ©°, νλλ 컀λ GNU λλ²κ±°λ₯Ό μ΄μ©ν λ°©λ²μ΄κ³ , λ€λ₯Έ νλλ ARM νλμ¨μ΄ λΈλ μ΄ν¬ν¬μΈνΈλ₯Ό νμ©ν λ°©λ²μ΄λ€. μμ© μμμμ κ²°ν¨μ μ£Όμ
νκΈ° μν΄ GDB κΈ°λ° κ²°ν¨ μ£Όμ
λ°©λ²μ μ΄μ©νμ¬ λμΌ μμ€ν
νΉμ μ격 μμ€ν
μ μμ©μ κ²°ν¨μ μ£Όμ
ν μ μλ€. κ²°ν¨ μ£Όμ
λꡬμ λν μ€νμ ODROID-XU4 보λμμ μ§ννμλ€.While various software development methodologies have been proposed to increase the design productivity and maintainability of software, they usually focus on the development of application software running on a single processing element, without concern about the non-functional requirements of an embedded system such as latency and resource requirements.
In this thesis, we present a model-based software development method for parallel and distributed embedded systems. An application is specified as a set of tasks that follow a set of given rules for communication and synchronization in a hierarchical fashion, independently of the hardware platform. Having such rules enables us to perform static analysis to check some software errors at compile time to reduce the verification difficulty. Platform-specific program is synthesized automatically after mapping of tasks onto processing elements is determined.
The program synthesizer is also proposed to generate codes which satisfies platform requirements for parallel and distributed embedded systems. As multiple models which can express dynamic behaviors can be depicted hierarchically, the synthesizer supports to manage multiple task graphs with a different hierarchy to run tasks with parallelism. Also, the synthesizer shows methods of managing codes for heterogeneous platforms and generating various communication methods. The viability of the proposed software development method is verified with a real-life surveillance application that runs on six processing elements with three remote communication methods, and remote deep learning example is conducted to use heterogeneous multiprocessing components on distributed systems. Also, supporting a new platform and network requires a small effort by measuring and estimating development costs.
Since tolerance to unexpected errors is a required feature of many embedded systems, we also support an automatic fault-tolerant code generation. Fault tolerance can be applied by modifying the task graph based on the selected fault tolerance configurations, so the non-functional requirement of fault tolerance can be easily adopted by an application developer. To compare the effort of supporting fault tolerance, manual implementation of fault tolerance is performed. Also, the fault tolerance method is tested with the fault injection tool to emulate fault scenarios and inject faults randomly.
Our fault injection tool, which has used for testing our fault-tolerance method, is another work of this thesis. Emulating fault scenarios by intentionally injecting faults is commonly used to test and verify the robustness of a system. To emulate faults on an embedded system, we present a run-time fault injection framework that can inject a fault on both a kernel and application layer of Linux-based systems. For injecting faults on a kernel layer, two complementary fault injection techniques are used. One is based on Kernel GNU Debugger, and the other is using a hardware breakpoint supported by the ARM architecture. For application-level fault injection, the GDB-based fault injection method is used to inject a fault on a remote application. The viability of the proposed fault injection tool is proved by real-life experiments with an ODROID-XU4 system.Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Contribution 6
1.3 Dissertation Organization 8
Chapter 2 Background 9
2.1 HOPES: Hope of Parallel Embedded Software 9
2.1.1 Software Development Procedure 9
2.1.2 Components of HOPES 12
2.2 Universal Execution Model 13
2.2.1 Task Graph Specification 13
2.2.2 Dataflow specification of an Application 15
2.2.3 Task Code Specification and Generic APIs 21
2.2.4 Meta-data Specification 23
Chapter 3 Program Synthesis for Parallel and Distributed Embedded Systems 24
3.1 Motivational Example 24
3.2 Program Synthesis Overview 26
3.3 Program Synthesis from Hierarchically-mixed Models 30
3.4 Platform Code Synthesis 33
3.5 Communication Code Synthesis 36
3.6 Experiments 40
3.6.1 Development Cost of Supporting New Platforms and Networks 40
3.6.2 Program Synthesis for the Surveillance System Example 44
3.6.3 Remote GPU-accelerated Deep Learning Example 46
3.7 Document Generation 48
3.8 Related Works 49
Chapter 4 Model Transformation for Fault-tolerant Code Synthesis 56
4.1 Fault-tolerant Code Synthesis Techniques 56
4.2 Applying Fault Tolerance Techniques in HOPES 61
4.3 Experiments 62
4.3.1 Development Cost of Applying Fault Tolerance 62
4.3.2 Fault Tolerance Experiments 62
4.4 Random Fault Injection Experiments 65
4.5 Related Works 68
Chapter 5 Fault Injection Framework for Linux-based Embedded Systems 70
5.1 Background 70
5.1.1 Fault Injection Techniques 70
5.1.2 Kernel GNU Debugger 71
5.1.3 ARM Hardware Breakpoint 72
5.2 Fault Injection Framework 74
5.2.1 Overview 74
5.2.2 Architecture 75
5.2.3 Fault Injection Techniques 79
5.2.4 Implementation 83
5.3 Experiments 90
5.3.1 Experiment Setup 90
5.3.2 Performance Comparison of Two Fault Injection Methods 90
5.3.3 Bit-flip Fault Experiments 92
5.3.4 eMMC Controller Fault Experiments 94
Chapter 6 Conclusion 97
Bibliography 99
μ μ½ 108Docto
Design and management of image processing pipelines within CPS: Acquired experience towards the end of the FitOptiVis ECSEL Project
Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints
Design and management of image processing pipelines within CPS : Acquired experience towards the end of the FitOptiVis ECSEL Project
Cyber-Physical Systems (CPSs) are dynamic and reactive systems interacting with processes, environment and, sometimes, humans. They are often distributed with sensors and actuators, characterized for being smart, adaptive, predictive and react in real-time. Indeed, image- and video-processing pipelines are a prime source for environmental information for systems allowing them to take better decisions according to what they see. Therefore, in FitOptiVis, we are developing novel methods and tools to integrate complex image- and video-processing pipelines. FitOptiVis aims to deliver a reference architecture for describing and optimizing quality and resource management for imaging and video pipelines in CPSs both at design- and run-time. The architecture is concretized in low-power, high-performance, smart components, and in methods and tools for combined design-time and run-time multi-objective optimization and adaptation within system and environment constraints.Peer reviewe
State-of-the-art Assessment For Simulated Forces
Summary of the review of the state of the art in simulated forces conducted to support the research objectives of Research and Development for Intelligent Simulated Forces
A Methodology for Extracting Human Bodies from Still Images
Monitoring and surveillance of humans is one of the most prominent applications of today and it is expected to be part of many future aspects of our life, for safety reasons, assisted living and many others. Many efforts have been made towards automatic and robust solutions, but the general problem is very challenging and remains still open. In this PhD dissertation we examine the problem from many perspectives. First, we study the performance of a hardware architecture designed for large-scale surveillance systems. Then, we focus on the general problem of human activity recognition, present an extensive survey of methodologies that deal with this subject and propose a maturity metric to evaluate them.
One of the numerous and most popular algorithms for image processing found in the field is image segmentation and we propose a blind metric to evaluate their results regarding the activity at local regions. Finally, we propose a fully automatic system for segmenting and extracting human bodies from challenging single images, which is the main contribution of the dissertation. Our methodology is a novel bottom-up approach relying mostly on anthropometric constraints and is facilitated by our research in the fields of face, skin and hands detection. Experimental results and comparison with state-of-the-art methodologies demonstrate the success of our approach
Information management system study results. Volume 1: IMS study results
The information management system (IMS) special emphasis task was performed as an adjunct to the modular space station study, with the objective of providing extended depth of analysis and design in selected key areas of the information management system. Specific objectives included: (1) in-depth studies of IMS requirements and design approaches; (2) design and fabricate breadboard hardware for demonstration and verification of design concepts; (3) provide a technological base to identify potential design problems and influence long range planning (4) develop hardware and techniques to permit long duration, low cost, manned space operations; (5) support SR&T areas where techniques or equipment are considered inadequate; and (6) permit an overall understanding of the IMS as an integrated component of the space station
Using embedded hardware monitor cores in critical computer systems
The integration of FPGA devices in many different architectures and services
makes monitoring and real time detection of errors an important concern in FPGA
system design. A monitor is a tool, or a set of tools, that facilitate analytic
measurements in observing a given system. The goal of these observations is
usually the performance analysis and optimisation, or the surveillance of the system.
However, System-on-Chip (SoC) based designs leave few points to attach external
tools such as logic analysers. Thus, an embedded error detection core that allows
observation of critical system nodes (such as processor cores and buses) should
enforce the operation of the FPGA-based system, in order to prevent system
failures. The core should not interfere with system performance and must ensure
timely detection of errors.
This thesis is an investigation onto how a robust hardware-monitoring module
can be efficiently integrated in a target PCI board (with FPGA-based application processing
features) which is part of a critical computing system. [Continues.
Energy-based tuning of convolution neural networks on multi-GPUs
Deep Learning (DL) applications are gaining momentum in the realm of Artificial Intelligence, particularly after GPUs have demonstrated remarkable skills for accelerating their challenging computational requirements. Within this context, Convolutional Neural Network (CNN) models constitute a representative example of success on a wide set of complex applications, particularly on datasets where the target can be represented through a hierarchy of local features of increas- ing semantic complexity. In most of the real scenarios, the roadmap to improve results relies on CNN settings involving brute force computation, and researchers have lately proven Nvidia GPUs to be one of the best hardware counterparts for acceleration. Our work complements those find- ings with an energy study on critical parameters for the deployment of CNNs on flagship image and video applications, ie, object recognition and people identification by gait, respectively. We evaluate energy consumption on four different networks based on the two most popular ones (ResNet/AlexNet), ie, ResNet (167 layers), a 2D CNN (15 layers), a CaffeNet (25 layers), and a ResNetIm (94 layers) using batch sizes of 64, 128, and 256, and then correlate those with speed-up and accuracy to determine optimal settings. Experimental results on a multi-GPU server endowed with twin Maxwell and twin Pascal Titan X GPUs demonstrate that energy correlates with per- formance and that Pascal may have up to 40% gains versus Maxwell. Larger batch sizes extend performance gains and energy savings, but we have to keep an eye on accuracy, which sometimes shows a preference for small batches. We expect this work to provide a preliminary guidance for a wide set of CNN and DL applications in modern HPC times, where the GFLOPS/w ratio constitutes the primary goal.Ministry of Education of Spain, Grant/Award Number: TIN2013-42253-P and TIN2016-78799-P; ConsejerΓa de EconomΓa, InnovaciΓ³n, Ciencia y Empleo, Junta de AndalucΓa, Grant/Award Number: P12-TIC-1741 and TIC-169
- β¦