114 research outputs found

    Hardware/Software Co-design Applied to Reed-Solomon Decoding for the DMB Standard

    Get PDF
    This paper addresses the implementation of Reed- Solomon decoding for battery-powered wireless devices. The scope of this paper is constrained by the Digital Media Broadcasting (DMB). The most critical element of the Reed-Solomon algorithm is implemented on two different reconfigurable hardware architectures: an FPGA and a coarse-grained architecture: the Montium, The remaining parts are executed on an ARM processor. The results of this research show that a co-design of the ARM together with an FPGA or a Montium leads to a substantial decrease in energy consumption. The energy consumption of syndrome calculation of the Reed- Solomon decoding algorithm is estimated for an FPGA and a Montium by means of simulations. The Montium proves to be more efficient

    ํšจ์œจ์ ์ธ ์ถ”๋ก ์„ ์œ„ํ•œ ํ•˜๋“œ์›จ์–ด ์นœํ™”์  ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ ๋ฐ ๊ฐ€์†๊ธฐ ์„ค๊ณ„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2020. 8. ์ดํ˜์žฌ.๋จธ์‹  ๋Ÿฌ๋‹ (Machine Learning) ๋ฐฉ๋ฒ• ์ค‘ ํ˜„์žฌ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ๋Š” ๋”ฅ๋Ÿฌ๋‹(Deep Learning)์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋“ค์ด ํ•˜๋“œ์›จ์–ด์™€ ์†Œํ”„ํŠธ์›จ์–ด ๋‘ ์ธก๋ฉด์—์„œ ๋ชจ๋‘ ํ™œ๋ฐœํ•˜๊ฒŒ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ๋†’์€ ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•˜๋ฉด์„œ๋„ ํšจ์œจ์ ์œผ๋กœ ์ถ”๋ก ์„ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋ชจ๋ฐ”์ผ์šฉ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ(Neural Network Architecture) ์„ค๊ณ„ ๋ฐ ํ•™์Šต๋œ ๋ชจ๋ธ ์••์ถ• ๋“ฑ ์†Œํ”„ํŠธ์›จ์–ด ์ธก๋ฉด์—์„œ์˜ ์ตœ์ ํ™” ๋ฐฉ๋ฒ•๋“ค์ด ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ์œผ๋ฉฐ, ์ด๋ฏธ ํ•™์Šต๋œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ ๋น ๋ฅธ ์ถ”๋ก ๊ณผ ๋†’์€ ์—๋„ˆ์ง€ํšจ์œจ์„ฑ์„ ๊ฐ–๋Š” ๊ฐ€์†๊ธฐ๋ฅผ ์„ค๊ณ„ํ•˜๋Š” ํ•˜๋“œ์›จ์–ด ์ธก๋ฉด์—์„œ์˜ ์—ฐ๊ตฌ๊ฐ€ ๋™์‹œ์— ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ธฐ์กด์˜ ์ตœ์ ํ™” ๋ฐ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์—์„œ ๋” ๋‚˜์•„๊ฐ€ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ƒˆ๋กœ์šด ํ•˜๋“œ์›จ์–ด ์„ค๊ณ„ ๊ธฐ์ˆ ๊ณผ ๋ชจ๋ธ ๋ณ€ํ™˜ ๋ฐฉ๋ฒ• ๋“ฑ์„ ์ ์šฉํ•˜์—ฌ ๋” ํšจ์œจ์ ์ธ ์ถ”๋ก  ์‹œ์Šคํ…œ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ, ์ƒˆ๋กœ์šด ํ•˜๋“œ์›จ์–ด ์„ค๊ณ„ ๋ฐฉ๋ฒ•์ธ ํ™•๋ฅ  ์ปดํ“จํŒ…(Stochastic computing)์„ ๋„์ž…ํ•˜์—ฌ ๋” ํšจ์œจ์ ์ธ ๋”ฅ๋Ÿฌ๋‹ ๊ฐ€์† ํ•˜๋“œ์›จ์–ด๋ฅผ ์„ค๊ณ„ํ•˜์˜€๋‹ค. ํ™•๋ฅ  ์ปดํ“จํŒ…์€ ํ™•๋ฅ  ์—ฐ์‚ฐ์— ๊ธฐ๋ฐ˜์„ ๋‘” ์ƒˆ๋กœ์šด ํšŒ๋กœ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ธฐ์กด์˜ ์ด์ง„ ์—ฐ์‚ฐ ํšŒ๋กœ(Binary system)๋ณด๋‹ค ํ›จ์”ฌ ๋” ์ ์€ ํŠธ๋žœ์ง€์Šคํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋™์ผํ•œ ์—ฐ์‚ฐ ํšŒ๋กœ๋ฅผ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ํŠนํžˆ, ๋”ฅ๋Ÿฌ๋‹์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ๊ณฑ์…ˆ ์—ฐ์‚ฐ์„ ์œ„ํ•˜์—ฌ ์ด์ง„ ์—ฐ์‚ฐ ํšŒ๋กœ์—์„œ๋Š” ๋ฐฐ์—ด ์Šน์‚ฐ๊ธฐ(Array Multiplier)๋ฅผ ํ•„์š”๋กœ ํ•˜์ง€๋งŒ ํ™•๋ฅ  ์ปดํ“จํŒ…์—์„œ๋Š” AND ๊ฒŒ์ดํŠธํ•˜๋‚˜๋กœ ๊ตฌํ˜„์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์„ ํ–‰ ์—ฐ๊ตฌ๋“ค์ด ํ™•๋ฅ  ์ปดํ“จํŒ… ํšŒ๋กœ๋ฅผ ๊ธฐ๋ฐ˜ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ฐ€์†๊ธฐ๋“ค์„ ์„ค๊ณ„ํ•˜๊ณ  ์žˆ๋Š”๋ฐ, ์ธ์‹๋ฅ ์ด ์ด์ง„ ์—ฐ์‚ฐ ํšŒ๋กœ์— ๋น„ํ•˜์—ฌ ๋งŽ์ด ๋’ค์ณ์ง€๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์—ฐ์‚ฐ์˜ ์ •ํ™•๋„๋ฅผ ๋” ๋†’์ผ ์ˆ˜ ์žˆ๋„๋ก ๋‹จ๊ทน์„ฑ ๋ถ€ํ˜ธํ™”(Unipolar encoding) ๋ฐฉ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ๊ฐ€์†๊ธฐ๋ฅผ ์„ค๊ณ„ํ•˜์˜€๊ณ , ํ™•๋ฅ  ์ปดํ“จํŒ… ์ˆซ์ž ์ƒ์„ฑ๊ธฐ (Stochastic number generator)์˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•˜์—ฌ ํ™•๋ฅ  ์ปดํ“จํŒ… ์ˆซ์ž ์ƒ์„ฑ๊ธฐ๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋‰ด๋Ÿฐ์ด ๊ณต์œ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋‘ ๋ฒˆ์งธ, ๋” ๋†’์€ ์ถ”๋ก  ์†๋„ ํ–ฅ์ƒ์„ ์œ„ํ•˜์—ฌ ํ•™์Šต๋œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์••์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ• ๋Œ€์‹ ์— ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋ฅผ ๋ณ€ํ™˜ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค. ์„ ํ–‰ ์—ฐ๊ตฌ๋“ค์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด๋ฉด, ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์••์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ตœ์‹  ๊ตฌ์กฐ๋“ค์— ์ ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ๊ฐ€์ค‘์น˜ ํŒŒ๋ผ๋ฏธํ„ฐ(Weight Parameter)์—๋Š” ๋†’์€ ์••์ถ•๋ฅ ์„ ๋ณด์—ฌ์ฃผ์ง€๋งŒ ์‹ค์ œ ์ถ”๋ก  ์†๋„ ํ–ฅ์ƒ์—๋Š” ๋ฏธ๋ฏธํ•œ ํšจ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์‹ค์งˆ์ ์ธ ์†๋„ ํ–ฅ์ƒ์ด ๋ฏธํกํ•œ ๊ฒƒ์€ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๊ฐ€ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ตฌ์กฐ์ƒ์˜ ํ•œ๊ณ„์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ์ด๊ณ , ์ด๊ฒƒ์„ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊พธ๋Š”๊ฒƒ์ด ๊ฐ€์žฅ ๊ทผ๋ณธ์ ์ธ ํ•ด๊ฒฐ์ฑ…์ด๋‹ค. ์ด๋Ÿฌํ•œ ๊ด€์ฐฐ ๊ฒฐ๊ณผ๋ฅผ ํ† ๋Œ€๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์„ ํ–‰์—ฐ๊ตฌ๋ณด๋‹ค ๋” ๋†’์€ ์†๋„ ํ–ฅ์ƒ์„ ์œ„ํ•˜์—ฌ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋ฅผ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๊ฐ ์ธต๋งˆ๋‹ค ์„œ๋กœ ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋„๋ก ํƒ์ƒ‰ ๋ฒ”์œ„๋ฅผ ๋” ํ™•์žฅ์‹œํ‚ค๋ฉด์„œ๋„ ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ ํƒ์ƒ‰ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค. ์„ ํ–‰ ์—ฐ๊ตฌ์—์„œ์˜ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ ํƒ์ƒ‰์€ ๊ธฐ๋ณธ ๋‹จ์œ„์ธ ์…€(Cell)์˜ ๊ตฌ์กฐ๋ฅผ ํƒ์ƒ‰ํ•˜๊ณ , ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ๋ณต์‚ฌํ•˜์—ฌ ํ•˜๋‚˜์˜ ํฐ ์‹ ๊ฒฝ๋ง์œผ๋กœ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•œ๋‹ค. ํ•ด๋‹น ๋ฐฉ๋ฒ•์€ ํ•˜๋‚˜์˜ ์…€ ๊ตฌ์กฐ๋งŒ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์œ„์น˜์— ๋”ฐ๋ฅธ ์ž…๋ ฅ ํŠน์„ฑ๋งต(Input Feature Map)์˜ ํฌ๊ธฐ๋‚˜ ๊ฐ€์ค‘์น˜ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ํฌ๊ธฐ ๋“ฑ์— ๊ด€ํ•œ ์ •๋ณด๋Š” ๋ฌด์‹œํ•˜๊ฒŒ ๋œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์ ๋“ค์„ ํ•ด๊ฒฐํ•˜๋ฉด์„œ๋„ ์•ˆ์ •์ ์œผ๋กœ ํ•™์Šต์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์—ฐ์‚ฐ๋Ÿ‰๋ฟ๋งŒ์•„๋‹ˆ๋ผ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ํšŸ์ˆ˜์˜ ์ œ์•ฝ์„ ์ฃผ์–ด ๋” ํšจ์œจ์ ์ธ ๊ตฌ์กฐ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ฃผ๋Š” ํŽ˜๋„ํ‹ฐ(Penalty)๋ฅผ ์ƒˆ๋กœ์ด ๊ณ ์•ˆํ•˜์˜€๋‹ค.Deep learning is the most promising machine learning algorithm, and it is already used in real life. Actually, the latest smartphone use a neural network for better photograph and voice recognition. However, as the performance of the neural network improved, the hardware cost dramatically increases. Until the past few years, many researches focus on only a single side such as hardware or software, so its actual cost is hardly improved. Therefore, hardware and software co-optimization is needed to achieve further improvement. For this reason, this dissertation proposes the efficient inference system considering the hardware accelerator to the network architecture design. The first part of the dissertation is a deep neural network accelerator with stochastic computing. The main goal is the efficient stochastic computing hardware design for a convolutional neural network. It includes stochastic ReLU and optimized max function, which are key components in the convolutional neural network. To avoid the range limitation problem of stochastic numbers and increase the signal-to-noise ratio, we perform weight normalization and upscaling. In addition, to reduce the overhead of binary-to-stochastic conversion, we propose a scheme for sharing stochastic number generators among the neurons in the convolutional neural network. The second part of the dissertation is a neural architecture transformation. The network recasting is proposed, and it enables the network architecture transformation. The primary goal of this method is to accelerate the inference process through the transformation, but there can be many other practical applications. The method is based on block-wise recasting; it recasts each source block in a pre-trained teacher network to a target block in a student network. For the recasting, a target block is trained such that its output activation approximates that of the source block. Such a block-by-block recasting in a sequential manner transforms the network architecture while preserving accuracy. This method can be used to transform an arbitrary teacher network type to an arbitrary student network type. It can even generate a mixed-architecture network that consists of two or more types of block. The network recasting can generate a network with fewer parameters and/or activations, which reduce the inference time significantly. Naturally, it can be used for network compression by recasting a trained network into a smaller network of the same type. The third part of the dissertation is a fine-grained neural architecture search. InheritedNAS is the fine-grained architecture search method, and it uses the coarsegrained architecture that is found from the cell-based architecture search. Basically, fine-grained architecture has a very large search space, so it is hard to find directly. A stage independent search is proposed, and this method divides the entire network to several stages and trains each stage independently. To break the dependency between each stage, a two-point matching distillation method is also proposed. And then, operation pruning is applied to remove the unimportant operation. The block-wise pruning method is used to remove the operations rather than the node-wise pruning. In addition, a hardware-aware latency penalty is proposed, and it covers not only FLOPs but also memory access.1 Introduction 1 1.1 DNN Accelerator with Stochastic Computing 2 1.2 Neural Architecture Transformation 4 1.3 Fine-Grained Neural Architecture Search 6 2 Background 8 2.1 Stochastic Computing 8 2.2 Neural Network 10 2.2.1 Network Compression 10 2.2.2 Neural Network Accelerator 13 2.3 Knowledge Distillation 17 2.4 Neural Architecture Search 19 3 DNN Accelerator with Stochastic Computing 23 3.1 Motivation 23 3.1.1 Multiplication Error on Stochastic Computing 23 3.1.2 DNN with Stochastic Computing 24 3.2 Unipolar SC Hardware for CNN 25 3.2.1 Overall Hardware Design 25 3.2.2 Stochastic ReLU function 27 3.2.3 Stochastic Max function 30 3.2.4 Efficient Average Function 36 3.3 Weight Modulation for SC Hardware 38 3.3.1 Weight Normalization for SC 38 3.3.2 Weight Upscaling for Output Layer 43 3.4 Early Decision Termination 44 3.5 Stochastic Number Generator Sharing 49 3.6 Experiments 53 3.6.1 Accuracy of CNN using Unipolar SC 53 3.6.2 Synthesis Result 57 3.7 Summary 58 4 Neural Architecture Transformation 59 4.1 Motivation 59 4.2 Network Recasting 61 4.2.1 Recasting from DenseNet to ResNet and ConvNet 63 4.2.2 Recasting from ResNet to ConvNet 63 4.2.3 Compression 63 4.2.4 Block Training 65 4.2.5 Sequential Recasting and Fine-tuning 67 4.3 Experiments 69 4.3.1 Visualization of Filter Reduction 70 4.3.2 CIFAR 71 4.3.3 ILSVRC2012 73 4.4 Summary 76 5 Fine-Grained Neural Architecture Search 77 5.1 Motivation 77 5.1.1 Search Space Reduction Versus Diversity 77 5.1.2 Hardware-Aware Optimization 78 5.2 InheritedNAS 79 5.2.1 Stage Independent Search 79 5.2.2 Operation Pruning 82 5.2.3 Entire Search Procedure 83 5.3 Hardware-aware Penalty Design 85 5.4 Experiments 87 5.4.1 Fine-Grained Architecture Search 88 5.4.2 Penalty Analysis 90 5.5 Summary 92 6 Conclusion 93 Abstract (In Korean) 113Docto

    A fuzzy logic based dynamic reconfiguration scheme for optimal energy and throughput in symmetric chip multiprocessors

    Get PDF
    Embedded systems architectures have traditionally often been investigated and designed in order to achieve a greater throughput combined with minimum energy consumption. With the advent of reconfigurable architectures it is now possible to support algorithms to find optimal solutions for an improved energy and throughput balance. As a result of ongoing research several online and offline techniques and algorithm have been proposed for hardware adaptation. This paper presents a novel coarse-grained reconfigurable symmetric chip multiprocessor (SCMP) architecture managed by a fuzzy logic engine that balances performance and energy consumption. The architecture incorporates reconfigurable level 1 (L1) caches, power gated cores and adaptive on-chip network routers to allow minimizing leakage energy effects for inactive components. A coarse grained architecture was selected as to be a focus for this study as it typically allows for fast reconfiguration as compared to the fine-grained architectures, thus making it more feasible to be used for runtime adaption schemes. The presented architecture is analyzed using a set of OpenMP based parallel benchmarks and the results show significant improvements in performance while maintaining minimum energy consumption

    Are coarse-grained overlays ready for general purpose application acceleration on FPGAs?

    Get PDF
    Combining processors with hardware accelerators has become a norm with systems-on-chip (SoCs) ever present in modern compute devices. Heterogeneous programmable system on chip platforms sometimes referred to as hybrid FPGAs, tightly couple general purpose processors with high performance reconfigurable fabrics, providing a more flexible alternative. We can now think of a software application with hardware accelerated portions that are reconfigured at runtime. While such ideas have been explored in the past, modern hybrid FPGAs are the first commercial platforms to enable this move to a more software oriented view, where reconfiguration enables hardware resources to be shared by multiple tasks in a bigger application. However, while the rapidly increasing logic density and more capable hard resources found in modern hybrid FPGA devices should make them widely deployable, they remain constrained within specialist application domains. This is due to both design productivity issues and a lack of suitable hardware abstraction to eliminate the need for working with platform-specific details, as server and desktop virtualization has done in a more general sense. To allow mainstream adoption of FPGA based accelerators in general purpose computing, there is a need to virtualize FPGAs and make them more accessible to application developers who are accustomed to software API abstractions and fast development cycles. In this paper, we discuss the role of overlay architectures in enabling general purpose FPGA application acceleration

    Coarse Grained FLS-based Processor with Prognostic Malfunction Feature for UAM Drones using FPGA

    Full text link
    Many overall safety factors need to be considered in the next generation of Urban Air Mobility (UAM) systems and addressing these can become the anchor point for such technology to reach consent for worldwide application. On the other hand, fulfilling the safety requirements from an exponential increase of prolific UAM systems, is extremely complicated, and requires careful consideration of a variety of issues. One of the key goals of these Unmanned Air Systems (UAS) is the requirement to support the launch and control of hundreds of thousands of these advanced drones in the air simultaneously. Given the impracticalities of training the corresponding number of expert pilots, achieving this goal can only be realized through safe operation in either fullautonomous or semi-autonomous modes. According to many recent studies, the majority of flight accidents are concentrated on the last three stages of a flight trip, which include the Initial Approach, Final Approach, and Landing Phases of an airplane trip. Therefore, this paper proposes a novel decentralized processing system for enhancing the safety factors during the critical phases of Vertical and/or Short Take-Off and Landing (V/STOL) drones. This has been achieved by adopting several processing and control algorithms such as an Open Fuzzy Logic System (FLS) integrated with a Flight Rules Unit (FRU), FIR filters, and a novel Prognostic Malfunction processing unit. After applying several optimization techniques, this novel coarse-grained Autonomous Landing Guidance Assistance System (ALGAS3) processing architecture has been optimized to achieve a maximum computational processing performance of 70.82 Giga Operations per Second (GOPS). Also, the proposed ALGAS3 system shows an ultra-low dynamic thermal power dissipation (I/O and core) of 145.4 mW which is ideal for mobile avionic systems using INTEL 5CGXFC9D6F27C7 FPGA chip.Comment: The paper is accepte

    Towards a re-engineering method for web services architectures

    Get PDF
    Recent developments in Web technologies โ€“ in particular through the Web services framework โ€“ have greatly enhanced the flexible and interoperable implementation of service-oriented software architectures. Many older Web-based and other distributed software systems will be re-engineered to a Web services-oriented platform. Using an advanced e-learning system as our case study, we investigate central aspects of a re-engineering approach for the Web services platform. Since our aim is to provide components of the legacy system also as services in the new platform, re-engineering to suit the new development paradigm is as important as re-engineering to suit the new architectural requirements

    Reconfigurable architectures for beyond 3G wireless communication systems

    Get PDF
    • โ€ฆ
    corecore