18 research outputs found

    People tracking by cooperative fusion of RADAR and camera sensors

    Get PDF
    Accurate 3D tracking of objects from monocular camera poses challenges due to the loss of depth during projection. Although ranging by RADAR has proven effective in highway environments, people tracking remains beyond the capability of single sensor systems. In this paper, we propose a cooperative RADAR-camera fusion method for people tracking on the ground plane. Using average person height, joint detection likelihood is calculated by back-projecting detections from the camera onto the RADAR Range-Azimuth data. Peaks in the joint likelihood, representing candidate targets, are fed into a Particle Filter tracker. Depending on the association outcome, particles are updated using the associated detections (Tracking by Detection), or by sampling the raw likelihood itself (Tracking Before Detection). Utilizing the raw likelihood data has the advantage that lost targets are continuously tracked even if the camera or RADAR signal is below the detection threshold. We show that in single target, uncluttered environments, the proposed method entirely outperforms camera-only tracking. Experiments in a real-world urban environment also confirm that the cooperative fusion tracker produces significantly better estimates, even in difficult and ambiguous situations

    Accelerating iterative CT reconstruction algorithms using Tensor Cores

    Get PDF
    Tensor Cores are specialized hardware units added to recent NVIDIA GPUs to speed up matrix multiplication-related tasks, such as convolutions and densely connected layers in neural networks. Due to their specific hardware implementation and programming model, Tensor Cores cannot be straightforwardly applied to other applications outside machine learning. In this paper, we demonstrate the feasibility of using NVIDIA Tensor Cores for the acceleration of a non-machine learning application: iterative Computed Tomography (CT) reconstruction. For large CT images and real-time CT scanning, the reconstruction time for many existing iterative reconstruction methods is relatively high, ranging from seconds to minutes, depending on the size of the image. Therefore, CT reconstruction is an application area that could potentially benefit from Tensor Core hardware acceleration. We first studied the reconstruction algorithm's performance as a function of the hardware related parameters and proposed an approach to accelerate reconstruction on Tensor Cores. The results show that the proposed method provides about 5 x increase in speed and energy saving using the NVIDIA RTX 2080 Ti GPU for the parallel projection of 32 images of size 512 x 512. The relative reconstruction error due to the mixed-precision computations was almost equal to the error of single-precision (32-bit) floating- point computations. We then presented an approach for real-time and memory-limited applications by exploiting the symmetry of the system (i.e., the acquisition geometry). As the proposed approach is based on the conjugate gradient method, it can be generalized to extend its application to many research and industrial fields

    An interactive ImageJ plugin for semi-automated image denoising in electron microscopy

    Get PDF
    The recent advent of 3D in electron microscopy (EM) has allowed for detection of nanometer resolution structures. This has caused an explosion in dataset size, necessitating the development of automated workflows. Moreover, large 3D EM datasets typically require hours to days to be acquired and accelerated imaging typically results in noisy data. Advanced denoising techniques can alleviate this, but tend to be less accessible to the community due to low-level programming environments, complex parameter tuning or a computational bottleneck. We present DenoisEM: an interactive and GPU accelerated denoising plugin for ImageJ that ensures fast parameter tuning and processing through parallel computing. Experimental results show that DenoisEM is one order of magnitude faster than related software and can accelerate data acquisition by a factor of 4 without significantly affecting data quality. Lastly, we show that image denoising benefits visualization and (semi-)automated segmentation and analysis of ultrastructure in various volume EM datasets

    Autonomous Machine์„ ์œ„ํ•œ ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ์™€ ์„ผ์„œ ํ“จ์ „์„ ์ง€์›ํ•˜๋Š” Splash ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์˜ ์„ค๊ณ„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ํ™์„ฑ์ˆ˜.Autonomous machines have begun to be widely used in various application domains due to recent remarkable advances in machine intelligence. As these autonomous machines are equipped with diverse sensors, multicore processors and distributed computing nodes, the complexity of the underlying software platform is increasing at a rapid pace, overwhelming the developers with implementation details. This leads to a demand for a new programming framework that has an easy-to-use programming abstraction. In this thesis, we present a graphical programming framework named Splash that explicitly addresses the programming challenges that arise during the development of an autonomous machine. We set four design goals to solve the challenges. First, Splash should provide an easy-to-use, effective programming abstraction. Second, it must support real-time stream processing for deep-learning based machine learning intelligence. Third, it must provide programming support for real-time control system of autonomous machines such as sensor fusion and mode change. Finally, it should support performance optimization of software system running on a heterogeneous multicore distributed computing platform. Splash allows programmers to specify genuine, end-to-end timing constraints. Also, it provides a best-effort runtime system that tries to meet the annotated timing constraints and exception handling mechanisms to monitor the violation of such constraints. To implement these runtime mechanisms, Splash provides underlying timing semantics: (1) it provides an abstract global clock that is shared by machines in the distributed system and (2) it supports programmers to write birthmark on every stream data item. Splash offers a multithreaded process model to support concurrent programming. In the multithreaded process model, a programmer can write a multithreaded program using Splash threads we call sthreads. An sthread is a logical entity of independent execution. In addition, Splash provides a language construct named build unit that allows programmers to allocate sthreads to processes and threads of an underlying operating system. Splash provides three additional language semantics to support real-time stream processing and real-time control systems. First, it provides rate control semantics to solve uncontrolled jitter and an unbounded FIFO queue problem due to the variability in communication delay and execution time. Second, it supports fusion semantics to handle timing issues caused by asynchronous sensors in the system. Finally, it provides mode change semantics to meet varying requirements in the real-time control systems. In this paper, we describe each language semantics and runtime mechanism that realizes such semantics in detail. To show the utility of our framework, we have written a lane keeping assist system (LKAS) in Splash as an example. We evaluated rate control, sensor fusion, mode change and build unit-based allocation. First, using rate controller, the jitter was reduced from 30.61 milliseconds to 1.66 milliseconds. Also, average lateral deviation and heading angle is reduced from 0.180 meters to 0.016 meters and 0.043 rad to 0.008 rad, respectively. Second, we showed that the fusion operator works normally as intended, with a run-time overhead of only 7 microseconds on average. Third, the mode change mechanism operated correctly and incurred a run-time overhead of only 0.53 milliseconds. Finally, as we increased the number of build units from 1 to 8, the average end-to-end latency was increased from 75.79 microseconds to 2022.96 microseconds. These results show that the language semantics and runtime mechanisms proposed in this thesis are designed and implemented correctly, and Splash can be used to effectively develop applications for an autonomous machine.๋”ฅ ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ machine intelligence์˜ ๋น„์•ฝ์ ์ธ ๋ฐœ์ „์œผ๋กœ ์ธํ•ด autonomous machine๋“ค์ด ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฐ ๊ธฐ๊ธฐ๋“ค์€ ๋‹ค์–‘ํ•œ ์„ผ์„œ, ๋ฉ€ํ‹ฐ์ฝ”์–ด ํ”„๋กœ์„ธ์„œ, ๋ถ„์‚ฐ ์ปดํ“จํŒ… ๋…ธ๋“œ๋ฅผ ์žฅ์ฐฉํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ด๋“ค์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•œ ๊ธฐ๋ฐ˜ ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ์˜ ๋ณต์žก๋„๋Š” ๋น ๋ฅธ ์†๋„๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ์ถ”์„ธ์ด๋‹ค. ์ด์— ๋”ฐ๋ผ ๊ฐœ๋ฐœ์ž๋“ค์ด ๋ณต์žกํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๊ตฌ์กฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ฃผ๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ํ•„์š”์„ฑ์ด ๋Œ€๋‘๋˜๊ณ  ์žˆ๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ autonomous machine์˜ ๊ฐœ๋ฐœ ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๊ทธ๋ž˜ํ”ฝ ๊ธฐ๋ฐ˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ํ”„๋ ˆ์ž„์›Œํฌ์ธ Splash๋ฅผ ์ œ์•ˆํ•œ๋‹ค. Splash๋ผ๋Š” ์ด๋ฆ„์€ stream processing language for autonomous machine์—์„œ ์•ž์˜ ์„ธ ๋‹จ์–ด์˜ ์ฒซ ๋ฌธ์ž๋“ค์„ ๋”ฐ์„œ ์ง€์–ด์กŒ๋‹ค. ์ด ์ด๋ฆ„์€ ๋ฌผ๊ณผ ๊ฐ™์ด ํ๋ฅด๋Š” ์ŠคํŠธ๋ฆผ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์™€ ๋Ÿฐํƒ€์ž„ ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ•˜๊ฒ ๋‹ค๋Š” ์˜๋„๋ฅผ ๊ฐ€์ง„๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ณต์žกํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๊ตฌ์กฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ๋„ค ๊ฐ€์ง€ ๋””์ž์ธ ๋ชฉํ‘œ๋ฅผ ์„ค์ •ํ•œ๋‹ค. ์ฒซ์งธ, Splash๋Š” ๊ฐœ๋ฐœ์ž์—๊ฒŒ ์„ธ๋ถ€์ ์ธ ๊ตฌํ˜„ ์ด์Šˆ๋ฅผ ์ˆจ๊ธฐ๊ณ , ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์ถ”์ƒํ™”๋ฅผ ์ œ๊ณตํ•˜์—ฌ์•ผ ํ•œ๋‹ค. ๋‘˜์งธ, Splash๋Š” machine intelligence๋ฅผ ์œ„ํ•œ ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์…‹์งธ, Splash๋Š” ์‹ค์‹œ๊ฐ„ ์ œ์–ด ์‹œ์Šคํ…œ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ์„ผ์„œ ํ“จ์ „, ๋ชจ๋“œ ๋ณ€๊ฒฝ, ์˜ˆ์™ธ ์ฒ˜๋ฆฌ์™€ ๊ฐ™์€ ๊ธฐ๋Šฅ๋“ค์„ ์œ„ํ•œ ์ง€์›์„ ์ œ๊ณตํ•˜์—ฌ์•ผ ํ•œ๋‹ค. ๋„ท์งธ, Splash๋Š” ์ด๊ธฐ์ข… ๋ฉ€ํ‹ฐ์ฝ”์–ด ๋ถ„์‚ฐ ์ปดํ“จํŒ… ํ”Œ๋žซํผ์—์„œ ์ˆ˜ํ–‰๋˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ ์ตœ์ ํ™”๋ฅผ ์ง€์›ํ•˜์—ฌ์•ผ ํ•œ๋‹ค. Splash๋Š” ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด ๊ฐœ๋ฐœ์ž๊ฐ€ ํ”„๋กœ๊ทธ๋žจ ์ƒ์— ๋ณธ์งˆ์ ์ธ end-to-end timing constraints๋ฅผ ๋ช…์‹œํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฐœ๋ฐœ์ž๊ฐ€ ๋ช…์‹œํ•œ timing constraints๋ฅผ ์ธ์ง€ํ•˜๊ณ  ์ด๋ฅผ ์ตœ๋Œ€ํ•œ ์ง€์ผœ์ฃผ๋Š” best-effort ๋Ÿฐํƒ€์ž„ ์‹œ์Šคํ…œ๊ณผ timing constraints์˜ ์œ„๋ฐ˜์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ณ  ์ฒ˜๋ฆฌํ•ด์ฃผ๋Š” ์˜ˆ์™ธ ์ฒ˜๋ฆฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ•จ๊ป˜ ์ œ๊ณตํ•œ๋‹ค. ์ด๋Ÿฐ ๋Ÿฐํƒ€์ž„ ๋ฉ”์ปค๋‹ˆ์ฆ˜๋“ค์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด Splash๋Š” ๋‘ ๊ฐ€์ง€ ๊ธฐ๋ณธ์ ์ธ timing semantics๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ์ฒซ์งธ, ๋ถ„์‚ฐ ์‹œ์Šคํ…œ ์ƒ์—์„œ ๋ชจ๋“  ๋จธ์‹ ๋“ค์ด ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋Š” global time base๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ๋‘˜์งธ, Splash ์ƒ์— ๋“ค์–ด์˜ค๋Š” ๋ชจ๋“  ์ŠคํŠธ๋ฆผ ๋ฐ์ดํ„ฐ ์•„์ดํ…œ์— ์ž์‹ ์˜ birthmark๋ฅผ ๊ธฐ๋กํ•˜๋„๋ก ํ•œ๋‹ค. Splash๋Š” ๋™์‹œ์„ฑ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•œ ๋ฉ€ํ‹ฐ ์“ฐ๋ ˆ๋””๋“œ ์ฒ˜๋ฆฌ ๋ชจ๋ธ์„ ์ œ๊ณตํ•œ๋‹ค. Splash ํ”„๋กœ๊ทธ๋ž˜๋จธ๋Š” sthread๋ผ๋Š” ๋…ผ๋ฆฌ์ ์ธ ์ˆ˜ํ–‰ ๋‹จ์œ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ”„๋กœ๊ทธ๋žจ์„ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Splash๋Š” sthread๋“ค์„ ์‹ค์ œ ์šด์˜์ฒด์ œ์˜ ์ˆ˜ํ–‰ ๋‹จ์œ„์ธ ํ”„๋กœ์„ธ์Šค์™€ ์“ฐ๋ ˆ๋“œ์—๊ฒŒ ํ• ๋‹นํ•˜๋Š” ๊ณผ์ •์„ ๋•๊ธฐ ์œ„ํ•œ ๋นŒ๋“œ ์œ ๋‹›์ด๋ผ๋Š” language construct๋ฅผ ์ œ๊ณตํ•œ๋‹ค. Splash๋Š” timing semantics์™€ ๋ฉ€ํ‹ฐ ์“ฐ๋ ˆ๋””๋“œ ์ฒ˜๋ฆฌ ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ์™€ ์‹ค์‹œ๊ฐ„ ์ œ์–ด ์‹œ์Šคํ…œ์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•œ ์„ธ ๊ฐ€์ง€ language semantics๋ฅผ ์ถ”๊ฐ€๋กœ ์ง€์›ํ•œ๋‹ค. ์ฒซ์งธ๋Š” ์ŠคํŠธ๋ฆผ ๋ฐ์ดํ„ฐ์˜ ํ†ต์‹ ์ด๋‚˜ ์ฒ˜๋ฆฌ ์ง€์—ฐ์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” ์ง€ํ„ฐ๋‚˜ ๋ฐ”์šด๋“œ ๋˜์ง€ ์•Š๋Š” ํ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ rate ์ œ์–ด semantics์ด๋‹ค. ๋‘˜์งธ๋Š” ์„ผ์„œ ํ“จ์ „ ๊ณผ์ •์—์„œ ์‹œ๊ฐ„์ ์œผ๋กœ ๋™๊ธฐํ™”๋˜์ง€ ์•Š์€ ์„ผ์„œ ์ž…๋ ฅ๋“ค๋กœ ์ธํ•œ ํƒ€์ด๋ฐ ์ด์Šˆ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ํ“จ์ „ semantics์ด๋‹ค. ๋งˆ์ง€๋ง‰์€ ๊ฐ€๋ณ€์ ์ธ ์ œ์–ด ์‹œ์Šคํ…œ์˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์ˆ˜ํ–‰ ๋กœ์ง์˜ ๋ณ€๊ฒฝ์„ ์ง€์›ํ•˜๋Š” ๋ชจ๋“œ ๋ณ€๊ฒฝ semantics์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ฐ๊ฐ์˜ language semantics๋ฅผ ๊ตฌ์ฒด์ ์œผ๋กœ ์„ค๋ช…ํ•˜๊ณ , ์ด๋ฅผ ์‹คํ˜„ํ•˜๊ธฐ ์œ„ํ•œ ๋Ÿฐํƒ€์ž„ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์„ค๊ณ„ํ•˜๊ณ  ๊ตฌํ˜„ํ•œ๋‹ค. Splash์˜ ํšจ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด์„œ, ๋ณธ ๋…ผ๋ฌธ์€ Splash๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ LKAS ์‘์šฉ์„ ๊ฐœ๋ฐœํ•˜๊ณ  ์ด๋ฅผ Splash ๋Ÿฐํƒ€์ž„ ์‹œ์Šคํ…œ ์ƒ์—์„œ ์ˆ˜ํ–‰์‹œํ‚ค๋ฉฐ ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” rate ์ œ์–ด ๋ฉ”์ปค๋‹ˆ์ฆ˜, ์„ผ์„œ ํ“จ์ „ ๋ฉ”์ปค๋‹ˆ์ฆ˜, ๋ชจ๋“œ ๋ณ€๊ฒฝ ๋ฉ”์ปค๋‹ˆ์ฆ˜, ๋นŒ๋“œ ์œ ๋‹› ๊ธฐ๋ฐ˜ allocation์„ ๊ฐ๊ฐ ์„ ์ •๋œ ์„ฑ๋Šฅ ์ง€ํ‘œ๋“ค์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒ€์ฆํ•˜์˜€๋‹ค. ์ฒซ์งธ, Splash์˜ rate ์ œ์–ด๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ง€ํ„ฐ๊ฐ€ 30.61ms์—์„œ 1.66ms๋กœ ๊ฐ์†Œ๋˜์—ˆ๊ณ , ์ด๋กœ ์ธํ•ด ์ฃผํ–‰ ์ฐจ๋Ÿ‰์˜ ์ธก๋ฉด ํŽธ์ฐจ์™€ ๋ฐฉํ–ฅ๊ฐ์ด ๊ฐ๊ฐ 0.180m์—์„œ 0.016m, 0.043rad์—์„œ 0.008rad์œผ๋กœ ๊ฐœ์„ ๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋‘˜์งธ, ์„ผ์„œ ํ“จ์ „์„ ์œ„ํ•ด ์ œ์•ˆ๋œ ํ“จ์ „ ์—ฐ์‚ฐ์ž๊ฐ€ ์„ค๊ณ„๋œ ์˜๋„๋Œ€๋กœ ์ •์ƒ ๋™์ž‘ํ•˜๊ณ , ํ‰๊ท  7us์˜ ๋‚ฎ์€ ์˜ค๋ฒ„ํ—ค๋“œ๋งŒ์„ ์œ ๋ฐœํ•œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์…‹์งธ, ๋ชจ๋“œ ๋ณ€๊ฒฝ ๊ธฐ๋Šฅ์˜ ์ •์ƒ ๋™์ž‘์„ ๊ฒ€์ฆํ•˜์˜€๊ณ  ๊ทธ ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์‹œ๊ฐ„์  ์˜ค๋ฒ„ํ—ค๋“œ๋Š” ํ‰๊ท  0.53ms์— ๋ถˆ๊ณผํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, synthetic workload์— ๋Œ€ํ•ด ์ปดํฌ๋„ŒํŠธ๋“ค์— ๋งคํ•‘๋œ ๋นŒ๋“œ ์œ ๋‹› ๊ฐœ์ˆ˜๋ฅผ 1๊ฐœ, 2๊ฐœ, 4๊ฐœ, 8๊ฐœ๋กœ ์ฆ๊ฐ€์‹œํ‚ด์— ๋”ฐ๋ผ ํ‰๊ท  end-to-end ์ง€์—ฐ ์‹œ๊ฐ„์€ 75.79us, 330.80us, 591.87us, 2022.96us๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋“ค์€ ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” language semantics์™€ ๋Ÿฐํƒ€์ž„ ๋ฉ”์ปค๋‹ˆ์ฆ˜๋“ค์ด ์˜๋„๋Œ€๋กœ ์„ค๊ณ„, ๊ตฌํ˜„๋˜์—ˆ๊ณ , ์ด๋ฅผ ํ†ตํ•ด autonomous machine์˜ ์‘์šฉ๋“ค์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.Chapter 1 Introduction p.1 1.1 Motivation p.2 1.2 Splash Overview p.5 1.3 Organization of This Dissertation p.9 Chapter 2 Related Work p.10 2.1 Kahn Process Network p.10 2.2 Firing Rule Applied to a Process p.13 2.3 Programming Framework for an Autonomous Machine p.14 2.4 Runtime Software for an Autonomous Machine p.16 2.5 Rate Control p.18 2.5.1 Traffic Shaping p.20 2.5.2 Traffic Policing p.22 2.6 Sensor Fusion p.23 2.6.1 Measurement Fusion p.24 2.6.2 Situation Fusion p.27 2.7 Mode Change p.30 2.7.1 Synchronous Mode Change p.32 2.7.2 Asynchronous Mode Change p.32 Chapter 3 Motivation and Contributions p.34 3.1 Problem Description p.34 3.2 Limitations of Kahn Process Network p.36 3.3 Contributions of this Dissertation p.38 Chapter 4 Underlying Timing Semantics of Splash p.41 4.1 End-to-End Timing Constraints p.41 4.2 Global Time Base and In-order Delivery p.42 4.3 Integrating Three Distinct Computing Models p.43 Chapter 5 Splash Language Constructs p.45 5.1 Processing Component p.46 5.2 Port p.49 5.3 Channel and Clink p.52 5.4 Fusion Operator p.54 5.5 Factory and Mode Change p.60 5.6 Build Unit p.65 5.7 Exception Handling p.67 Chapter 6 Splash Runtime Mechanisms p.69 6.1 Rate Control Mechanism p.69 6.2 Sensor Fusion Mechanism p.70 6.3 Mode Change Mechanism p.77 Chapter 7 Code Generation and Runtime System p.80 7.1 Build Unit-based Allocation p.80 7.2 Code Generation Template p.82 7.3 Splash Runtime System p.84 Chapter 8 Experimental Evaluation p.86 8.1 LKAS Program p.86 8.2 Experimental Environment p.91 8.3 Evaluating Rate Control p.92 8.4 Evaluating Sensor Fusion p.96 8.5 Evaluating Mode Change p.97 8.6 Evaluating Build Unit-based Allocation p.99 Chapter 9 Conclusion p.102 Bibliography p.104 Abstract in Korean p.113Docto

    On the Future of Cloud Engineering

    Get PDF
    Ever since the commercial offerings of the Cloud started appearing in 2006, the landscape of cloud computing has been undergoing remarkable changes with the emergence of many different types of service offerings, developer productivity enhancement tools, and new application classes as well as the manifestation of cloud functionality closer to the user at the edge. The notion of utility computing, however, has remained constant throughout its evolution, which means that cloud users always seek to save costs of leasing cloud resources while maximizing their use. On the other hand, cloud providers try to maximize their profits while assuring service-level objectives of the cloud-hosted applications and keeping operational costs low. All these outcomes require systematic and sound cloud engineering principles. The aim of this paper is to highlight the importance of cloud engineering, survey the landscape of best practices in cloud engineering and its evolution, discuss many of the existing cloud engineering advances, and identify both the inherent technical challenges and research opportunities for the future of cloud computing in general and cloud engineering in particular

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 โ€œHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)โ€œ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 โ€œHigh-Performance Modelling and Simulation for Big Data Applications (cHiPSet)โ€œ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Robust and stable region-of-interest tomographic reconstruction using a robust width prior

    Get PDF
    Region-of-interest computed tomography (ROI CT) aims at reconstructing a region within the field of view by using only ROI-focused projections. The solution of this inverse problem is challenging and methods of tomographic reconstruction that are designed to work with full projection data may perform poorly or fail when applied to this setting. In this work, we study the ROI CT problem in the presence of measurement noise and formulate the reconstruction problem by relaxing data fidelity and consistency requirements. Under the assumption of a robust width prior that provides a form of stability for data satisfying appropriate sparsity-inducing norms, we derive reconstruction performance guarantees and controllable error bounds. Based on this theoretical setting, we introduce a novel iterative reconstruction algorithm from ROI-focused projection data that is guaranteed to converge with controllable error while satisfying predetermined fidelity and consistency tolerances. Numerical tests on experimental data show that our algorithm for ROI CT is competitive with state-of-the-art methods especially when the ROI radius is small
    corecore