2,161 research outputs found

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

    Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

    Full text link
    In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

    Machine Learning for Fluid Mechanics

    Full text link
    The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202

    Computational Physics on Graphics Processing Units

    Full text link
    The use of graphics processing units for scientific computations is an emerging strategy that can significantly speed up various different algorithms. In this review, we discuss advances made in the field of computational physics, focusing on classical molecular dynamics, and on quantum simulations for electronic structure calculations using the density functional theory, wave function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012, Helsinki, Finland, June 10-13, 201

    Cooperative Path-Planning for Multi-Vehicle Systems

    Get PDF
    In this paper, we propose a collision avoidance algorithm for multi-vehicle systems, which is a common problem in many areas, including navigation and robotics. In dynamic environments, vehicles may become involved in potential collisions with each other, particularly when the vehicle density is high and the direction of travel is unrestricted. Cooperatively planning vehicle movement can effectively reduce and fairly distribute the detour inconvenience before subsequently returning vehicles to their intended paths. We present a novel method of cooperative path planning for multi-vehicle systems based on reinforcement learning to address this problem as a decision process. A dynamic system is described as a multi-dimensional space formed by vectors as states to represent all participating vehiclesโ€™ position and orientation, whilst considering the kinematic constraints of the vehicles. Actions are defined for the system to transit from one state to another. In order to select appropriate actions whilst satisfying the constraints of path smoothness, constant speed and complying with a minimum distance between vehicles, an approximate value function is iteratively developed to indicate the desirability of every state-action pair from the continuous state space and action space. The proposed scheme comprises two phases. The convergence of the value function takes place in the former learning phase, and it is then used as a path planning guideline in the subsequent action phase. This paper summarizes the concept and methodologies used to implement this online cooperative collision avoidance algorithm and presents results and analysis regarding how this cooperative scheme improves upon two baseline schemes where vehicles make movement decisions independently

    ๋ณ‘๋ ฌ ๋ฐ ๋ถ„์‚ฐ ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์„ ์œ„ํ•œ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ฝ”๋“œ ์ƒ์„ฑ ํ”„๋ ˆ์ž„์›Œํฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ํ•˜์ˆœํšŒ.์†Œํ”„ํŠธ์›จ์–ด ์„ค๊ณ„ ์ƒ์‚ฐ์„ฑ ๋ฐ ์œ ์ง€๋ณด์ˆ˜์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์ด ์ œ์•ˆ๋˜์—ˆ์ง€๋งŒ, ๋Œ€๋ถ€๋ถ„์˜ ์—ฐ๊ตฌ๋Š” ์‘์šฉ ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ํ•˜๋‚˜์˜ ํ”„๋กœ์„ธ์„œ์—์„œ ๋™์ž‘์‹œํ‚ค๋Š” ๋ฐ์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ๋‹ค. ๋˜ํ•œ, ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ•˜๋Š” ๋ฐ์— ํ•„์š”ํ•œ ์ง€์—ฐ์ด๋‚˜ ์ž์› ์š”๊ตฌ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ๋น„๊ธฐ๋Šฅ์  ์š”๊ตฌ ์‚ฌํ•ญ์„ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ผ๋ฐ˜์ ์ธ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์„ ์ž„๋ฒ ๋””๋“œ ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ๊ฐœ๋ฐœํ•˜๋Š” ๋ฐ์— ์ ์šฉํ•˜๋Š” ๊ฒƒ์€ ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋ณ‘๋ ฌ ๋ฐ ๋ถ„์‚ฐ ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ๋ชจ๋ธ๋กœ ํ‘œํ˜„ํ•˜๊ณ , ์ด๋ฅผ ์†Œํ”„ํŠธ์›จ์–ด ๋ถ„์„์ด๋‚˜ ๊ฐœ๋ฐœ์— ํ™œ์šฉํ•˜๋Š” ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์„ ์†Œ๊ฐœํ•œ๋‹ค. ์šฐ๋ฆฌ์˜ ๋ชจ๋ธ์—์„œ ์‘์šฉ ์†Œํ”„ํŠธ์›จ์–ด๋Š” ๊ณ„์ธต์ ์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํƒœ์Šคํฌ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ, ํ•˜๋“œ์›จ์–ด ํ”Œ๋žซํผ๊ณผ ๋…๋ฆฝ์ ์œผ๋กœ ๋ช…์„ธํ•œ๋‹ค. ํƒœ์Šคํฌ ๊ฐ„์˜ ํ†ต์‹  ๋ฐ ๋™๊ธฐํ™”๋Š” ๋ชจ๋ธ์ด ์ •์˜ํ•œ ๊ทœ์•ฝ์ด ์ •ํ•ด์ ธ ์žˆ๊ณ , ์ด๋Ÿฌํ•œ ๊ทœ์•ฝ์„ ํ†ตํ•ด ์‹ค์ œ ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•˜๊ธฐ ์ „์— ์†Œํ”„ํŠธ์›จ์–ด ์—๋Ÿฌ๋ฅผ ์ •์  ๋ถ„์„์„ ํ†ตํ•ด ํ™•์ธํ•  ์ˆ˜ ์žˆ๊ณ , ์ด๋Š” ์‘์šฉ์˜ ๊ฒ€์ฆ ๋ณต์žก๋„๋ฅผ ์ค„์ด๋Š” ๋ฐ์— ๊ธฐ์—ฌํ•œ๋‹ค. ์ง€์ •ํ•œ ํ•˜๋“œ์›จ์–ด ํ”Œ๋žซํผ์—์„œ ๋™์ž‘ํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ์€ ํƒœ์Šคํฌ๋“ค์„ ํ”„๋กœ์„ธ์„œ์— ๋งคํ•‘ํ•œ ์ดํ›„์— ์ž๋™์ ์œผ๋กœ ํ•ฉ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ์œ„์˜ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ํ”„๋กœ๊ทธ๋žจ ํ•ฉ์„ฑ๊ธฐ๋ฅผ ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜์˜€๋Š”๋ฐ, ๋ช…์„ธํ•œ ํ”Œ๋žซํผ ์š”๊ตฌ ์‚ฌํ•ญ์„ ๋ฐ”ํƒ•์œผ๋กœ ๋ณ‘๋ ฌ ๋ฐ ๋ถ„์‚ฐ ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์„์—์„œ ๋™์ž‘ํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ •ํ˜•์  ๋ชจ๋ธ๋“ค์„ ๊ณ„์ธต์ ์œผ๋กœ ํ‘œํ˜„ํ•˜์—ฌ ์‘์šฉ์˜ ๋™์  ํ–‰ํƒœ๋ฅผ ๋‚˜ํƒ€๊ณ , ํ•ฉ์„ฑ๊ธฐ๋Š” ์—ฌ๋Ÿฌ ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ๋œ ๊ณ„์ธต์ ์ธ ๋ชจ๋ธ๋กœ๋ถ€ํ„ฐ ๋ณ‘๋ ฌ์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ ํƒœ์Šคํฌ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ํ”„๋กœ๊ทธ๋žจ ํ•ฉ์„ฑ๊ธฐ์—์„œ ๋‹ค์–‘ํ•œ ํ”Œ๋žซํผ์ด๋‚˜ ๋„คํŠธ์›Œํฌ๋ฅผ ์ง€์›ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฝ”๋“œ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์€ 6๊ฐœ์˜ ํ•˜๋“œ์›จ์–ด ํ”Œ๋žซํผ๊ณผ 3 ์ข…๋ฅ˜์˜ ๋„คํŠธ์›Œํฌ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š” ์‹ค์ œ ๊ฐ์‹œ ์†Œํ”„ํŠธ์›จ์–ด ์‹œ์Šคํ…œ ์‘์šฉ ์˜ˆ์ œ์™€ ์ด์ข… ๋ฉ€ํ‹ฐ ํ”„๋กœ์„ธ์„œ๋ฅผ ํ™œ์šฉํ•˜๋Š” ์›๊ฒฉ ๋”ฅ ๋Ÿฌ๋‹ ์˜ˆ์ œ๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ฐœ๋ฐœ ๋ฐฉ๋ฒ•๋ก ์˜ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ์‹œํ—˜ํ•˜์˜€๋‹ค. ๋˜ํ•œ, ํ”„๋กœ๊ทธ๋žจ ํ•ฉ์„ฑ๊ธฐ๊ฐ€ ์ƒˆ๋กœ์šด ํ”Œ๋žซํผ์ด๋‚˜ ๋„คํŠธ์›Œํฌ๋ฅผ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”๋กœ ํ•˜๋Š” ๊ฐœ๋ฐœ ๋น„์šฉ๋„ ์‹ค์ œ ์ธก์ • ๋ฐ ์˜ˆ์ธกํ•˜์—ฌ ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ๋…ธ๋ ฅ์œผ๋กœ ์ƒˆ๋กœ์šด ํ”Œ๋žซํผ์„ ์ง€์›ํ•  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋งŽ์€ ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์—์„œ ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ํ•˜๋“œ์›จ์–ด ์—๋Ÿฌ์— ๋Œ€ํ•ด ๊ฒฐํ•จ์„ ๊ฐ๋‚ดํ•˜๋Š” ๊ฒƒ์„ ํ•„์š”๋กœ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ฒฐํ•จ ๊ฐ๋‚ด์— ๋Œ€ํ•œ ์ฝ”๋“œ๋ฅผ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•˜๋Š” ์—ฐ๊ตฌ๋„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ๋ณธ ๊ธฐ๋ฒ•์—์„œ ๊ฒฐํ•จ ๊ฐ๋‚ด ์„ค์ •์— ๋”ฐ๋ผ ํƒœ์Šคํฌ ๊ทธ๋ž˜ํ”„๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๋ฐฉ์‹์„ ํ™œ์šฉํ•˜์˜€์œผ๋ฉฐ, ๊ฒฐํ•จ ๊ฐ๋‚ด์˜ ๋น„๊ธฐ๋Šฅ์  ์š”๊ตฌ ์‚ฌํ•ญ์„ ์‘์šฉ ๊ฐœ๋ฐœ์ž๊ฐ€ ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๊ฒฐํ•จ ๊ฐ๋‚ด ์ง€์›ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์‹ค์ œ ์ˆ˜๋™์œผ๋กœ ๊ตฌํ˜„ํ–ˆ์„ ๊ฒฝ์šฐ์™€ ๋น„๊ตํ•˜์˜€๊ณ , ๊ฒฐํ•จ ์ฃผ์ž… ๋„๊ตฌ๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ฒฐํ•จ ๋ฐœ์ƒ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์žฌํ˜„ํ•˜๊ฑฐ๋‚˜, ์ž„์˜๋กœ ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•˜๋Š” ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๊ฒฐํ•จ ๊ฐ๋‚ด๋ฅผ ์‹คํ—˜ํ•  ๋•Œ์— ํ™œ์šฉํ•œ ๊ฒฐํ•จ ์ฃผ์ž… ๋„๊ตฌ๋Š” ๋ณธ ๋…ผ๋ฌธ์˜ ๋˜ ๋‹ค๋ฅธ ๊ธฐ์—ฌ ์‚ฌํ•ญ ์ค‘ ํ•˜๋‚˜๋กœ ๋ฆฌ๋ˆ…์Šค ํ™˜๊ฒฝ์œผ๋กœ ๋Œ€์ƒ์œผ๋กœ ์‘์šฉ ์˜์—ญ ๋ฐ ์ปค๋„ ์˜์—ญ์— ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•˜๋Š” ๋„๊ตฌ๋ฅผ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์‹œ์Šคํ…œ์˜ ๊ฒฌ๊ณ ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•˜์—ฌ ๊ฒฐํ•จ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์žฌํ˜„ํ•˜๋Š” ๊ฒƒ์€ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ ๊ฐœ๋ฐœ๋œ ๊ฒฐํ•จ ์ฃผ์ž… ๋„๊ตฌ๋Š” ์‹œ์Šคํ…œ์ด ๋™์ž‘ํ•˜๋Š” ๋„์ค‘์— ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•  ์ˆ˜ ์žˆ๋Š” ๋„๊ตฌ์ด๋‹ค. ์ปค๋„ ์˜์—ญ์—์„œ์˜ ๊ฒฐํ•จ ์ฃผ์ž…์„ ์œ„ํ•ด ๋‘ ์ข…๋ฅ˜์˜ ๊ฒฐํ•จ ์ฃผ์ž… ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•˜๋ฉฐ, ํ•˜๋‚˜๋Š” ์ปค๋„ GNU ๋””๋ฒ„๊ฑฐ๋ฅผ ์ด์šฉํ•œ ๋ฐฉ๋ฒ•์ด๊ณ , ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ARM ํ•˜๋“œ์›จ์–ด ๋ธŒ๋ ˆ์ดํฌํฌ์ธํŠธ๋ฅผ ํ™œ์šฉํ•œ ๋ฐฉ๋ฒ•์ด๋‹ค. ์‘์šฉ ์˜์—ญ์—์„œ ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•˜๊ธฐ ์œ„ํ•ด GDB ๊ธฐ๋ฐ˜ ๊ฒฐํ•จ ์ฃผ์ž… ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๋™์ผ ์‹œ์Šคํ…œ ํ˜น์€ ์›๊ฒฉ ์‹œ์Šคํ…œ์˜ ์‘์šฉ์— ๊ฒฐํ•จ์„ ์ฃผ์ž…ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ฒฐํ•จ ์ฃผ์ž… ๋„๊ตฌ์— ๋Œ€ํ•œ ์‹คํ—˜์€ ODROID-XU4 ๋ณด๋“œ์—์„œ ์ง„ํ–‰ํ•˜์˜€๋‹ค.While various software development methodologies have been proposed to increase the design productivity and maintainability of software, they usually focus on the development of application software running on a single processing element, without concern about the non-functional requirements of an embedded system such as latency and resource requirements. In this thesis, we present a model-based software development method for parallel and distributed embedded systems. An application is specified as a set of tasks that follow a set of given rules for communication and synchronization in a hierarchical fashion, independently of the hardware platform. Having such rules enables us to perform static analysis to check some software errors at compile time to reduce the verification difficulty. Platform-specific program is synthesized automatically after mapping of tasks onto processing elements is determined. The program synthesizer is also proposed to generate codes which satisfies platform requirements for parallel and distributed embedded systems. As multiple models which can express dynamic behaviors can be depicted hierarchically, the synthesizer supports to manage multiple task graphs with a different hierarchy to run tasks with parallelism. Also, the synthesizer shows methods of managing codes for heterogeneous platforms and generating various communication methods. The viability of the proposed software development method is verified with a real-life surveillance application that runs on six processing elements with three remote communication methods, and remote deep learning example is conducted to use heterogeneous multiprocessing components on distributed systems. Also, supporting a new platform and network requires a small effort by measuring and estimating development costs. Since tolerance to unexpected errors is a required feature of many embedded systems, we also support an automatic fault-tolerant code generation. Fault tolerance can be applied by modifying the task graph based on the selected fault tolerance configurations, so the non-functional requirement of fault tolerance can be easily adopted by an application developer. To compare the effort of supporting fault tolerance, manual implementation of fault tolerance is performed. Also, the fault tolerance method is tested with the fault injection tool to emulate fault scenarios and inject faults randomly. Our fault injection tool, which has used for testing our fault-tolerance method, is another work of this thesis. Emulating fault scenarios by intentionally injecting faults is commonly used to test and verify the robustness of a system. To emulate faults on an embedded system, we present a run-time fault injection framework that can inject a fault on both a kernel and application layer of Linux-based systems. For injecting faults on a kernel layer, two complementary fault injection techniques are used. One is based on Kernel GNU Debugger, and the other is using a hardware breakpoint supported by the ARM architecture. For application-level fault injection, the GDB-based fault injection method is used to inject a fault on a remote application. The viability of the proposed fault injection tool is proved by real-life experiments with an ODROID-XU4 system.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 6 1.3 Dissertation Organization 8 Chapter 2 Background 9 2.1 HOPES: Hope of Parallel Embedded Software 9 2.1.1 Software Development Procedure 9 2.1.2 Components of HOPES 12 2.2 Universal Execution Model 13 2.2.1 Task Graph Specification 13 2.2.2 Dataflow specification of an Application 15 2.2.3 Task Code Specification and Generic APIs 21 2.2.4 Meta-data Specification 23 Chapter 3 Program Synthesis for Parallel and Distributed Embedded Systems 24 3.1 Motivational Example 24 3.2 Program Synthesis Overview 26 3.3 Program Synthesis from Hierarchically-mixed Models 30 3.4 Platform Code Synthesis 33 3.5 Communication Code Synthesis 36 3.6 Experiments 40 3.6.1 Development Cost of Supporting New Platforms and Networks 40 3.6.2 Program Synthesis for the Surveillance System Example 44 3.6.3 Remote GPU-accelerated Deep Learning Example 46 3.7 Document Generation 48 3.8 Related Works 49 Chapter 4 Model Transformation for Fault-tolerant Code Synthesis 56 4.1 Fault-tolerant Code Synthesis Techniques 56 4.2 Applying Fault Tolerance Techniques in HOPES 61 4.3 Experiments 62 4.3.1 Development Cost of Applying Fault Tolerance 62 4.3.2 Fault Tolerance Experiments 62 4.4 Random Fault Injection Experiments 65 4.5 Related Works 68 Chapter 5 Fault Injection Framework for Linux-based Embedded Systems 70 5.1 Background 70 5.1.1 Fault Injection Techniques 70 5.1.2 Kernel GNU Debugger 71 5.1.3 ARM Hardware Breakpoint 72 5.2 Fault Injection Framework 74 5.2.1 Overview 74 5.2.2 Architecture 75 5.2.3 Fault Injection Techniques 79 5.2.4 Implementation 83 5.3 Experiments 90 5.3.1 Experiment Setup 90 5.3.2 Performance Comparison of Two Fault Injection Methods 90 5.3.3 Bit-flip Fault Experiments 92 5.3.4 eMMC Controller Fault Experiments 94 Chapter 6 Conclusion 97 Bibliography 99 ์š” ์•ฝ 108Docto

    Hierarchical Control of the ATLAS Experiment

    Get PDF
    Control systems at High Energy Physics (HEP) experiments are becoming increasingly complex mainly due to the size, complexity and data volume associated to the front-end instrumentation. In particular, this becomes visible for the ATLAS experiment at the LHC accelerator at CERN. ATLAS will be the largest particle detector ever built, result of an international collaboration of more than 150 institutes. The experiment is composed of 9 different specialized sub-detectors that perform different tasks and have different requirements for operation. The system in charge of the safe and coherent operation of the whole experiment is called Detector Control System (DCS). This thesis presents the integration of the ATLAS DCS into a global control tree following the natural segmentation of the experiment into sub-detectors and smaller sub-systems. The integration of the many different systems composing the DCS includes issues such as: back-end organization, process model identification, fault detection, synchronization with external systems, automation of processes and supervisory control. Distributed control modeling is applied to the widely distributed devices that coexist in ATLAS. Thus, control is achieved by means of many distributed, autonomous and co-operative entities that are hierarchically organized and follow a finite-state machine logic. The key to integration of these systems lies in the so called Finite State Machine tool (FSM), which is based on two main enabling technologies: a SCADA product, and the State Manager Interface (SMI++) toolkit. The SMI++ toolkit has been already used with success in two previous HEP experiments providing functionality such as: an object-oriented language, a finite-state machine logic, an interface to develop expert systems, and a platform-independent communication protocol. This functionality is then used at all levels of the experiment operation process, ranging from the overall supervision down to device integration, enabling the overall sequencing and automation of the experiment. Although the experience gained in the past is an important input for the design of the detector's control hierarchy, further requirements arose due to the complexity and size of ATLAS. In total, around 200.000 channels will be supervised by the DCS and the final control tree will be hundreds of times bigger than any of the antecedents. Thus, in order to apply a hierarchical control model to the ATLAS DCS, a common approach has been proposed to ensure homogeneity between the large-scale distributed software ensembles of sub-detectors. A standard architecture and a human interface have been defined with emphasis on the early detection, monitoring and diagnosis of faults based on a dynamic fault-data mechanism. This mechanism relies on two parallel communication paths that manage the faults while providing a clear description of the detector conditions. The DCS information is split and handled by different types of SMI++ objects; whilst one path of objects manages the operational mode of the system, the other is to handle eventual faults. The proposed strategy has been validated through many different tests with positive results in both functionality and performance. This strategy has been successfully implemented and constitutes the ATLAS standard to build the global control tree. During the operation of the experiment, the DCS, responsible for the detector operation, must be synchronized with the data acquisition system which is in charge of the physics data taking process. The interaction between both systems has so far been limited, but becomes increasingly important as the detector nears completion. A prototype implementation, ready to be used during the sub-detector integration, has achieved data reconciliation by mapping the different segments of the data acquisition system into the DCS control tree. The adopted solution allows the data acquisition control applications to command different DCS sections independently and prevents incorrect physics data taking caused by a failure in a detector part. Finally, the human-machine interface presents and controls the DCS data in the ATLAS control room. The main challenges faced during the design and development phases were: how to support the operator in controlling this large system, how to maintain integration across many displays, and how to provide an effective navigation. These issues have been solved by combining the functionalities provided by both, the SCADA product and the FSM tool. The control hierarchy provides an intuitive structure for the organization of many different displays that are needed for the visualization of the experiment conditions. Each node in the tree represents a workspace that contains the functional information associated with its abstraction level within the hierarchy. By means of an effective navigation, any workspace of the control tree is accessible by the operator or detector expert within a common human interface layout. The interface is modular and flexible enough to be accommodated to new operational scenarios, fulfil the necessities of the different kind of users and facilitate the maintenance during the long lifetime of the detector of up to 20 years. The interface is in use since several months, and the sub-detector's control hierarchies, together with their associated displays, are currently being integrated into the common human-machine interface
    • โ€ฆ
    corecore