1,828 research outputs found

    A synthesis of logic and bio-inspired techniques in the design of dependable systems

    Get PDF
    Much of the development of model-based design and dependability analysis in the design of dependable systems, including software intensive systems, can be attributed to the application of advances in formal logic and its application to fault forecasting and verification of systems. In parallel, work on bio-inspired technologies has shown potential for the evolutionary design of engineering systems via automated exploration of potentially large design spaces. We have not yet seen the emergence of a design paradigm that effectively combines these two techniques, schematically founded on the two pillars of formal logic and biology, from the early stages of, and throughout, the design lifecycle. Such a design paradigm would apply these techniques synergistically and systematically to enable optimal refinement of new designs which can be driven effectively by dependability requirements. The paper sketches such a model-centric paradigm for the design of dependable systems, presented in the scope of the HiP-HOPS tool and technique, that brings these technologies together to realise their combined potential benefits. The paper begins by identifying current challenges in model-based safety assessment and then overviews the use of meta-heuristics at various stages of the design lifecycle covering topics that span from allocation of dependability requirements, through dependability analysis, to multi-objective optimisation of system architectures and maintenance schedules

    Model-based dependability analysis : state-of-the-art, challenges and future outlook

    Get PDF
    Abstract: Over the past two decades, the study of model-based dependability analysis has gathered significant research interest. Different approaches have been developed to automate and address various limitations of classical dependability techniques to contend with the increasing complexity and challenges of modern safety-critical system. Two leading paradigms have emerged, one which constructs predictive system failure models from component failure models compositionally using the topology of the system. The other utilizes design models - typically state automata - to explore system behaviour through fault injection. This paper reviews a number of prominent techniques under these two paradigms, and provides an insight into their working mechanism, applicability, strengths and challenges, as well as recent developments within these fields. We also discuss the emerging trends on integrated approaches and advanced analysis capabilities. Lastly, we outline the future outlook for model-based dependability analysis

    A synthesis of logic and bio-inspired techniques in the design of dependable systems

    Get PDF
    YesMuch of the development of model-based design and dependability analysis in the design of dependable systems, including software intensive systems, can be attributed to the application of advances in formal logic and its application to fault forecasting and verification of systems. In parallel, work on bio-inspired technologies has shown potential for the evolutionary design of engineering systems via automated exploration of potentially large design spaces. We have not yet seen the emergence of a design paradigm that effectively combines these two techniques, schematically founded on the two pillars of formal logic and biology, from the early stages of, and throughout, the design lifecycle. Such a design paradigm would apply these techniques synergistically and systematically to enable optimal refinement of new designs which can be driven effectively by dependability requirements. The paper sketches such a model-centric paradigm for the design of dependable systems, presented in the scope of the HiP-HOPS tool and technique, that brings these technologies together to realise their combined potential benefits. The paper begins by identifying current challenges in model-based safety assessment and then overviews the use of meta-heuristics at various stages of the design lifecycle covering topics that span from allocation of dependability requirements, through dependability analysis, to multi-objective optimisation of system architectures and maintenance schedules

    Dynamic model-based safety analysis: from state machines to temporal fault trees

    Get PDF
    Finite state transition models such as State Machines (SMs) have become a prevalent paradigm for the description of dynamic systems. Such models are well-suited to modelling the behaviour of complex systems, including in conditions of failure, and where the order in which failures and fault events occur can affect the overall outcome (e.g. total failure of the system). For the safety assessment though, the SM failure behavioural models need to be converted to analysis models like Generalised Stochastic Petri Nets (GSPNs), Markov Chains (MCs) or Fault Trees (FTs). This is particularly important if the transformed models are supported by safety analysis tools.This thesis, firstly, identifies a number of problems encountered in current safety analysis techniques based on SMs. One of the existing approaches consists of transforming the SMs to analysis-supported state-transition formalisms like GSPNs or MCs, which are very powerful in capturing the dynamic aspects and in the evaluation of safety measures. But in this approach, qualitative analysis is not encouraged; here the focus is primarily on probabilistic analysis. Qualitative analysis is particularly important when probabilistic data are not available (e.g., at early stages of design). In an alternative approach though, the generation of combinatorial, Boolean FTs has been applied to SM-based models. FTs are well-suited to qualitative analysis, but cannot capture the significance of the temporal order of events expressed by SMs. This makes the approach potentially error prone for the analysis of dynamic systems. In response, we propose a new SM-based safety analysis technique which converts SMs to Temporal Fault Trees (TFTs) using Pandora — a recent technique for introducing temporal logic to FTs. Pandora provides a set of temporal laws, which allow the significance of the SM temporal semantics to be preserved along the logical analysis, and thereby enabling a true qualitative analysis of a dynamic system. The thesis develops algorithms for conversion of SMs to TFTs. It also deals with the issue of scalability of the approach by proposing a form of compositional synthesis in which system large TFTs can be generated from individual component SMs using a process of composition. This has the dual benefits of allowing more accurate analysis of different sequences of faults, and also helping to reduce the cost of performing temporal analysis by producing smaller, more manageable TFTs via the compositionality.The thesis concludes that this approach can potentially address limitations of earlier work and thus help to improve the safety analysis of increasingly complex dynamic safety-critical systems

    Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives

    Get PDF
    © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, Vol. 53, No. 5, Article 95. Publication date: September 2020. https://doi.org/10.1145/3403956[EN] Performance and power constraints come together with Complementary Metal Oxide Semiconductor technology scaling in future Exascale systems. Technology scaling makes each individual transistor more prone to faults and, due to the exponential increase in the number of devices per chip, to higher system fault rates. Consequently, High-performance Computing (HPC) systems need to integrate prediction, detection, and recovery mechanisms to cope with faults efficiently. This article reviews fault detection, fault prediction, and recovery techniques in HPC systems, from electronics to system level. We analyze their strengths and limitations. Finally, we identify the promising paths to meet the reliability levels of Exascale systems.This work has received funding from the European Union's Horizon 2020 (H2020) research and innovation program under the FET-HPC Grant Agreement No. 801137 (RECIPE). Jaume Abella was also partially supported by the Ministry of Economy and Competitiveness of Spain under Contract No. TIN2015-65316-P and under Ramon y Cajal Postdoctoral Fellowship No. RYC-2013-14717, as well as by the HiPEAC Network of Excellence. Ramon Canal is partially supported by the Generalitat de Catalunya under Contract No. 2017SGR0962.Canal, R.; Hernández Luz, C.; Tornero-Gavilá, R.; Cilardo, A.; Massari, G.; Reghenzani, F.; Fornaciari, W.... (2020). Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives. ACM Computing Surveys. 53(5):1-32. https://doi.org/10.1145/3403956S132535Abella, J., Hernandez, C., Quinones, E., Cazorla, F. J., Conmy, P. R., Azkarate-askasua, M., … Vardanega, T. (2015). WCET analysis methods: Pitfalls and challenges on their trustworthiness. 10th IEEE International Symposium on Industrial Embedded Systems (SIES). doi:10.1109/sies.2015.7185039E. Agullo L. Giraud A. Guermouche J. Roman and M. Zounon. 2013. Towards resilient parallel linear Krylov solvers: Recover-restart strategies. INRIA Research Report RR-8324. E. Agullo L. Giraud A. Guermouche J. Roman and M. Zounon. 2013. Towards resilient parallel linear Krylov solvers: Recover-restart strategies. INRIA Research Report RR-8324.Agullo, E., Giraud, L., Salas, P., & Zounon, M. (2016). Interpolation-Restart Strategies for Resilient Eigensolvers. SIAM Journal on Scientific Computing, 38(5), C560-C583. doi:10.1137/15m1042115Al-Qawasmeh, A. M., Pasricha, S., Maciejewski, A. A., & Siegel, H. J. (2015). Power and Thermal-Aware Workload Allocation in Heterogeneous Data Centers. IEEE Transactions on Computers, 64(2), 477-491. doi:10.1109/tc.2013.116ARM. 2017. ARM Reliability Availability and Serviceability (RAS) Specification—ARMv8 for the ARMv8-A Architecture Profile. White paper. Retrieved from https://developer.arm.com/docs/ddi0587/latest. ARM. 2017. ARM Reliability Availability and Serviceability (RAS) Specification—ARMv8 for the ARMv8-A Architecture Profile. White paper. Retrieved from https://developer.arm.com/docs/ddi0587/latest.Avizienis, A., Laprie, J.-C., Randell, B., & Landwehr, C. (2004). Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing, 1(1), 11-33. doi:10.1109/tdsc.2004.2Bautista-Gomez, L., Zyulkyarov, F., Unsal, O., & McIntosh-Smith, S. (2016). Unprotected Computing: A Large-Scale Study of DRAM Raw Error Rate on a Supercomputer. SC16: International Conference for High Performance Computing, Networking, Storage and Analysis. doi:10.1109/sc.2016.54Berrocal, E., Bautista-Gomez, L., Di, S., Lan, Z., & Cappello, F. (2017). Toward General Software Level Silent Data Corruption Detection for Parallel Applications. IEEE Transactions on Parallel and Distributed Systems, 28(12), 3642-3655. doi:10.1109/tpds.2017.2735971M.-A. Breuer and A. D. Friedman. 1976. Diagnosis 8 Reliable Design of Digital Systems. Springer. M.-A. Breuer and A. D. Friedman. 1976. Diagnosis 8 Reliable Design of Digital Systems. Springer.P. Bridges K. Ferreira M. Heroux and M. Hoemmen. 2012. Fault-tolerant linear solvers via selective reliability. ArXiv e-prints June 2012. arXiv:1206.1390 [math.NA]. P. Bridges K. Ferreira M. Heroux and M. Hoemmen. 2012. Fault-tolerant linear solvers via selective reliability. ArXiv e-prints June 2012. arXiv:1206.1390 [math.NA].F. Cappello A. Geist W. Gropp S. Kale B. Kramer and M. Snir. 2014. Toward exascale resilience: 2014 update. Supercomput. Front. Innovat. 1 1 (2014). http://superfri.org/superfri/article/view/14. F. Cappello A. Geist W. Gropp S. Kale B. Kramer and M. Snir. 2014. Toward exascale resilience: 2014 update. Supercomput. Front. Innovat. 1 1 (2014). http://superfri.org/superfri/article/view/14.F. J. Cazorla L. Kosmidis E. Mezzetti C. Hernandez J. Abella and T. Vardanega. 2019. Probabilistic worst-case timing analysis: Taxonomy and comprehensive survey. ACM Comput. Surv. 52 1 Article 14 (Feb. 2019) 35 pages. DOI:https://doi.org/10.1145/3301283 F. J. Cazorla L. Kosmidis E. Mezzetti C. Hernandez J. Abella and T. Vardanega. 2019. Probabilistic worst-case timing analysis: Taxonomy and comprehensive survey. ACM Comput. Surv. 52 1 Article 14 (Feb. 2019) 35 pages. DOI:https://doi.org/10.1145/3301283Chan, C. S., Pan, B., Gross, K., Vaidyanathan, K., & Rosing, T. Š. (2014). Correcting vibration-induced performance degradation in enterprise servers. ACM SIGMETRICS Performance Evaluation Review, 41(3), 83-88. doi:10.1145/2567529.2567555Chantem, T., Hu, X. S., & Dick, R. P. (2011). Temperature-Aware Scheduling and Assignment for Hard Real-Time Applications on MPSoCs. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 19(10), 1884-1897. doi:10.1109/tvlsi.2010.2058873Chen, M. Y., Kiciman, E., Fratkin, E., Fox, A., & Brewer, E. (s. f.). Pinpoint: problem determination in large, dynamic Internet services. Proceedings International Conference on Dependable Systems and Networks. doi:10.1109/dsn.2002.1029005Chen, Z. (2011). Algorithm-based recovery for iterative methods without checkpointing. Proceedings of the 20th international symposium on High performance distributed computing - HPDC ’11. doi:10.1145/1996130.1996142Chen, Z. (2013). Online-ABFT. Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP ’13. doi:10.1145/2442516.2442533Coskun, A. K., Rosing, T. S., Mihic, K., De Micheli, G., & Leblebici, Y. (2006). Analysis and Optimization of MPSoC Reliability. Journal of Low Power Electronics, 2(1), 56-69. doi:10.1166/jolpe.2006.007G. Da Costa A. Oleksiak W. Piatek J. Salom and L. Sisó. 2015. Minimization of costs and energy consumption in a data center by a workload-based capacity management. In Energy Efficient Data Centers S. Klingert M. Chinnici and M. Rey Porto (Eds.). Springer International Publishing Cham 102--119. G. Da Costa A. Oleksiak W. Piatek J. Salom and L. Sisó. 2015. Minimization of costs and energy consumption in a data center by a workload-based capacity management. In Energy Efficient Data Centers S. Klingert M. Chinnici and M. Rey Porto (Eds.). Springer International Publishing Cham 102--119.Cupertino, L., Da Costa, G., Oleksiak, A., Pia¸tek, W., Pierson, J.-M., Salom, J., … Zilio, T. (2015). Energy-efficient, thermal-aware modeling and simulation of data centers: The CoolEmAll approach and evaluation results. Ad Hoc Networks, 25, 535-553. doi:10.1016/j.adhoc.2014.11.002Dally, W. J. (1991). Express cubes: improving the performance of k-ary n-cube interconnection networks. IEEE Transactions on Computers, 40(9), 1016-1023. doi:10.1109/12.83652Dauwe, D., Pasricha, S., Maciejewski, A. A., & Siegel, H. J. (2018). Resilience-Aware Resource Management for Exascale Computing Systems. IEEE Transactions on Sustainable Computing, 3(4), 332-345. doi:10.1109/tsusc.2018.2797890R. I. Davis and A. Burns. 2011. A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43 4 Article 35 (Oct. 2011) 44 pages. DOI:https://doi.org/10.1145/1978802.1978814 R. I. Davis and A. Burns. 2011. A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43 4 Article 35 (Oct. 2011) 44 pages. DOI:https://doi.org/10.1145/1978802.1978814Di, S., & Cappello, F. (2016). Adaptive Impact-Driven Detection of Silent Data Corruption for HPC Applications. IEEE Transactions on Parallel and Distributed Systems, 27(10), 2809-2823. doi:10.1109/tpds.2016.2517639Di, S., Guo, H., Gupta, R., Pershey, E. R., Snir, M., & Cappello, F. (2019). Exploring Properties and Correlations of Fatal Events in a Large-Scale HPC System. IEEE Transactions on Parallel and Distributed Systems, 30(2), 361-374. doi:10.1109/tpds.2018.2864184Di, S., Robert, Y., Vivien, F., & Cappello, F. (2017). Toward an Optimal Online Checkpoint Solution under a Two-Level HPC Checkpoint Model. IEEE Transactions on Parallel and Distributed Systems, 28(1), 244-259. doi:10.1109/tpds.2016.2546248J. Dongarra T. Herault and Y. Robert. 2015. Fault Tolerance Techniques for High-Performance Computing. Springer. J. Dongarra T. Herault and Y. Robert. 2015. Fault Tolerance Techniques for High-Performance Computing. Springer.DOWNING, S., & SOCIE, D. (1982). Simple rainflow counting algorithms. International Journal of Fatigue, 4(1), 31-40. doi:10.1016/0142-1123(82)90018-4Eghbalkhah, B., Kamal, M., Afzali-Kusha, H., Afzali-Kusha, A., Ghaznavi-Ghoushchi, M. B., & Pedram, M. (2015). Workload and temperature dependent evaluation of BTI-induced lifetime degradation in digital circuits. Microelectronics Reliability, 55(8), 1152-1162. doi:10.1016/j.microrel.2015.06.004Gottscho, M., Shoaib, M., Govindan, S., Sharma, B., Wang, D., & Gupta, P. (2017). Measuring the Impact of Memory Errors on Application  Performance. IEEE Computer Architecture Letters, 16(1), 51-55. doi:10.1109/lca.2016.2599513Greenberg, A., Hamilton, J. R., Jain, N., Kandula, S., Kim, C., Lahiri, P., … Sengupta, S. (2011). VL2. Communications of the ACM, 54(3), 95-104. doi:10.1145/1897852.1897877Heroux, M. A., Bartlett, R. A., Howle, V. E., Hoekstra, R. J., Hu, J. J., Kolda, T. G., … Stanley, K. S. (2005). An overview of the Trilinos project. ACM Transactions on Mathematical Software, 31(3), 397-423. doi:10.1145/1089014.1089021Hoffmann, G. A., Trivedi, K. S., & Malek, M. (2007). A Best Practice Guide to Resource Forecasting for Computing Systems. IEEE Transactions on Reliability, 56(4), 615-628. doi:10.1109/tr.2007.909764Hsiao, M. Y., Carter, W. C., Thomas, J. W., & Stringfellow, W. R. (1981). Reliability, Availability, and Serviceability of IBM Computer Systems: A Quarter Century of Progress. IBM Journal of Research and Development, 25(5), 453-468. doi:10.1147/rd.255.0453Hughes, G. F., Murray, J. F., Kreutz-Delgado, K., & Elkan, C. (2002). Improved disk-drive failure warnings. IEEE Transactions on Reliability, 51(3), 350-357. doi:10.1109/tr.2002.802886S. Hukerikar and C. Engelmann. 2017. Resilience design patterns: A structured approach to resilience at extreme scale. Supercomput. Front. Innov. 4 3 (2017). DOI:https://doi.org/10.14529/jsfi170301 S. Hukerikar and C. Engelmann. 2017. Resilience design patterns: A structured approach to resilience at extreme scale. Supercomput. Front. Innov. 4 3 (2017). DOI:https://doi.org/10.14529/jsfi170301Hussain, H., Malik, S. U. R., Hameed, A., Khan, S. U., Bickler, G., Min-Allah, N., … Rayes, A. (2013). A survey on resource allocation in high performance distributed computing systems. Parallel Computing, 39(11), 709-736. doi:10.1016/j.parco.2013.09.009Intel Corporation. [n.d.]. Intel Xeon Processor E7 Family: Reliability Availability and Serviceability. White paper. https://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-family-ras-server-paper.html. Intel Corporation. [n.d.]. Intel Xeon Processor E7 Family: Reliability Availability and Serviceability. White paper. https://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-family-ras-server-paper.html.Jha, S., Formicola, V., Martino, C. D., Dalton, M., Kramer, W. T., Kalbarczyk, Z., & Iyer, R. K. (2018). Resiliency of HPC Interconnects: A Case Study of Interconnect Failures and Recovery in Blue Waters. IEEE Transactions on Dependable and Secure Computing, 15(6), 915-930. doi:10.1109/tdsc.2017.2737537Kiciman, E., & Fox, A. (2005). Detecting Application-Level Failures in Component-Based Internet Services. IEEE Transactions on Neural Networks, 16(5), 1027-1041. doi:10.1109/tnn.2005.853411Kim, T., Sun, Z., Cook, C., Zhao, H., Li, R., Wong, D., & Tan, S. X.-D. (2016). Invited - Cross-layer modeling and optimization for electromigration induced reliability. Proceedings of the 53rd Annual Design Automation Conference. doi:10.1145/2897937.2905010Kurowski, K., Oleksiak, A., Piątek, W., Piontek, T., Przybyszewski, A., & Węglarz, J. (2013). DCworms – A tool for simulation of energy efficiency in distributed computing infrastructures. Simulation Modelling Practice and Theory, 39, 135-151. doi:10.1016/j.simpat.2013.08.007Langou, J., Chen, Z., Bosilca, G., & Dongarra, J. (2008). Recovery Patterns for Iterative Methods in a Parallel Unstable Environment. SIAM Journal on Scientific Computing, 30(1), 102-116. doi:10.1137/040620394J. C. Laprie (Ed.). 1995. Dependability—Its Attributes Impairments and Means. Springer-Verlag Berlin. J. C. Laprie (Ed.). 1995. Dependability—Its Attributes Impairments and Means. Springer-Verlag Berlin.Laprie, J.-C. (s. f.). DEPENDABLE COMPUTING AND FAULT TOLERANCE : CONCEPTS AND TERMINOLOGY. Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ’ Highlights from Twenty-Five Years’. doi:10.1109/ftcsh.1995.532603Lasance, C. J. M. (2003). Thermally driven reliability issues in microelectronic systems: status-quo and challenges. Microelectronics Reliability, 43(12), 1969-1974. doi:10.1016/s0026-2714(03)00183-5Yinglung Liang, Yanyong Zhang, Sivasubramaniam, A., Jette, M., & Sahoo, R. (s. f.). BlueGene/L Failure Analysis and Prediction Models. International Conference on Dependable Systems and Networks (DSN’06). doi:10.1109/dsn.2006.18Lin, T.-T. Y., & Siewiorek, D. P. (1990). Error log analysis: statistical modeling and heuristic trend analysis. IEEE Transactions on Reliability, 39(4), 419-432. doi:10.1109/24.58720Losada, N., González, P., Martín, M. J., Bosilca, G., Bouteiller, A., & Teranishi, K. (2020). Fault tolerance of MPI applications in exascale systems: The ULFM solution. Future Generation Computer Systems, 106, 467-481. doi:10.1016/j.future.2020.01.026Lyons, R. E., & Vanderkulk, W. (1962). The Use of Triple-Modular Redundancy to Improve Computer Reliability. IBM Journal of Research and Development, 6(2), 200-209. doi:10.1147/rd.62.0200M. Médard and S. S. Lumetta. 2003. Network Reliability and Fault Tolerance. American Cancer Society. Retrieved from arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/0471219282.eot281. M. Médard and S. S. Lumetta. 2003. Network Reliability and Fault Tolerance. American Cancer Society. Retrieved from arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/0471219282.eot281.Moody, A., Bronevetsky, G., Mohror, K., & de Supinski, B. (2010). Detailed Modeling, Design, and Evaluation of a Scalable Multi-level Checkpointing System. doi:10.2172/984082Moor Insights 8 Strategy. 2017. AMD EPYC Brings New RAS Capability. White paper. Retrieved from https://www.amd.com/system/files/2017-06/AMD-EPYC-Brings-New-RAS-Capability.pdf. Moor Insights 8 Strategy. 2017. AMD EPYC Brings New RAS Capability. White paper. Retrieved from https://www.amd.com/system/files/2017-06/AMD-EPYC-Brings-New-RAS-Capability.pdf.Mulas, F., Atienza, D., Acquaviva, A., Carta, S., Benini, L., & De Micheli, G. (2009). Thermal Balancing Policy for Multiprocessor Stream Computing Platforms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 28(12), 1870-1882. doi:10.1109/tcad.2009.2032372Oleksiak, A., Kierzynka, M., Piatek, W., Agosta, G., Barenghi, A., Brandolese, C., … Janssen, U. (2017). M2DC – Modular Microserver DataCentre with heterogeneous hardware. Microprocessors and Microsystems, 52, 117-130. doi:10.1016/j.micpro.2017.05.019Oxley, M. A., Jonardi, E., Pasricha, S., Maciejewski, A. A., Siegel, H. J., Burns, P. J., & Koenig, G. A. (2018). Rate-based thermal, power, and co-location aware resource management for heterogeneous data centers. Journal of Parallel and Distributed Computing, 112, 126-139. doi:10.1016/j.jpdc.2017.04.015K. O’brien I. Pietri R. Reddy A. Lastovetsky and R. Sakellariou. 2017. A survey of power and energy predictive models in HPC systems and applications. ACM Comput. Surv. 50 3 Article 37 (June 2017) 38 pages. DOI:https://doi.org/10.1145/3078811 K. O’brien I. Pietri R. Reddy A. Lastovetsky and R. Sakellariou. 2017. A survey of power and energy predictive models in HPC systems and applications. ACM Comput. Surv. 50 3 Article 37 (June 2017) 38 pages. DOI:https://doi.org/10.1145/3078811Park, S.-M., & Humphrey, M. (2011). Predictable High-Performance Computing Using Feedback Control and Admission Control. IEEE Transactions on Parallel and Distributed Systems, 22(3), 396-411. doi:10.1109/tpds.2010.100Pfefferman, J. D., & Cernuschi-Frias, B. (2002). A nonparametric nonstationary procedure for failure prediction. IEEE Transactions on Reliability, 51(4), 434-442. doi:10.1109/tr.2002.804733Rangan, K. K., Wei, G.-Y., & Brooks, D. (2009). Thread motion. ACM SIGARCH Computer Architecture News, 37(3), 302-313. doi:10.1145/1555815.1555793Paolo Rech. [n.d.]. Reliability Issues in Current and Future Supercomputers. Retrieved from http://energysfe.ufsc.br/slides/Paolo-Rech-260917.pdf. Paolo Rech. [n.d.]. Reliability Issues in Current and Future Supercomputers. Retrieved from http://energysfe.ufsc.br/slides/Paolo-Rech-260917.pdf.F. Reghenzani G. Massari and W. Fornaciari. 2019. The real-time Linux kernel: A survey on PREEMPT_RT. Comput. Surveys 52 1 Article 18 (Feb. 2019) 36 pages. DOI:https://doi.org/10.1145/3297714 F. Reghenzani G. Massari and W. Fornaciari. 2019. The real-time Linux kernel: A survey on PREEMPT_RT. Comput. Surveys 52 1 Article 18 (Feb. 2019) 36 pages. DOI:https://doi.org/10.1145/3297714F. Salfner M. Lenk and M. Malek. 2010. A survey of online failure prediction methods. ACM Comput. Surv. 42 3 Article 10 (March 2010) 42 pages. DOI:https://doi.org/10.1145/1670679.1670680 F. Salfner M. Lenk and M. Malek. 2010. A survey of online failure prediction methods. ACM Comput. Surv. 42 3 Article 10 (March 2010) 42 pages. DOI:https://doi.org/10.1145/1670679.1670680Salfner, F., Schieschke, M., & Malek, M. (2006). Predicting failures of computer systems: a case study for a telecommunication system. Proceedings 20th IEEE International Parallel & Distributed Processing Symposium. doi:10.1109/ipdps.2006.1639672Shi, L., Chen, H., Sun, J., & Li, K. (2012). vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines. IEEE Transactions on Computers, 61(6), 804-816. doi:10.1109/tc.2011.112D. P. Siewiorek and R. S. Swarz. 1998. Reliable Computer Systems 3rd ed. A. K. Peters Ltd. D. P. Siewiorek and R. S. Swarz. 1998. Reliable Computer Systems 3rd ed. A. K. Peters Ltd.Singh, S., & Chana, I. (2016). A Survey on Resource Scheduling in Cloud Computing: Issues and Challenges. Journal of Grid Computing, 14(2), 217-264. doi:10.1007/s10723-015-9359-2Slegel, T. J., Averill, R. M., Check, M. A., Giamei, B. C., Krumm, B. W., Krygowski, C. A., … Webb, C. F. (1999). IBM’s S/390 G5 microprocessor design. IEEE Micro, 19(2), 12-23. doi:10.1109/40.755464Sridhar, A., Sabry, M. M., & Atienza, D. (2014). A Semi-Analytical Thermal Modeling Framework for Liquid-Cooled ICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(8), 1145-1158. doi:10.1109/tcad.2014.2323194Sridharan, V., DeBardeleben, N., Blanchard, S., Ferreira, K. B., Stearley, J., Shalf, J., & Gurumurthi, S. (2015). Memory Errors in Modern Systems. ACM SIGARCH Computer Architecture News, 43(1), 297-310. doi:10.1145/2786763.2694348Stathis, J. H. (2018). The physics of NBTI: What do we really know? 2018 IEEE International Reliability Physics Symposium (IRPS). doi:10.1109/irps.2018.8353539Stellner, G. (s. f.). CoCheck: checkpointing and process migration for MPI. Proceedings of International Conference on Parallel Processing. doi:10.1109/ipps.1996.508106Stone, J. E., Gohara, D., & Shi, G. (2010). OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems. Computing in Science & Engineering, 12(3), 66-73. doi:10.1109/mcse.2010.69Subasi, O., Di, S., Bautista-Gomez, L., Balaprakash, P., Unsal, O., Labarta, J., … Cappello, F. (2018). Exploring the capabilities of support vector machines in detecting silent data corruptions. Sustainable Computing: Informatics and Systems, 19, 277-290. doi:10.1016/j.suscom.2018.01.004Tang, D., & Iyer, R. K. (1993). Dependability measurement and modeling of a multicomputer system. IEEE Transactions on Computers, 42(1), 62-75. doi:10.1109/12.192214D. Turnbull and N. Alldrin. 2003. Failure Prediction in Hardware Systems. Tech. rep. University of California San Diego CA. Retrieved from http://www.cs.ucsd.edu/ dturnbul/Papers/ServerPrediction.pdf. D. Turnbull and N. Alldrin. 2003. Failure Prediction in Hardware Systems. Tech. rep. University of California San Diego CA. Retrieved from http://www.cs.ucsd.edu/ dturnbul/Papers/ServerPrediction.pdf.Vilalta, R., Apte, C. V., Hellerstein, J. L., Ma, S., & Weiss, S. M. (2002). Predictive algorithms in the management of computer systems. IBM Systems Journal, 41(3), 461-474. doi:10.1147/sj.413.0461Vinoski, S. (2007). Reliability with Erlang. IEEE Internet Com

    Model transformation for multi-objective architecture optimisation for dependable systems

    Get PDF
    Model-based engineering (MBE) promises a number of advantages for the development of embedded systems. Model-based engineering depends on a common model of the system, which is refined as the system is developed. The use of a common model promises a consistent and systematic analysis of dependability, correctness, timing and performance properties. These benefits are potentially available early and throughout the development life cycle. An important part of model-based engineering is the use of analysis and design languages. The Architecture Analysis and Design Language (AADL) is a new modelling language which is increasingly being used for high dependability embedded systems development. AADL is ideally suited to model-based engineering but the use of new language threatens to isolate existing tools which use different languages. This is a particular problem when these tools provide an important development or analysis function, for example system optimisation. System designers seek an optimal trade-off between high dependability and low cost. For large systems, the design space of alternatives with respect to both dependability and cost is enormous and too large to investigate manually. For this reason automation is required to produce optimal or near optimal designs.There is, however, a lack of analysis techniques and tools that can perform a dependability analysis and optimisation of AADL models. Some analysis tools are available in the literature but they are not able to accept AADL models since they use a different modelling language. A cost effective way of adding system dependability analysis and optimisation to models expressed in AADL is to exploit the capabilities of existing tools. Model transformation is a useful technique to maximise the utility of model-based engineering approaches because it provides a route for the exploitation of mature and tested tools in a new model-based engineering context. By using model transformation techniques, one can automatically translate between AADL models and other models. The advantage of this model transformation approach is that it opens a path by which AADL models may exploit existing non-AADL tools.There is little published work which gives a comprehensive description of a method for transforming AADL models. Although transformations from AADL into other models have been reported only one comprehensive description has been published, a transformation of AADL to petri net models. There is a lack of detailed guidance for the transformation of AADL models.This thesis investigates the transformation of AADL models into the HiP-HOPS modelling language, in order to provide dependability analysis and optimisation. HiP-HOPS is a mature, state of the art, dependability analysis and optimisation tool but it has its own model. A model transformation is defined from the AADL model to the HiP-HOPS model. In addition to the model-to-model transformation, it is necessary to extend the AADL modelling attributes. For cost and dependability optimisation, a new AADL property set is developed for modelling component and system variability. This solves the problem of describing, within an AADL model, the design space of alternative designs. The transformation (with transformation rules written in ATLAS Transformation Language (ATL)) has been implemented as a plug-in for the AADL model development tool OSATE (Open-source AADL Tool Environment). To illustrate the method, the plug-in is used to transform some AADL model case-studies

    On Age-of-Information Aware Resource Allocation for Industrial Control-Communication-Codesign

    Get PDF
    Unter dem Überbegriff Industrie 4.0 wird in der industriellen Fertigung die zunehmende Digitalisierung und Vernetzung von industriellen Maschinen und Prozessen zusammengefasst. Die drahtlose, hoch-zuverlässige, niedrig-latente Kommunikation (engl. ultra-reliable low-latency communication, URLLC) – als Bestandteil von 5G gewährleistet höchste Dienstgüten, die mit industriellen drahtgebundenen Technologien vergleichbar sind und wird deshalb als Wegbereiter von Industrie 4.0 gesehen. Entgegen diesem Trend haben eine Reihe von Arbeiten im Forschungsbereich der vernetzten Regelungssysteme (engl. networked control systems, NCS) gezeigt, dass die hohen Dienstgüten von URLLC nicht notwendigerweise erforderlich sind, um eine hohe Regelgüte zu erzielen. Das Co-Design von Kommunikation und Regelung ermöglicht eine gemeinsame Optimierung von Regelgüte und Netzwerkparametern durch die Aufweichung der Grenze zwischen Netzwerk- und Applikationsschicht. Durch diese Verschränkung wird jedoch eine fundamentale (gemeinsame) Neuentwicklung von Regelungssystemen und Kommunikationsnetzen nötig, was ein Hindernis für die Verbreitung dieses Ansatzes darstellt. Stattdessen bedient sich diese Dissertation einem Co-Design-Ansatz, der beide Domänen weiterhin eindeutig voneinander abgrenzt, aber das Informationsalter (engl. age of information, AoI) als bedeutenden Schnittstellenparameter ausnutzt. Diese Dissertation trägt dazu bei, die Echtzeitanwendungszuverlässigkeit als Folge der Überschreitung eines vorgegebenen Informationsalterschwellenwerts zu quantifizieren und fokussiert sich dabei auf den Paketverlust als Ursache. Anhand der Beispielanwendung eines fahrerlosen Transportsystems wird gezeigt, dass die zeitlich negative Korrelation von Paketfehlern, die in heutigen Systemen keine Rolle spielt, für Echtzeitanwendungen äußerst vorteilhaft ist. Mit der Annahme von schnellem Schwund als dominanter Fehlerursache auf der Luftschnittstelle werden durch zeitdiskrete Markovmodelle, die für die zwei Netzwerkarchitekturen Single-Hop und Dual-Hop präsentiert werden, Kommunikationsfehlerfolgen auf einen Applikationsfehler abgebildet. Diese Modellierung ermöglicht die analytische Ableitung von anwendungsbezogenen Zuverlässigkeitsmetriken wie die durschnittliche Dauer bis zu einem Fehler (engl. mean time to failure). Für Single-Hop-Netze wird das neuartige Ressourcenallokationsschema State-Aware Resource Allocation (SARA) entwickelt, das auf dem Informationsalter beruht und die Anwendungszuverlässigkeit im Vergleich zu statischer Multi-Konnektivität um Größenordnungen erhöht, während der Ressourcenverbrauch im Bereich von konventioneller Einzelkonnektivität bleibt. Diese Zuverlässigkeit kann auch innerhalb eines Systems von Regelanwendungen, in welchem mehrere Agenten um eine begrenzte Anzahl Ressourcen konkurrieren, statistisch garantiert werden, wenn die Anzahl der verfügbaren Ressourcen pro Agent um ca. 10 % erhöht werden. Für das Dual-Hop Szenario wird darüberhinaus ein Optimierungsverfahren vorgestellt, das eine benutzerdefinierte Kostenfunktion minimiert, die niedrige Anwendungszuverlässigkeit, hohes Informationsalter und hohen durchschnittlichen Ressourcenverbrauch bestraft und so das benutzerdefinierte optimale SARA-Schema ableitet. Diese Optimierung kann offline durchgeführt und als Look-Up-Table in der unteren Medienzugriffsschicht zukünftiger industrieller Drahtlosnetze implementiert werden.:1. Introduction 1 1.1. The Need for an Industrial Solution . . . . . . . . . . . . . . . . . . . 3 1.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Related Work 7 2.1. Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Codesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1. The Need for Abstraction – Age of Information . . . . . . . . 11 2.4. Dependability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3. Deriving Proper Communications Requirements 17 3.1. Fundamentals of Control Theory . . . . . . . . . . . . . . . . . . . . 18 3.1.1. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.2. Performance Requirements . . . . . . . . . . . . . . . . . . . 21 3.1.3. Packet Losses and Delay . . . . . . . . . . . . . . . . . . . . . 22 3.2. Joint Design of Control Loop with Packet Losses . . . . . . . . . . . . 23 3.2.1. Method 1: Reduced Sampling . . . . . . . . . . . . . . . . . . 23 3.2.2. Method 2: Markov Jump Linear System . . . . . . . . . . . . . 25 3.2.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3. Focus Application: The AGV Use Case . . . . . . . . . . . . . . . . . . 31 3.3.1. Control Loop Model . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.2. Control Performance Requirements . . . . . . . . . . . . . . . 33 3.3.3. Joint Modeling: Applying Reduced Sampling . . . . . . . . . . 34 3.3.4. Joint Modeling: Applying MJLS . . . . . . . . . . . . . . . . . 34 3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4. Modeling Control-Communication Failures 43 4.1. Communication Assumptions . . . . . . . . . . . . . . . . . . . . . . 43 4.1.1. Small-Scale Fading as a Cause of Failure . . . . . . . . . . . . 44 4.1.2. Connectivity Models . . . . . . . . . . . . . . . . . . . . . . . 46 4.2. Failure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.1. Single-hop network . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.2. Dual-hop network . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.1. Mean Time to Failure . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2. Packet Loss Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.3. Average Number of Assigned Channels . . . . . . . . . . . . . 57 4.3.4. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5. Single Hop – Single Agent 61 5.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 61 5.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3. Erroneous Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 67 5.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6. Single Hop – Multiple Agents 71 6.1. Failure Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.1.1. Admission Control . . . . . . . . . . . . . . . . . . . . . . . . 72 6.1.2. Transition Probabilities . . . . . . . . . . . . . . . . . . . . . . 73 6.1.3. Computational Complexity . . . . . . . . . . . . . . . . . . . 74 6.1.4. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 75 6.2. Illustration Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.1. Verification through System-Level Simulation . . . . . . . . . 78 6.3.2. Applicability on the System Level . . . . . . . . . . . . . . . . 79 6.3.3. Comparison of Admission Control Schemes . . . . . . . . . . 80 6.3.4. Impact of the Packet Loss Tolerance . . . . . . . . . . . . . . . 82 6.3.5. Impact of the Number of Agents . . . . . . . . . . . . . . . . . 84 6.3.6. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.7. Channel Saturation Ratio . . . . . . . . . . . . . . . . . . . . 86 6.3.8. Enforcing Full Channel Saturation . . . . . . . . . . . . . . . 86 6.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 7. Dual Hop – Single Agent 91 7.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 91 7.2. Optimization Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.3.1. Extensive Simulation . . . . . . . . . . . . . . . . . . . . . . . 96 7.3.2. Non-Integer-Constrained Optimization . . . . . . . . . . . . . 98 7.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 8. Conclusions and Outlook 105 8.1. Key Results and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 105 8.2. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A. DC Motor Model 111 Bibliography 113 Publications of the Author 127 List of Figures 129 List of Tables 131 List of Operators and Constants 133 List of Symbols 135 List of Acronyms 137 Curriculum Vitae 139In industrial manufacturing, Industry 4.0 refers to the ongoing convergence of the real and virtual worlds, enabled through intelligently interconnecting industrial machines and processes through information and communications technology. Ultrareliable low-latency communication (URLLC) is widely regarded as the enabling technology for Industry 4.0 due to its ability to fulfill highest quality-of-service (QoS) comparable to those of industrial wireline connections. In contrast to this trend, a range of works in the research domain of networked control systems have shown that URLLC’s supreme QoS is not necessarily required to achieve high quality-ofcontrol; the co-design of control and communication enables to jointly optimize and balance both quality-of-control parameters and network parameters through blurring the boundary between application and network layer. However, through the tight interlacing, this approach requires a fundamental (joint) redesign of both control systems and communication networks and may therefore not lead to short-term widespread adoption. Therefore, this thesis instead embraces a novel co-design approach which keeps both domains distinct but leverages the combination of control and communications by yet exploiting the age of information (AoI) as a valuable interface metric. This thesis contributes to quantifying application dependability as a consequence of exceeding a given peak AoI with the particular focus on packet losses. The beneficial influence of negative temporal packet loss correlation on control performance is demonstrated by means of the automated guided vehicle use case. Assuming small-scale fading as the dominant cause of communication failure, a series of communication failures are mapped to an application failure through discrete-time Markov models for single-hop (e.g, only uplink or downlink) and dual-hop (e.g., subsequent uplink and downlink) architectures. This enables the derivation of application-related dependability metrics such as the mean time to failure in closed form. For single-hop networks, an AoI-aware resource allocation strategy termed state-aware resource allocation (SARA) is proposed that increases the application reliability by orders of magnitude compared to static multi-connectivity while keeping the resource consumption in the range of best-effort single-connectivity. This dependability can also be statistically guaranteed on a system level – where multiple agents compete for a limited number of resources – if the provided amount of resources per agent is increased by approximately 10 %. For the dual-hop scenario, an AoI-aware resource allocation optimization is developed that minimizes a user-defined penalty function that punishes low application reliability, high AoI, and high average resource consumption. This optimization may be carried out offline and each resulting optimal SARA scheme may be implemented as a look-up table in the lower medium access control layer of future wireless industrial networks.:1. Introduction 1 1.1. The Need for an Industrial Solution . . . . . . . . . . . . . . . . . . . 3 1.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Related Work 7 2.1. Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Codesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1. The Need for Abstraction – Age of Information . . . . . . . . 11 2.4. Dependability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3. Deriving Proper Communications Requirements 17 3.1. Fundamentals of Control Theory . . . . . . . . . . . . . . . . . . . . 18 3.1.1. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.2. Performance Requirements . . . . . . . . . . . . . . . . . . . 21 3.1.3. Packet Losses and Delay . . . . . . . . . . . . . . . . . . . . . 22 3.2. Joint Design of Control Loop with Packet Losses . . . . . . . . . . . . 23 3.2.1. Method 1: Reduced Sampling . . . . . . . . . . . . . . . . . . 23 3.2.2. Method 2: Markov Jump Linear System . . . . . . . . . . . . . 25 3.2.3. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3. Focus Application: The AGV Use Case . . . . . . . . . . . . . . . . . . 31 3.3.1. Control Loop Model . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.2. Control Performance Requirements . . . . . . . . . . . . . . . 33 3.3.3. Joint Modeling: Applying Reduced Sampling . . . . . . . . . . 34 3.3.4. Joint Modeling: Applying MJLS . . . . . . . . . . . . . . . . . 34 3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4. Modeling Control-Communication Failures 43 4.1. Communication Assumptions . . . . . . . . . . . . . . . . . . . . . . 43 4.1.1. Small-Scale Fading as a Cause of Failure . . . . . . . . . . . . 44 4.1.2. Connectivity Models . . . . . . . . . . . . . . . . . . . . . . . 46 4.2. Failure Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.1. Single-hop network . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.2. Dual-hop network . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.1. Mean Time to Failure . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2. Packet Loss Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.3. Average Number of Assigned Channels . . . . . . . . . . . . . 57 4.3.4. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5. Single Hop – Single Agent 61 5.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 61 5.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3. Erroneous Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 67 5.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6. Single Hop – Multiple Agents 71 6.1. Failure Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.1.1. Admission Control . . . . . . . . . . . . . . . . . . . . . . . . 72 6.1.2. Transition Probabilities . . . . . . . . . . . . . . . . . . . . . . 73 6.1.3. Computational Complexity . . . . . . . . . . . . . . . . . . . 74 6.1.4. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . 75 6.2. Illustration Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.1. Verification through System-Level Simulation . . . . . . . . . 78 6.3.2. Applicability on the System Level . . . . . . . . . . . . . . . . 79 6.3.3. Comparison of Admission Control Schemes . . . . . . . . . . 80 6.3.4. Impact of the Packet Loss Tolerance . . . . . . . . . . . . . . . 82 6.3.5. Impact of the Number of Agents . . . . . . . . . . . . . . . . . 84 6.3.6. Age of Information . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.7. Channel Saturation Ratio . . . . . . . . . . . . . . . . . . . . 86 6.3.8. Enforcing Full Channel Saturation . . . . . . . . . . . . . . . 86 6.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 7. Dual Hop – Single Agent 91 7.1. State-Aware Resource Allocation . . . . . . . . . . . . . . . . . . . . 91 7.2. Optimization Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.3.1. Extensive Simulation . . . . . . . . . . . . . . . . . . . . . . . 96 7.3.2. Non-Integer-Constrained Optimization . . . . . . . . . . . . . 98 7.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 8. Conclusions and Outlook 105 8.1. Key Results and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 105 8.2. Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 A. DC Motor Model 111 Bibliography 113 Publications of the Author 127 List of Figures 129 List of Tables 131 List of Operators and Constants 133 List of Symbols 135 List of Acronyms 137 Curriculum Vitae 13

    Review of selection criteria for sensor and actuator configurations suitable for internal combustion engines

    Get PDF
    This literature review considers the problem of finding a suitable configuration of sensors and actuators for the control of an internal combustion engine. It takes a look at the methods, algorithms, processes, metrics, applications, research groups and patents relevant for this topic. Several formal metric have been proposed, but practical use remains limited. Maximal information criteria are theoretically optimal for selecting sensors, but hard to apply to a system as complex and nonlinear as an engine. Thus, we reviewed methods applied to neighboring fields including nonlinear systems and non-minimal phase systems. Furthermore, the closed loop nature of control means that information is not the only consideration, and speed, stability and robustness have to be considered. The optimal use of sensor information also requires the use of models, observers, state estimators or virtual sensors, and practical acceptance of these remains limited. Simple control metrics such as conditioning number are popular, mostly because they need fewer assumptions than closed-loop metrics, which require a full plant, disturbance and goal model. Overall, no clear consensus can be found on the choice of metrics to define optimal control configurations, with physical measures, linear algebra metrics and modern control metrics all being used. Genetic algorithms and multi-criterial optimisation were identified as the most widely used methods for optimal sensor selection, although addressing the dimensionality and complexity of formulating the problem remains a challenge. This review does present a number of different successful approaches for specific applications domains, some of which may be applicable to diesel engines and other automotive applications. For a thorough treatment, non-linear dynamics and uncertainties need to be considered together, which requires sophisticated (non-Gaussian) stochastic models to establish the value of a control architecture
    • …
    corecore