Article thumbnail

星载计算机软件容错技术

By 曹东坡

Abstract

小卫星以其功能集成度高、研制周期短、费用成本低等众多优势,已经成为航天领域研究的热点。而星载计算机是小卫星系统的核心,不仅在功耗、体积、重量、资源等方面有着特殊限制,而且对系统的实时性和可靠性也有着非常严格的要求。软件容错是提高系统可靠性的有效手段,但是,现有的软件容错技术并不能完全满足星载计算机的实际需要。因此,本文基于国产SPARC V8架构下的宇航级微处理器BM3803,研究开源实时多处理器操作系统RTEMS上的软件容错技术。 本文从分析空间环境和故障特点出发,针对抗SEL、抗SEU和软件缺陷处理等容错需求,在系统平台容错支持的基础上,提出一种更为全面的层次式、模块化的软件容错体系结构。首先,设计多模冗余加载和系统自检恢复,解决系统因文件损坏或硬件故障不能安全启动和正常工作的问题;其次,实现软件注入机制,解决系统在线升级和软件更新的问题;然后,提出改进异常处理和插入扩展块断言的控制流容错方法,增强RTEMS处理系统级控制流错误的能力;最后,结合实验测试和仿真结果,采用组合模型和软件可靠性模型等分析方法,对软件容错机制效能进行评估。结果表明,本文提出的软件容错结构是可行的,一定程度上提高了系统的可靠性。Small satellite is becoming a hot research in the area of aerospace for its great advantages such as high integrity of functions, short period of development,agility of launch mode,low cost and ect.On-board computer ,which is the core of small satellite system,has special limits of its power, bulk, weight, resources, and strict demand of high reliability and real-time ability. Software fault tolerances are efficient tools to enhance system reliability. But those techniques can’t meet all actual demands of on-board computer. Therefore, we study some techniques of software fault tolerance in an open source real-time multiprocessor operating system RTEMS, which runs on BM3803 microprocessor of SPARC V8. To start with the analysis of space environments and fault types,a more general software fault tolerant architecture in multi-players and modularization is put forward for fault tolerant demands of dealing with SEL,SEU and software bugs, which is based on system fault tolerance. First, bootloader using N-modular redundancy is designed to assure starting safely. The mechanism of system self-checking and recovering is provided to protect system from SEL faults and file errors.Then,software injection is implemented to support on-line upgrading of the system。After that, improved exception handling and control flow fault tolerance based on extend block and assertions are provided to deal with control flow error jumping to non-procedures and procedures. At last, we evaluate the performance of software fault tolerance in system reliability models with experimental test and simulation. The results show software fault tolerance in this thesis is feasible and efficient

Topics: 计算机软件, 计算机软件::操作系统与操作环境, 计算机应用技术, 星载计算机,软件容错,单粒子效应,瞬时故障,控制流错误
Year: 2010
OAI identifier: oai:ir.iscas.ac.cn:311060/2385
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://124.16.136.157/handle/3... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.