Large service systems today are of highly network structures. In this thesis, these large service systems are called networked service systems. The network nature of these systems has no doubt brought mass customized services but has also created challenges in the management of their safety. The safety of service systems is an important issue due to their critical influences on the functioning of society. Traditional safety engineering methods focus on maintaining service systems in a safe state, in particular aiming to maintain systems to be reliable and robust. However, resilience cannot be absent from safety out of many recent disasters that occur in society.
The goal of this thesis is to improve the resilience of networked service systems. Four major works have been performed to achieve this goal. First, a unified definition of service systems was proposed and its relationship to other system concepts was unfolded. Upon the new definition, a domain model of service systems was established by a FCBPSS framework, followed by developing a computational model. Second, a definition of resilience for service systems was proposed, based on which the relationship among three safety properties (i.e., reliability, robustness and resilience) was clarified, followed by developing a framework for resilience analysis. Third, a methodology of resilience measurement for service systems was proposed by four measurement axioms along with corresponding mathematical models. The methodology focused on the potential ability of a service system to create optimal rebalancing solutions. Two typical service systems, transportation system and enterprise information system, were employed to validate the methodology. Fourth, a methodology of enhancing resilience for service systems was proposed by integrating three types of reconfigurations of systems, namely design, planning and management, along with the corresponding mathematical model. This methodology was validated by an example of transportation system.
Several conclusions can be drawn from the work above: (1) a service system has a unique characteristic that it meets humans' demand directly, and its safety relies on the balance between the supplies and demands; (2) different from reliability and robustness, the resilience of a service system focuses on the rebalancing ability from imbalanced situations; (3) it makes sense to measure the resilience of a service system only for a particular imbalanced situation and based on evaluation of rebalancing solutions; and (4) integration of design, planning and management is an effective approach for improvement of the resilience for a service system.
The contributions of this thesis can be summarized. Scientifically, this thesis work has improved our understanding of service systems and their resilience property; furthermore, this work has advanced the state of knowledge of safety science in particular having successfully responded to two questions: is a service system safe and how to make a service system safer? Technologically or methodologically, the work has advanced the knowledge for modeling and optimization of networked service systems in particular with multiple layer models along with the algorithms for integrated decision making on design, planning, and management