A framework for the elicitation and debugging of formal specifications for Cyber-Physical Systems is presented. The elicitation of specifications is handled through a graphical interface. Two debugging algorithms are presented. The first checks for erroneous or incomplete temporal logic specifications without considering the system. The second can be utilized for the analysis of reactive requirements with respect to system test traces. The specification debugging framework is applied on a number of formal specifications collected through a user study. The user study establishes that requirement errors are common and that the debugging framework can resolve many insidious specification errors.
I. INTRODUCTION
In formal verification of Cyber-Physical Systems (CPS), a system is verified with respect to formal specifications. It has been shown that utilizing formal specifications can lead to improved testing and verification [23] , [31] , [43] . However, developing formal specifications using logics is a challenging and error prone task even for experts who have formal mathematical training. Therefore, in practice, system engineers usually define specifications in natural language. Natural language is convenient to use in many stages of system development, but its inherent ambiguity, inaccuracy and inconsistency make it unsuitable for use in defining specifications.
To assist in the elicitation of formal specifications, in [29] , [30] , we presented a graphical formalism and tool VISPEC that can be utilized by both expert and non-expert users. Namely, a user-developed graphical input is translated to a Metric Temporal Logic (MTL) formula. The formal specifications in MTL can be used for testing and verification with tools such as S-TALIRO [5] and Breach [21] .
In [30] , the tool was evaluated through a usability study which showed that both expert and non-expert users were able to use the tool to elicit formal specifications. The usability study results also indicated that in many cases the developed specifications were incorrect. Namely, the specifications contained logical inconsistencies or they were (partially) wrong 1 . This raised two questions. First, are these issues artifacts of the graphical user interface? Second, can we automatically detect and report issues with the requirements themselves?
We have created an on-line survey 2 to answer the first question. Namely, we conducted a usability study on Metric Interval Temporal Logic (MITL) by targeting experts in temporal logics. In our on-line survey, we tested how well formal method experts can translate natural requirements to MITL. That is, given a set of requirements in natural language, experts were asked to formalize the requirements in MITL. The study is ongoing but preliminary results indicate that even experts can make errors in their specifications. For example, for the natural language specification "At some time in the first 30 seconds, the vehicle speed (v) will go over 100 and stay above 100 for 20 seconds", the specification ϕ = 3 [0, 30] ((v > 100) ⇒ 2 [0, 20] (v > 100)) was provided as an answer by an expert user. Here, 3 [0, 30] stands for "eventually within 30 time units" and 2 [0, 20] for "always from 0 to 20 time units".
However, the specification ϕ is a tautology!, i.e. it evaluates to true no matter what the system behavior is and, thus, the requirement ϕ is invalid. This is because, if at some time t point between 0 and 30 seconds the predicate (v > 100) is false, then the implication (⇒) will trivially evaluate to true at time t and, thus, ϕ will evaluate to true as well. On the other hand, if the predicate (v > 100) is true for all time between 0 and 30 seconds, then the subformula 2 [0, 20] (v > 100) will be true at all time between 0 and 10 seconds. This means that the subformula (v > 100) ⇒ 2 [0, 20] (v > 100) is true at all time between 0 and 10 seconds. Thus, again, ϕ evaluates to true, which means that ϕ is a tautology. This implies that specification issues are not necessarily artifacts of the graphical user interface and that they can happen even for the people who are familiar with temporal logics.
This indicates that the specification elicitation can be a major issue in testing and verification since effort can be wasted in checking incorrect requirements, or even worse, the system can pass the incorrect requirements. Clearly, this can lead to a false sense of system correctness, which leads us to the second question: What can be done automatically to prevent specification errors in CPS ?
In this work, we have developed a specification development framework that would enable the elicitation and debugging of specifications. The specification debugging algorithm identifies invalid and wrong specifications. Namely, it performs the following in order: 1) Validity detection: the specification is unsatisfiable or a tautology. 2) Redundancy detection: the formula has redundant conjuncts. 3) Vacuity detection: some subformulas do not affect the satisfiability of the formula. This usually indicates some misunderstanding in the requirements.
As a result, specification errors in the elicitation process can be corrected before any test and verification process is initiated. However, some specification issues cannot be detected unless we consider the system, and test the system behaviors with respect to the specification. In addition to the specification debugging, we provide algorithms to detect specification vacuity with respect to signals in order to help the CPS developer find more vacuity issues during system testing. Our framework can help developers correct their specifications and avoid wasted effort on checking incorrect requirements as well as finding more subtle errors during testing.
This paper is an extended version of the conference paper that appeared in MEMOCODE 2015 [20] .
Summary of Contributions:
1) We present a specification debugging algorithm for a fragment of MITL [3] specifications.
2) Using (1) we provide a debugging algorithm for Signal Temporal Logic [39] .
3) We extend Linear Temporal Logic (LTL) [18] vacuity detection algorithms [15] to real-time specifications in MITL. 4) We provide a signal vacuity detection algorithm to feed back the testing team the signals that vacuously satisfy the specification. 5) We present experimental results on specifications that typically appear in CPS specifications.
The above contributions solve the specification correctness problem for VISPEC [30] requirements. The user of VISPEC can benefit from our feed-back and fix any reported issues. In addition, we can find potential system issues using algorithms to detect the specification vacuity with respect to signals during testing.
In this paper, we have expanded some examples and added further experimental results which did not appear in [20] . The major new results presented in this paper over the conference version of the paper [20] concern item (4) in the list above and are presented in Sections VI, VII-C, VII-D, and Appendix XI.
II. RELATED WORKS
The challenge of developing formal specifications has been studied in the past. The most relevant works appear in [7] and [44] . In [7] , the authors extend Message Sequence Charts and UML 2.0 Interaction Sequence Diagrams to propose a scenario based formalism called Property Sequence Chart (PSC). The formalism is mainly developed for specifications on concurrent systems. In [44] , PSC is extended to Timed PSC which enables the addition of timing constructs to specifications.
Specification debugging can also be considered in areas such as system synthesis [40] and software verification [4] . In system synthesis, realizability is an important factor, which checks whether the system is implementable given the constraints (environment) and requirements (specification) [22] , [34] , [17] , [40] . Specification debugging can also be considered with respect to the environment for robot motion planing. In [24] , [33] , the authors considered the problem where the original specification is unsatisfiable with the given environment and robot actions. Then, they relax the specification in order to render it satisfiable in the given domain.
One of the most powerful verification methods is model checking [18] where the model of the system is evaluated with respect to a specification. For example, let us consider model checking with respect to LTL formulas. It is possible that the model satisfies the specification but not in the intended way. This may hide actual problems in the model. These satisfactions are called vacuous satisfactions. Antecedent failure was the first problem that raised the vacuity as a serious issue in verification [9] , [11] . For example, ϕ = 2 [0, 5] (req ⇒ 3 [0, 10] ack) is interpreted as "if at any time within the first 5 seconds, a request happens, then from that moment on within the next 10 seconds, an acknowledge must happen". Here, ϕ can be vacuously satisfiable since it can be satisfied in all systems in which a request never happens. In this example, ack does not affect the satisfaction of ϕ in any system with no request [8] . Vacuity can be addressed with respect to a model [10] , [6] , [37] , [26] , [36] or without a model [25] , [15] . A formula is vacuous when it can be simplified to a smaller equivalent formula. It has been proven in [25] that a specification ϕ is satisfied vacuously in all systems that satisfy it iff ϕ is equivalent to some mutations of it. In [15] , they provide an algorithmic approach to detecting vacuity and redundancy in LTL specifications. Vacuity with respect to testing was considered in [8] , where they considered vacuity in model checking as a strong vacuity. In contrast, [8] defined weak vacuity for test suits that vacuously pass LTL monitors, e.g. [27] . The main idea behind the work in [8] is that they remove transitions from the LTL specification automata in order to find vacuous passes during testing.
Our work extends [15] and it is applied to a fragment of MITL. We provide a new definition of vacuity with respect to Boolean or real-valued signals. To the best of our knowledge, vacuity of real-time properties such as MITL has not been addressed yet. Although this problem is computationally hard, due to the small size of the formulas, in practice the computation problem is manageable.
III. PRELIMINARIES
In this work, we take a general approach in modeling Cyber-Physical Systems (CPS). In the following, R is the set of real numbers, R + is the set of non-negative real numbers, Q is the set of rational numbers, Q + is the set of non-negative rational numbers. Given two sets A and B, B
A is the set of all functions from A to B, i.e., for any f ∈ B A we have f : A → B. We define 2 A to be the power set of set A. We fix T ∈ R + to be the maximum time of a signal.
A. Metric Interval Temporal Logic
Metric Temporal Logic (MTL) was introduced in [35] in order to reason about the quantitative timing properties of boolean signals. Metric Interval Temporal Logic (MITL) is MTL where the timing constraints are not allowed to be singleton sets [3] . In the rest of the paper, we restrict our focus to a fragment of MITL called Bounded-MITL (3, 2) where the only temporal operators allowed are Eventually (3) and Always (2) operators with timing intervals. Formally, the syntax of Bounded-MITL(3,2) is presented by the following grammar:
Definition 1 (Bounded-MITL(3,2) syntax):
where AP is the set of atomic propositions and a ∈ AP , is True, ⊥ is False. Also, I is a nonsingular interval over Q + with defined end-points. The interval I is right-closed.
Definition 2 (Bounded-MITL(3,2) semantics): Given a time trace µ : [0, T ] → 2 AP and t, t ∈ R, and an MITL formula φ, the satisfaction relation (µ, t) φ is inductively defined:
Where (t + I) creates a new interval I where if I = [l, u] then I = [l + t, u + t]. A boolean signal µ satisfies a Bounded-MITL(3,2) formula φ (denoted by µ φ), iff (µ, 0) φ.
The Implication (⇒) is defined as ψ ⇒ ϕ ≡ ¬ψ ∨ ϕ, and also ⊥ ≡ ¬ . In this paper, we assume that Bounded-MITL(3,2) formula is in Negation Normal Form (NNF) where the negation operation is only applied on atomic propositions. NNF is easily obtainable by applying DeMorgan's Law, i.e ¬3 I ϕ ≡ 2 I ¬ϕ and ¬2 I ϕ ≡ 3 I ¬ϕ. For simplifying the presentation, when we mention MITL we mean Bounded-MITL (3, 2) . Given MITL formulas ϕ and ψ, ϕ satisfies ψ, denoted by ϕ |= ψ iff ∀µ µ |= ϕ ⇒ µ |= ψ.
B. Signal Temporal Logic
The logic and semantics can be extended to real-valued signals through Signal Temporal Logic (STL) [39] .
m be a real-valued signal, and P = {p 1 , ..., p n } be a collection of predicates or boolean functions of the form p i : R m → B where B = { , ⊥} is a boolean value.
We define the STL formula Φ ST L over predicates P using MITL formula Φ M IT L over the atomic propositions AP . The semantics of STL can be defined using MITL as follows:
1) Define a set of AP such that for each p ∈ P , there exist some a p ∈ AP 2) For each real-valued signal s we define a µ such that ∀t
The Visual Specification Tool (VISPEC) [30] enables the development of formal specifications for CPS. The graphical formalism enables reasoning on both timing and event sequence occurrence. Consider the specification φ cps = 2 [0, 30] ((speed > 100) ⇒ 2 [0, 40] (rpm > 4000)). It states that whenever within the first 30 seconds, vehicle speed goes over 100, then from that moment on, the engine speed (rpm), for the next 40 seconds, should always be above 4000. Here both the sequence and timing of the events are of critical importance. See Fig. 1 for the visual representation of φ cps . Users develop specifications using a visual formalism which can be translated to a MITL formula. The set of specifications that can be generated from this graphical formalism is a proper subset of the set of MITL specifications. Formally, the following grammar produces the set of formulas that can be expressed by the proposed graphical formalism:
where p is an atomic proposition. In the tool, the atomic propositions are automatically derived from the templates. For example the formula 2 I 3 I p can be generated using the following parse tree S−→T−→C−→ 2 I 3 I D−→ 2 I 3 I p.
The graphical formalism was developed with the following goals: a) The user interface is easy to use, i.e, it does not have a high learning curve; b) The visual representation of the requirements is clear and unambiguous; c) There is a one-to-one mapping from the visual representation of the requirement and the corresponding requirement in MITL. The graphical formalism is mainly composed of the following: 1) Templates; 2) Relationships between templates.
Templates are used to define temporal logic operators, their timing intervals, and the expected signal shape. A template configuration wizard guides the user in the development process. The process is context dependent where each option selection leads to a potentially different set of options for the next step. After the selection of the temporal operator, the user will define the timing bounds for it. For specifications with temporal operators such as Eventually Always (32) and Repeatedly Often and Finally (23) , setting the timing bounds may be a challenging task. To clarify this issue, the tool provides a fill-in-the-blanks sentence format to the user. For example, if the operator Eventually Always is selected, the user will have to complete the following sentence with the timing bounds: "Eventually, between and seconds, the signal will become true, and from that point on, will stay true in the next to seconds". The set timing intervals are visualized with color shaded regions in the template.
The next step in the process is in defining whether the predicate will evaluate to true when the signal is above or below a set threshold. For example, for the Always (2) operator, a signal is selected that is either always above or below a specified threshold. Once either option is selected, a signal that fits the requirement is automatically generated and presented visually (See Fig. 1 ).
Relationships between templates enable the development of more complex specifications. The three main relationships between templates are the following: 1) Templates can be placed in a sequence, where the last template is only considered if the previous templates are evaluated to true. Formally, it enables the definition of an implication relationship between templates of the form φ ⇒ ψ. 2) Templates can be grouped to establish a conjunction relationship of the form φ ∧ ψ. This is indicated visually by a black box around the templates. 3) Finally, the relative timing relationship enables the definition of: a) Reactive response specifications of the form 2(φ ⇒ M ψ) ; b) Non-strict sequencing specifications of the form N (φ ∧ M ψ), where N and M are temporal operators. This relationship is visually distinct in that the nested template is tabbed in relation to the main template. The variety of templates and the connections between them allow users to express a wide variety of specifications as presented in Table I.   TABLE I. CLASSES OF SPECIFICATIONS EXPRESSIBLE WITH THE GRAPHICAL FORMALISM Specification Class Explanation Safety Specifications of the form 2φ used to define specifications where φ should always be true. Reachability
Specifications of the form 3φ used to define specifications where φ should be true at least once in the future (or now). Stabilization Specifications of the form 32φ used to define specifications that, at least once, φ should be true and from that point on, stay true. Oscillation Specifications of the form 23φ used to define specifications that, it is always the case, that at some point in the future, φ repeatedly will become true. Implication Specifications of the form φ ⇒ ψ requires that ψ should hold when φ is true. Reactive Response Specifications of the form 2(φ ⇒ M ψ), where M is temporal operator, used to define an implicative response between two specifications where the timing of M is relative to timing of 2.
Conjunction
Specifications of the form φ ∧ ψ used to define the conjunction of two sub-specifications. Non-strict Sequencing Specifications of the form N (φ∧M ψ), where N and M are temporal operators, used to define a conjunction between two specifications where the timing of M is relative to timing of N .
IV. MITL ELICITATION FRAMEWORK
Our framework for elicitation of MITL specifications is presented in Fig. 2 . Once a specification is developed using VISPEC, it is translated into STL. Then, we create the corresponding MITL formula from STL. Next, the MITL specification is analyzed by the debugging algorithm which returns an alert to the user in case the specification has inconsistency or correctness issues. The debugging process is explained in detail in the next section.
To enable the debugging of specifications, we must first project the STL predicate expressions into atomic propositions with independent truth valuations. This is very important because the atomic propositions (a ∈ AP ) in MITL are assumed to be independent of each other. This notion of independence is illustrated with the following example. Consider the real-valued signal Speed in Fig. 3 . The boolean abstraction a (resp. b) over the Speed signal is true when the Speed is above 100 (resp. 80). The predicates a and b are related to each other because it is always the case that if Speed > 100 then also Speed > 80. In Fig. 3 , the boolean signals for predicates a and b are represented in the red and blue lines, respectively. It can be seen that red and blue Boolean signals are overlapping which shows the logical dependency between them. However, this logical dependency is not captured if we naively substitute each predicate in the STL formula with a unique atomic proposition. If we lose information about the intrinsic logical dependency between a and b, then the debugging algorithm will not find possible issues. This is because MITL semantics assumes that the atomic propositions are independent of each other.
In order to enable the analysis of STL formulas within our MITL debugging process, we must replace the original predicate expressions with non-overlapping corresponding predicate expressions. For the example illustrated in Fig. 3 , we create a new atomic proposition c which corresponds to 100 ≥ speed > 80 and its signal is represented in green. In addition, we replace the atomic proposition b with the propositional expression a ∨ c since speed > 80 ≡ (speed > 100 ∨ 100 ≥ speed > 80). Now, the dependency between Speed > 100 and Speed > 80 can be preserved because it is always the case that if a (Speed > 100), then a ∨ c (Speed > 80). It can be seen in Fig. 3 that the blue signal b is the disjunction of the red (a) and green (c) signals. By doing this, we have considered not only the syntax but also the semantics of the predicates. The projection of STL to MITL with independent atomic propositions in our implementation is conducted using a brute-force algorithm that runs through all the combinations of predicate expressions to find overlapping areas.
V. MITL SPECIFICATION DEBUGGING
Clearly, verifying a system with respect to incorrect specifications is pointless. Therefore, any inconsistencies or other issues with the specification should be resolved. In the following, we present algorithms that can detect inconsistency and correctness issues in specifications. This will help the user in the elicitation of correct specifications.
Our specification debugging process conducts the following checks in this order: 1) Validity, 2) Redundancy, and 3) Vacuity. In brief, validity checking determines whether the specification is satisfiable but not a tautology. Namely, if the specification is unsatisfiable no system can satisfy it and if it is a tautology every system can trivially satisfy it. For example, p ∨ ¬p is a tautology.
Redundancy checking determines whether the specification has no redundant conjunct when the specification is a conjunction of MITL formulas. For example, in the specification p ∧ 2 [0, 10] p, the first conjunct is redundant. Sometimes redundancy is related to incomplete or erroneous requirements where the user may have wanted to specify something else. Therefore, the user should be notified.
Vacuity checking determines whether the specification has a subformula that does not have any affect on the satisfaction of the specification. For example ϕ = p ∨ 3 [0, 10] p is vacuous since the first occurrence of p does not Definition 4 (Wrong Specification): A specification which is redundant or vacuous is called wrong.
The reason that we choose the term "wrong" is that although this specification is logically valid, the specification in its current representation does not reflect the intention of the requirement in its natural language form. This is because part of the specification is overshadowed with the other components.
The debugging process is presented in Fig. 4 . First, given a specification, a validity check is conducted. If a formula does not pass a validity check then it means that there is a major problem in the specification and the formula is returned for revision. Therefore, redundancy and vacuity checks are not relevant at that point. Similarly, if the specification is redundant it means that it has a conjunct that does not have any affect on the satisfaction of the specification and we return the redundant conjunct for revision. Lastly, if the specification is vacuous it is returned with the issue for revision by the user.
A. Redundancy Checking
Recall that a specification has a redundancy issue if one of its conjuncts can be removed without affecting the models of the specification. Before we formally present what redundant requirements are, we have to introduce some notation. We consider specification Φ as a conjunction of MITL subformulas (ϕ j ):
To simplify discussion, we will abuse notation and we will associate a conjunctive formula with the set of its conjuncts. That is:
Similarly, {Φ\ϕ i } represents the specification Φ where the conjunct ϕ i is removed:
Whether {Φ\ϕ i } represents a set or a conjunctive formula will be clear from the context. Redundancy in specifications is fairly common in practice due to the incremental additive approach that system engineers take in the development of specifications. In the following, we consider the redundancy removal algorithm provided in [15] for LTL formulas and we extend it to support MITL formulas. if (Φ\ϕ i ) |= ϕ i then 4:
end if 6: end for
B. Specification Vacuity Checking
Vacuity detection is used to ensure that all the subformulas of the specification contribute to the satisfaction of the specification. In other words, vacuity check enables the detection of irrelevant subformulas in the specifications [15] . [0, 10] (a∨c)) where a corresponds to speed > 100 and c corresponds to 100 ≥ speed > 80 is the correct MITL formula corresponding of φ stl , and it is vacuous. In the following, we provide the definition of MITL vacuity with respect to a signal:
Definition 6 (MITL Vacuity with respect to signal): Given a signal T and an MITL formula ϕ. A subformula ψ of ϕ does not affect the satisfiability of ϕ with respect to T if and only if ψ can be replaced with any subformula θ without changing the satisfiability of ϕ on T . A specification ϕ is satisfied vacuously by T , denoted by T |= V ϕ, if there exists ψ which does not affect the satisfiability of ϕ on T .
In the following, we extend the framework presented in [15] to support MITL specifications. Let ϕ be a formula in NNF where only predicates can be in the negated form. A literal is defined as a predicate or its negation. For formula ϕ the set of literals of ϕ is denoted by literal(ϕ) and contains all the literals appearing in ϕ. 
The proof of Theorem 1 is straightforward modification of the proofs given in [15] , [37] . We have added the proof in Appendix (Section IX) for completeness. When we do not have the conjunction in the specification (Φ = ϕ), we check the vacuity of the formula with respect to itself. In other words, we check whether the specification satisfies its mutation (ϕ |= ϕ[l ←⊥] or ϕ |= V ϕ). Algorithm 2 finds the vacuous subformulas of the specification similar to [15] .
VI. SIGNAL VACUITY CHECKING
In the previous section, we addressed specification vacuity without considering the system. However, in many cases specification vacuity depends on the system. Consider the LTL specification ϕ = 2(req ⇒ 3ack). The specification ϕ does not have an inherent vacuity issue [25] . However, if req never happens in the system, then the specification ϕ is vacuously satisfied. As a result, it is important to add vacuity detection in the model checking process. We encounter the same issue when we test signals with respect to STL/MITL specifications.
Algorithm 2 Specification Vacuity Checking
for each l ∈ litOccur(ϕ i ) do
end if
7:
end for 8: end for
A. Vacuous Signals
Consider the MITL specification ϕ = 2 [0, 5] (req ⇒ 3 [0,10] ack) as presented in Section II. This formula will pass the MITL Specification Debugging method presented in Section V. However, any signal µ that does not satisfy req at any point in time during the test will vacuously satisfy ϕ. We refer to signals that do not satisfy the antecedent (precondition) of the subformula as vacuous signals. Similarly, these issues follow for STL formulas as well. Consider Task 6 in Table II 
1) Antecedent Failure:
For each implication subformula (ϕ ⇒ ψ), the left operand (ϕ) is the precondition (antecedent) of the implication. An antecedent failure mutation is a new formula that is created with the assertion that the precondition (ϕ) never happens. For each precondition ϕ, we create an antecedent failure mutation 2 Iϕ (¬ϕ) where I ϕ is called the effective interval of ϕ.
Definition 9 (Effective Interval):
The effective interval of a subformula is the time interval when the subformula has an impact on the truth value of the whole MITL specification.
In other words, each subformula is evaluated only in the time window that is provided by the effective interval. The effective interval of MITL formulas can be computed recursively using Algorithm 3. To run Algorithm 3, we must provide the parse tree 3 The algorithm is initialized with the [0,0] interval for the top node of MITL formula, namely, EIU(ϕ,[0,0]). This is because, according to semantics of MITL, the value of the whole MITL formula is only important at time zero. In line 8 of Algorithm 3, the operator ⊕ is used to add two intervals as follows: 4 . The effective interval is important for the creation of an accurate antecedent failure mutation. This is because the antecedent can affect the truth value of the MITL if it is evaluated in the effective interval. The effective interval is like a timing window to make antecedent observable for an outside observer the way it is observed by the MITL specification.
For example, assume that the MITL specification is ϕ = 2 [1, 2] (3 [3, 5] b ⇒ (2 [4, 6] (c ⇒ d))). The specification ϕ has two antecedents, α 1 = 3 [3, 5] 
if T |= 2 Iϕi (¬ϕ i ) then 4:
end if 6: end for [5, 8] , respectively. As a result, the corresponding antecedent failure mutations are 2 [1, 2] (¬3 [3, 5] b) and 2 [5, 8] (¬c), respectively. Next, we present the second type of mutation.
2) Literal Occurrence Removal: This mutation type is generated by recursively substituting the occurrences of literals with ⊥ denoted by ϕ[l ←⊥] (see the Definition 7). To detect vacuous signals, we create a list of all the mutated formulas that are satisfied by the signal. Algorithms for detecting vacuous signals in each of the vacuity types (see the Definition 8) are provided in Algorithms 4 and 5. Both algorithms create the list of mutated formulas that are satisfied by the signal T , and if these lists are not empty, then the signal T is vacuous. In Algorithm 4, we check whether the signal will satisfy the antecedent failure mutation. In details, the algorithm returns the list of antecedents (AF ϕ ) that the signal T never satisfies. In Algorithm 5, we check whether the signal will satisfy the mutated specification (ϕ i [l ←⊥]). Finally, all the mutated formulas will be returned to the user (M F ϕ ). To check whether signal T satisfies ϕ's mutations in Algorithm 4 (Line 3) and Algorithm 5 (Line 4), we should use an off-line monitor such as [23] .
In Appendix XI, we prove that for any MITL specification ϕ, which contains one or more disjunction operators (∨) in its NNF, any signal that satisfies ϕ will also satisfy a mutation ϕ i [l ←⊥] for some literal occurrence l. Consequently, any specification which lacks a disjunction operator (∨) in its NNF will not satisfy ϕ i [l ←⊥] for any literal occurrence l. That is, for formulas without any disjunction operator in NNF, we have ϕ i [l ←⊥] ≡⊥ since for any ϕ formula, we have ϕ∧ ⊥≡⊥. Therefore, Algorithm 5 should not be used for formulas without disjunction in NNF.
Antecedent failure is the most critical issue in vacuity analysis. Therefore, when the specification has implication operators, we should only utilize Algorithm 4. We utilize Algorithm 5 to catch potential issues in reactive response specifications where the implication operation was not used. As a result, Algorithm 5 can be used to detect antecedent failure, implicitly.
B. Vacuity Detection in Testing
Detecting vacuous satisfaction of specifications is usually applied on top of model checking tools for finite state systems [10] , [37] . Due to the finite state nature of the system, the model checking problem is decidable. However, due to the complex system dynamics of CPS, the model checking problem for CPS is undecidable [2] . Therefore, a formal guarantee about the complete correctness of CPS is impossible, in general. However, CPS is ubiquitous in safety critical systems and the verification of these systems is necessary. Thus, semi-formal verification methods are gaining in popularity [32] . Although we cannot solve the correctness problem with testing and monitoring, we
end if 7: end for 8: end for can detect possible errors with respect to the MITL requirements. Vacuity detection in testing is important because vacuous signals do not satisfy the specification for the reason that is intended by the verification engineer.
In Fig. 5 , the testing approach for signal vacuity detection is presented. The input generator creates initial conditions and inputs to the system under test. An example of a test generation technology that implements the architecture in Fig. 5 is presented in [1] . The system executes or simulates to generate an output trace. Then, a monitor checks the trace with respect to the specification and reports to the user whether the system trace satisfies or falsifies the specification (for example [42] , [39] ). Signal vacuity checking is conducted by using Algorithms 4 or 5. Therefore, vacuous traces are reported to the user for more inspection. 
VII. EXPERIMENTAL ANALYSIS
All the 3-level correctness analysis of MITL specifications need satisfiability checking as the underlying tool [14] . In validity checking we simply check whether the specification and its negation are satisfiable. In general, in order to check whether ϕ |= ψ, we should check whether ϕ =⇒ ψ is a tautology, that is ∀µ, µ |= ϕ =⇒ ψ. This can be verified by checking whether ¬(ϕ =⇒ ψ) is unsatisfiable.
Recall that ϕ =⇒ ψ is equivalent to ¬ϕ ∨ ψ. So we have to check whether ϕ ∧ ¬ψ is unsatisfiable to conclude that ϕ |= ψ. We use the above reasoning for redundancy checking as well as for vacuity checking. For redundancy checking, {Φ\ϕ i } ∧ ¬ϕ i should be unsatisfiable, in order to reason that {Φ\ϕ i } |= ϕ i . For vacuity checking,
A. MITL Satisfiability
As mentioned earlier, we can check all evaluations of a specification using a satisfiability checker. In order to check whether an MITL formula is satisfiable we use two publicly available tools: qtlsolver 5 and zot 6 . The qtlsolver that we used, translates MITL formulas into CLTL-over-clocks [13] , [14] . Constraint LTL (CLTL) is an extension of LTL where predicates are allowed to be the assertions on the values of non-Boolean variables [19] . That is, in CLTL, we are allowed to define predicates using relational operators for variable over domains like N and Z. Although satisfiability of CLTL in general is not decidable, some variant of it is decidable [19] . 
QTL Solver

Zot Z3
MITL SAT CLTLoc SAT SMT Validity Redundancy Vacuity Fig. 6 . MITL SAT solver from [13] is used for debugging algorithms.
CLTLoc (CLTL-over-clocks) is a variant of CLTL where the clock variables are the only arithmetic variables that are considered in the atomic constraints. It has been proved in [12] that CLTLoc is equivalent to timed automata [18] . Moreover, it can be polynomially reduced to decidable Satisfiable Modulo Theories which are solvable by many SMT solvers such as Z3 7 . The satisfiability of CLTLoc is PSPACE-complete [14] and the translation from MITL to CLTLoc in the worst case can be exponential [13] . One additional restriction over the MITL formulas is that the lower bound and upper bound for the intervals of MITL formulas should be integer in order to use the qtlsolver [13] . Therefore, we expect the values to be integer when we analyse MITL formulas. The high level architecture of MITL SAT solver is provided in Fig. 6 which we use it for finding validity, redundancy, and vacuity issues.
B. Specification Debugging Results
We utilize the debugging algorithm on a set of specifications developed as part of a usability study for the evaluation of the VISPEC tool [30] . The usability study was conducted on two groups: 1) Non-expert users: These are users who declared that they have little to no experience in working with requirements. The non-expert cohort consists of twenty subjects from the academic community at Arizona State University. Most of the subjects have an engineering background. 2) Expert users: These are users who declared that they have experience working with system requirements. Note that they do not necessarily have experience in writing requirements using formal logics. The expert subject cohort was comprised of ten subjects from the industry in Phoenix area.
Each subject received a task list to complete. The task list contained ten tasks related to automotive system specifications. Each task asked the subject to formalize a natural language specification through VISPEC and generate an STL specification. The task list is presented in Table II . Note that the specifications were preprocessed and transformed from the original STL formulas to MITL in order to run the debugging algorithm. For example, specification φ 3 in Table III originally 30] (speed > 100)). The STL predicate expressions (speed > 80), (rpm > 4000), (speed > 100) are mapped into atomic propositions with non-overlapping predicates (Boolean functions) p 1 , p 2 , p 3 . The predicates p 1 , p 2 , p 3 correspond to the following STL representations: p 1 ≡ speed > 100, p 2 ≡ rpm > 4000, and p 3 ≡ 100 ≥ speed > 80.
In Table III , we present common issues with the developed specifications that our debugging algorithm would have detected and alerted each subject if the tool were available at the time of the study. Note that validity, redundancy and vacuity issues are present in the specifications listed. It should be noted that for specification φ 3 , although finding the error takes a significant amount of time, our algorithm can be used off-line.
In Fig. 7 , we present the runtime overhead of the three stage debugging algorithm over specifications collected in the usability study. In the first stage, 87 specifications go through validity checking. Five specifications fail the test and therefore they are immediately returned to the user. As a result, 82 specifications go through redundancy checking, where 9 fail the test. Lastly, 73 specifications go through vacuity checking where 5 specification have vacuity issue. The rest 68 specifications passed the tests. Note that in the figure, two outlier data points are omitted from the vacuity sub-figure for presentation purposes. The two cases were timed at 39,618sec and 17,421sec. In both cases, the runtime overhead was mainly because the zot software took hours to determine that the modified specification is unsatisfiable (both specifications where vacuous). The overall runtime of φ 3 in Table III is 39,645sec which includes the runtime of validity and redundancy checking. The runtime overhead of vacuity checking of φ 3 (39,618sec) can be reduced by half because in vacuity checking we run MITL satisfiability checking for all literal occurrences. In particular, φ 3 has four literal occurrences where for two cases the zot took more than 19,500sec to determine that the modified specification is unsatisfiable. We can provide an option for early detection: as soon as At some point in time in the first 30 seconds, vehicle speed will go over 100 and stay above for 20 seconds.
Oscillation
At every point in time in the first 40 seconds, vehicle speed will go over 100 in the next 10 seconds.
It is not the case that, for up to 40 seconds, the vehicle speed will go over 100 in every 10 second period.
Implication
If, within 40 seconds, vehicle speed is above 100 then within 30 seconds from time 0, engine speed should be over 3000.
Reactive Response
If, at some point in time in the first 40 seconds, vehicle speed goes over 80 then from that point on, for the next 30 seconds, engine speed should be over 4000.
Conjunction
In the first 40 seconds, vehicle speed should be less than 100 and engine speed should be under 4000. 9. Non-strict sequencing At some point in time in the first 40 seconds, vehicle speed should go over 80 and then from that point on, for the next 30 seconds, engine speed should be over 4000. 
an issue is found (just one unsatisfiable detection) the software should return the result which in φ 3 case can lead to half of the computation time of the original vacuity detection.
The blue circles in Fig. 7 represent the timing performance in each test categorized by literal occurrence and number of temporal operators. The red asterisks represent the mean values and the dashed line is the linear interpolation between them. In general, we observe an increase on the average computation time as the literal occurrence and number of temporal operators increases. Ideally, the performance analysis should be conducted over a large set of artificially generated benchmarks, i.e., specification formulas. However, developing such benchmarks is a challenging problem on itself and thus further research is required. All the experimental results in Section VII were extracted from an Intel Xeon X5647 (2.993GHz) machine with 12 GB RAM.
C. LTL Satisfiability
In the previous section, we mentioned that MITL satisfiability problem is a computationally hard problem. However, we know that LTL satisfiability is in practice solvable faster than MITL satisfiability [38] . We consider how we can use the satisfiability of LTL formulas to decide about the satisfiability of MITL formulas. Consider the following fragments of MITL and LTL in NNF: MITL(2):
In the Appendix X, we prove that the satisfaction of a formula φ M ∈ MITL(3) in NNF is related to the satisfaction of an LTL version of φ M called φ L ∈ LTL(3) where φ L is identical to φ M except that the every interval I in φ M is removed. For example, if φ M = 3 [0, 10] 
For the always (2) operator, satisfiability is the dual of the eventually operator (3) . Assume that φ M ∈ MITL(2) contains only the 2 operator and φ L ∈ LTL(2) is the LTL version of φ M . If φ L is satisfiable, then φ M will also be satisfiable.
Based on the above discussion, if the specification φ M that we intend to test/debug belongs to either category (fragment) of MITL(3) or MITL(2), then we can check the satisfiability of its LTL version (φ L ) and decide accordingly: -If φ L ∈ LTL(2) is satisfiable, then φ M is satisfiable.
In these two cases, we do not need to run MITL satisfiability. As a result LTL satisfiability checking is useful for validity testing. For redundancy checks, it may also be useful. For example, if we have a formula φ = 3 [0, 10] 20] p we should check the satisfiability of φ = 2 [0, 10] ¬p∧2 [0, 20] p and φ = 3 [0, 10] p∧3 [0, 20] ¬p for redundancy. Although the original formula φ does not belong to either MITL(3) or MITL(2), its modified NNF version will fit in these fragments and we may benefit by the LTL satisfiability for φ and/or φ . For vacuity checking, in some occasions we may also be able to use LTL satisfiability if after manipulating/simplifying the original specification and creating the NNF version, we can categorize the resulting formula into the MITL(3) or the MITL(2) fragments.
In order to check the LTL satisfiability of the modified MITL specifications, we must use a model checking tool. Since model checking tools need to consider LTL over discrete time semantics, we assume that system outputs, i.e. signals, satisfy the finite variability assumption. That is, for every finite time interval, the number of times that any atomic proposition changes valuation is finite, as well. The finite variability assumption is satisfied by all physical systems. Therefore, we can use model checking tools for satisfiability checking of LTL formulas. We used the NuSMV 8 tool, which is a well known Symbolic Model Checker [16] . We applied a similar encoding as in [41] to use NuSMV for satisfiability checking of LTL formulas. In Table IV , we compare the runtime overhead of MITL and LTL satisfiability checking. For the results of the usability study in [30] , we conduct validity and vacuity checking with the LTL satisfiability solver. We remark that in our results in Table IV , all the formulas belong to the MITL(2) fragment. Since we did not find any MITL(3) formula where its LTL version is not satisfiable, we cannot utilize the LTL satisfiability solver.
The first column of Table IV provides the debugging test phase where we used the satisfiability checkers. The second column represents the MITL formulas that we tested using the SAT solver. We omit the LTL formulas from Table IV, since they are identical to MITL but do not contain timing intervals. The predicates p 1 , p 2 , p 3 , p 4 , p 5 of the MITL formulas in Table IV correspond to the following STL representations: p 1 ≡ speed > 100, p 2 ≡ rpm > 4000, p 3 ≡ 100 ≥ speed > 80, p 4 ≡ rpm > 3000, and p 5 ≡ speed > 80. The third and forth columns represent the runtime overhead of satisfiability checking for MITL specifications and their corresponding LTL version. The last column represents the speedup of the LTL approach over the MITL approach. It can be seen that the LTL SAT solver (NuSMV) is about 30-300 times faster than the MITL SAT solver (zot). These results indicate that, when applicable, LTL SAT solvers outperform MITL SAT solvers in detecting vacuity and validity issues in specifications. 
D. Antecedent Failure Detection
To apply signal vacuity checking we use the S-TALIRO testing framework [1] , [29] . S-TALIRO is a MatLab toolbox that uses stochastic optimization techniques to search for system inputs for Simulink models which falsify the safety requirements presented in MTL [1] . Falsification based approaches for CPS can help us find subtle bugs in industrial size control systems [31] . If after using stochastic-based testing and numerical analysis we could not find those bugs, then we are more confident that the system works correctly. However, it will be concerning, if the numerical analysis are mostly based on vacuous signals. If we report vacuous signals to S-TALIRO users, then they will be aware of the vacuity issue. This will help them to focus on the part of the system that causes the generation of vacuous signals. For example, users should find the system conditions that activate the antecedent in case of antecedent failure.
In the following, we illustrate the vacuous signal detection process by using the Automatic Transmission (AT) model provided by Mathworks as a Simulink demo 9 . We introduced a few modifications to the model to make it compatible with the S-TALIRO framework. Further details can be found in [28] . S-TALIRO calls the AT Simulink model in order to generate the output trajectories. The outputs contain two continuous-time real-valued signals: the speed of the engine ω (RPM) and the speed of the vehicle v. In addition, the outputs contain one continuous-time discrete-valued signal gear with four possible values (gear = 1, ..., gear = 4) which indicates the current gear in the auto-transmission controller. S-TALIRO then monitors system trajectories with respect to the requirements provided in Table V . There, in the MITL formulas, we use the shorthand g i to indicate the gear value, i.e. (gear = i) ≡ g i . The simulation time for the system is set to 30 seconds; therefore, we can use bounded MITL formulas for the requirements.
After testing the AT with S-TALIRO, we collected all the system trajectories. Then, we utilized the antecedent failure mutation on the specification to check signal vacuity (Algorithm 4) for each of the formulas that are provided in Table V . We provide the antecedent failure specifications and the signals that satisfy them in Table VI . It can be seen in Table VI that most of the system traces are vacuous signals where the antecedent is not satisfied. This helps the users to consider these issues and identify interesting test cases that can be used to initialize the system tester so that the antecedent is always satisfied. a) Remark: The work presented here can be modified to debug systems even after a counter example was found. According to Corollary 2, we considered the case where the signal T falsifies the specification ϕ with conjunction operation. We can simplify the requirement in such a way that the sub-requirement generated by iteratively substituting ϕ[l ← ] can also be unsatisfiable. The coverage information of the simplified requirements that are falsified reveals the corresponding conditions for the failure of CPS. Therefore, system engineers can have a more informative feed-back from the tests, where the sources of the errors can be better located.
VIII. CONCLUSION AND FUTURE WORK
We have presented a specification elicitation and debugging framework that helps expert and non-expert users to produce correct formal specifications. The debugging algorithm enables the detection of logical inconsistencies in MITL and STL specifications. Our algorithm improves the elicitation process by providing feedback to the users on validity, redundancy and vacuity issues. In the future, the specification elicitation and debugging framework will be integrated in the VISPEC tool to simplify MITL and STL specification development for verification of CPS. In addition, we considered vacuity detection with respect to signals. This enables improved analysis since some issues can only be detected when considering both the system and the specification. In the future, we will consider the feasibility of using vacuous signals to improve the counter example generation process and system debugging using signal vacuity.
