Abstract-Model checking is a powerful approach for the formal verification of hardware and software systems. However, this approach suffers from the state space explosion problem, which limits its application to large-scale systems due to space shortage. To overcome this drawback, one of the most effective solutions is to use external memory algorithms. In this paper, we propose an I/O efficient model checking algorithm for large-scale systems. To lower I/O complexity and improve time efficiency, we combine three new techniques: 1) a linear hashsorting technique; 2) a cached duplicate detection technique; and 3) a dynamic path management technique. We show that the new algorithm has a lower I/O complexity than state-of-the-art I/O efficient model checking algorithms, including detect accepting cycle, maximal accepting predecessors, and iterative-deepening depth-first search. In addition, the experiments show that our algorithm obviously outperforms these three algorithms on the selected representative benchmarks in terms of performance.
I. INTRODUCTION

M
ODEL checking is a powerful approach for the formal verification of hardware and software systems. When applicable, it automatically checks whether or not a system satisfies a given specification via detection of counterexamples.
There have been a lot of efforts applying model checking in hardware verification [1] - [4] . However, this approach severely suffers from the state space explosion problem, which renders it inapplicable to large-scale systems due to space shortage [5] .
Practical model checking algorithms mainly fall into two types: 1) internal memory algorithms; and 2) external algorithms. To overcome the state space explosion problem, internal memory algorithms focus on reducing system size or representation. To this end, many techniques are introduced, such as partial order reduction [6] , symmetry reduction [7] , abstraction [8] , compositional approach [9] , symbolic model checking [10] , symbolic trajectory evaluation (STE), automata theory [11] , and bounded model checking [12] . Nevertheless, due to the internal memory limitation, internal memory algorithms become inapplicable to real-life industrial systems with large scale.
Compared with internal memory, external memory devices (disks) can provide much larger space. In addition, in the past few years, there have been enormous increase in the capacity of magnetic disks, with little increase in their cost, resulting in dramatic reductions in the cost per byte. Magnetic disk is about two and a half orders of magnitude cheaper than the semiconductor memory [13] , which suggests the idea of using external memory in model checking large-scale systems. Because external memory access is orders of magnitude slower than internal memory access [14] , [15] , the main concern for external memory algorithms is to reduce the number of I/O operations so as to improve their time efficiency.
In this paper, we propose an I/O efficient model checking algorithm for large-scale systems based on nested depth-first search [16] , called IOEMC. To lower I/O complexity and improve time efficiency, we combine three new techniques: 1) a linear hash-sorting algorithm denoted by LHS; 2) a cached duplicate detection technique denoted by CDD; and 3) a dynamic path management technique denoted by DPM. The algorithm LHS aims to quickly locate a record in a hash table on disk by the hash value of a state. Whenever the hash table storing visited states in the internal memory is full, it merges the hash table into the sorted hash table in external memory and sorts the new hash table in external memory by a special technique. The I/O complexity of this algorithm is linear in the size of the two hash tables together. The CDD technique allows almost all duplicate detections to be performed in internal memory by efficient management of visited states. With LHS together, CDD significantly reduces the cost of duplicate detection. The scheme DPM makes two stacks of the nested depth-first search, dynamically share the same memory section, and solves the memory dithering problem by efficient management of stacks and states, where the memory dithering refers to the phenomenon that states are frequently moved into and out of the internal memory (see Section V-D), which may significantly increase the number of I/O operations and thus reduce the efficiency of an algorithm.
For demonstrating the effectiveness of IOEMC, we compare it with state-of-the-art I/O efficient linear temporal logic (LTL) 1063-8210 © 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. model checking algorithms, including detect accepting cycle (DAC) [14] , maximal accepting predecessors (MAP) [17] , and iterative-deepening depth-first search (IDDFS) [18] . The complexity comparisons show that IOEMC has lower I/O complexity than DAC, MAP, and IDDFS. Furthermore, the experiments show that the time efficiency of IOEMC is obviously better than its competitors.
The rest of this paper is organized as follows. We describe a running example in Section II. Section III provides some necessary knowledge used in this paper. In Section IV, we introduce the related work. And then, we propose an I/O efficient LTL model checking algorithm for large-scale systems based on LHS, CDD, and DPM in Section V. In Sections VI and VII, we compare our algorithm's I/O complexity and practical performance with that of DAC, MAP, and IDDFS. Section VIII discusses the parameter used in the dynamic path management technique DPM and state number limit. Finally, we conclude this paper in Section IX.
II. EXAMPLE
In order to illustrate the related concepts and our algorithm, we use I T C 99, b15(std) as a running example, which is also a benchmark in the experiment of this paper. I T C 99, b15(std) is a standard 80386 processor (subset) that has 671 VHDL (VHSIC Hardware Description Language) lines, three processes, 8922 gates, 36 primary inputs, 70 primary outputs, 449 flip-flops, 21 logic-zero, nine logicone, and 53 018 faults in complete fault list, where VHSIC is the abbreviation of very high speed integrated circuit. Its RT (register transfer) level VHDL description can be found in [19] . Two properties to be verified for I T C 99, b15(std) are as follows. 1) P1: AG(reset= 1 ∧ Pro0@s 3 ), meaning that whenever 80386 processor is reset, the system process Pro0 will forever stay at the third state s 3 . 2) P2: E F(reset= 1 ∧ Pro0@s 3 ), meaning that there is one path such that the system process Pro0 will eventually research the third state s 3 along the path when 80386 processor is reset.
III. PRELIMINARIES
In this section, we provide a brief introduction of some necessary notions used in this paper. Please refer to [5] and [20] for more details.
A. Model Checking and Automata
Model checking is a technique that checks if a system satisfies the given specification, where the specification is the property that the system needs to satisfy, which is expressed by a logical formula. For example, the verified properties of the benchmark I T C 99, B15(std) are expressed by AG(reset= 1 ∧ Pro0@s 3 ) and E F(reset= 1 ∧ Pro0@s 3 ). The automata-theoretic approach is one of the most efficient model checking techniques.
Formally, a finite automaton (over finite words) M is a five tuple ( , Q, , Q 0 , F) such as follows.
1)
is the finite alphabet. The letters in are used for transition labels. 2) Q is the finite set of states.
3)
⊆ Q × × Q is the transition relation, (q 1 , a, q 2 ) ∈ means that state q 1 transits to state q 2 through the edge labeled with letter a. The a is called a transition label. 4) Q 0 ⊆ Q is the set of initial states. 5) F is the set of final states (or accepting states). We use L(M) to denote the language accepted by M. Suppose the specification (or property) that the system needs to satisfy is expressed by LTL formula ϕ, the negation ¬ϕ of the specification is translated into automaton S = ( , Q 2 , 2 , Q 0 2 , F 2 ), and the system to be verified is translated into automaton
According to [5] and [14] , model checking is to check whether or not there is an accepting cycle accessible from some initial state in the intersection automaton of A and S which is denoted by A ∩ S, where a cycle is a path that first vertex (state) and last vertex of the path are the same, and an accepting cycle is a cycle going through some accepting vertex. If there exists such a cycle, then the path consisting of the accepting cycle and the path from some initial state to the cycle is a counterexample; otherwise, the system satisfies the given specification. Thus, transition labels can be ignored, which does not affect the verification results.
The intersection of automata of A and S is an automaton that accepts language L(A) ∩ L(S). Note that intersection of automata here is different from that of sets. Because accepting states from both automata may appear together only finitely many times even if they appear individually infinitely often [5] , setting F = F 1 × F 2 does not work. Hence, we build A ∩ S following the method in [5] . Namely, A∩S =( ,
, where is the alphabet, Q 1 × Q 2 × {0, 1, 2} is the state set, is the transition relation, Q 0 1 × Q 0 2 × 0 is the initial state set, Q 1 × Q 2 × 2 is the accepting state set of A ∩ S, respectively. The transition relation of A ∩ S is defined as following:
if and only if the following conditions hold: 1) (r i , a, r m ) ∈ 1 and (q j , a, q n ) ∈ 2 , that is, the local components agree with the transitions of A and S; 2) the third component is affected by the accepting conditions of A and S; 3) if x = 0 and r m ∈ F 1 , then y = 1; 4) if x = 1 and q n ∈ F 2 , then y = 2; 5) if x = 2 then y = 0; 6) otherwise, y = x. Fig. 1 
B. I/O Complexity Model
Because the access to information stored on an external memory device is orders of magnitude slower than the access to information stored in the internal memory [14] , [15] , complexities of external memory algorithms are usually measured in terms of the number of I/O operations. Here, an I/O operation is a transfer of data from a disk to internal memory or from internal memory to a disk. For example, for the benchmark I T C 99, B15(std), P1, our algorithm costs in total 210 I/O operations when finding one counterexample.
For complexity analysis of external memory algorithms, a widely used model is the model of Aggarwal and Vitter [20] . In the model, the number of I/O operations is usually described 
IV. RELATED WORK
Different I/O efficient algorithms for LTL model checking have been proposed. Most algorithms are based on nondepthfirst search (non-DFS) which include breadth-first search (BFS) and A* [14] , [21] - [27] , because they utilize the delayed duplicate detection technique which is incompatible with DFS [25] . The technique needs to maintain a set of visited states on disk to prevent them from being reexplored. It is based on the observation [21] that a newly generated state does not need to be checked against the state table immediately; one can postpone the checking until an entire level of the search has been explored and then check all states in the level together by linearly reading the table from the disk.
To the best of our knowledge, in the last few years, among this kind of algorithms, DAC [14] , MAP [27] , and IDDFS [18] achieve state-of-the-art performance and represent the most recent advances.
The algorithm DAC [14] adapts an existing non-DFS-based accepting cycle detection algorithm one way catch them young [28] to the I/O efficient setting. The algorithm first inserts all reachable vertices into an approximation set. After that, it repeatedly reduces the approximation set until a fixpoint is reached. In detail, vertices violating the condition are gradually removed from the approximation set using two procedures. One procedure removes those vertices from the approximation set that lie outside any cycle. The other removes vertices lying on nonaccepting cycles. Finally, if the approximation set is empty, there is no accepting cycle in the graph, otherwise the presence of an accepting cycle is ensured. The algorithm is especially useful for verification of large systems with valid properties. But, it needs to create the whole state space.
Since DAC does not work on-the-fly, Barnat et al. [14] , the same authors of DAC, further proposed an on-the-fly algorithm: MAP algorithm [17] , [27] , which is a revisiting resistant algorithm for I/O efficient LTL model checking. Revisiting resistant graph algorithms are those that significantly reduce the number of expensive I/O operations at the price of reexploration of edges (or vertices) in internal memory. They are actually an improvement on the delayed duplicate detection technique. The idea of the delayed duplicate detection technique is to postpone the duplicate check of single vertex against disk and perform them together in a group, for the reduction of the number of I/O operations. In the case of BFS traversal, the group (also called candidate set) consists typically of a single BFS level. However, if the level is small, the utility of delaying duplicate detection drops down. A possible solution is to maximize the group by exploring more BFS levels at once which will lead to revisiting of vertices due to cycles. However, even though vertex revisits result in performing more (cheap) operations in internal memory, it might significantly reduce the number of expensive I/O operations. Thus, revisiting resistant algorithms are expected to be more I/O efficient than nonresistant ones in practice. The main idea behind the MAP algorithm is based on the fact that each accepting vertex lying on an accepting cycle is its own accepting predecessor. Instead of expensive computing and storing of all accepting predecessors for each (accepting) vertex, the algorithm computes and stores a single representative accepting predecessor for each vertex, namely the maximal one in a suitable ordering of vertices. Experiments showed the algorithm outperformed previous I/O efficient algorithms on invalid LTL properties.
The IDDFS is a 5-bit semiexternal LTL model checking algorithm proposed in [18] . Semiexternal graph algorithms are algorithms, in which the vertices but not the edges fit in memory [29] . The IDDFS uses heuristic EPH to construct a minimal perfect hash function from the vertex set stored on disk, which allows compressing V to 5|V | bits, and only needs to store the 5|V | bits but not V into internal memory, where V is the vertex set of a graph. Thus, the algorithm can handle spaces that are orders of magnitudes larger than internal memory. However, IDDFS still has a limitation on the size of the graph because it needs 5 bits of internal memory for every vertex. This algorithm works on-the-fly by applying iterativedeepening strategy.
V. I/O EFFICIENT MODEL CHECKING FOR LARGE-SCALE SYSTEMS
In this section, we propose an I/O efficient LTL model checking algorithm based on the nested depth-first search [16] , namely IOEMC. The IOEMC incorporates three key ideas: 1) a new linear hash-sorting algorithm LHS; 2) a new duplicate detection technique CDD; and 3) a dynamic search path management scheme DPM. By using linear hash-sorting algorithm and new duplicate detection technique, IOEMC can significantly reduce the time cost of duplicate detection of states. In addition, IOEMC also solves the memory dithering problem by using DPM technique.
We first present related data structures and memory usage. And then, LHS, CDD, and DPM techniques are proposed. Finally, we describe the model checking algorithm IOEMC, which is based on the above techniques. In the following, #(Q) expresses the number of elements in Q, and we assume that the algorithm can read or write B records from or into external memory in each I/O operation.
A. Data Structures and Memory Usage
The IOEMC needs a database DB on disk and two stacks stack 1 Tables table P 1 and table P 2 : They store the states on the path in the first DFS and ones on the path in the second DFS, respectively. In our algorithm, the internal memory space is divided into code segment and data segment. The data segment is divided into two equal parts: T 1 and T 2 . The first one is further divided into two subparts T 11 and T 12 , which are of the same size and store two hash tables H 1 and H 2 for two DFSs, respectively. Each element in H 1 and H 2 is a tuple (h, s), where s is a visited state, and h is the hash value of the state, and all elements in H 1 or H 2 are stored in time sequence. The other part T 2 is shared by stack 1 and stack 2 in a dynamic way.
Note that the aim of using tuples is to accelerate the search of disk tables tableD D 1 and tableD D 2 , and to avoid hash collision. By using tuples, we cannot only search the disk tables sorted by hash values quickly, but also differentiate two different states even if they share the same hash value.
B. Linear Hash-Sorting Algorithm
In this section, we propose a linear hash-sorting algorithm LHS, which is used to reduce I/O complexity and improve the practical performance of our model checking algorithm.
The hash-sorting problem we consider in this paper is described as follows. In the following, we still use the example I T C 99, B15(std) to illustrate how the hash storing algorithm is working. Note that every state is denoted by its hash value. Table I shows the states in the internal memory and the states in table on disk before merging, which are sorted in nondecreasing order by hash values. Our aim is to merge those in the internal memory into a table on disk. The last line "−−−" in Table I (b) means 1000 empty records are appended, where 1000 is equal to the number of the states in the internal memory. The 100 states can be transferred in a single I/O operation. After performing sequentially the following operations: 1) moving the last 100 states in Table I (b) (which are from 4409 to 5833) into the internal memory; 2) sorting them in the internal memory; and 3) moving the states in the internal memory whose hash value are greater than or equal to 4409 into Table I(b), the corresponding result is shown in Table II (a) and (b). This goes on until all records in Table I(b) are handled. The final result is shown in Table II (c) on the disk.
C. Cached Duplicate Detection Technique
In this section, we propose a cached duplicate detection technique CDD, which can significantly improve the performance of IOEMC.
In our duplicate detection technique, the visited states are divided into two groups: 1) the recent states; and 2) the historical states. The recent states are the ones generated most recently and stored in the hash In general, if we select ρ 1 < 0.05, then duplicate detections of almost all states are carried out in the internal memory.
We give an intuitive explanation as follows. Suppose the data segment is allocated 2G memory and every state needs 500 bits of internal memory and every hash value needs 12 bits. Then, H 1 should be 0.5G of size and can hold 2 20 tuples nearest generated. When H 1 is full and the algorithm moves the first (#(H ) · ρ 1 ) tuples of the hash table H into external memory, the tuple number of H 1 is still close to 2 20 because ρ 1 < 0.05. In general, the probability that the currently generated state goes back beyond the 2 20 states nearest generated is extremely small. Therefore, for most states, their duplicate detections are executed in the internal memory.
To verify our argument, we conduct some experiments to figure out the proportion of states whose duplicate detections are performed in internal memory. The selected benchmarks and experimental environment are the same as that of Section VII. The experimental results are listed in Table III . The experiments show that for all models, the rates of the number of states whose duplicate detections are carried out in internal memory to that of all generated states are at least 90%. In addition, we also observe that setting ρ 1 = 0.02 yields the best performance for most large-scale models. The main cause is as follows. There is a tradeoff about time consumption in the setting of ρ 1 . As the value of ρ 1 decreases, the number of visited states stored in internal memory increases, which saves time for duplicate detection, but this also results in more frequent movement of state blocks from the internal memory to the disk, leading to more time of state management, and vice versa.
The above analysis and experimental evidence show CDD can significantly reduce the cost of duplicate detection due to the fact that almost all duplicate detections of IOEMC are carried out in internal memory and tableD D is sorted by hash values. 
D. Dynamic Search Path Management
The search path management includes the static and dynamic ones. The static management means the algorithm allocates fixed internal memory sections to two stacks of the nested depth-first search, while the dynamic one means two stacks share the same internal memory section. Thus, the dynamic management can more efficiently make use of the internal memory. In this section, we introduce a scheme for dynamic search path management, called DPM.
During search, when T 2 is full and a new state is generated, in order to avoid memory overflow, we need to move states from the two stacks to DB. However, this may result in the phenomenon that states are frequently moved inside and outside the internal memory.
In the following, we analyze how this phenomenon occurs. Suppose we swap M 2 states at a time between T 2 and the disk, where M 2 is the number of states that T 2 can hold. When T 2 is full, if we transfer all states in T 2 to the disk, then T 2 becomes empty. In succession, if the algorithm needs to pop a state from stack 1 or stack 2 because of backtracking, then immediately those M 2 states just moved to the disk have to be moved back to the internal memory, and thus T 2 becomes full. Afterward, if another new state is generated and needs to be pushed into stack 1 or stack 2 , then the M 2 states have to be moved outside the internal memory again to make space for the new state, and so on. We call such a phenomenon of frequent state movement memory dithering, which can significantly increase disk accesses and thus the algorithm's I/O complexity.
In order to avoid memory dithering, we present an efficient scheme which works as follows. 
When stack 2 is empty, the algorithm works similarly. The corresponding procedure is implemented in DDB-mem() function and is outlined in Algorithm 3. Note that by reserving some states in both stacks when T 2 is full, we ensure there are always some states in both stacks, which to some extend reduces the memory dithering phenomenon.
E. Model Checking Algorithm IOEMC
Based on LHS, CDD, and DPM, we design the IOEMC algorithm. The IOEMC is an I/O efficient on-the-fly model checking algorithm based on the nested depth-first search.
We use V 0 to denote the initial state set and F to denote the accepting state set. Also, we assume the state transition graph of the automata is implicitly given from the function successor(x) that generates all successors of the state x.
Algorithm 3 Dynamic Empty-Stack Management
The first DFS, outlined in Algorithm 4, is to search for a path from an initial state to some accepting state by postorder traversal. In each while loop, for the current state x, the algorithm first performs duplicate detection for successor s by using CDD technique. If s is a new state, then the algorithm puts s into stack 1 and puts the tuple (hash(s), s) into H 1 . Before these two operations, the algorithm needs to check whether or not stack 1 The second DFS is to detect an accepting cycle and finally return a counterexample if the system does not satisfy the given specification. The counterexample consists of an accepting cycle and a path to the cycle from some initial state. The second DFS works similarly as the first DFS, and is outlined in Algorithm 5.
VI. COMPLEXITY ANALYSIS
In this section, we analyze the I/O complexity of IOEMC. Note that the correctness of IOEMC can be easily proved by following the similar proof line in [16] . In the following, we let N express the number of states of the verified system and M be the number of states that the internal memory can hold.
A. Complexity of Algorithm IOEMC
In this section, we estimate the I/O complexity of IOEMC. Lemma 2: The total I/O operations that dynamic search path management needs in the nested depth-first search is O (scan(N) ).
Proof: The worst case is the one with the most number of I/O operations. Thus, the algorithm should traverse all states and pop them out from the two stacks, and the number of states moved from the internal memory to the disk is equal to the Algorithm 4 First DFS Based on Quick Hash-Sorting Algorithm number of states moved from the disk to the internal memory. Thus, when T 2 is full, (M 2 · ρ 2 ) states are moved from the internal memory to the disk, and then (M 2 ·(1 −ρ 2 )) states are continuously popped out from the two stacks one by one and T 2 becomes empty, and then (M 2 · ρ 2 ) states have to be transferred to the internal memory from the disk again; afterward, the algorithm pushes continuously (M 2 · (1 − ρ 2 ) ) states into the stacks one by one and T 2 becomes full, and then it moves (M 2 · ρ 2 ) states from the internal memory to the disk again, and this goes on until all states of system are traversed. In this process, the block of states is moved from the internal memory to the disk
Because every movement of a block of states needs (M 2 · ρ 2 /B) I/O operations and the number of states moved from the internal memory to the disk is equal to that from the disk to the internal memory, the whole process costs 2((N − M 2 · ρ 2 )/ Algorithm 5 Second DFS Based on Quick Hash-Sorting Algorithm O(scan(N) ). Because T 1 and T 2 are of the same size and T 11 and T 12 are of the same size, the I/O complexity of algorithm IOEMC is
B. Complexity Comparison
In this section, we compare IOEMC with DAC, MAP, and IDDFS, in terms of I/O complexity.
The DAC is an I/O efficient algorithm for accepting cycle detection proposed in [14] . The [27] . Because |E| is larger than N, I/O complexity of IOEMC is much lower than that of MAP in the case for candidate set in RAM. In the case for candidate set on disk, from Section III-A, we can observe that |F| is equal to N/3 for the intersection of automata A and S. It follows that I/O complexity of MAP is O (N 3 ) . Thus, IOEMC outperforms MAP in terms of I/O complexity.
The IDDFS is a semiexternal algorithm proposed in [18] . 
VII. EXPERIMENT
In this section, we compare runtime and allocated disk space of IOEMC with that of DAC, MAP, and IDDFS.
A. Benchmarks
In order to compare the performance of IOEMC with that of DAC, MAP, and IDDFS, we selected benchmarks from [14] , [18] , [19] and added models Peterson(6), P4 and Szyman. (6) , P4. The two models are to show the limitation of scale of systems IDDFS can verify. All selected benchmarks are from the BEEM project [30] , which include models with valid properties and models with invalid properties, ranging from less than 50 000 states to more than 6 000 000 000 states. They are typical ones in the literatures and serve as a good test bed to justify the efficiency and performance of model checking algorithms.
B. Experimental Setup
The four algorithms have been implemented on top of the DiVine library [31] , providing the state space generator, and the STXXL library [32] , providing the I/O primitives. For IOEMC, we set the parameters ρ 1 = 0.02 and ρ 2 = 0.90.
All experiments were run on a PC with CPU P4 2.4 G, memory 2G, disk space 400 GB, and Linux 9.0 operation system. For each instance, each algorithm is performed 100 runs. For each algorithm on each instance, we report the average runtime (time) and average disk consumption (disk). The time format is hh:mm:ss (the elapsed hours, minutes, and seconds).
C. Experimental Results
Experimental results on models with valid properties are presented in Table IV . As is clear in Table IV , IOEMC verifies these valid models significantly faster than other algorithms. On the five benchmarks, namely Elevator2(16), P4, MC S(5), P4, Phils(16, 1), P3, Lamport (5), P4, and I T C 99, b15(std), P2, for which all the algorithms verify the validity within 10 h, IOEMC does this two to three times faster than other algorithms. For the two hard benchmarks Peterson(6), P4 and Szyman. (6) , P4, IDDFS fails to handle them due to internal memory shortage, as it is a semiexternal algorithm which needs five extra bits of internal memory for every state. In addition, both DAC and MAP need more than 30 h to verify each of these two benchmarks, while IOEMC only needs 12 and 15 h for Peterson (6) , P4 and Szyman. (6) , P4, respectively. Nevertheless, as IOEMC needs to store not only state, but also its hash value on disk for every state, it has a bit more space consumption than other algorithms on models with valid properties.
Experimental results on models with invalid properties are reported in Table V . An obvious observation from Table V is that all the algorithms but DAC can find a counterexample for these benchmarks very quickly (within several minutes). Notably, IOEMC dominates other algorithms on four benchmarks, namely, Bakery(5,5),P3, Elevator2 (16) ,P5, Szyman(4),P2, and Lifts(7),P4, in terms of time consumption. For these three benchmarks, IOEMC performs at least two times faster than the best of other algorithms. Admittedly, for two of small models, namely, I T C 99, b15(std),P1, and Li f ts (7) ,P4, IOEMC is a bit slower than other three algorithm. This is because the three new techniques used in IOEMC is designed to reduce the number of I/O operations for large models, and has no effect for small models.
VIII. DISCUSSION
A. ρ 2 Parameter
The ρ 2 parameter must be provided in order to execute IOEMC. It influences the performance of IOEMC by controlling the size of state blocks moved into or out of the internal memory.
To observe the parameter's impact on the performance of IOEMC, we run IOEMC with different values of the ρ 2 parameter for each of the used instances. The selected benchmarks and experimental environment are the same as that of Section VII except for models I T C 99, b15(std),P2, Szyman. (4) ,P2, I T C 99, b15(std),P1, Elevator2 (7) ,P5, and Li f ts (7) ,P4, because small models do not need to use DPM technique. The experimental results are reported in Table VI .
From Table VI , we observe that the ρ 2 parameter has an obvious impact on the performance of IOEMC, and there are different optimal values of the parameter between different instances, and the optimal values mainly are between 0.85 and 0.90. The investigation about adjusting the ρ 2 parameter automatically is left for future work.
B. State Number Limit
In this section, we discuss the state number limit over which our approach cannot get a solution in reasonable time. We may regard 24 h as reasonable time, and assume the computer has a 7200-r/min desktop hard disk drive (HDD) and a Serial Advanced Technology Attachment (SATA) bus interface, and every state occupies 100 bits in our approach.
According to [33] , as of 2010, a typical 7200-r/min desktop HDD has a disk-to-buffer data transfer rate up to 1030 Mb/s, and a widely used standard for the buffer-to-computer interface is 3.0-Gb/s SATA. Thus, the transfer rate between disk and computer is less than 772 Mb/s. In order to find a counterexample, if the state number to be searched is larger than 772 × 1024 × 1024 × 24 × 3600/100 (<6.68 × 10 10 ), then our approach cannot carry out this operation successfully in reasonable time (24 h ). Thus, a state number limit to our approach is about 6.68 × 10 10 .
IX. CONCLUSION
In this paper, we proposed and introduced a linear hash-sorting algorithm LHS, a cached duplicate detection technique CDD, and a dynamic search path management technique DPM. Based on the above techniques, we proposed an I/O efficient LTL model checking algorithm for large-scale systems. We have implemented our model checking algorithm, and carried out the experiments on selected representative benchmarks. The complexity analysis and the experimental results show IOEMC has lower I/O complexity and obviously better practical performance than state-of-theart I/O efficient algorithms, including DAC, MAP, and IDDFS. The low I/O complexity and good performance indicate that IOEMC is very promising for verifying large-scale systems efficiently.
