ABSTRACT Due to the centralization of communication in the management of data generated by diverse Internet of Thing (IoT) devices, there is a lack of reliability when data is being transferred and stored. Among errors caused by various behaviors, Silent Data Corruption (SDC) error, owing to stealthy destruction without error prompt, is one of the most difficult data consistency problems in the storage system, whether it is a traditional multi-control, distributed storage, or public cloud storage. Nowadays, for SDC error detection, extracting instruction features to analyze vulnerabilities of programs or instructions has still not been fully explored. Literature generally just count the number of possible features, without mining the inter-characteristic of the instruction and correlation between them. Thus, we propose a method of SDC-causing Error Detection based on Support Vector Regression (SED-SVR) for fully exploiting the correlation between data features. Specifically, firstly, we extract instruction features based on the SDC vulnerability of program instructions by analyzing results of fault injections. Secondly, we establish the instruction SDC vulnerability prediction model based on SVR and propose our SED-SVR model. Thirdly, according to the predicted values of SDC vulnerability, we develop some solutions for faults tolerance of target programs by different granularity of instruction redundancy. The experimental results show that our SED-SVR has higher fault detection rate with lower performance overhead.
I. INTRODUCTION
With the rapid development of wireless communication technology and the continuous improvement of information science, the research and application of modern communication are more intelligent. Recently, the 5G communication technology has more broader development prospect due to its characteristics of ultra-high speed, ultra-low latency, and numerous connected devices, which will be applied in countless fields, such as the Internet of Things (IoTs), cloud computing, machine-to-machine communication, subway networks and high-speed rail networks. It can additionally be used in the extreme customer services like virtual reality and augmented reality. However, 5G mobile communication technology, which is still in the process of exploring and developing, will bring a series of system reliability and security challenges.
The associate editor coordinating the review of this manuscript and approving it for publication was Xiaojiang Du.
The system reliability of wireless networks has always been significant [1] . In diverse system errors, data flow corruption is not unusual. Especially when it creeps into backup and remain undetected, it becomes Silent Data Corruption (SDC) error in both hardware and software. Unlike other errors, SDC propagation does not generate system symptom during the program execution, thereby it is difficult to be captured by common detection mechanisms and cause the program to output erroneous results. Thus SDC is one of the most difficult data consistency problems and has a high risk for the application and storage system, whether it is a traditional multi-control, distributed storage, or public cloud storage, which materializes as bit flips in both volatile memory and non-volatile disk or even within processing cores [2] .
Generally, there are two kinds of methods for SDC error detection: program assertions [3] and program redundancy [4] . Program assertion detected data flow errors by inserting assertions into the program, including numerical features of the program under numerical intervals and relationships between data values. The detection cost was not VOLUME 7, 2019 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ high, but the detection accuracy was unsatisfactory due to low generation [3] , [5] . Program redundancy set a copy of the operation in the program and detected the SDC error by comparing operation results of the original programs with the redundant copies', which possibly causing high consumption from copying all the instructions in the program.
Recently, literature focus on analyze vulnerable parts of the program and selectively reinforce them, including SDC vulnerability analysis based on fault injection [6] , based on program static analysis [7] and machine learning-based SDC vulnerability analysis [8] . The fault injection methods injected the needed instruction information of faults into the schedule sequentially, which were often costly [6] , [9] , [10] . Detection methods based on static analysis analyzed features of instructions and the SDC vulnerability of programs without fault injections, but the time cost increased exponentially as the size of the program raised [11] , [12] . The machine learning-based SDC vulnerability analysis methods predicted the vulnerability by extracting a series of features of programs, which reduced the performance overhead of error detection but had low general applicability [8] , [13] .
To address above challenges, in this paper, we analyze the SDC vulnerability of programs to design error detection mechanism and propose a model of SDC Error Detection based on Support Vector Regression (SED-SVR). Specifically, firstly, according to multiple fault injection observations, we extract instruction features affecting the SDC vulnerability. Then we establish the SDC vulnerability prediction model in the target program by exploiting the inter-characteristic correlation between features. Secondly, from the prediction model, we propose the SDC error detection model to selectively process the redundancy of instructions in the target program. Thirdly, by the Low Level Virtual Machine (LLVM) [14] compiler framework, the proposed SDC error detection method is implemented, and the fault tolerance of the program can be realized.
II. RELATED WORK
Data flow error detection methods were mainly divided into hardware-based and software-based technologies. Hardware-based data error detection usually adopted on-chip redundancy method, that was, additional hardware circuits were added to detect faults. Although it had a good failure detection rate, it could cause a high hardware consumption. In contrast, software-based data flow error detection can be independent of specific hardware. Literature detected key elements of the program through program analysis for fault tolerance, which had attracted more attention due to its high flexibility and low hardware consumption [15] - [18] .
Generally, there were two kinds of software-based methods for SDC error detection: program assertion [3] and program redundancy [4] . The former detected data flow errors by inserting assertions into the program, including numerical features under normal conditions, such as numerical intervals and relationships between values. Program redundancy mainly set a copy of the operation of the program, for detecting the SDC error by comparing the running results of the original program with the redundant copy's.
Specifically, the method based on program assertion comprised automatic feature extraction [5] and artificial feature extraction [3] . Automatic extraction performed assertions by extracting legal intervals of values in the program to detect errors. And the artificial feature extraction explored the functional relationship between variables through the function logic of the analysis program, which was more complicated. Since program assertion detection had the problem of low generation, it had unsatisfactory application ranges. For the traditional redundancy-based detection approaches, they copied all instructions in the program, obviously resulting in high performance overhead. Recently, the research on redundancy methods focused on analyzing and selectively reinforcing the vulnerable parts of the program for cost-effective error detection, which mainly contains: SDC vulnerability analysis based on fault injection [6] , program static analysis [7] and machine learning [8] .
The fault injection-based method selectively injected the partial instruction information of faults according to a schedule for reducing the performance cost. For example, Hari et al. proposed a fault point analysis method called Relyzer [6] aimed to compress the fault space and predict the fault results, so as to select a small number of faults to perform selective fault injection. Li et al. proposed an intelligent fault injection framework [19] , which could cut faults without fault simulation or find equivalent faults, so as to compressed the fault space more effectively and improved fault injection efficiency. Xu et al. proposed a biased injection framework called Critical Fault to remove benign faults by the instruction-level vulnerability analysis [20] . Yang N et al. proposed a method called Program Vulnerable Instructions Identification (PVInsiden) to identify program vulnerable instructions that were likely to cause SDCs [9] . Wang C et al. proposed a neural network detector which could find out the SDC errors [10] iterated many times in the program.
The method based on static analysis analyzed the SDC vulnerability of programs based on their features. For instance, Yang et al. analyzed the propagation of transient faults by establishing error flow model [7] and proposed some rules of error propagation. Pattabiraman et al. proposed a framework to identify the SDC vulnerability instructions [21] , which used symbols to perform error states of variables in abstract programs, employed model checking techniques to abstract programs and analyzed the propagation path of the error. Liu LP et al. proposed an artificial intelligence method based on random forest to identify instructions in the program that could cause SDC errors [11] . Guanpeng L et al. constructed a three-level model which captured error propagation at the static data dependency, control-flow and memory levels based on empirical observations of error propagation [12] . All these methods did not require fault injection when identifying instructions causing SDC errors. But their time cost increased exponentially as the size of the program raises, so it rarely applied to the large-scale program analysis [11] , [12] .
J. Gu et al.: Vulnerability Analysis of Instructions for SDC-Causing Error Detection
Recently, the machine learning-based SDC analysis methods predicted vulnerabilities by extracting a series of features of the program. Bronovetsky et al. used Support Vector Machine (SVM) and neural network to propose a basic framework for modular program fault analysis [8] , which analyzed the vulnerability of each code segment and generate vulnerability models. Lu et al. proposed an empirical model to predict SDC vulnerability [15] , which extracted static features of instructions in the process of program compilation, and used the decision regression tree to predict the vulnerability of storage instructions. Laguna et al. proposed an instruction replication technique [13] for Silent Output Corruption (SOC) errors in scientific computational programs with transient faults, by SVM to determine SOC vulnerability and offer redundancy protection. In summary, although the reliability of networks has always been important [22] , and there have been many works about defecting the system errors to predicate the vulnerability by extracting a series of features of programs, but had a gap to a satisfactory performance and general applicability.
III. INSTRUCTION SDC VULNERABILITY PREDICTION MODEL A. DEFINITION OF INSTRUCTION SDC VULNERABILITY
In this paper, the instruction SDC vulnerability is characterized as the probability of the SDC error occurred by transient faults in the destination register. 
Definition 2: I dynamic (I i ) represents a set of all dynamic execution instructions of the static instruction I q i , as shown in the equation (2) , where I i is the q − th dynamic instruction of the instruction I i :
Definition 3: The instruction SDC vulnerability is characterized as the probability of the SDC error in the the program occurred by a transient fault of the instruction register, as shown in the equation (3) . P SDC (I i ) is the SDC vulnerability value of the instruction I i , and P dynamic (I 
P dynamic (I
We utilize the regression model to predict the SDC vulnerability of instructions, by exploring the correlation between features. Therefore, we extract the instruction features in the process of the program compilation, and further derive the vulnerability of the instruction through the trained the prediction model. Here, we adopt SVR to establish the SDC vulnerability prediction model, since it has high generalization and good nonlinear mapping, with not too many parameters to adjust [23] . Thus, the prediction function can be expressed as the equation (4):
where p is the SDC vulnerability prediction value of the instruction I i represented by the target program instruction vulnerability feature vector F, ω is the weight vector, b is the bias term, and ϕ(F) is a nonlinear function that maps the input instruction feature space to a high dimensional space. Let the sample space is
where l is the number of training instruction samples, F i is the feature vector of the i − th training sample, p i is the SDC vulnerability of the i − th sample. According to the risk minimization of SVR [24] , the optimization problem of the SDC vulnerability prediction function is shown in the equation (5):
where the parameter ε is the bias allowed by the predicted value. In order to make possible the solution of the equation, ξ i ( ξ * i ) is the relaxation factor beyond ( below ) the target vulnerability value of ε, and C is the penalty parameter of the error term. From the Lagrange function, where the non-negative vector α, α * i , t i , and t * i are multipliers of Lagrange, we can have:
Derive the Lagrange function by ω, ξ i and ξ * i , respectively. After the result is taken into the above optimization problem VOLUME 7, 2019 with the kernel function
, we can get the optimization problem as shown in the equation (7) by the Wolfe duality technique [25] :
By solving the above quadratic programming problem, we can get Lagrange multipliers α i and α * i , and by the KKT (Karush-Kuhn-Tucker) condition [26] - [28] , we can calculate the parameter b:
where x i and y i form a training sample set (
Therefore, we can have the instruction vulnerability prediction model as follows:
Here, we use the Gaussian kernel function [29] :
. When establishing the instruction vulnerability prediction model, there are three parameters to be selected: insensitive loss ε, penalty parameter C, and kernel parameter σ .
IV. FEATURES EXTRACTION OF THE INSTRUCTION SDC VULNERABILITY
In order to effectively predict the vulnerability of instructions, it is necessary to appropriately analyze and extract features describing the vulnerability, which can be divided into dependent and inherent features of instructions. These two features can also be described in detail from multiple levels. For instance, inherent features can be further divided into features of the instruction type, the features of the basic block or the location of the instruction, and the features of the function where the instruction is located. In this paper, we propose a novel representation method that describes features from two dimensions: dependent features, such as instruction correlation, and inherent features, such as instruction category. Particularly, we introduce the slicing technique, using the instruction as the basic unit of analysis, to extract instruction features in the program that affect the vulnerability. It should be noted that, features irrelated to the vulnerability are not considered, such as the module name, neither do the feature information that cannot be obtained in the LLVM IR instruction [14] , such as the hardware architecture related information.
A. DEPENDENCY INSTRUCTION FEATURES EXTRACTION
The dependency feature of an instruction can cause the SDC fault to be propagated or changed during the program running. For instance, the compare/jump instruction can be viewed as having such the dependency feature because the SDC fault will be propagated to the next instruction. Therefore, we need to identify the program or control the predicate which is affected by variables in instructions. Here, we use the program slicing technology [30] to solve this problem. We build the Program Dependency Graph (PDG) which is a directed graph based on the data dependency and controls the dependency of instructions [31] . Then we divide the program based on PDG in LLVM [14] , and use the program slice to realize the understanding and analysis of the whole program.
Definition 4: The instruction can be expressed as a six-tuple as shown in the equation (10) . O S (O d ) is the source(destination) operand in the instruction I . S B (S F ) is the backward(forward) slice of instruction I . Fun (BB) is the function(basic) block where instruction I is located.
When the program runs, the fault will be propagated into other data through the dependency of the data. Therefore, analyzing the data dependency in the program is conducive to extract dependency features of instructions and can be used for the vulnerability analysis of the instructions. When analyzing the vulnerability of an instruction, the dependency feature of the instruction F dependent is the instruction feature related to the data dependency.
1) DATA DEPENDENCY CHAIN AND FAULT PROPAGATION PATH
The vulnerability of an instruction's SDC depends on its resulting variable in the data dependency chain and the vulnerability of its data dependency chain-end instructions [32] . According to the fault propagation path, we divide the dependency propagation relationship in the program into three categories: the transient fault propagated into the (1) storage instruction, (2) branch jump instruction, and (3) function call instruction. For the category (1) and category (3), store instructions and function call instructions are respectively used to analyze data dependencies of instructions. For the category (2), since the execution of the branch jump instruction depends on results of the comparison instruction, we use the comparison instruction to analyze the data dependency of the instruction. In addition, some instructions terminate the dependency chain. For example, in the LLVM IR instruction, if there is no destination register in the store instruction ''store or the branch instruction ''br , these instructions will terminate the dependency chain of the data.
Data dependency analysis generates the propagation path from the target instruction to the one at the end of the data dependency chain, where includes storage instruction, branch jump instruction and function call instruction.
The propagation path is shown in equation (11):
In equation (11) (I i ) is the data chain end instruction of the I i .
In order to extract instruction features related to the description of data dependencies, we will analyze the fault propagation and instructions of data chain end.
2) FAULT MASKING FACTOR
After a transient fault, there are mainly two factors affecting the vulnerability in the data dependence propagation path: whether the fault will be masked in the propagation [32] and whether the fault will cause the program to crash during the propagation [33] . Thus, here we analyze these from the SDC error masking and program crash. Further, we propose the dependency vulnerability factor to measure the impact of the dependency.
In the propagation of the transient fault in the program, some instructions would mask the error. For example, instruction x = y&0x f f f 0 will mask the low 4-bit error in variable x propagating to variable x. We call it an error mask instruction. So if the result of an instruction is used in an error mask instruction, its SDC vulnerability will be reduced.
Definition 5: P mask (I i ) is defined as the probability that the instruction I i is masked on Path(I i ) when transient faults occur. The calculation method is shown in the equation (12) . The types of error mask instructions are shown in the table 1 [34] .
where DO is the destination operand of I i , and its total number of bits is BITS. P mask_b (bit, I i ) is the probability that the bit in I i will be masked by the error mask instruction, which is determined by the type and operand of the error mask instruction. P error (bit) is the probability that a transient fault occurs at this bit. According to [33] , P error (bit) is assumed to be 1 
N BITS
, that is, the probability of a transient fault occurring for each bit is equal. I MASK represents an error masking instruction set which will mask one bit or a few bits of transient fault, including logic operation instructions, shift operation instructions, conversion operation instructions.
Definition 6: For the instruction I i in the program, we define the mask factor MF to indicate the SDC vulnerability affected by the error mask instruction. The larger the MF, the smaller the probability of the error being masked, and the greater the probability of an SDC-causing error occurring after a transient fault occurs. The calculation method is shown in Equation (13):
In equation (13), P mask (I i ) is the probability of the instruction masked on its propagation path after an error occurs. f SDC (I end (I i )) indicates the SDC vulnerability of I i by the decision whether the end instruction I end (I i ) of the data dependency chain will affect the program output. Obviously, transient faults affect the program output and will cause errors to final results of the program. Thus, for the end instruction I end of the data dependency chain, if an instruction is in the backward slice of the program result output instruction I OUT , f SDC (I end (I i )) is 1; if not, it is 0. The calculation method is as shown in equation (14) .
3
) DEPENDENCY FEATURES ANALYSIS FROM END INSTRUCTIONS
In the entire data dependency chain, besides the postdeduction analysis from the head instruction based on the relevant dependencies, it is also possible to step backwards from the end instruction. Note that this method just works for three types of instructions described above, including store instructions, branch jump instruction and function call instruction. For instructions in the category (1) and (3), an error occurring in the instruction related to the address mostly causes the program to crash. 99% of program crashes from transient faults are caused by segment errors [33] when the program's memory space exceeds the memory allocated by the system to the program. In the program, the more the bit number of address variables, more bits may be wrong to cause the program to crash. The probability of a crash-causing error due to an address-related variable fault is classified by the number of bits of the variable. The statistics results are shown in Figure1(a). As can be seen from Figure1(a), the larger the number of bits in the address variable, the more errors in the bits will cause the program to crash. Therefore, in order to measure the impact of program crash on the instruction SDC vulnerability, we extract features associated with the program crash, including the number of address-dependent instructions in the forward slice and the number of bits in the destination operand.
For instructions in the category (2), the depth of nested loops is an important factor that affects the SDC vulnerability. VOLUME 7, 2019 J As shown in Figure1 (b), there is a relationship between instruction vulnerability and the loop depth of the instruction. The deeper the loop depth, the greater the vulnerability impact. Therefore, loop depth Loop depth is chosen as a feature, and the value is 0 if is is not in the loop. Based on the above analysis, we extract the instruction dependent features that affect the SDC vulnerability of the instruction as shown in table 2.
Definition 7: The instruction dependent feature related to data dependence can be characterized as a quadruple as shown in equation (15):
where F dependent indicates whether the target instruction is used in a store instruction, a call instruction, an integer comparison instruction, a floating-point comparison instruction, and an address-related instruction. If so, the corresponding field is marked as 1, otherwise marked as 0. CMP = Loop depth , P branch is used to describe the comparison related features when the target instruction or the end instruction is a comparison instruction. Loop depth is the loop depth and P branch is a static branch probability by the branch probability prediction analysis from LLVM. Crash = Byte dest , Addr num describes the crash related features. Byte dest is a destination operand and Addr num is the number of address-related instructions in the forward slice. MF is the calculated mask factor from equation (13) . In order to extract the dependent features of the instruction effectively, Algorithm 1 in table 3 gives the specific process of the feature extraction. The algorithm first divides all the instructions in the program into three categories according to the data dependence, and generates the propagation path from the target instruction to the end of the data chain according to the instruction type as shown in Equation (11) (step 3-15) . Based on the propagation path and the backward slice of the program output instruction, the error mask factor is calculated for each instruction, and the calculation equation is as shown in Equation (13) (step 16-18) . Finally, according to table 2 traversing all the instructions, dependent features are extracted combined with both static and dynamic analysises of LLVM (step 19-25).
B. INHERENT INSTRUCTION FEATURES EXTRACTION
Generally, inherent features of an instruction can be used to describe the instruction itself. Also, during the execution of the program, the attributes of the instruction and the basic block or function to which the instruction belongs can reflect the SDC vulnerability. Therefore, by analyzing the instruction, we can extract inherent features F inherent that affect the vulnerability of the instruction.
1) EXTRACTING FEATURES OF INSTRUCTION TYPES
After multiple fault injection experiments, we find the situation of the program after transient faults is highly correlated with the instruction type. For example, a fault occurs in a load instruction, and the probability of a system error is relatively high, while in a comparison instruction, the probability of an SDC error is higher. This will be further described in the Case Study in the section of Experiment.
Definition 8: The instruction type can be presented as a vector as shown in equation (16) (16) where parameters indicate whether the instruction is an integer binary operation, a floating-point binary operation, a comparison instruction, a logic operation, a conversion operation, an address-related operation, a function call instruction, and a memory reading instruction. If so, the corresponding element is marked as 1, otherwise marked as 0. Some instruction types are shown in table 4. Fig. 2 shows statistical results categorized by the effect of instruction type on program execution after a fault is injected into the training program. The detail experimental setup is given in Section VI. In figure2, for the binary operation, operations such as fadd, fsub, fmul, fdiv between floating-point numbers are less vulnerable due to their work mode which can mask transient faults. Instructions related to address operations such as sext, bitcast, etc., have a very high probability of causing program crash. For integer binary operations and comparison instructions, if a transient fault occurs, there is a great probability of propagation to result in the SDC-causing error.
2) EXTRACTING FEATURES OF BASIC BLOCKS AND FUNCTIONS
During the execution of the program, basic blocks and functions where the instruction is located have an impact on the SDC vulnerability. The experimental results in literature [15] also show that the execution time ratio of the instruction to its located function is the importance coefficient in training the vulnerability prediction model. Therefore, in order to describe the impact of properties of the basic block and function on the vulnerability of the instruction, we extract the six features as shown in Table 5 .
Refer to the literature [25] , the basis for variable redundancy protection is from whether the variable affects the global memory and whether it is a function parameter. And the literature [16] proposed to use the fan-out value of a variable as a basis for whether or not to protect a variable. The location with high fan-out is usually the stack or stack pointer, where faults easily cause the program to crash. Therefore, we choose whether the variable in the instruction is the global variable and fan-out values of the variable as features. 
FIGURE 3. Effects of instruction types on execution results.
Based on the above analysis, we extract instruction inherent features that affect the SDC vulnerability as shown in Equation (17) .
Definition 9: The inherent feature F inherent of the instruction can be characterized as a nine-tuple as shown in Equation (17) : 
V. SDC-CAUSING ERROR DETECTION BASED ON SUPPORT VECTOR REGRESSION A. WORKFLOW
The workflow of our SED-SVR is shown in figure 3 , which includes four phases: fault injection, instruction SDC vulnerability prediction model train, target program instruction SDC vulnerability prediction and instruction redundancy.
1) FAULT INJECTION
In order to collect instruction samples for training the instruction SDC vulnerability prediction model, several typical programs of Mibench benchmark [35] 
2) INSTRUCTION SDC VULNERABILITY PREDICTION MODEL
Based on the analysis of the instruction vulnerability, instruction features of training samples are extracted through the implemented LLVM Pass. The instruction sample space T = {(F i , p i )} of SDC vulnerability prediction model can be trained, where F i = F i.inherent , F i.dependent is feature vectors of the i th sample in the training set, including both the dependent and inherent vectors. p i is the vulnerability of the i th sample which is calculated during the fault injection phase. Then, we can obtain the vulnerability function prediction basec on SVR and shown in equation (9).
3) TARGET PROGRAM INSTRUCTION SDC VULNERABILITY PREDICTION
We use Clang to convert target programs to IR instructions in LLVM to get the target program static instruction set V static = {I 1 , . . . , I j , . . . , I N }, where I j is the j th static instruction in the target program and N is the number of instructions. Through the implementation of the LLVM Pass, instructions in V static set are analyzed to extract the instruction feature F j = F j.inheret , F j.dependent of the target program instruction. Based on the instruction vulnerability prediction model trained, the vulnerability is predicted and output to the corresponding statistical file. In order to evaluate the prediction accuracy, the SDC vulnerability is defined as follows: Definition 10: SDC rate is the program SDC vulnerability value by the equation (18), in which p(I i ) is the SDC vulnerability value of the instruction I i , d i is the dynamic execution number of the I i , and N p is the dynamic instruction number of the entire program.
4) TARGET PROGRAM INSTRUCTION REDUNDANCY
According to the instruction redundancy strategy described in the next sub-section, the instruction vulnerability in the statistics file is ranked. The first 10%, 20%, 30% of the instructions are redundant, and the SDC-causing error detection method with different granularity is realized. Finally, the target program with SDC-causing error detection ability is obtained.
B. INSTRUCTION REDUNDANCY STRATEGY
In the instruction redundancy phase, instructions are processed with different granularity of redundancy. Specific strategies are as follows:
1) SELECT TARGET REDUNDANT INSTRUCTIONS
Sort the SDC vulnerability of the instructions and select redundant instructions with some granularity value. The instruction set selected is shown in equation (19):
Here, I sort is the instruction set sorted by instruction vulnerability, I selected is the first S instruction set selected from the I sort , Z is the redundant granularity, N is the total number of instructions in the program and P indicates the instruction vulnerability.
2) DUPLICATE SELECTED INSTRUCTIONS
The selected instruction set I selected is duplicated in the program to obtain redundant instruction set I dup , except for the store-instruction. This is because most of the existing computer systems use Error-Correcting Code memory (ECC memory) to guarantee the data in the storage module to unaffected by transient faults when being read and stored.
3) INSET CHECK INSTRUCTIONS STRATEGY
Traversing all instructions in redundant instruction set I dup , if the instruction I dup i depends on I dup j and i > j, they exist to def-use relationship. We insert the comparison instruction at the end of the basic block and compare execution results of the original instruction with execution results of the inserted redundancy instruction. If the instruction in I dup cannot form a def-use relationship with others, we insert a separate comparison instruction for this instruction. When the program is running, if the results are inconsistent, it indicates that a SDC-causing error has occurred and measures need to be taken for restoration. If the results are consistent, no error occurs in the execution of this basic block code.
VI. EXPERIMENT
In order to evaluate the effectiveness of our SED-SVR, we carry out a set of experiments. The experimental environment are: Ubuntu 16.04.1, CPU Intel Core i5-4200H, 8GB memory. We use LLVM 3.4 to compile the original target program to obtain the IR instruction. We use the LLFI tool, which is based on the LLVM compiler framework and operates at the LLVM IR in the instruction level and has been verified for accuracy in analog register hardware faults [37] . This paper simulates the impact of the transient fault on program execution by randomly injecting faults into dynamic instruction. Because the multi-processor platform is not widely used due to its high cost, we just discuss the program which is implemented by the single thread. We use the embedded MiBench benchmark, and select qsort, Dijkstra and Susan as testing programs to evaluate the effectiveness of our method [35] .
In order to simulate the SDC error that occurs during the transmission of data in the network or system in the running of the program, we analyze the propagation of the error in the program through the fault injection experiment. According to literature [13] , [15] , [35] , [38] , [39] , this paper makes following assumptions for fault injection experiments: (1) This paper mainly considers the error of one bit of data, and assumes that in the program only one failure will occur in a single execution. (2) This paper focuses on the possible transient faults, thus just the error of the instruction operands are considered. Here, we evaluate the effectiveness from two aspects: (1) the accuracy of SDC vulnerability prediction; (2) the performance of SDC-causing error detection.
A. INSTRUCTION SDC VULNERABILITY PREDICTION EXPERIMENT
This sub-section evaluates our method for predicting SDC vulnerability by their accuracy, correlation, and time consumption. During the process of training the SDC vulnerability prediction model, we use ε-support vector regression model [23] and the Libsvm package [11] . We divide the sample data into several sets, one of them is used as the test and the others are used as the training to carry out the cross-validation test for selecting parameters. Through the experiments, we get the final insensitive loss ε = 10 −3 , the penalty parameter C = 1000 and the used kernel function is the Gauss radial basis function with kernel parameter σ = 0.003. The computation method of the program SDC vulnerability is shown as Equation (18) . We compared our proposed method with the SDC-causing error rate calculated by the crash model [23] in the enhanced Program Vulnerability Factor (ePVF) method [33] and the SDC-causing error rate obtained by the fault injection experiment. The results are shown in Figure4.
As can be seen in figure 4 , the average SDC vulnerability of three test programs is 21.3%, while the average SDC vulnerability obtained through fault injection experiments is 19.1%. The predicted value of SDC rate is slightly higher than the value obtained from the fault injection experiment (abbreviated as FI), which is the real probability of SDC-causing error occurring in the program. However, the ePVF estimates the program SDC-causing error rate by calculating the probability of the program crash, raising the SDC-causing error rate by 23.8%. As can be seen from the comparative experimental results, the SED-SVR method is superior to the ePVF. By analyzing the prediction results of our method, the reason for the deviation is that the fault is masked by program specific property during the propagation, such as correctness checks in floating-point operations, branch instructions that do not affect the program output, etc.
In addition, we evaluate the correlation of the SDC vulnerabilities between our SED-SVR and values by fault injection experiments. In the case of selective protection of the program, we just select relatively vulnerable instructions [15] , [33] . The time cost is used to measure the time required for feature extraction. We compare the prediction correlation and the time consumption of SED-SVR, fault injection, and ePVF at the same inputs. The experimental results are shown in Table 6 . Among them, the fault injection method obtains the exact value of the SDC vulnerability, so the correlation is 1. At the same time, it is ignored in the comparison of the time consumption due to the long time of the fault injection.
It also can be seen from Table 6 that the time consumption of SED-SVR is 6.1% higher than ePVF method, and the average prediction correlation is 0.914, which is higher than ePVF method of 0.803. In our method, all features of instructions in the program are extracted, but the ePVF models the probability that program crash will occur after a transient fault, ignoring other factors that affect the vulnerability of the SDC program. So, the ePVF method does not have the high accuracy of SDC vulnerability prediction.
B. SDC-CAUSING ERROR DETECTION EXPERIMENT
Here, the SDC-causing error detection rate and performance overhead of the proposed method are evaluated. We randomly injected 3000 faults into the target program to verify the SED-SVR method. Hotpath is the target program with the data flow error detection and redundant processing for frequently executed paths in the program, which is obtained through the LLVM framework. Here, we select the first 20% of frequently executed instructions. Figure 6 shows a comparison of performance overhead, which is compared against the execution time of the original target program. The performance for the ordinate is the excessive execution time for program execution compared to the base time after the redundant instruction is inserted.
As shown in Figure 5 , the average SDC-causing error detection rate of SED-SVR-30 is 91.7%, and that of Hotpath is 89.5%. As shown in Figure 6 , the average performance overhead of SED-SVR-30 is 43.7%, and that of Hotpath is VOLUME 7, 2019 59.9%, 16.2% more than the SED-SVR-30 proposed in this paper. Obviously, our method can reduce the performance overhead while ensuring the detection rate. On the other hand, the difference of the average SDC-causing error detection rate between SED-SVR-20 and SED-SVR-30 is 9.4%.
C. CASE STUDY 1) INSTRUCTION CLASSIFICATION ON THE DEPENDENCY CORRELATION
First, the quick sort algorithm is chosen as an example to illustrate the dependencies of SDC-causing instructions to propagate faults in the program. A piece of code in the quick sort algorithm and corresponding instructions are shown in figures 7 and 8. In Figure 8 , two registers, 4% and 9%, apply for 32-bit space from the memory. Once they have a transient fault, the error value they stored will not only affect themselves, but will continue to propagate throughout the whole program. If a transient fault occurs in the 4% registers, according to the successor relationship of the program dependency graph, the storage instruction stores the wrong value into the 10% registers, and a data storage error occurs. Then 11% register loads the fault value in the 10% register, causing an incorrect jump in the branch jump instruction. If a transient fault occurred in the 9% registers, 42% register load the value in the 9% register. Thus 42% register affects the value of 43% register. This error value of 43% register is imported into the qsort function, causing the function call instructions to pass in wrong parameters.
2) SDC VULNERABILITY ANALYSIS OF INSTRUCTIONS
In order to obtain the SDC vulnerability of the instruction in the program and analyze how it differs with instruction features, we perform the fault injection experiment. We take the code inside the dotted line box from figure 7 as an example. The LLVM IR instruction of this code and the SDC vulnerability of these instructions calculated by equation (3) are given in Table 7 . This code also corresponds to the basic block within the dashed box in Figure 8 . The instruction SDC vulnerability and data dependence are highly relevant. For example, the data bit of the 56-th icmp instruction has only one bit but it determines the direction of the next branch instruction. If SED occurs, the comparison result of icmp must be wrong. It eventually causes the program to jump to the wrong direction, causing an SDC error.
Through the analysis of fault injection experimental results, we can find whether the transient fault causes the final SDC-causing error is highly dependent on the target program. Therefore, the study of efficient error detection mechanism should identify higher SDC vulnerability instructions in the program. Then these instructions will be processed redundantly. As in above example codes, the 55-th call instruction and the 56-th icmp instruction with higher vulnerability need to be fault tolerance. For the 51-st load command with medium vulnerability, the user can choose whether to strengthen according to the performance requirements. The 50-th load instruction, the 52-nd follow instruction, the 53-rd and 54-th getelementptr intructions have low vulnerability. Hence, they don't need to be fault tolerance.
3) SOFT ERROR DETECTION INSTANCE
The results obtained after the fault injection experiment are shown in Table 8 . ID, OPCODE, WIDTH, SDC, SYM/CRASH, TRUE indicate the instruction allocation sequence number, the instruction operation type, the instruction data digits, the number of SDC errors, the number of system errors/crashes, and the number of correct executions of the program. There are 8 static instructions in the sample code. Except for the last jump instruction, the set of 7 instruction combinations after assigning the sequence number is V static = {I 50 , I 51 , I 52 , I 53 , I 54 , I 55 , I 56 }. Although I 56 has the highest vulnerability, it can be seen from Table 8 that it has only 1 data bit, which is not representative as an example. Hence, we use instruction A as an example to introduce the calculation process of SDC vulnerability. It can be seen from Table 8 Figure 9 gives an example of a comparison of the full and partial redundancy for the target program. The gray part is the part of redundant instructions. In figure 9 , full redundancy is done by duplicating all the instructions in the program and inserting comparison instructions at the end of the basic block. Some of comparison instructions to be inserted are not presented in the figure due to space limitations. However, for the proposed granularity-based redundancy method, we just perform redundant operations on the instructions I selected = {I 51 , I 55 , I 56 } in the program, which have high SDC vulnerability as 0.416667, 0.833333, 1 respectively from the table 8, and compare them at the end of the basic block to detect the occurrence of transient fault.
4) EXAMPLE OF THE FAULT TOLERANCE

VII. CONCLUSION
In order to solve the problem that the data flow error detection method based on redundancy results costs large time overhead, in this paper, we proposed a approach for SDC-causing error detection: SED-SVR. QIANWEN ZHANG received the M.S. degree in computer science and technology and the master's degree from the Nanjing University of Aeronautics and Astronautics, Jiangsu, China, in 2015and 2018, respectively.
Her research interests include information security and system security.
