Abstract. Side channel attack is typically used to get private key of cryptography system. It is one type of great threats to the cryptosystem. How to detect the attack effectively is an open problem. We proposed a way to detect the cache-based side channel attack using performance counters in this paper. Two main performance parameters, i.e. cache miss rate and dTLB miss rate are used here. The result showed that the cache-based side channel attack not only has a high cache miss rate even above 99.4%, but also has a low dTLB miss rate even below 0.002%. The experiments prove that this method can detect cache-based side channel attack accurately and quickly.
Introduction
Side channel attack is any attack which is aimed at the physical implementation of cipher system. To break a cipher system, side channel attack can be achieved by collecting and analyzing some side channel information leaked by system, such as time, power consumption or sound. Compared with traditional crypt-analysis attack, it does not rely on powerful mathematical theories and mathematical derivation. Cache-based side channel attack [1, 2] is a typical side channel attack which can extract sensitive information from the system by utilizing the shared cache memory. CPU cache is shared among virtual machines or processor cores, thus, a spy can easily utilize it to 'steal' sensitive information of the victim.
Paper [3] implemented a cache-based attack and showed that the DES algorithm can be broken at a success rate more than 90% under certain conditions. Paper [4] presented a new cache-based side channel attack to retrieve the private encryption keys by using the weakness of the implement of RSA in GnuPG 1.4.13, and the same flush + reload technique was used to recover the ECDSA nonces implemented in OpenSSL in paper [5] .
Significant progress has been made in cache-based side channel attack. With the rapid development of the attack, its detection and defense become more and more important. Paper [6] implemented a detection system HexPADS, which collected performance metrics of all running processes measured by hardware performance counters. And it uses one indicator which is the cache miss rate. Paper [7] detected cache-based side channel attack which used the flush + reload technology. And it detects only one particular type of attack and also uses one indicator which is the number of LLC cache accesses.
In this paper, a new method is proposed to detect the cache-based side channel attack by using the CPU performance counters. Firstly, some typical cache-based side channel attacks, ordinary attacks and benign procedures are implemented to analyze the behavioral characteristics by employing the performance monitor counters. By comparison and analysis, the fact can be obtained that the cache-based side channel attack not only has a high cache miss rate, but also has a low dTLB miss rate. Based on the two characteristics, the detection method is proposed by using the performance counters.
Preliminaries

CPU Cache and TLB
CPU cache memory is a temporary storage between CPU and memory which has less capacity but fast switching speed than memory, and it can hide the memory access latency by caching the recently used data. When CPU needs to read data in main memory, it sends the memory address of the data to the cache first. If the data has already been cached in CPU cache, it can be transmitted to the CPU immediately, which produces a cache hit. If not, CPU will call the main memory read cycle to read the data from memory, and cache the data to CPU cache simultaneously, which produces a cache miss with a relatively longer time than cache hit. CPU cache can be divided into three levels on modern Intel processors. L1 cache and L2 cache are private by core while the Last Level Cache (LLC or L3) is shared by all cores. It is the shared property of LLC that provides the prerequisite of attack. Because of the shared, different processes can access the same LLC as long as they are located in the same physical machine.
Modern processor employs virtual address based on paging mechanisms. Process can be run with a virtual address while the data storage adopts physical address. Page table is a special data structure that holds the correspondence between virtual page and physical page frame. Page table is stored in physical memory. To improve access efficiency, Translation Lookaside Buffer (TLB), a small cache, is employed to store the most recently used page table entries. TLB is divided into iTLB (instruction TLB) and dTLB (data TLB), and the access to them also produces TLB hit and TLB miss.
Performance Counters
Perf [8] is a performance profiling tool built into the Linux kernel source tree with 2.6+ version. It is based on event sampling principle, and the foundation of it is performance event. It can be a statistical analysis of both kernel and userland code and support related performance indicator analysis of processor or operating system, which is often used for performance bottleneck lookup and hotspot code localization.
Perf can be accessed by the command perf, and there are also a number of subcommands. The subcommand perf list lists all pre-defined events. Besides, an interface function perf_event_open() [9] is provided for convenience.
A call to perf_event_open() creates a file descriptor corresponding to the measured event. When multiple performance events need to be measured simultaneously, the file descriptors can be grouped together for a group event. Events can be enabled and disabled via function ioclt() or prctl().
Characteristics Analysis of Cache-based Side Channel Attack
Analysis of Cache-based Side Channel Attack
Cache-based side channel attack is classified into three typical types: flush + reload, prime + probe and evict + time, and all of them utilize the time difference between cache hit and cache miss.
In flush + reload attack, there is a premise that the spy and the victim must share the cache hierarchy and memory page with each other. The spy evicts the specified memory block from the shared memory cache by using clflush instruction. Then it waits for a time interval when the processor responds to the service request of the victim. After finishing the waiting time, the spy reloads the shared memory block and keeps track of the reload time. The time will be different according to the different cache behaviors. And the time information can be exploited to infer some sensitive information, while a longer time means the victim accesses the shared memory. The prime + probe attack is based on the cache occupies between the spy and victim. The spy fills the cache groups with prepared data first and waits for a time interval. Then the spy rereads the prepared data and keeps track of the read time of each cache group. Similarly, the measured time is used to infer sensitive information, while a longer time means the victim accesses the shared memory. Different from the above two types of attacks, the evict + time attack measures the time of entire encryption process. First, the victim performs an encryption operation and records the time. Second, the spy evicts one specific cache set after the encryption. Then it repeats the above encryption operation and calculates the time difference which can determine whether the evicted cache set is used by the victim or not.
Characteristics Capture
From the analysis of all the above typical attacks, an obvious feature is that the spy and victim will access the same cache group. As a malicious process, the spy is required to flush or evict some specified cache sets periodically. In the waiting time, the processor responds to the service request of the victim, which results in a highly cache miss rate of the spy with a high probability. The dTLB table will be refreshed when unrelated processes are running at the same time. However, when the spy and victim are running, the address accessed by the spy and victim remains unchanged, which means the mapping relationship between the virtual address and the physical address remains constant. The corresponding page table entries always have been cached in the dTLB page table, which products a low dTLB miss rate.
Characteristics capture of the cache miss rate and dTLB miss rate is shown in Figure 1 using the perf_event_open() interface [10] in Ubuntu. At the beginning, the input of the performance event should be checked to see if the system supports it or not. The performance event should be any event in perf list. After that, the program automatically scans the /proc directory until the specified process directory is obtained. The /proc directory stores a series of special files containing information about currently running processes. The pid of the process can be got from the specified directory and be passed to the interface function perf_event_open() as a important parameter. Once the interface function is called, there will be a set of values of the performance events. In order to get multiple sets of data, the function can be called circularly. 
Characteristics Analysis
To obtain and verify cache miss rate and dTLB miss rate, some cache-based side channel attacks have been implemented here [4, 11, 12] . As a comparison, a side channel attack rsa_time_attack by Paul Cristian Pintilie, a normal attack web_timing_attack by Daniel Honig, and two benign processes are selected. Figure 2(a) describes the cache miss rate of one hundred measurements of all the processes, and Figure 3 It is obvious in Figure 2 (a) that all the four cache-based side channel attacks have a highly cache miss rate, and three of them even reached to an extent above 99.4% displayed in Figure 2(b) . Although the remaining one rowhammer is slightly lower, most of them are more than 50%. By comparison, all the other processes have a low cache miss rate below 20%, which has a huge gap with the former. The dTLB miss rate of all the other selected processes have a low value below 1% presented in Figure 3(a) . However, three cache-based side channel attacks have a much lower dTLB miss rate below 0.002% , which is about one-five-hundredth of the value 1% described in Figure 3(b) . Although the remaining one rowhammer is slightly higher, the value is slower than others below 0.2%. From Figure 2 and Figure 3 , we can see that all the cache-based side channel attacks have a much higher cache miss rate and a much lower dTLB miss rate than others. Despite some processes have a lower dTLB miss rate such as urlopen from Figure 3 (a), they are not the cache-based side channel attacks because of a low cache miss rate in Figure 2(a) . As a conclusion, the cache-based side channel attack not only has a high cache miss rate, but also has a low dTLB miss rate. 
Detection of Cache-based Side Channel Attack
The detection is implemented by scanning all running processes in /proc and calculating the cache miss rate and dTLB miss rate of the scanned process. Then it checks whether the performance event values conform to the characteristics of the cache-based side channel attack or not. A warning will be given if the performance event values fit the characteristics exactly. Experiments prove that this method can detect cache-based side channel attack accurately and quickly, and can detect all the attacks above. Compared to using one parameter of cache miss rate in paper [6] , the method which using cache miss rate and dTLB miss rate is more sensitive to the process when the two parameters are relatively high. When using the detection tool in paper [6] , there was a false alarm of 'firefox' whose cache miss rate reached to about 80% occasionally. And in the method, the 'firefox' was not warned for a higher dTLB miss rate above 0.02%. Therefore, the new method can reduce the false positive rate to a certain extent.
Conclusion
A new approach to detect cache-based side channel attack is proposed in this paper. In the new method, two parameters, i.e. cache miss rate and dTLB miss rate, are selected as the main indicators of the attack. The result shows that cache-based side channel attack not only has a high cache miss rate, but also has a low dTLB miss rate. Compared with the method in paper [6] , the method here can not only detect cache-based side channel attack exactly and quickly but also reduce the false positive rate to a certain extent. In the future work, more attention should be paid to improve the accuracy of detection and the defense process should be improved.
