TurboCC: A Practical Frequency-Based Covert Channel With Intel Turbo
  Boost by Kalmbach, Manuel et al.
ar
X
iv
:2
00
7.
07
04
6v
1 
 [c
s.C
R]
  1
4 J
ul 
20
20
TurboCC: A Practical Frequency-Based Covert Channel With Intel Turbo Boost
Manuel Kalmbach
Karlsruhe Institute of Technology
Karlsruhe, Germany
manuel.kalmbach@kit.edu
Mathias Gottschlag
Karlsruhe Institute of Technology
Karlsruhe, Germany
os@itec.kit.edu
Tim Schmidt
Karlsruhe Institute of Technology
Karlsruhe, Germany
os@itec.kit.edu
Frank Bellosa
Karlsruhe Institute of Technology
Karlsruhe, Germany
os@itec.kit.edu
Abstract—Covert channels are communication channels used
by attackers to transmit information from a compromised
system when the access control policy of the system does
not allow doing so. Previous work has shown that CPU
frequency scaling can be used as a covert channel to transmit
information between otherwise isolated processes. Modern
systems either try to save power or try to operate near
their power limits in order to maximize performance, so
they implement mechanisms to vary the frequency based on
load. Existing covert channels based on this approach are
either easily thwarted by software countermeasures or only
work on completely idle systems.
In this paper, we show how the automatic frequency
scaling provided by Intel Turbo Boost can be used to
construct a covert channel that is both hard to prevent
without significant performance impact and can tolerate
significant background system load. As Intel Turbo Boost
selects the maximum CPU frequency based on the number of
active cores, our covert channel modulates information onto
the maximum CPU frequency by placing load on multiple
additional CPU cores. Our prototype of the covert channel
achieves a throughput of up to 61bit/s on an idle system
and up to 43bit/s on a system with 25% utilization.
Index Terms—Covert Channel, Intel Turbo Boost, Security,
Frequency, AMD Precision Boost, POWERT
1. Introduction
Businesses often depend on secure storage of sensitive
information. To prevent inadvertent or malicious transfer
of this information into the public, access control mech-
anisms such as firewalls or separate namespaces in a vir-
tualized environment are used to restrict communication
from sensitive software components. In such a system,
even if an attacker has gained control over sensitive soft-
ware components, the attacker is not able to use the regular
communication channels such as network or inter-process
communication to access the information.
Previous work, however, has identified a wide range
of covert channels which are not covered by traditional
access control. A covert channel transmits information
via a method not intended for information transfer [23].
Covert channels frequently employ hardware implemen-
tation artifacts such as memory access timing [27], [31],
Core 1 (sender):
min
max
“0” “1” “0” “1”
fr
eq
u
en
cy
Core 2 (receiver):
min
max
time
fr
eq
u
en
cy
Figure 1. The maximum boost frequency depends on the number of
active cores. Increased load on one core is visible as a frequency drop
on other cores, which can be used to transmit information.
[41], CPU temperature [26] or power management char-
acteristics [21] for communication. In the process, in-
creasingly stronger isolation mechanisms such as cache
partitioning [37] or memory bandwidth partitioning [17]
have been developed to prevent unauthorized flow of
information.
Recently, the CPU frequency has been identified as
another mechanism to construct covert channels [2], [21],
[28]. In order to reduce power consumption, operating
systems often reduce the CPU frequency as soon as the
CPU becomes partially or fully idle. Therefore, a process
can modulate the load of the system to trigger intentional
frequency changes. If another process is able to read the
CPU frequency, the two processes can use this mechanism
to exchange information even if they are otherwise not
permitted to do so.
Reading the CPU frequency of the current core of
a process is simple, as the process can use timers to
determine the execution time of a known number of
operations [28]. Modern CPUs, however, decouple the
frequency of individual cores [18]. Therefore, the OS can
prevent the covert channel by placing sensitive processes
on isolated cores: In that case, each process is only able
to read the frequency of the core its currently running on,
whereas sensitive processes only affect their own core’s
frequency. In this situation, system call interfaces such
as the sysfs file system in Linux can be used to read the
CPU frequency of all cores [2]. Again, the OS can trivially
restrict access to such interfaces, thereby greatly limiting
the viability of existing frequency-based covert channels
in practice.
A property of these modern systems that introduces
new cross-core covert channels is that they operate close to
their thermal limits, so utilizing available power headroom
is necessary to maximize performance. This situation re-
quires power management mechanisms to prevent exceed-
ing power limits, for example, by decreasing the frequency
of one core if another increases its power utilization.
Khatamifard et al. [21] show that this principle can be used
to construct covert channels – which they named POW-
ERT channels – and they calculate the potential through-
put. They demonstrate a covert channel on an Intel system,
albeit without explaining the underlying frequency scaling
implementation which causes the frequency changes. In
fact, their demonstration likely works because Intel Turbo
Boost 2 varies the frequency based on the number of active
cores as shown in Table 1. As shown in Figure 1, if the
transmitter activates an additional core, the receiver is able
to detect the resulting frequency drop. As Khatamifard et
al. only vary the power consumption of a single CPU core
– a “1” is sent by three active cores, a “0” by two active
cores – their prototype fails if just a single additional
core is used by other processes1. Our experiments show
that even the little additional CPU utilization caused by
virtualization can cause significant throughput reduction
in such a system.
In this paper, we describe the potential of adapting
such covert channels to specific power management mech-
anisms. We describe a covert channel specially for Intel
Turbo Boost 2 and show how adaptations make the covert
channel viable even with significantly higher background
load as required for the use in real-world scenarios. Our
experiments show that the covert channel is usable at up
to 37.5% background CPU load which is close to the CPU
utilization achieved by modern data centers [15].
In our prototype, the transmitter activates or deacti-
vates multiple cores synchronously to reduce or increase
the frequency. The receiver executes a timing loop on
another core to determine the current turbo frequency and
employs edge detection on the resulting signal to recover
the transmitted information. As this setup cannot prevent
some of the transmission errors due to interference from
other processes, our prototype implements a packet-based
communication protocol with automatic retransmission of
corrupt packets to ensure a robust communication channel.
On a system with an 8-core Intel server processor, our
prototype provides a throughput of 61 bit/s on an idle
system and of 43 bit/s on a system running at 25%
utilization. Our covert channel is still able to transmit
12 bit/s at a background load of 37.5%. These results
show that CPU load alone is not a viable countermeasure
against frequency-based covert channels.
The following sections are structured as follows: As
our covert channel depends on Intel Turbo Boost, Sec-
tion 2 contains a description of the behavior of Turbo
Boost. Then, Sections 3 and 4 describe the design prin-
1. If the cooling solution is not able to sustain the specified turbo
frequency, thermal throttling might cause frequency changes in this
situation – our work targets server systems, where that situation is
unlikely.
TABLE 1. TURBO FREQUENCIES OF THE INTEL XEON SILVER 4108
PROCESSOR FOR DIFFERENT NUMBERS OF ACTIVE CORES [39]
Base
Frequency
Turbo Frequency / Active Cores
1, 2 3, 4 5, 6, 7, 8
1.8GHz 3.0GHz 2.7GHz 2.1GHz
ciples and the implementation of our covert channel. The
following sections then contain an evaluation of our proto-
type at different amounts of system utilization (Section 5)
and a discussion of countermeasures, limitations, and the
potential use of AMD Precision Boost to construct covert
channels (Sections 6, 8 and 7). Finally, we present related
work in Section 9 and conclude in Section 10.
1.1. Threat Model
As a threat model, we assume a scenario similar to
other covert channel publications described in Section 9.
Hereby an attacker is able to execute arbitrary code in
user-space, but access control prevents information flow.
We imagine a scenario where one process has access
to sensitive information, but access control prevents any
further communication with other processes or via the
network. Another process, instead, has internet access but
no access to the sensitive information. We assume that the
attacker was able to infiltrate both processes and execute
arbitrary code in their context (in our case, receiver and
transmitter code of the covert channel). We further assume
that the attacker was not able to gain elevated privileges
to circumvent the access control policies. For our covert
channel to work, the two processes have to run on the
same physical CPU package – albeit on arbitrary CPU
cores – and the overall background load has to be low
enough as described in Section 5.3.
2. Intel Turbo Boost
Traditionally, systems employ two types of frequency
scaling: At low load, the operating system reduces the
frequency to conserve energy, and at high load the CPU
opportunistically increases the frequency above its nom-
inal maximum frequency to improve performance when-
ever further thermal headroom and power supply capacity
is available. Similar to POWERT [21], this work exploits
the latter.
Our prototype targets Intel Turbo Boost 2.0 which is
found in most recent Intel CPUs that tries to increase
the active cores’ frequency as much as possible without
violating either power or temperature limits [8]. Our work
exploits the fact that the power consumption depends
on the number of cores which are active, which is why
Turbo Boost selects higher turbo frequencies when more
cores are in a power-saving sleep state (C-state≥C3) [19].
Table 1 lists the frequencies for the Xeon Silver 4108
processor, showing that depending on the active core count
the frequency is automatically increased to up to 3GHz
from a base frequency of 1.8GHz.
Covert channels using the frequency selection to trans-
mit information require frequent frequency changes to
provide high throughput. In the Haswell microarchitec-
ture, a central power control unit (PCU) [5] determines
the highest possible turbo frequency once per millisec-
ond, calculating the maximum CPU frequency from the
temperature of the chip as well as the time spent at
the different power limits [35]. Although our prototype
has been developed on an Intel Skylake-SP CPU, our
measurements confirm the rate of these turbo frequency
changes.
This high frequency switching rate is only viable
because Turbo Boost is integrated into the CPU and is not
controlled by the operating system. The operating system
only has the possibility to activate or deactivate Turbo
Boost by setting the P-states [36]. If P-state P0 is set,
the boost frequencies are activated. At any higher P-state
Turbo Boost is deactivated and the cores’ frequency is
either defined by the given P-state or by Intel’s speed shift
technology [13].
3. Frequency Covert Channel
With Intel Turbo Boost the activity of one core affects
the maximum frequency on other cores, which enables
a covert channel between different cores. Exchanging
information only requires the transmitter to change the
maximum core frequency and the receiver to detect the
changes. As the maximum frequency is affected by a
number of parameters such as the number of active cores,
power limitations, temperature constraints, or even the
type of instructions executed, the transmitter could be
built using different mechanisms. In our work, we focus
on the number of active cores as it provides a simple
mechanism to influence the frequencies on a well-cooled
system, where reaching power or temperature limits might
be difficult.
Activating a single core is not always enough, as
shown in Table 1. We can see that Turbo Boost uses
the same turbo frequency for multiple numbers of active
cores. For example, at most two cores are allowed to
be active to get the highest turbo frequency of 3.0GHz.
Similarly, with three or four active cores the turbo fre-
quency drops to 2.7GHz. In this range, if we assume one
core to be used by a receiver process as described below,
activating a single additional core does not always cause
a frequency change, whereas two freshly activated cores
always will.
To read the current turbo frequency, a receiver has to
utilize its core enough to use the turbo frequency and is
then able to measure its current frequency. Because the
turbo frequency is identical for all cores, the receiver can
determine the current turbo frequency from any core it is
running on.
Note that once more than four cores are already active,
on our system no amount of additional active cores will
cause further frequency reductions as the turbo frequency
already is at the lowest “all-core turbo” level. Therefore,
the covert channel is limited to at least partially idle
systems. Some system utilization is acceptable, though,
as long as less than four cores are active.
In all such cases the transmitter needs to ensure
that its frequency changes are distinguishable from the
background noise. In particular, if the system utilization
is not constant, the transmitter needs to trigger a lower
frequency than can be caused by the other processes. If a
larger number of cores are synchronously activated by the
transmitter, the resulting frequency swing is larger and the
resulting frequency is guaranteed to be lower. Therefore,
a larger number of transmitter cores increases the amount
of system utilization that can be tolerated by the covert
channel.
4. Implementation
To determine the potential of a covert channel specif-
ically adapted to Intel Turbo Boost, we implemented a
prototype called TurboCC to transmit information between
processes and virtual machines on a Linux system. Our
implementation consists of two basic parts: The transmit-
ter (described in Section 4.1) changes the turbo frequency
by activating additional cores, and the receiver (described
in Section 4.2) measures the frequency and recovers the
information.
Intermittently, other processes on the system will ac-
tivate additional cores and can potentially cause transmis-
sion errors. Therefore, the covert channel uses a commu-
nication protocol with automatic retransmission of cor-
rupt packets as described in Section 4.3. Unlike other
approaches, we do not use error-correcting codes. Instead,
we only use a CRC checksum to detect corruption, a
decision which is detailed in Section 4.4.
4.1. Transmitter
As shown in Figure 1, our transmitter uses a simple
approach. It transmits a “1”-bit by keeping one or more
cores active and a “0”-bit by letting them sleep for a given
time. In case of a “1”-bit, the additional active cores will
cause Turbo Boost to select a lower turbo frequency which
can be detected by the receiver. A similar principle, albeit
with only one core which is activated, is implemented in
the POWERT prototype [21].
The number of cores activated by the transmitter needs
to be chosen so that the cores deterministically trigger
a specific frequency change that can be recognized by
the receiver. Table 1 shows the frequency levels for a 8-
core server system. As already described in Section 3,
the receiver keeps one core activated, then two additional
cores are required to switch to the next lower frequency
level assuming that the system is otherwise idle. More
cores are required if other processes generate additional
constant or changing background load, as then another
frequency level might have to be targeted. Particularly, if
more cores are used to send a “1”-bit than are at most
held active by other processes, the resulting frequency
change is distinguishable from background noise. Note
that sometimes fewer cores are sufficient if the transmitter
monitors the resulting frequency and adds more cores only
when required. Doing so, however, potentially delays the
frequency change and therefore reduces the throughput. In
our prototype, we make the number of transmitter cores
configurable so that the attacker can adapt them to the
expected system utilization.
Turbo Boost regards cores as active whenever they
are not in a deep sleep state, i.e., not in a C-state higher
than C3 [19]. The easiest method to enforce a core to be
active is, as described by Alagappan et al. [2], to keep
the core busy executing code. An attacker can implement
1 start = RDTSC;
2 counter = 0;
3 while(RDTSC - start < time_frame) {
4 counter++;
5 }
Listing 1. To measure the frequency, the receiver measures the amount
of work performed in a fixed amount of time (pseudocode).
this method purely in user space without the need for el-
evated privileges. Our prototype repeatedly executes long
sequences of NOP instructions2 to keep the core from
sleeping.
Conversely, to transmit a zero, the core has to go to
sleep to allow other cores to raise the turbo frequency to
the next level. On Intel CPUs, sleep states are entered by
the core when the operating system executes the privileged
MWAIT instruction with the appropriate flags set [20, Vol.
2B 4-159]. User space programs are therefore not able
to directly halt a core by themselves. However, both for
power and performance reasons, modern operating sys-
tems will automatically put idle cores to sleep. Therefore,
our prototype makes sure that all transmitter processes
either execute usleep() or sem_wait() during “0”-
bits so that the corresponding cores – if the operating
system does not schedule any other program on them –
idle and are sent to sleep.
As described above, the activation and deactivation
of multiple cores happens synchronously. Our test has
shown that spawning new processes each time to generate
additional CPU load induces significant delay, most likely
due to the overhead of forking a new process. Therefore,
our prototype creates a set of child processes at startup
and uses a semaphore to wake them up as required.
4.2. Receiver
As the transmitter modulates the information onto the
turbo frequency, the receiver needs to be able to determine
the current turbo frequency of its core. Although the
receiver can read the frequency through the sysfs file sys-
tem [2], doing so causes a lot of overhead and can easily
be detected and blocked by the operating system. Instead,
to be independent of the operating system and have a
higher signal detecting frequency, we chose to measure
the current frequency with the loop shown in Listing 1.
This loop counts the operations executed in a fixed amount
of time which are proportional to the current frequency.
Note that the current frequency of a core does not need
to be a turbo frequency if the operating system decides
that the utilization of the core is low enough that power-
saving reduced frequencies are preferable. However, the
loop generates enough load for the receiver’s core to use
the turbo frequency.
Listing 1 uses the RDTSC instruction to measure time.
The RDTSC instruction returns the core’s time stamp
counter (TSC) which on recent CPUs is independent of
the current core clock speed [20, Vol. 3B 17.17]. On
modern out-of-order CPUs, however, the resulting time
2. In general, the choice of instructions does not have an impact on
the frequency of other cores – even the frequency reduction caused by
AVX2 and AVX-512 instructions only affects a single core.
“1” (1 bit)
“0” (2 bits)
Ignored
time
fr
eq
u
en
cy
samples
edge threshold
Figure 2. To recover the data from the signal, the receiver detects edges
and measures the time between edges to determine the number of bits.
Outliers consisting of one or two samples are ignored.
stamp often does not correlate to a single point in the
program, as these CPUs reorder other instructions around
the RDTSC instruction. Such reordering could therefore
cause wrong measurements of the runtime of the timing
loop. RDTSCP provides the functionality of RDTSC, but
waits until all previous instructions have executed [20, Vol.
2B 4-544]. However, using RDTSCP has the disadvantage
that some hypervisors, such as KVM, disallow the use
of this instruction [16]. Therefore, our prototype uses the
LFENCE instruction that serializes all load-from-memory
instructions and thereby prevents most reordering.
The time stamp counter itself is not only precise, as
required for high bit rates, but also provided directly by the
CPU and therefore largely independent of the operating
system. Even though the time presented by the operating
system would be precise enough, the receiver would have
to rely on the operating system to provide it with the
correct time. The operating system could also potentially
disable the use of the RDTSC instruction, but the instruc-
tion is widely used to measure exact time. In particular,
the instruction is used on Linux systems to implement
clock_gettime() and gettimeofday() without
any system call3, so blocking RDTSC would reduce per-
formance. Even if this instruction is blocked, though,
the receiver can either use the normal system clock at
the cost of reduced throughput. Alternatively, it might
even be possible to generate a high resolution clock by
executing code whose runtime is not proportional to the
CPU frequency (e.g., due to cache-misses), similar to the
techniques described by Kohlbrenner et al. [22].
As the receiver is not perfectly synchronized with the
transmitter, a single frequency measurement per transmit-
ted bit is not sufficient. Our prototype samples the CPU
frequency at a significantly higher rate than the bit rate
of the signal and then uses edge detection to recover the
signal. As shown in Figure 2, edge detection only detects
changes in the signal from “1” to “0” and vice versa, but
not the number of identical bits in-between. Therefore, our
prototype measures the time between the two edges and
divides it by the time per bit in order to get the transmitted
data. The edge detection categorizes samples depending
on whether they are below or above a threshold. In our
prototype, the threshold value is determined manually in
order to increase reproducibility of the results, although,
3. Linux 5.2, arch/x86/entry/vdso/vclock_gettime.c,
l. 127ff
0 7 8 15
10101100 Seq. No.
Data
...
CRC-16
Figure 3. Each packet consists of the synchronization sequence, a
sequence number, the data and a checksum.
as the turbo frequency levels are discrete, automatic cal-
culation of the threshold is possible.
The main problem of simple edge detection is that
during frequency sampling each single outlier immedi-
ately is recognized as an edge and is therefore detected
as a new bit. Such errors can be caused by, for example,
concurrent signal recovery, interrupt handling on the re-
ceiver’s core, or migration between cores. As these errors
are fairly frequent and each error potentially causes the
receiver to lose synchronization to the sender, the edge
detection algorithm discards changes of the signal which
are significantly shorter than a single bit.
4.3. Communication Protocol
In addition to the transmission mechanism of the re-
ceiver and transmitter, we added a simple communication
protocol which has to fulfill mainly two tasks. First, the
mechanism described above requires sender and receiver
to be synchronized so that the receiver recognizes when
the sender starts transmitting information. Second, even
though the most common intermittent transmission errors
are covered by our edge detection mechanism, other errors
might still require retransmitting information that was
lost to ensure robust error-free communication. The com-
munication protocol enables the transmitter to determine
whether packets were lost or corrupted and have to be
retransmitted.. Figure 3 shows the structure of a single
data packet which contains fields to deal with both these
problems.
As our communication channel does not have any
external synchronization signal, our prototype implements
synchronization between sender and receiver via the trans-
mission of a special bit sequence at the beginning of
every transmission, similar to POWERT [21]. Whenever
the receiver recognizes the corresponding characteristic
frequency changes, it starts recording incoming data. The
bit sequence has to be long and complex enough that it
is unlikely to occur by accident when the scheduler lets
other processes run. Short sequences, like 1100, are too
likely to occur and therefore we chose the longer and
more complex sequence 10101100 for the synchroniza-
tion. Note that occasional misinterpretation of background
noise as the start of the transmission by the receiver is
not particularly problematic as the received data will be
recognized as being erroneous and will be discarded as
described below.
Even if the synchronization header is correctly rec-
ognized, we still have to handle other errors during the
transmission of the data. Similar to most network proto-
cols, our approach uses checksums and a retransmission
mechanism to handle errors in the received data. We add
a CRC-16 checksum to the transmitted data in order to
detect errors.
To notify the sender about a failed transmission, the
receiver checks the checksum and replies with a short
acknowledgment message back to the original sender. If
the sender does not receive that message in time or if the
acknowledgment itself is corrupted, the sender retransmits
the data. During this process, sender and receiver are
switching their role to allow the acknowledgment to be
send back to the original sender. For the acknowledgement
packets, while the underlying transmission mechanism is
fully bidirectional, we use a simplified format without the
possibility to send any additional data back to the original
transmitter. The packet format of our covert channel could
be trivially extended to allow full bidirectional communi-
cation, albeit at slightly higher overhead.
In order to restrict the data that has to be retransmitted
after each error, the sender splits the payload into multiple
packets of fixed size. Each packet contains the synchro-
nization bit sequence, a sequence number to identify the
packet, the fixed size data and the checksum over the
rest of the packet. Similar to reliable network protocols
such as TCP, each acknowledgment packet contains the
sequence number of the last correctly received packet.
The sender only starts transmitting the next packet once a
correct acknowledgment for the previous packet has been
received.
Note that, due to the low throughput of our covert
channel, the packets are of a fixed size and do not contain
any additional information about the length of the data
carried. It is left to the application to implement correct
padding. Our packets each contain 8 bytes of data as our
experiments have shown this size to be a good compro-
mise between the overhead per transmitted byte and the
amount of retransmitted data.
4.4. Error Correction
Although the retransmissions of corrupt packets are
sufficient to achieve a robust communication channel, sim-
ple error detection might not be the most efficient method
to handle transmission errors. Especially for higher error
rates, error correction codes might be able to repair some
corrupt data, might reduce the number of retransmissions,
and might thereby increase performance. As, however, the
choice of error handling mechanisms depends on the type
and rate of errors, in the following, we first discuss the
different error types and then show why adding further
error correction is disadvantageous for our covert channel.
Our covert channel is supposed to be usable on a
normal system with all its background tasks which dis-
turb the transmitted signal. As shown in Figure 4, the
transmissions in our covert channel are affected by the
same types of errors as in the cache-based covert channel
described by Maurice et al. [27]. Single bit flip and burst
errors are caused by the scheduler of the operating system
which chooses processes to run on sleeping cores, which
causes the cores to wake up and have an impact on
the turbo frequency. We counted the frequency changes
caused by other cores becoming active during one second
and measured their length. As shown in Table 2, on an idle
system most of these frequency changes are short and will
not cause transmission errors if the time per transmitted
Transmitter 1 1 0 0 0 1
Receiver 1 1 0 0 0 1
(a) Transmission without any errors.
“110001” is transmitted and received.
1 1 0 0 0 1
1 1 0 1 0 1
(b) Only a single bit is
flipped, for example, if an
interrupt is handled on an-
other core.
1 1 0 0 0 1
1 1 1 1 1 1
(c) A burst error, i.e., a se-
quence of flipped bits, can
be caused by processes on
other cores.
1 1 0 0 0 1
1 1 0 0 0 0 0 1
(d) Synchronization errors occur if
sender or receiver are preempted.
All bits from then on are corrupt.
Figure 4. Possible transmission errors in our side channel – our system suffers from the similar types of errors as cache-based covert channels [27],
except that we expect burst errors to be more frequent.
bit is at least several milliseconds. Therefore, reducing the
transmission rate to reduce transmission errors provides a
simple approach to limit the number or required retrans-
missions. It presents a trade-off between the number of
retransmissions and the maximum achievable throughput
though. Also, even arbitrarily low bit rates do not fully
guarantee that no errors occur as there is no bound on
the run time of other processes, so some retransmissions
might be required.
As error correction codes are able to repair corrupt
data at the receiving end of the channel, they have the
potential to reduce the number of retransmissions which
can have a positive impact on throughput. However, the
codes itself need to be appended to each packet and cause
overhead. Therefore, the use of error correction codes
also presents a trade-off where the utility depends on
the transmission error rate and the type of the errors.
To determine whether error correction is advantageous
in our setup, we constructed a simplified version of our
covert channel without retransmissions and acknowledge-
ment packets and we recorded the packet content at the
receiver on an idle system. For each packet, we used
the data received to determine whether an error-correcting
code could have successfully repaired the packet. For our
analysis, we chose the widely used Reed-Solomon code
with 4 bytes of parity which is able to recover up to
two bytes of data. Using more parity bytes would allow
recovering more bytes of data but would increase the
packet transmission time. In any case, the Reed-Solomon
does not remove the need for the CRC-16 checksum,
as the retransmission mechanism would still be required
for robustness. To simplify estimating the transmission
rate, we assume that a retransmission of a packet will
be without errors.
The measured number of corrupted bytes per transmis-
sion shows that most errors could have been corrected at
all transmission rates where the covert channel is viable.
At a high transmission rate of only 5ms per bit, 29 out
of 40 packets were without any errors and out of these
11 broken packets 7 could have been fixed by the Reed-
Solomon code. With retransmissions as well as the use of
the Reed-Solomon code, we calculate a theoretical data
rate of 107bit/s. Without Reed-Solomon we expect a data
rate of approximately 121 bit/s. This analysis shows that,
despite the correctable errors, the overhead of the Reed-
Solomon code would be enough to offset the advantages.
With only four broken transmissions, our experiment
showed the lowest error rate at a data rate of 10ms per
bit. All of these four corrupted transmissions could have
been fixed by the Reed-Solomon code. However, even
in this situation, error correction would not yield a net
TABLE 2. NUMBER OF INVOLUNTARY FREQUENCY CHANGES
DURING ONE SECOND ALONG WITH THEIR RESPECTIVE DURATION
ON AN IDLE SYSTEM WITH ONLY BACKGROUND SERVICES RUNNING
Duration Involuntary Frequency Changes
1 ms 109
2 ms 5
3 ms 2
4 ms 1
6 ms 1
advantage. With the Reed-Solomon code, we calculate a
throughput of 59 bit/s, whereas we expect the system to
achieve 70 bit/s without error correction. As none of these
setups promise any advantage due to error correction, our
prototype only employs error detection, retransmitting any
corrupted packet as necessary.
5. Evaluation
We conducted an evaluation of our implementation
consisting of two parts: First, in Section 5.2, we measure
the throughput of our prototype on an idle system for
different bit rates and show how, despite the runtime over-
head added by the retransmission protocol, our prototype
achieves a throughput close to the prediction by Khatam-
ifard et al. [21]. Second, in Section 5.3, we show that
our covert channel is able to achieve sufficient throughput
even in a system with significant background load. To
finalize our evaluation, we show in Section 5.4 that our
covert channel is even able to communicate between two
VMs running on the same CPU.
5.1. Setup
As our covert channel is usable in both server and
desktop environments, we chose to use the Intel Xeon
Silver 4108 CPU as our primary target system. Although
this specific CPU is usually found in servers, most recent
desktop CPUs by Intel have a similar architecture with
similar power management. This CPU is based on the
Skylake architecture and has 8 physical cores with a
base frequency of 1.8GHz and a max turbo frequency of
3.0GHz with Turbo Boost 2.0. The system runs Fedora
28 and is sufficiently cooled so that our experiments
do not trigger thermal throttling. For the experiments in
Section 5.2, the system is idle during the experiments,
i.e., only the terminal and the background tasks of the
operating system are intermittently active.
All experiments are executed with deactivated hyper-
threading to prevent running the transmitter and the re-
ceiver on the same physical core, as our covert channel
was only designed for inter-core communication. If trans-
mitter and receiver are on the same physical core and
share functional units, other covert channels such as the
one described by Wang et al. [37] are available instead.
It is likely that the two approaches can be combined to
create a covert channel that works both for communication
between the hyper threads of a single core and between
different physical cores. For example, sender and receiver
could alternate between the different methods to send
messages until one of them succeeds. We expect such an
approach to become less important, though, as recently
discovered high-risk side channels that require hyper-
threading [3], [33] likely trigger increasing deployment
of counter measures. For example, hyper-threading can
be completely deactivated [24] or in order to reduce the
performance impact the OS can instead use scheduling
algorithms that prevent applications from different sources
to execute on the same physical core [10].
5.2. Idle Target System
The main goal of our prototype is to provide a practi-
cal covert channel that, in contrast to other approaches,
is optimized for robustness in the face of significant
background CPU load and other sources for transmission
errors. The communication protocol required, however,
potentially introduces significant overhead. In this section,
we show that the solution is still able to provide through-
put comparable to other approaches.
The throughput of a covert channel such as ours de-
pends on several variables, amongst them the overhead of
the communication protocol, the raw transmission time per
bit, and the number of retransmissions caused by transmis-
sion errors. The communication protocol is fixed, but the
bit rate is variable and the number of transmission errors
depends on the bit rate as well as the background load on
the system. To determine the achievable throughput at a
specific bit rate, we use the covert channel to transmit
80 bytes of data while we measure the time between
the start of the transmission and the reception of the last
acknowledgement packet. We repeat this experiment for
different bit rates to determine the optimal throughput.
The results for an idle system are shown as the
blue line in Figure 5a. As transmission errors have an
unpredictable influence on the throughput, we repeated
each test 10 times and averaged the results. The highest
throughput of 61 bit/s is reached at a raw transmission
time of 7ms per bit. Intuitively, the throughput increases
with increasing transmission speed. Below 7ms per bit,
the throughput decreases again, as the signal conditioning
of our prototype – which is supposed to filter out short
frequency changes caused by interrupts on other cores
– fails to detect the short frequency changes caused by
the transmitter. This effect can be seen as an increase in
Figure 5b, which shows the number of retransmissions
per packet during the experiment. For a similar system,
Khatamifard et al. [21] have determined the maximum
throughput to be approximately 120 bit/s – our system has
significantly lower throughput due to the communication
protocol, it would be the same if we removed our error
detection and communication protocol. Note that further
optimization – e.g., letting the transmitter send multiple
packets in one batch and then letting the receiver send
an accumulative acknowledgement – might improve the
throughput of our prototype.
5.3. Background Load
To demonstrate that our system is able to cope with
significant background load caused by other processes
on the same system, we repeat the experiment from the
last section. In parallel to our prototype, we execute an
instance of the x265 video encoder to generate background
CPU load. We vary the number of threads used by the
video encoder. x265 was chosen as a workload as it
produces a mostly stable background load with some I/O
operations. For workloads with higher variation to cause
transmission errors, two conditions have to be fulfilled:
First, the utilization temporarily needs to be higher than
the maximum supported utilization, and second, the uti-
lization spikes need to be long enough to be detected as a
“1”-bit. Although some cloud workloads violate the latter,
others do not. In addition to the experiments described
below, we conducted an additional test with a kernel
compilation benchmark, where make spawns processes
on varying cores and I/O interrupts are often triggered on
additional cores, yet our prototype was able to transmit
data.
Again, Figure 5 shows the results of these experiments.
The figure shows that the achievable throughput varies
with the utilization of the system: As shown, our covert
channel is able to cope well with up to 25% background
load. Except at the higher bit rates, the throughput is
nearly the same as without background load. At higher bit
rates, spikes in the CPU load cause numerous transmission
errors.
With 37.5% background load, our covert channel
shows a significant lower data rate. Note that this back-
ground load does not include the processes used by the
sender and receiver to implement the covert channel. The
resulting utilization as seen by the operating system is
therefore higher. At 37.5% background load, on average
three cores are fully utilized by the background processes,
as well as one by the receiver. As shown in Table 1, the
CPU selects the lowest turbo frequency whenever more
then 4 cores are active.4 Therefore, any additional core
will cause the frequency drop. As we do not prohibit
any background services from executing and as the back-
ground load is not uniform, load fluctuations cause fre-
quent transmission errors. Nevertheless, the covert chan-
nel is still functional despite the significant amount of
background load.
Even cloud servers, where cloud providers try to
increase utilization by placing workloads from different
users on one system, are frequently operated at such
utilization levels. For example, Garraghan [15] quote an
average utilization from 28.34% to 55.66% for the Google
cloud data center. In our case, the number of cores used by
4. Note that these values are for the 8-core system on which we
developed our prototype. The number of turbo frequencies is higher
for systems with larger core counts, so the prototype is able to work
in the presence of higher background utilization on such (likely more
representative) systems.
6 8 10 12 14 16 18 20 22 24 26 28 30
0
10
20
30
40
50
60
Transmission time per bit [ms]
b
it
s/
s
idle
12.5% load
25% load
37.5% load
(a) Data rate
6 8 10 12 14 16 18 20 22 24 26 28 30
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Transmission time per bit [ms]
n
u
m
b
er
o
f
re
tr
an
sm
is
si
o
n
s
p
er
p
ac
k
et
idle
12.5% load
25% load
37.5% load
(b) Number of retransmissions per packet
Figure 5. Data rate and average number of retransmissions required for a transmission of 80 bytes of data with different background loads. With
increasing background load, the number of transmission errors rises which causes more retransmissions and therefore decreases the data rate. For
example, at 25% background load and 6 milliseconds transmission time per bit, the number of retransmissions per packet is at 2.58, which means
that every packet is sent about two and a half times before it was delivered successful.
background tasks only needs to be lower than the number
of active cores of the second to last turbo frequency step,
minus the one core needed for the receiver. In addition,
the operator of a system under attack not only sees the
background load but also the load caused by the covert
channel itself. The sum of both is significantly higher than
the background load itself, thereby potentially deterring
the operator of the system from placing more load on it.
Overall, many server systems are therefore likely operated
at utilization levels where our covert channel is viable.
Furthermore, many server systems use server CPUs
with more cores than the Xeon Silver used in our ex-
periments. For example, the Xeon Gold 6130 changes its
turbo frequency when surpassing 75% load, yielding a
frequency change for each four additional active cores [1].
On such a system, this additional frequency level can
likely be used to transmit information at even higher
background load.
5.4. TurboCC Between VMs
Our covert channel neither needs any special instruc-
tion nor depends on any particular feature of the operating
system and therefore is not limited to communication
between processes within one operating system. As an
example for other setups, we setup two virtual machines
on one host system to show that the channel can also be
used to communicate between virtualized systems. Both
VMs use the KVM hypervisor and are allocated two
virtual cores each. These virtual cores are mapped to
different physical cores to prevent the VMs from running
on the same core and to provide a more realistic scenario.
The VMs are running the same stock Fedora 28 as the
host system. With this setup and the same transmission
time of 7ms as used above, our prototype only reaches
approx. 11 bit/s with approx. 3.2 retransmissions per sent
packet.
To identify possible causes for this increased number
of retransmissions, we looked at the number of interrupts
occurring during a transmission. Handling most interrupts
takes little time, so the OS only briefly activates cores or
briefly suspends processes. Our signal detection algorithm
is able to cope with such short disruptions. Therefore,
we looked at rescheduling interrupts in particular, because
these can cause a process to be suspended much longer
and will eventually cause synchronization errors. Table 3
compares the number of rescheduling interrupts per sec-
ond between two setups with and without virtualization.
As three of the four cores allocated to the VMs experience
approximately four interrupts per second – significantly
higher than in the setup without virtualization – we expect
that the lower data rate between VMs is caused by these
rescheduling interrupts. At the selected bit rate of 7ms
per bit, a transmission of one packet takes 728ms. On
average, the transmitter and the receiver will therefore get
preempted roughly three times per packet during a trans-
mission between VMs. These interrupts will eventually
cause transmission errors and therefore retransmissions.
Note that core 0 has substantially higher reschedul-
ing interrupt rates than any other core. As none of our
prototype’s processes are running on this core, we think
that these interrupts are all caused by the operating system
managing the system. Although these interrupts do cause
additional noise for our covert channel, the rate does not
differ much between the two setups.
6. Countermeasures
As we have shown that an attacker can use our covert
channel to bypass access control, for some setups with
sensitive information it might make sense to implement
countermeasures to prevent such flow of information. In
the following, we discuss a number of potential counter-
measures and their respective drawbacks.
6.1. Restricting Frequency Management
The most obvious countermeasure against frequency-
based covert channels is to prevent load-based frequency
changes. In our case, interaction between the frequencies
of the different cores could be prevented by completely
disabling Turbo Boost. The drawback, however, will in-
variably be reduced performance. On a system with a
desktop CPU based on the Nehalem microarchitecture,
Charles et al. [8] measured 6% lower performance with-
out Turbo Boost. To show the impact on newer CPUs,
TABLE 3. NUMBER OF RESCHEDULING INTERRUPTS DURING A TRANSMISSION OF 100 BYTES BETWEEN PROCESSES AND BETWEEN VMS.
RESCHEDULING INTERRUPTS TRIGGER PREEMPTION AND CAN CAUSE SYNCHRONIZATION ERRORS. THE CORES OF THE RECEIVER’SVM
EXPERIENCE HIGHER INTERRUPT RATES,WHICH IS LIKELY WHY OUR PROTOTYPE ONLY ACHIEVES LOWER THROUGHPUT.
interrupts per core [interrupts/s]
0 1 2 3 4 5 6 7
processes 84.33 0.1 a 0.03 c 0.13 0 0.03 b 0 0
virtual machines 131.45 4.26 a 1.02 a 0 4.43 d 4.10 d 0.05 0
t a Receiver or receiver’s VM b Transmitter c Additional active core d Transmitter’s VM + active core
we executed the Parsec benchmark suite [6] on a Xeon
Silver 4108 server CPU with Turbo Boost enabled as
well as disabled and found an average performance re-
duction of 23% (between 15% and 50% depending on the
benchmark). Note that these numbers include the single-
threaded initialization phase of the benchmarks. Although
disabling Turbo Boost has the potential to improve energy
efficiency [8], the experiments show that disabling Turbo
Boost comes at a significant cost to performance.
As an alternative, it is possible to force the CPU to
limit its turbo frequency to the lowest turbo frequency.
Our tests have shown that by preventing CPU sleep states
higher than C2 the operating system can keep enough
cores active in order to prevent higher turbo frequencies.
As the all-core turbo frequency is higher than the base
frequency of the system, the resulting performance reduc-
tion is lower compared to disabling Turbo Boost. For the
Parsec benchmark suite, we measured an average overhead
of only 5%, which we expect given the parallel nature of
the benchmarks. Single-threaded performance, however,
is expected to experience larger performance reduction.
Also, disabling deeper sleep states greatly increases power
consumption of idle and partially loaded systems. In our
case, the idle power consumption of the CPU increased by
almost 60% when the C-states of all cores were restricted.
6.2. Adding Artificial Noise
A countermeasure with potentially lower performance
impact would be to randomly add more artificial back-
ground load to increase the transmission error rate to the
point where no useful data transfer is possible anymore.
The operating system could randomly wake additional
cores up to mask the effect of the transmitter. To prevent
the attack, the number of additionally active cores needs
to be at least as large as the number of cores controlled by
the transmitter as described in Section 4.1. In practice, the
transmitter core count is likely limited if the transmitter
is within a virtual machine with a limited number of
virtual CPUs. While we still expect measurable energy
and performance overhead due to additional cores being
active, the approach has less overhead compared to the
alternative countermeasures listed above if the transmitter
can only control a limited number of cores.
6.3. Detecting Frequency Covert Channels
As the methods described above all have drawbacks in
terms of performance or energy consumption, it would be
beneficial to limit their usage to when a covert channel is
detected. For that reason, mechanisms to detect this kind
of covert channel are desired. Wu et al. [40] describe a
framework for detecting covert channels that is contin-
uously logging, for example, the system load and then
tries to detect abnormal behavior. Such a technique could
be applied to our covert channel as well. As frequency
changes occur on the order of milliseconds, a sufficiently
high logging rate of the current turbo frequency is required
to detect this covert channel.
7. AMD Precision Boost
The covert channel presented in this work is based on
Intel Turbo Boost 2.0. Recent AMD processors also have
a similar mechanism named AMD Precision Boost [12].
Similar to Turbo Boost, Precision Boost increases the fre-
quency of the CPU above its base frequency, but in a dif-
ferent way. The first version of Precision Boost was doc-
umented to increase the CPU frequency to the maximum
turbo frequency when two cores or less were utilized. If
more than two cores are active, the turbo frequency is
set to a far lower turbo frequency. Precision Boost 2, in
addition, increased the frequency for more than two active
cores even further, depending on the number of active
cores and as long as the thermal headroom allows it [11].
Due to the similar behavior, a similar covert channel can
be implemented the same way on AMD CPUs as on Intel
ones. However, our experiments showed that the reaction
to load changes is very slow. Depending on the number
of active cores, AMD Zen+ cores implementing Precision
Boost 2 can take up to almost 400ms before a core returns
to the maximum frequency after other cores go to sleep.
As expected from these delays, our experiments show that
our covert channel is only able to reach about 1.08 bit/s
on an idle system with an AMD Ryzen 2700X CPU. As
increased CPU load further reduces the throughput, we do
not deem the covert channel to be viable on this specific
CPU. Note that cores from the newer Zen 2 architecture
show faster reaction to load changes. More work has to
be conducted to determine whether the covert channel is
able to provide satisfactory throughput on these CPUs.
8. Discussion
Our evaluation has shown that the covert channel can
be used to transmit information between two cores, and
the throughput is in the same order of magnitude as for
other approaches under similar conditions [2], [21], [28].
Although a throughput of 61 bit/s might seem low, an
attacker is often only interested in small cryptographic
keys and has sufficient time at their disposal to leak the
key even through such a slow communication channel.
As described in the small message theorem [29] even
a few bits of data stolen over a long period can be
enough to be harmful for a system if it is dependent on
keeping them private. The covert channel, like most other
covert channels, has prerequisites regarding the underlying
system and is therefore only usable in certain scenarios.
In our case, the covert channel fails if the load on the
system by other processes is too high, if Turbo Boost is
not enabled, or if the operating system does not place idle
cores in deep sleep states to allow frequencies higher than
the all-core turbo frequency.
The first, system utilization, is where our covert chan-
nel is significantly different from other frequency-based
covert channels. We demonstrate that our covert channel
is still able to achieve usable throughput at up to 37.5%
of background load. As previously described, the average
CPU utilization of modern data center is low enough that
out covert channel is often still able to transmit data [15].
37.5% background load is the maximum possible for the
covert channel on our system, as in this scenario there
are four cores active including that of the receiver. As the
lowest all-core turbo frequency is chosen at five active
cores and above, any additional core will cause the final
frequency drop.
Our covert channel is able to cope with higher
background noise than other comparable frequency-based
covert channels. For example, Miedl et al. [28] as-
sume a completely idle system with only short bursts of
background load. Similarly, the prototype described by
Khatamifard et al. [21] fails on Intel systems even if only
one additional core is held active by other processes.
The other requirements, that Turbo Boost is enabled
and that idle cores are placed in deep sleep states by the
operating system, are likely fulfilled on most systems.
Disabling deep sleep states restricts the system to the
all-core boost frequency and has significant impact on
power consumption, while disabling Turbo Boost restricts
the system to its base frequency, causing an even higher
performance impact as described above.
9. Related Work
While our covert channel exploits the power manage-
ment of current Intel CPUs to provide a particularly robust
covert channel in the face of background load caused by
other processes, it is by far not the first covert channel
based on CPU load. In this section, we review existing
load-based covert channels and describe similarities and
differences.
9.1. Load-Based Covert Channels
The covert channel described in this paper is an
example for a class of covert channels which transmit
information by having the receiver observe the CPU load
generated by the transmitter. In contrast to our covert
channel, most of these covert channels are easily mitigated
without a negative impact on performance.
A basic example for such a covert channel is the orig-
inal CPU load monitoring covert channel by Cioranesco
et al. [9] where the receiver directly observes CPU load
via the sysstat sar command. Access control by the
operating system can be used to deny access to such
commands which trivially mitigates the covert channel.
Another example is the thermal covert channel described
by Masti et al. [26]. This channel uses the effect that
CPU load increases the temperature not only of the loaded
core but also of neighbor cores even if they are otherwise
isolated by software. If processes have user-space access
to a temperature sensor of their current core, the heat
propagation can be used to construct a inter-core covert
channel and even a limited side channel. However, as with
other CPU load monitoring mechanisms provided by the
OS, the operating system can deny user-space access to
the temperature sensor.
9.2. Frequency-Based Covert Channels
Another metric influenced by the load on the system
is the CPU frequency. On modern systems, the frequency
is affected by CPU load in two ways: First, the operating
system – or, in some cases, the CPU itself – lowers the
frequency when load is low to save energy. Second, the
CPU increases the frequency above its nominal frequency
when power headroom is available, as for example imple-
mented by Intel Turbo Boost.
A covert channel exploiting the former has been de-
scribed by Alagappan et al. [2]. Their covert channel
either manually lets the application set the frequency
if the operating system allows – which can be trivially
denied through access control – or varies CPU load to
which the Linux frequency governor reacts by reducing
the CPU frequency when possible. At the receiver’s side,
the CPU frequency is either determined through the sysfs
file system – which can be trivially denied – or through
a timing loop similar to our covert channel. On recent
systems, however, the latter does not provide a cross-core
covert channel anymore. As individual cores can have
different frequencies when operating below the system’s
turbo frequency, measurement on one core does not reflect
another core’s frequency.
Other approaches use the same mechanisms. For ex-
ample, the approach described by Miedl et al. [28] uses
varying CPU load to send information and uses a timing
loop to directly measure CPU frequency, whereas the
approach described by Benhani and Bossuet [4] directly
manipulates the CPU clock to achieve significantly higher
bit rates. These approaches therefore suffer from the
same limitations. As our covert channel does not rely
on specific operating system frequency management and
provides cross-core communication in a lot of cases, the
approach complements the existing approaches by provid-
ing a covert channel in situations where neither of them
are viable.
The effect exploited by our covert channel is that
modern CPUs implement boost frequencies to provide
increase performance when possible. As modern systems
try to utilize available power headroom to maximize per-
formance, increasing load on one core means that the
system cannot sustain its previous boost frequency and
that other cores might have to reduce their frequency.
Khatamifard et al. [21] show how this effect can be
used to construct a range of covert channels that cannot
be prevented by the operating system without significant
performance impact. They also provide a formal model
for the achievable throughput and describe a prototype
for Intel and ARM systems. On the Intel system, their
prototype achieves a peak channel capacity of 121.6bit/s,
which is similar to our covert channel if we remove error
detection and correction from our prototype. As described
in the introduction, their covert channel is not particularly
optimized for a certain system architecture, though. Their
prototype tries to maximize power consumption when
activating transmitter cores, which is not necessary on
systems such as the one used in our evaluation and which
might make detection of the attack easier. Also, their
prototype fails to transmit information on a sufficiently
cooled system if even just a single additional core is
held active. While assuming an idle system is common
in literature, this situation is highly unlikely in many
practical environments, with modern cloud environments
achieving up to 56% CPU utilization on average [15]. We
describe an optimized covert channel for Intel Turbo Boost
and show how the adaptation to the system allows for a
significantly higher tolerance of the CPU load caused by
other processes.
9.3. Other Covert Channels
Besides the load-based covert channels described
above, a plethora of other mechanisms have been used to
transmit information. For example, the CPU caches [30],
[41], the DRAM row buffers [31], and CPU functional
units [37] are shared resources that can enable covert
channels with significantly higher throughput than the
covert channel described in this work. Other covert chan-
nels often also can be prevented by other countermea-
sures, though – in these cases, cache partitioning, memory
channel isolation, and disabling of hyper-threading can
be used to prevent resource sharing. Our covert channel
or the channels described above do not supersede, but
rather complement other covert channels. Depending on
the targeted system and the countermeasures deployed on
the system, different covert channels can be used.
9.4. Error Correction
In Section 4.4, we described error correction methods
to increase the usable (i.e., error-free) throughput of our
covert channel. The choice of error correction mechanisms
depends on the characteristics of the channel. For cache-
based covert channels, Maurice et al. [27] provide a
detailed characterization. They describe the types of errors
and show that with the right error correction mechanisms
reliable communication with high throughput is possible.
On the physical layer a Berger code is used as an error
detection code, combined with retransmissions of broken
packets and a Reed-Solomon code for error correction on
the data link layer.
Although the underlying mechanisms of the channel
are different, our approach has to cope with similar types
of errors, except that in our case the parallel execution of
other programs on other cores can cause noise to affect
many bits in series. Our approach does not use error-
correcting codes as estimates showed that Reed-Solomon
codes would reduce performance. Instead, our approach
only relies on CRC-16 checksums for data integrity and
on a retransmission scheme for robust transmission.
9.5. Countermeasures
In Section 6, we describe countermeasures against our
covert channel. While the countermeasures are in most
cases specific to our approach in particular or at least to
frequency-based approaches in general, there are parallels
to countermeasures against other types of covert channels.
For example, Brumley [7] and Schmidt et al. [32]
describe pollution of cache lines as a countermeasure
against cache-based covert channels. We show that CPU
load noise has a significant impact on the throughput of
our covert channel and is therefore usable as a counter-
measure. In our case, artificial noisy CPU load is effective
because it randomly reduces the frequency of the system.
However, for that reason it potentially has a significant
performance impact.
In the past, reducing the resolution of the available
time sources has been suggested as a countermeasure
against timing-based covert channels as well as side chan-
nels [25]. However, even in highly restricted environments
such as web browsers, where the timer resolution is arti-
ficially lowered, it is possible to achieve high temporal
accuracy either by exploiting other unorthodox timing
sources or by recovering precise timing information from
the intentionally imprecise timer [34].
Countermeasures against cache-based covert channels
include static partitioning of shared caches [38] or the use
of randomized prefetchers in order to disguise memory
accesses [14]. The frequency covert channel equivalent of
these techniques would be a power management mecha-
nism that limits selection of turbo frequencies to subsets
of the total cores and randomizes frequency selection. No
current hardware, however, implements such mechanisms.
10. Conclusion
Covert channels allow attackers to transfer information
even if the access policy of the system prohibits the flow
of information. Previous work has identified changes to
the CPU frequency as a method to transfer information
between different programs either on the same core or on
different cores. However, existing approaches are either
easily thwarted by the operating system at little to no cost
or fail when the system experiences significant CPU load.
In this paper, we describe a covert channel specifically
optimized to exploit Intel Turbo Boost. Our transmitter
varies the number of active cores, and our receiver detects
the resulting changes to the maximum boost frequency. By
using multiple processes in the transmitter, we are able to
increase the load swing so that the resulting covert channel
can tolerate significant CPU load from other processes.
On an idle system, our prototype achieves a throughput
of 61 bit/s between two processes on an idle system
and 43 bit/s with 25% utilization. Even on a system
with 37.5% background load, our prototype is still able
to successfully transmit 12 bit/s of data. We show how
available countermeasures against the covert channel have
a significant impact on performance.
References
[1] Intel Xeon Processor Scalable Family – Specification Update. Intel
Corporation, February 2018.
[2] M. Alagappan, J. Rajendran, M. Doroslovacˇki, and G. Venkatara-
mani. DFS covert channels on multi-core platforms. In 2017
IFIP/IEEE International Conference on Very Large Scale Integra-
tion (VLSI-SoC), pages 1–6, October 2017.
[3] Alejandro Cabrera Aldaya, Billy Bob Brumley, Sohaib ul Hassan,
Cesar Pereida Garcı´a, and Nicola Tuveri. Port contention for fun
and profit. In 2019 IEEE Symposium on Security and Privacy (SP),
pages 870–887. IEEE, 2019.
[4] El Mehdi Benhani and Lilian Bossuet. Dvfs as a security
failure of trustzone-enabled heterogeneous soc. arXiv preprint
arXiv:1902.08517, 2019.
[5] Malini K Bhandaru and Eric J DeHaemer. Providing energy
efficient turbo operation of a processor, May 31 2016. US Patent
9,354,689.
[6] Christian Bienia. Benchmarking Modern Multiprocessors. PhD
thesis, Princeton University, January 2011.
[7] Billy Bob Brumley et al. Covert timing channels, caching, and
cryptography. Aalto University, 2011.
[8] J. Charles, P. Jassi, N. S. Ananth, A. Sadat, and A. Fedorova. Eval-
uation of the Intel® Core™ i7 Turbo Boost feature. In 2009 IEEE
International Symposium on Workload Characterization (IISWC),
pages 188–197, October 2009.
[9] J. Cioranesco, H. Ferradi, and D. Naccache. Communicat-
ing Covertly through CPU Monitoring. IEEE Security Privacy,
11(6):71–73, November 2013.
[10] Jonathan Corbet. Many uses for Core scheduling [LWN.net],
September 2019.
[11] Ian Cutress. The amd 2nd gen ryzen deep dive: The 2700x, 2700,
2600x, and 2600 tested.
[12] Sima Dezs. AMDs Zen-based processor lines (Family 17h), 2019.
[13] J. Doweck, W. Kao, A. K. Lu, J. Mandelblat, A. Rahatekar, L. Rap-
poport, E. Rotem, A. Yasin, and A. Yoaz. Inside 6th-Generation
Intel Core: New Microarchitecture Code-Named Skylake. IEEE
Micro, 37(2):52–62, March 2017.
[14] Adi Fuchs and Ruby B Lee. Disruptive prefetching: impact on
side-channel attacks and cache designs. In Proceedings of the
8th ACM International Systems and Storage Conference, page 14.
ACM, 2015.
[15] P. Garraghan, P. Townend, and J. Xu. An Analysis of the Server
Characteristics and Resource Utilization in Google Cloud. In 2013
IEEE International Conference on Cloud Engineering (IC2E),
pages 124–131, March 2013.
[16] Berk Gu¨lmezog˘lu, Mehmet Sinan I˙nci, Gorka Irazoqui, Thomas
Eisenbarth, and Berk Sunar. A Faster and More Realistic
Flush+Reload Attack on AES. In Stefan Mangard and Axel Y.
Poschmann, editors, Constructive Side-Channel Analysis and Se-
cure Design, Lecture Notes in Computer Science, pages 111–126.
Springer International Publishing, 2015.
[17] Akhila Gundu, Gita Sreekumar, Ali Shafiee, Seth Pugsley, Hardik
Jain, Rajeev Balasubramonian, and Mohit Tiwari. Memory band-
width reservation in the cloud to avoid information leakage in
the memory controller. In Proceedings of the Third Workshop
on Hardware and Architectural Support for Security and Privacy,
page 11. ACM, 2014.
[18] Daniel Hackenberg, Robert Scho¨ne, Thomas Ilsche, Daniel Molka,
Joseph Schuchart, and Robin Geyer. An energy efficiency feature
survey of the intel haswell processor. In 2015 IEEE international
parallel and distributed processing symposium workshop, pages
896–904. IEEE, 2015.
[19] Intel. Intel Turbo Boost Technology in Intel Core Microarchitecture
(Nehalem) Based Processors. November 2008.
[20] Intel. Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Combined Volumes: 1, 2a, 2b, 2c, 2d, 3a, 3b, 3c, 3d and
4, May 2018.
[21] S. Karen Khatamifard, Longfei Wang, Amitabh Das, Selcuk Kose,
and Ulya R. Karpuzcu. POWERT Channels: A Novel Class of
Covert Communication Exploiting Power Management Vulnerabil-
ities. In 2019 IEEE International Symposium on High Performance
Computer Architecture (HPCA), pages 291–303, February 2019.
ISSN: 2378-203X, 1530-0897.
[22] David Kohlbrenner and Hovav Shacham. Trusted browsers for
uncertain times. 25th USENIX Security Symposium, 2016.
[23] Butler W. Lampson. A Note on the Confinement Problem. Com-
mun. ACM, 16(10):613–615, October 1973.
[24] Andrew Marshall, Michael Howard, Grant Bugher, and Brian
Harden. Security best practices for developing windows azure
applications. Microsoft Corp, 2010.
[25] Robert Martin, John Demme, and Simha Sethumadhavan. Time-
warp: rethinking timekeeping and performance monitoring mecha-
nisms to mitigate side-channel attacks. ACM SIGARCH Computer
Architecture News, 40(3):118–129, 2012.
[26] Ramya Jayaram Masti, Devendra Rai, Aanjhan Ranganathan,
Christian Muller, Lothar Thiele, and Srdjan Capkun. Thermal
Covert Channels on Multi-core Platforms. page 17.
[27] Clementine Maurice, Manuel Weber, Michael Schwarz, Lukas
Giner, Daniel Gruss, Carlo Alberto Boano, Stefan Mangard, Kay
Roemer, and Stefan Mangard. Hello from the Other Side: SSH
over Robust Cache Covert Channels in the Cloud. In Proceedings
2017 Network and Distributed System Security Symposium, San
Diego, CA, 2017. Internet Society.
[28] Philipp Miedl, Xiaoxi He, Matthias Meyer, Davide Basilio Bar-
tolini, and Lothar Thiele. Frequency Scaling as a Security Threat
on Multicore Systems. IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, pages 1–1, 2018.
[29] I. S. Moskowitz and M. H. Kang. Covert channels-here to stay? In
Proceedings of COMPASS’94 - 1994 IEEE 9th Annual Conference
on Computer Assurance, pages 235–243, June 1994.
[30] Colin Percival. Cache missing for fun and profit, 2005.
[31] Peter Pessl, Daniel Gruss, Clementine Maurice, Michael Schwarz,
and Stefan Mangard. DRAMA: Exploiting DRAM Addressing for
Cross-CPU Attacks. 2016.
[32] Wolfgang Schmidt, Michael Hanspach, and Jo¨rg Keller. A case
study on covert channel establishment via software caches in high-
assurance computing systems. arXiv preprint arXiv:1508.05228,
2015.
[33] Michael Schwarz, Moritz Lipp, Daniel Moghimi, Jo Van Bulck,
Julian Stecklina, Thomas Prescher, and Daniel Gruss. Zom-
bieload: Cross-privilege-boundary data sampling. arXiv preprint
arXiv:1905.05726, 2019.
[34] Michael Schwarz, Cle´mentine Maurice, Daniel Gruss, and Stefan
Mangard. Fantastic timers and where to find them: high-resolution
microarchitectural attacks in javascript. In International Confer-
ence on Financial Cryptography and Data Security, pages 247–
267. Springer, 2017.
[35] Servermeile. Intel® Turbo Boost Technology | Servermeile Tech-
net. http://technet.servermeile.com/intel-turbo-boost-technology/.
[36] Jons-Tobias Wamhoff, Stephan Diestelhorst, Christof Fetzer,
Patrick Marlier, Pascal Felber, and Dave Dice. The TURBO
Diaries: Application-controlled Frequency Scaling Explained.
page 13, 2014.
[37] Z. Wang and R. B. Lee. Covert and Side Channels Due to Processor
Architecture. In 2006 22nd Annual Computer Security Applications
Conference (ACSAC’06), pages 473–482, December 2006.
[38] Zhenghong Wang and Ruby B Lee. New cache designs for thwart-
ing software cache-based side channel attacks. ACM SIGARCH
Computer Architecture News, 35(2):494–505, 2007.
[39] WikiChip. Xeon Silver 4108 - Intel - WikiChip.
https://en.wikichip.org/wiki/intel/xeon silver/4108, April 2019.
[40] Jingzheng Wu, Liping Ding, Yanjun Wu, Nasro Min-Allah,
Samee U. Khan, and Yongji Wang. C2detector: a covert channel
detection framework in cloud computing. Security and Communi-
cation Networks, 7(3):544–557, March 2014.
[41] Zhenyu Wu, Zhang Xu, and Haining Wang. Whispers in the hyper-
space: high-bandwidth and reliable covert channel attacks inside
the cloud. IEEE/ACM Transactions on Networking, 23(2):603–
615, 2014.
