Abstract-Today's Cyber-Physical Systems (CPS) are witnessing a growing complexity in terms of the number of components and computational power in order to meet the requirements of nowadays applications. For this reason and due to their energy efficiency, Multiprocessor system-on-chips (MPSoC) are becoming ubiquitous. Yet, since these systems are often used in batterydriven and/or small housing use cases, the correct configuration of their power management techniques is an important factor in system design and can be related to possible safety aspects of the system. This calls for an adequate test framework, which is able to observe the power management functionalities of CPS, even before the actual hardware platform is available. The use of virtual platforms for functional validation, that allows executing the CPS's real target platform compatible application binary code on a generic host computer, is currently being adopted by the industry. This work focuses on enhancing industrial OVP virtual platforms by a functional test framework of the power management techniques. We will demonstrate and evaluate how this framework maintains to observe the power management techniques of the system under test. The evaluation uses a Xilinx ZC702 board based on a Xilinx Zynq-7000 MPSoC and its correspondent virtual platform in OVP. Results show that the functional test framework is able to analyze the different modes of operation regarding the power management techniques of the Xilinx Zynq processing system.
I. INTRODUCTION
The complexity of Cyber-Physical Systems (CPS) is growing with the high computational demand of real-time applications in automotive, avionics and multimedia. More powerful and more extensive components are added to the CPS to meet the defined requirements. One example for this progress is the evolution of the Xilinx Zynq series with the older Zynq-7000 [1] and the new Zynq UltraScale+ [2] . Both, so called multiprocessor systemon-chips (MPSoC), are able to execute several applications in parallel. Furthermore, they offer various power management features, often configurable per single component.
Several projects funded by the European Union, e.g. CON-TREX [3] or SafePower [4] , analyze the execution of parallel applications with different criticality levels on such devices, having the goal of reducing their overall power consumption. Since these systems are often battery-driven and/or have constraints regarding size, weight and temperature, the configuration of their power management features plays a major role in system design, to save energy and to reduce system's temperature. Most of the power management techniques are directly linked to the system's performance, e.g. dynamic voltage and frequency scaling (DVFS) or power gating unused components. Especially when safety-critical applications are executed next to non-safety-critical applications, the power management configuration gets safety-critical as well.
This calls for an adequate method to analyze the correct configuration of the power management features in a fast and easy way. The use of physical boards has some major disadvantages. They can be expensive, often not available in early phases of system's design time and their debuggability is difficult due to the lack of internal observability and controllability. Another possibility are simulators on transistor or gate-level. Their accuracy comes at the cost of their simulation speed which makes them slow for large systems. Virtual platforms are able to simulate the software of a CPS on higher abstraction levels. These simulators are less accurate, but faster, especially when simulating on functional level. One of these well-known Simulators is Open Virtual Platform's (OVP) OVPsim [5] . Its virtual platforms can reach execution speeds of several million instructions per second (MIPS), at the cost of limited accuracy, especially in the notion of time. OVP is also able to execute the CPS's real target platform application code on a generic host computer and enable full system introspection at each time. This method has the benefit of easy deployment of the debugged application code on the target platform which made the usage of OVP attractive to industry. This paper contributes by including a functional test framework to the OVPsim simulator that enables the software developer to observe and analyze the power management configuration of a system at early design stages. The evalu-ation of this functional test framework applies on two real industrial use cases to demonstrate its strengths. A Linux operating system demonstration, with activated kernel power management features, and an XtratuM bare metal hypervisor [6] demonstration, also with integrated power management features. Both demonstrations use a Xilinx ZC702 MPSoC board and its correspondent virtual platform in OVP.
The rest of this paper is structured as follows. Section II discusses related work mainly addressing power and energy models and how they deal with power management techniques. Section III gives an overview of the functional test framework and its internal structure. Section IV adds implementation details, especially how the framework is attached to the simulator and how values are collected and evaluated. Section V describes the setup of our evaluation use cases. Section VI covers the mentioned evaluation and finally Section VII concludes this paper and gives some details about future work.
II. RELATED WORK
To the best of our knowledge, no other scientific work dealt with the topic of functional testing of power management techniques at simulation run-time. In the closer field of this topic, most research efforts focused on developing simulation models for estimating power and energy consumption of embedded systems at different abstraction levels.
The work done in [7] focuses on modeling MPSoC platforms at electronic system level in SystemC. The simulation gets a platform configuration that consists of the different system states with their characteristic values, e.g. transition timings or power consumptions. The system model executes nonfunctional code and steps through the correspondent system states to compute the overall power consumption. This approach provides, due to its high abstraction level, only relative figures for comparison which makes it primarily suitable for design space exploration. Authors in [8] and [9] present trace-based power and energy estimation technologies. Both approches use instruction-set simulators to simulate the behavior of the target architecture when executing the application. In that way they generate activity traces of the system components. Energy models compute the energy consumption of each component in a post-process by using the generated activity traces. Therefore, these approaches do not offer the possibility of a run-time power consumption analysis. In [10] SoCLib [11] is used to simulate System-on-Chip platforms at cycle-accurate bitaccurate (CABA) level. CABA level means that the simulator is binary compatible with the physical target architecture. This allows the execution of real target binary applications, like OVP [5] does. The simulator outputs activity counter values of each component to feed a cycle by cycle consumption simulator, that computes the energy consumption. SoCLib is an academic project and does not provide the needed processor models for our work. Thus it does not represent an alternative to OVP for us.
The following approaches regarding power and energy estimation use OVP to simulate their platforms and to generate inputs for their models. We will take a closer look at these approaches in the following.
In [12] a methodology is described to estimate power of black box processor models using OVP based virtual platforms. The evaluation utilizes a Texas Instruments OMAP4460 chip, that contains an ARM Cortex-A9 dual-core processor. The same processor is also integrated in the Xilinx Zynq chip used in our work. Since OVP does not provide any cache models, they have built their own cache model. They have also used SystemC Transaction Level Modeling (TLM) 2.0 capabilities in OVP to connect the single components of the platform and to observe the communication between them. The power model is constructed based on measurements on the physical system and the values of the observed communication points, since all components are black boxes. Especially of interest to our future work are the timing annotations made in [12] which can be used to reduce the timing error between the physical system and the virtual platform, allowing an accurate power estimation. OVP makes the assumption that an instruction is executed in one cycle (as also discussed in our previous publication [13] ). Due to the fact that the ARM Cortex-A9 is a superscalar outof-order processor, it is able to execute multiple instructions at the same time. The results show, that one instruction in average needs 368 ps less than the 833 ps (1.2 GHz) cycle time. That means that the real processor executes in average 1.79 instructions per cycle. Some further corrections exist by adding additional time for special cases, e.g. multiplication or read from caches as well as from external memories.
Authors in [14] present a fast energy evaluation of embedded applications for many-core systems. They use OVP to feed an instruction-driven energy model. Their approach is divided in two parts: the characterization flow of the energy model and the watchdog module that collects the run-time information while simulating. For our work the second part is of interest. Since their approach is instruction-driven, the watchdog captures and disassembles every instruction fetch. For that the watchdog module uses APIs provided by OVP, namely the ICM API that was replaced by the OP API end of 2016. This API is accessible from the platform description. That means informally, the watchdog module is a module of the virtual platform itself. In our implementation, we use the VMI RT API (described in Section IV), located in the intercept library, that is closer to the simulation kernel of OVP, and in that way speeds-up the simulation. The disassemble result is input for an hash table that returns the estimated energy values for defined instruction classes. Simulation speeds remains between 1.6 and 1.8 MIPS.
In [15] Rethinagiri et al. describes a tool for power and energy estimation at system-level. Like in [14] also this tool consists of the same two parts. The power model development and the system-level simulator with the power estimator kernel. Goal is to define a generic power modeling approach. The work describes in detail their power modeling methodology as well as some power models for different processors. Also in this work OVPsim is one of the used virtual platform simulators, next to SoCLib. Like in the other mentioned publications the simulator generates activity data to feed the power model library at simulation run-time. Implementation details about the connection from simulator to power model and power model internals remain unmentioned. However, this publication deals with two power optimization techniques: Dynamic slack reclamation (DSR) and workload balancing. First one exploits the free processor slack time by increasing the current task's execution time of early completion. For that the processor is able to decrease its frequency for the current task. This calls for the power management technique DVFS to adjust processor's voltage plus its frequency. Both parameters are observed by our own functional test framework, but in [15] they are only configured by the executed application on the virtual platform and can only be observed indirectly through the power results. In that case, if the accuracy of the power model is not high enough or the difference in power consumption of two subsequent modes is too small, the change is not traceable.
As mentioned earlier none of the above research efforts deals with the direct observation of power management techniques of MPSoCs as we do. We have decided to implement our functional test framework for power management techniques in OVP, for the following reasons: (1) OVP is commercially proven, provides a rich set of functional processor models and is one of the fastest simulators in its class; (2) As for all dynamic binary translation approaches, different target compilers can be used with all possible optimizations; (3) The target processor debugging tools can be used directly. Since the approaches presented in [12] , [14] , [15] also use OVP, our functional test framework can directly attached to them. But also the other functional simulation based approaches in [8] , [9] , [10] can benefit from our work with some effort to collect the needed values in their different simulators.
III. APPROACH
This section describes the approach of our developed functional test framework for power management techniques. Our work solves the challenge of observing and analyzing the power management techniques of today's MPSoCs by using the OVP [5] simulation environment. Nevertheless, the entire concept except the technical integration into the virtual platform can be applied to other virtual platforms as well, as long as they execute the target binary and provide comparable features for instruction level introspection.
In the following, we will give a brief introduction to OVP, which is based on our older publication [13] , before discussing the typical power management techniques of modern MPSoCs. Afterwards, we present the general concept and the structure of our functional test framework.
A. Open Virtual Platform (OVP)
Imperas [16] provides with OVP a library for modeling and simulation of virtual platforms and meant to be used for embedded software development. The library consists of various components to build virtual platforms and a simulation engine (OVPsim) to execute them. The component library contains models for buses, memory, a large number of specific peripherals, and processor models, e.g. for ARM CortexAx, ARM Cortex-Mx, ARM Cortex-Rx, Xilinx MicroBlaze, Altera Nios II, MIPS 32/64. The processor models are binary equivalent to the correspondent real processors in the sense that they are able to execute real target binaries that are compiled by the target compiler toolchains. OVP uses a Just-in-Time Code Morphing Engine (JIT). The JIT engine gets the binary operation code for each instruction as stated in the binary file and translates it to an equivalent instruction or program for the simulation host. By putting everything together, models of processors, memories, buses and peripherals, in a virtual platform description and using the address map of the target platform we get a full binary compatible platform, that shows the same functional behavior.
Since OVP is only for running instruction accurate simulations, the included timing model makes the assumption that one instruction is executed in one cycle. The simulator counts the executed number of instructions and divides this value with the configured nominal processor speed in MIPS. This is enough for many aspects in software development and test. But, there are other situations where this is not sufficient, for instance the analysis of time-triggered control tasks or for power estimations. So far our functional test framework collects all values needed to feed a run-time power model, that will developed in future work. A further technique of OVP we have to keep in mind is the serial simulation of the virtual platform. That means all processor models in the virtual platform are alternately simulated and not in parallel like in the physical platform. The quantum parameter of the simulation core describes how much simulation time in the processor model may pass before the scheduler switches to the next processor model. This is especially of interest for multiprocessor platforms with communication in between the models. A too large quantum can destroy the functional communication behavior of the simulated application.
OVP has extensive APIs, which are extended with each new release. Especially the OVP VMI RT API (OVP Virtual Machine Interface Run Time API) gets new functionalities for connecting time and power models. This API is accessible in the intercept library, which is very closely connected to the simulation engine and also to the processor model. This way the VMI RT API has a lower influence on the simulation speed than the OP API, which is located in the virtual platform itself and thus having higher communication efforts. For this reason, we use it for developing our functional test framework, like shown in Figure 1 . The test framework is embedded directly into the intercept library, from where the traces can be also output.
B. Low Power Management Techniques in MPSoCs
This subsection focuses particularly on Low Power management Techniques (LPTs) of the Xilinx ZC702 board [17] that contains a Zynq-7000 series [1] MPSoC, since we use this board in our evaluation. However, this board supports most of the commonly used power management techniques. There are two classes of power management techniques: (1) the on-chip techniques, and (2) the on-board techniques. Because the Xilinx Zynq-7000 architecture contains an ARM Cortex-A9 dual-core processing system (PS), as well as a programmable logic (PL) fabric, we have to further divide the on-chip techniques in two subclasses. We evaluated the following values and techniques on the real hardware setup. Nevertheless, in this work we will not describe the way to enter a specific power mode, we only mention the characteristic points to recognize that a power mode is taken.
1) On-chip LPTs for the processing system (PS):
Frequency scaling is supported for the processor cores and the clock for the external DDR-RAM, by adjusting respective clock divider registers. The processor frequency has to be in a range from 21 MHz to 667 MHz 1 , while the range of the DDR-RAM's frequency is from 213 MHz to 533 MHz. Additional the DDR-RAM can be disabled. The processing system has further the capabilities to set each core in a suspend mode, where the cores wait for an event or an interrupt. This is done by executing corresponding assembler instructions, wfe, wfi or sev to wake up from sleep. In this mode, the core is active but the dynamic power consumption is reduced. There are two deeper sleep modes that can be configured via a power state register. In dormant mode the processor core itself is clock-gated, while the L1 caches stay persistent. The deepest sleep mode is realized by turning the whole core off, disabling also the L1 caches.
2) On-chip LPTs for the programmable logic (PL): Frequency scaling of the programmable logic is supported for four different clock areas and can be adjusted in a range from 0.1 MHz to 250 MHz. The highest possible frequency depends on the implemented hardware design. Additionally, the programmable logic's clock can be disabled.
3) On-board LPTs over PMBus: The Power Management Bus (PMBus) is an I2C bus protocol to communicate with adjustable power supplies or other devices like sensors. The ZC702 board has the ability to adjust ten different voltage rails. Not all are mentioned here. To enable DVFS, the voltage of the processing system (from 1 V nominal voltage down to 0.86 V [18] ) and of the DDR-RAM (from 1.5 V nominal voltage down to 1.276 V [18] ) are adjustable. Also the PL 1 USB is unstable at frequencies lower than 222 MHZ voltage scaling is possible (from 1 V nominal voltage down to 0.86 V [18] ) via the PMBus.
4) Available sensors:
The power supplies have also the possibility of reading the currents of each voltage rail. Furthermore, the Xilinx Zynq has an internal temperature sensor, that is connected to the internal analog-to-digital converter. These sensor values can be useful to implement an online power management system.
C. Functional Test Framework
This first version of our functional test framework focuses on the power management techniques for the processing system. Further support for the programmable logic power management techniques will be added in the future. Figure 2 gives an overview of the internal structure of the developed functional test framework within the OVP intercept library. It consists of five main components. Two instances of core models, a platform model, a tracing engine for generating value change dump (VCD) files and a model of the voltage regulators, which implements the communication behavior of the physical PMBus interface. Excepting the virtual voltage regulators, all other components in the framework communicate via Timed Value Streams (TVS) [19] . TVS are particularly suitable for transferring values of extra-functional properties from one component to another. A TVS is a FIFO that saves tuples of the pushed data and the proper simulation time. If the source component adds new values, the data sink receives a notification to read the new tuple. The core models are the main entry point for each simulation cycle, since the entire data collection is carried out via the processor models of the virtual platform. Like mentioned earlier the processor models are executed alternately by the simulation kernel. As a result, only one core model is active at each time and we do not have to care about mutual exclusions. The core models deliver the following data for each core: number of accesses to the memory and the AXI bus, accesses to the processor's frequency register as well as to the memory's frequency and processor's power state register. To check the suspend state of the cores, the executed instructions are analyzed for occurrence of wfe or wfi instructions. Finally, the core model defines a timer that executes a periodic function to calculate the CPU load in a defined interval. If the callback functions detect a change outside this periodic interval, a separate update of the metrics at the current simulation time is made. This ensures, that no event is missed by the core models. All core related metrics are processed in the core models, the CPU load, memory and AXI read/write rates, the suspend and power states, and transferred via TVS to the VCD trace engine. All other platform related metrics, all frequency changes and also changes in power states are directly given to the platform model. In case not all data is required for the current analysis, the data acquisition can be switched off individually to further accelerate the simulation.
The platform model processes and stores the received data from the core models and also control access to them as well as to the simulation kernel. E.g., if a frequency is changed by the user application, only one core model detects this change and notifies the platform model, which then notifies all other core models to update their local values for the global state. So they are able to execute correct calculations regarding CPU loads or read/write rates, values which depend on the frequency. Furthermore, the platform model provides a global simulation time instance, that can be used by all components in the framework to push their values acquired data in the TVS with the correct event time.
Closely connected to the platform model, the virtual voltage regulators gives the possibility to the functional test framework to recognize changes of the platform voltages. For that the executed applications can use the provided PMBus interface like on the real ZC702 board. The model transfers the voltage values to the platform model since they are platform related.
The VCD trace engine is notified at each push of its connected TVS and pulls the data to update the trace output file as a VCD trace. This file can be viewed at run-time or stored for further analysis. Further output formats are easily realizable by using the Timed Value Streams. To use the functional test framework the only thing to do is to configure the simulation kernel to load this specific intercept library dynamically to the processor models at simulation run-time. Thus it is easy to provide the functional test framework to other OVP users as well as to maintain updates of the functional test framework without the need to modify the virtual platform descriptions itself.
IV. IMPLEMENTATION
In this section, we will elaborate on the implementation of the functional test framework and issues that must be considered when developing the intercept library for multi-processor platforms. The framework is object-oriented, implemented in C ++ 14, whereas OVP and the intercept library are written in plain C. We have implemented a C wrapper to enable the usage of the framework in OVP. Since the processing system of the Zynq-7000 contains an ARM Cortex-A9 dual-core processor, the intercept library itself is dynamically instanced twice: once for each processor model. In this way, two separate instances of the functional test framework would be created. However, this is not what we want. To counteract this, the top-level class implements a singleton pattern. This singleton object dynamically instances one core model for each constructor call of a processor model, the platform model, the VCD trace engine and the virtual voltage regulator as well as connecting the TVS at the very end of the initialization phase.
As mentioned the functional test framework uses the VMI RT API 2 of OVP. The core model uses most of the API calls since it acquires most of the observed values. To collect the memory and AXI read and write rates, the core model registers read and write callbacks to the corresponding address ranges of the processor and counts the amount of accesses: To recognize changes in the processor's or memory's frequency register as well as the processor's power state register three more write callbacks apply to the particular addresses. Furthermore, a fetch callback can be registered to observe the occurrence of wfe and wfi instruction in its callback function. The signature of the read, write and fetch callback function is:
#define VMI_MEM_WATCH_FN(_NAME) void _NAME( vmiProcessorP processor, Addr address, \ Uns32 bytes, const void * value, \ void * userData, Addr VA ) typedef VMI_MEM_WATCH_FN(( * vmiMemWatchFn));
Notably helpful are the parameters of the address, where the callback is thrown, and its value, to directly analyze the data of registers or the fetch callback. The mentioned model timer, used to have a periodic callback, to calculate the CPU load or the read and write rates, is configured for each core model:
The callback function is called every time the model timer hits a specified value, with the given signature:
#define VMI_ICOUNT_FN(_NAME) void _NAME( \ vmiProcessorP processor, vmiModelTimerP timer, \ Uns64 iCount, void * userData ) typedef VMI_ICOUNT_FN(( * vmiICountFn));
The model timer works with the MIPS parameter of the processor model, to generate a callback when the instruction count of the processor model reaches the defined value. For easier handling the timer can be configured by specifying a delta value:
void vmirtSetModelTimer( vmiModelTimerP modelTimer, Uns64 delta );
To calculate the relative CPU load or the read and write rates the MIPS parameter is needed, since OVP interprets this value in a way like the "frequency" of the processor:
For computing the memory and the AXI loads the amount of callbacks is used with their current frequencies and the interval of the model timer. To process the CPU load OVP provides two API calls. vmirtGetICount returns the amount of all available instruction slots, while vmirtGetExecutedICount returns the effectively executed amount of instructions:
Uns64 {vmirtGetICount/vmirtGetExecutedICount}( vmiProcessorP processor );
When executing an application like Linux the values can differ significantly, since many wfe and wfi instructions could occur. By calculating the value's differences to the last timer callback and dividing the values, the result is the relative CPU load. When a frequency change of the processors is recognized, also the processor model should adjust its MIPS parameter and thus also the virtual "frequency" of the processor. For that OVP offers the possibility to specify a percentage value, so-called derate factor:
If the value is set to 0.0 % the processor model runs at its configured MIPS rate, else if it is set to 100.0 % the processor model executes no instructions anymore.
void vmirtSetDerateFactor( vmiProcessorP processor, Flt64 factor );
The last scenario is useful if the core is turned of or in dormant mode. In that way also the virtual processor halts completely.
V. USE CASE SETUP
We already mentioned that this publication uses the Zynq ZC702 board and its corresponding virtual platform in OVP. In this section we give a short comparison of the physical and virtual platform. Afterwards we present descriptions of two use-cases which will be analyzed in the evaluation section.
A. Physical and Virtual Platform
Our presented functional test framework focuses on the processing system, the ARM Cortex-A9 dual-core, contained in the Zynq MPSoC. The physical processing system contains dedicated L1 caches as well as a shared L2 cache for the processors. Since OVP has no own cache models available and communication has no timing behavior, this is the first difference to the virtual platform. Next the physical platform has a large amount of components attached to the ARM Cortex-A9 dual-core. External, internal and configuration memories, e.g. System-Level Control registers (SLCR), Trust Zone control registers, SCU Control and Status, etc., are modeled in the virtual platform as well as timer components, GPIOs and the UART, to provide serial output and input devices for the processor cores. All other interfaces in the virtual platform are modeled by dummy devices, represented by memory areas with no further functional behavior. In that way it is possible to execute the same Linux operating system on the virtual platform as on the physical platform.
B. Linux Use Case
For this use case a Linux kernel 4.9.0 was compiled from the Xilinx repositories. In its configuration we activated the cpufreq functionality and the associated governors. cpufreq is a generic driver for changing processor's clock frequency in Linux via the sysfs, a RAM-based filesystem that allows access to system attributes in the userspace. The governors are different policies how the frequency of the system can be controlled. One of these governors transfers the frequency control directly to the userspace. That means the frequency can be adjusted via terminal commands by the user. The Linux kernel is configured for the frequencies 667 MHz and 333 MHz. The following commands allow to switch to these frequencies (values must be provided in kHz):
$ cd /sys/devices/system/cpu/cpu0/cpufreq/ $ echo 666667 > scaling_setspeed $ echo 333334 > scaling_setspeed
The frequency changes should be visible in the traces of our functional test framework. Next to the cpufreq, also the cpuidle functionality was activated in the Linux kernel configuration. cpuidle enables Linux to execute wfi instructions on the processor cores, when they are idling. This should also be recognized by the functional test framework. XtratuM [20] is a hypervisor of fentISS [6] . It supports the Xilinx Zynq-7000 device and is able to execute bare metal user code in configured partitions. It is designed for real-time embedded systems to meet safety-critical requirements. The hypervisor includes required functionalities to build systems based on ARINC 653 [21] , AUTOSTAR [22] and other standards. For this publication we used XtratuM version 2.0.3 for ARM. The use case scenario consists of three partitions as shown in Figure 3 . The first partition is a so-called system partition S0 that has higher privileges to access hypervisor functionalities, like power management techniques. Also it is able to control the two the user partitions, U1 and U2. The scheduling of the partition is static and defined in the example's configuration file that is used to compile the project. Figure 3 shows two complete hyperperiods of the scheduling. S0 and U1 have both 50 ms of execution time available as well as 5 ms of static slack time. They are executed on ARM core 0. U1 should have 210 ms of execution time available on ARM core 1 and 10 ms of static slack time. But with every second call, the system partition S0 should turn off ARM core 1, reduce the frequency from 667 MHz to 333 MHz and CPU voltage from 1.0 V to 0.9 V. The execution of U2 should be interrupted. The next call of S0 should reverse this process. This scenario demonstrates a mixed-criticality application, where ARM core 0 processes the safety-critical tasks and ARM core 1 a non-safety-critical tasks. This non-safety-critical tasks has to be interrupted to turn the MPSoC in a state with less power consumption, e.g. due to thermal issues.
C. XtratuM Use Case

VI. EVALUATION RESULTS
A. Linux Use Case
Code 1 shows the user input and terminal output of processor's UART terminal. First after login we check if the userspace governor is chosen for the frequency configuration and which frequency is configured. Afterwards we try to switch from 333 MHz to 667 MHz and validate this over another check of the current frequency. Last action is to return to 333 MHz. In the boot phase of Linux switches the frequency to 333 MHz at 0.2 s. And out test framework sets the derate factor to 50.075 % since the nominal MIPS rate for the platform is 667 MIPS. Short before the login Linux seems to try to switch to another frequency at 1.9 s, but the divider is invalid. A first error. At 185.3 s we set the frequency successful to 667 MHz and the derate factor is also adapted. Our manual attempt to return to 333 MHz at 221.3 s fails also. The VCD trace shown in Figure 4 confirms these errors. At 185.3 s and afterwards the frequency stays at 667 MHz. This was due to a wrong Linux kernel configuration, that led to the occurrence of a wrong frequency divider. Figure 5 shows the same Linux example Figure 6 shows the VCD trace results of the XtratuM use case. The simulation of the virtual platform indicates the right behavior of the given scenario description. The CPU load of core 0 is switching between 0 % and 100 % since the system partition generates nearly zero load. Also the power down of core 1 works together with the frequency and voltage changes. In that way user partition U2 is interrupt as wanted. But on closer inspection it is noticeable that the execution time does not match the scenario. One hyperperiod has here a duration of ca. 170 ms, not 220 ms as described in the scenario. This is due to the simple timing model of OVP, mentioned in [12] .
B. XtratuM Use Case
The simulation speed of the functional test framework depends on its activated features. Especially the observance of the wfe and wfi instructions ends in a large slow-down, since every fetched instruction has to be analyzed. In numbers, this means 26x slower than the physical system, if both cores are at full load. Without the recognition of the suspend states the platform executes 1.3x faster than the physical platform. All tests were performed on an Intel i7-4710MQ (2.5 GHz) with 16 GB RAM running Debian Jessie (Kernel 4.9.11 x86 64).
VII. CONCLUSION & FUTURE WORK
Our results show that the functional test framework is able to analyze the different modes of operation regarding the power management techniques of the Xilinx Zynq processing system and helps software developers to find bugs in their power management configuration. The functional test framework found an error at changing the frequency of the ARM processing system in Linux and validated the correct functionality of the provided XtratuM use case.
In our future work we will add a timing model, to improve the accuracy of OVP's simple MIPS timing model, as well as a power model for the ARM processing system to our functional test framework. Additionally the observance further power management techniques will be added, especially for the programmable logic of the Xilinx Zynq MPSoC.
