Abstract-This paper presents a fault-tolerant configuration for the modular multilevel converter (MMC). The procedure is able to detect faults in voltage sensors and semiconductor switching devices, and it can reconfigure the system so that it can keep on operating. Both switch and sensor faults can be detected by comparing the output voltage of a set of submodules (SMs), which is measured by a so-called supervisory sensor, with two calculated reference voltages. Faults in the supervisory sensors are also considered. Sensor faults are overcome by using a measuring technique based on estimates that are periodically updated with the voltage measurements of the supervisory sensors. Additional SMs are included in the arms so that the MMC can bypass a faulty SM and continue operating without affecting the output voltage of the phase-leg. Experimental results obtained from a low-power MMC prototype are presented in order to demonstrate the effectiveness of the proposed techniques.
I. INTRODUCTION

M
ULTILEVEL converters are power converter topologies suitable for medium-and high-power applications [1] , [2] . Among the multilevel converter topologies, the modular multilevel converter (MMC) has become the most attractive topology [3] - [7] for high-voltage direct-current (HVDC) transmission systems [8] , [9] and flexible alternating current transmission systems [10] . The main features of the MMC are [6] : 1) its modularity and scalability to different power and voltage levels; 2) its high efficiency; 3) the high quality of the output voltages; and 4) the absence of additional capacitors on the dc link, as the storage is distributed among the capacitors in the submodules (SMs) of the converter.
The general topology of an MMC consists of two arms per phase-leg, where each arm comprises N series-connected identical SMs and a series arm inductor, L. Each SM contains a half-bridge circuit and a capacitor C. The output voltage of each SM equals its capacitor voltage (v C ) when the SM is activated, or zero when it is deactivated. The voltage waveforms at the ac side of the MMC can be synthesized by using multiple modulation techniques [6] . Most of them are based on defining the number of SMs to be activated in each of the arms, and the particular SMs activated are determined by a voltage balancing algorithm [11] .
The current that flows through each arm consists of half the output current and a circulating current. The circulating current includes some harmonic components and a dc component that is related to the power exchange between the dc and the ac side of the converter. The harmonic components can be eliminated [12] or controlled for further reduction of capacitor voltage ripples [13] , [14] .
Reliability is one of the most important challenges in MMCs, since they include many switching devices, which are the weakest components in power converters [15] . For this reason, the development of fault-tolerant converter topologies and strategies are relevant research topics nowadays. Multiple studies analyze their reliability and provide solutions to faults on the dc side [16] - [18] and ac side [19] of the converter. Control techniques under SM faults have also been investigated based on including additional SMs in the arms of the converter. Redundancy is a characteristic inherit to the modular structure of the MMC, and the number of SMs can be easily increased in order to substitute faulty SMs [20] - [22] .
In the case of a component failure, the fault must be detected and localized. In some faults, like an open-circuit (OC) fault in a switching device, the capacitor voltage of the faulty SM may increase, which could cause further damage to the MMC. Given the large number of identical SMs and the symmetrical structure of the converter, localization of a faulty SM is challenging. Some fault detection techniques are based on using additional sensors for each switching device [23] , SM [24] , or using driver modules with integrated fault detection functions [25] . However, these techniques imply a high increase in the converter cost and complexity.
Recently, new fault detection techniques have appeared based on observers and estimators. In [26] , a sliding-mode observerbased fault detection method was proposed. The same observer was improved in [27] , increasing the robustness and reducing the fault detection time. The method is based on comparing the measured circulating current values with the values calculated by a sliding-mode observer. This detection and localization method is robust and does not require the use of additional sensors; however, it performs relatively slowly (the minimum localization time is 50 ms) and only detects OC faults in the switching devices. In [28] , a detection and localization method based on a Kalman filter is presented. The proposed technique compares the measured voltage and current values with the estimated ones using a Kalman filter. The technique is capable of detecting multiple faults at the same time, but it is still slow, with an average time of over 100 ms. Furthermore, it only detects OC faults.
In this paper, a new fault detection and localization technique is presented. The technique is based on dividing the arms in a minimum of two sets of SMs, and adding voltage sensors to measure the output voltage of each set of series-connected SMs. The technique only requires three additional sensors per arm and is capable of detecting and correcting OC faults, short-circuit (SC) faults, and also voltage sensor faults, which is a kind of fault that has not been studied much in the existing literature.
The detection and localization process follows three main steps: detection, localization, and correction. In steady state, the fault detection method supervises the system. Supervision consists on comparing the voltage of the additional sensors with a calculated reference. When the measured and reference voltages do not coincide, a fault is detected and its kind identified. Then, the fault localization method is initiated. This step is composed of multiple algorithms, since each kind of fault requires a different localization process. In summary, localization is based on checking if the fault is detected each time an SM is deactivated. When the fault disappears, the fault is localized on the last activated or deactivated SM. Finally, when the fault is localized, the faulty sensor is substituted by an estimation algorithm or the faulty SM is bypassed. The overall technique does not require intensive processing and provides a fast response, detecting and localizing faults generally in less than 5 ms.
In this study, the capacitor voltages are balanced by using the algorithm proposed in [29] . A circulating current controller [14] is implemented to regulate the internal dynamics of the converter as well as to reduce capacitor voltage ripples.
The remainder of this paper is organized as follows. Section II presents the MMC with redundant SMs, and it proposes a location for the sensors that will provide the converter with fault tolerance. Section III defines the operating principles of the detection method. Section IV describes the procedure for locating the faulty SM and/or sensor and for reconfiguring the converter. Section VII presents the experimental results, and Section VIII summarizes the main conclusions of this work.
II. SENSOR REDUNDANT CONFIGURATION AND ADDITIONAL SMS
The topology presented in this paper includes additional sensors and SMs in order to detect and solve two kinds of faults: a fault in a voltage sensor and a fault in an SM switching device. Faulty sensors provide a constant output value that is normally zero, and faults in the SMs can be classified as SC and OC faults in both the upper and lower power switches of the halfbridge SMs. All the faults are considered permanent. Faults in the diodes are not differentiated from faults in the controlled devices (IGBT or MOSFET). The performance of the SM with an SC fault in a diode is the same as with an SC fault in a controlled device. On the other hand, OC faults must be detected indirectly. Since the MMC arms are highly inductive circuits, an OC fault in a diode will block the current path, which may cause an SC break to another device.
Each arm is divided into nS sets of SMs, each one composed of k series-connected SMs. Fault detection is achieved by using additional voltage sensors (see Fig. 1 ) that measure the output voltage of the sets of SMs, v S j(x) , where j indicates the upper or lower arm (j = {u, l}) and x the number of the set. The voltage provided by each supervisory set sensor is compared with a reference value for fault detection and is also used to substitute faulty individual SM voltage sensors.
Substituting individual sensors with the supervisory set sensors is based on a new measuring technique [30] that requires only two voltage sensors to acquire all the SM capacitor voltages. When only one SM in the set is activated, the voltage provided by the supervisory set sensor is almost equal to the capacitor voltage of the activated SM. Since the measurements of all the capacitor voltages are not always available, they are estimated with a mathematical model between consecutive actual measures. In this mathematical model, the capacitor voltage values are updated whenever there is an actual measurement available, thus correcting the accumulated error in the estimator.
Faults in the supervisory set sensors are also considered. In order to check their performance and substitute them when faults appear, a third sensor is used: the so-called supervisory arm sensor. This sensor measures the voltage provided by all the series-connected SMs of the arm (v Aj ) and should be equal to the sum of all the supervisory set sensors, i.e.,
The supervisory arm sensor is the only sensor that cannot be substituted. If a fault appears in this sensor, the system loses the fault localization capability, since when a new fault is detected, it is impossible to check if it has occurred in an SM or individual sensor or in a supervisory set sensor.
The minimum number of sets per arm (nS), and therefore the number of additional supervisory set sensors, is two. The minimum number of sets is defined by the voltage measurement technique used to substitute faulty sensors [30] , and by the supervisory arm sensor checking, which is based on comparing its value with the two different supervisory set sensors. However, the number of sets and sensors can be increased if required. A low number of sensors per arm reduces the cost of the converter and improves its reliability, since the number of devices that can fail is lower. On the other hand, a higher number of sensors per arm requires lower accuracy of the sensors and allows higher noise margins, since the voltage of one SM has to be higher than the error margin of the set sensor. Moreover, the cost and voltage limitations of each sensor should be taken into account when defining the number of sets. Reducing the number of sets reduces the number of sensors and the cost of the measuring system, but it also increases the maximum voltage applied to each of the sensors.
In order to be able to reconfigure the MMC under switch faults, SM redundancy is provided. This redundancy is achieved by adding a number M of SMs to the N basic ones in the arms. In this paper, the redundant SMs are also active, which, in addition to provide redundancy, it helps to reduce the capacitor voltage ripples during normal operation mode [21] . This technique gives the same consideration to all the SMs, but the maximum number of SMs activated in each arm at any time is N out of the N + M available.
When a fault in a switching device is detected, the faulty SM is disabled in the control system, preventing it from being activated, and it is short-circuited by an external device, i.e., a high-speed by-pass switch or thyristor [8] .
A schematic of the proposed fault-tolerant topology is depicted in Fig. 1 .
III. FAULT DETECTION METHOD
The proposed fault detection method is based on comparing the voltage measured by the supervisory set sensors, v S j(x) , with a calculated reference. Two different reference voltages are calculated: one for detecting sensor and OC faults, and another for detecting SC faults. The first reference signal, which is named "expected voltage" (v S j(x) exp ), is calculated as the sum of the voltages of the activated SMs
where n is the number of SM, with n = {1,
is the state of the SM, and v C j(n ) is the SM capacitor voltage. The second reference signal, v S j(x) t , is named "theoretical voltage." It consists of the product of the number of activated SMs and the average value of the voltages of all the SMs in the arm
where
If the measured voltage v S j(x) and the calculated reference signals, v S j(x) exp and v S j(x) t , are different, a fault is detected. Variables e exp j (x) and e tj (x) represent the difference between the measured voltage and the expected and theoretical voltages, respectively
Due to nonidealities, the error values are always different from zero. For this reason, a fault is considered only when the error values are higher than a threshold value. Threshold values cannot be defined theoretically, since they depend on multiple factors like voltage sensors noise, sensors accuracy, and voltage drops in the switching devices and connections. For this reason, the threshold values should be adjusted empirically. In this paper, the threshold of the "expected error" has been adjusted to a value of 20% of the capacitor nominal voltage (V dc /N ). The "theoretical error" is also affected by the capacitor voltage imbalance, as it is calculated from the average capacitor voltage value of the SMs in the arm. Therefore, the threshold value of the theoretical error should be higher than the threshold of the expected value. In this paper, a value of 50% of the capacitor nominal voltage has been adopted.
Faults are only detected in one specific switching state of the SM (ON or OFF, depending on the kind of fault). Consequently, a significant delay may exist between the moment the fault appears and its detection. In order to reduce this delay, a second mechanism is used for detecting faults. The mechanism, known as alarm indicator, consists in comparing the values of the individual sensors with prefixed limit values at each sampling period. If an individual sensor provides a value too high or too low, the alarm indicator for that SM is activated. In order to distinguish between faults and capacitor voltage ripples that appear during normal operation of the converter, the upper limit is defined near the maximum value allowed to the capacitor voltages and the lower limit is defined close to zero.
Except for faults detected by alarms, not all detected errors correspond to a fault in an individual sensor or SM. Sometimes detected errors correspond to faults in the supervisory set sensors. For this reason, before starting a fault localization process, the correct performance of the supervisory set sensors should be checked.
A. Supervisory Sensor Fault Detection
Differences between the measured and the calculated voltages correspond to faults in SMs or sensors only if the supervisory set sensor measurement is correct. For this reason, the correct performance of the supervisory set sensor is checked at each sampling period.
The correct performance of the supervisory sensors is verified by comparing the sum of the supervisory set sensor voltages v S j(x) with the voltage of the supervisory arm sensor v Aj . If the difference is within a tolerance margin (small errors are expected due to noise and sensor accuracy), then the supervisory set sensor voltages and calculated voltages are compared. Otherwise, the process for localizing the faulty supervisory sensor is initiated.
B. Voltage Sensor Fault Detection
In this paper, a voltage sensor with a resistive input impedance is used, and faults are emulated as an OC at the linking cable between the SM and the voltage sensor. Therefore, when a fault appears, the measured voltage decreases rapidly to zero. Most times, the sensor faults are detected through the alarm mechanism, but comparisons of measured and calculated voltages are still required. If a different sensor topology were used, the fault process might be different, but the detection method would not require many modifications.
When an SM with a faulty sensor is activated, a significant difference appears between the expected and the measured voltage. The measured voltage is the sum of the voltages of the activated SMs, including the one with the faulty SM, which has a voltage value close to the other SM voltage values. In contrast, the expected voltage considers the voltage measured by the faulty individual sensor, which is lower than the others.
The difference between the measured and the expected value does not appear in the theoretical error. The measured voltage is very similar to the theoretical voltage, as the second one considers the average voltage of all the SMs for the SM with the faulty sensor, which is close to the real value. The variation of the average voltage value due to the faulty sensor is not significant enough to detect an error, since the average is performed by considering all the SMs in the arm.
If the sensor voltage drops too fast or the SM is deactivated when the fault appears, the fault is detected through the minimum voltage alarm. However, a comparison between the measured and calculated voltages is also required in order to differentiate between sensor and SM faults.
C. SM Fault Detection
SM faults considered in this paper are faults in controlled switching devices, which can be both OC and SC faults. Considering that only one fault appears at the same time, there are four kinds of faults, each one with its own dynamics: OC in the upper switch, OC in the lower switch, SC in the upper switch, and SC in the lower switch. OC and SC faults can be detected and differentiated through the error of expected and theoretical voltages. Fig. 2 depicts the schematics of each fault equivalent circuit and current flow path, depending on the SM state and the current direction.
OC faults are detected when the current should pass through the open-circuited semiconductor but, due to the fault, is forced to pass through the opposite switch diode. For example, in an upper switch OC fault, if the current is negative when the SM should be activated, the current cannot circulate through the transistor and is forced to flow through the lower diode. This provides zero volts at the output of the SM instead of the capacitor voltage. In this situation, a negative error appears in both the expected voltage value and the theoretical voltage value, as the measured voltage is lower than the calculated voltages. The flow path of the current can be observed in Fig. 2(a) .
When an OC fault occurs in a lower switch, the opposite happens: when the SM should be deactivated and the current is positive, the OC forces the current to flow through the diode of the upper switch [see Fig. 2(b) ]. In this case, the SM provides the voltage of the capacitor instead of zero volts, causing a positive error in both the expected and theoretical errors.
There is not much difference between the upper and lower switches in SC fault detection; therefore, they are not differentiated. When an SC fault appears and the opposite switch is activated, the capacitor is short-circuited and rapidly discharged. Considering a similar resistance in the short-circuited switch and the on-state switch, the voltage provided at the output is half the capacitor voltage, whose value is very low. SC faults are detected by the theoretical error, as the value provided by the SM (almost zero) is much lower than the calculated one (the average value of all the SMs). However, no error is detected in the expected value, as the measured voltage value is very low, and so it is in the calculated value (although an error exists, it is not significant enough to be considered a fault).
When an SC appears in the upper switch, the capacitor is discharged during the off-state of the SM, but the fault is detected when activating it. Conversely, if an SC fault is produced in the lower switch, the fault is detected in the on-state of the SM, while the capacitor is being discharged. A summary of the SM and sensor faults and their detection methods is shown in Table I .
D. Fault Detection by Alarm Indicator
If the voltage seen by the individual sensor reaches the nominal bounds (i.e., too high or too low), an alarm indicator is activated. This indicator facilitates detection and localization of the fault, as the SM where the alarm has been activated is where the fault was produced. When an alarm indicator is triggered in an activated SM (or the origin of the alarm is a lower switch OC fault), the voltage measured by the supervisory set sensor will be different from he expected and theoretical voltages and the kind of fault will be detected immediately. However, if an alarm indicator of a deactivated SM is triggered, the expected and theoretical voltages are not modified, and hence, the kind of fault cannot be detected and corrected. In order to identify the fault as soon as possible, the SM is forced to be activated. Activation is done through the enforced activation method, an algorithm that modifies the activation priority of the SMs without affecting the output voltage of the converter.
E. Enforced Activation Method
The enforced activation method [30] is a technique used for activating or deactivating some specific SMs without affecting the converter output voltage. The method is based on modifying the priority of the SMs in the voltage balancing algorithm. Since the priority of the selected SMs changes, they are activated or deactivated sooner, while the total number of activated SMs in the arm and their duty cycles remain unchanged. This method is used to accelerate some detection and localization algorithms. Priority is modified through the voltage values seen by the voltage balancing algorithm, v * C j(n ) , which are increased or decreased accordingly depending on the target (SM activation/deactivation). The SMs with modified priority are indicated in the "enforcing vector," which is scaled and added as an offset to the measured capacitor voltage values. When the selected SMs have to be deactivated, the sign of the offset is the same than that in the arm current. However, algorithms like fault detection by alarm indicator and lower switch OC fault localization require activating the specified SMs. In this situation, the offset is added with the opposite sign than that of the arm current. A block diagram of this enforced activation method is depicted in Fig. 3 .
IV. FAULT LOCALIZATION METHODS
After detecting a fault, the faulty SM or sensor has to be determined. Each kind of fault has its own localization method; therefore, different algorithms have to be executed, depending on the fault origin. A state machine has been implemented to manage all the processes related to fault detection and localization. The detection and localization system has nine different states faulty SM before returning to the supervision system; 10) State 9: Fault Detection Inability: after a supervisory arm sensor fault, the performance of supervisory sensors cannot be checked, and therefore, it is impossible to identify the kind of fault detected by the supervision state (not coming from an alarm indicator). The state machine diagram is detailed in Fig. 4 . The detection system starts with the Initialization State or State 0, which is used only before achieving the steady-state performance of the system. Once the system has started, it stays in the Supervision State, looking for faults in the sensors or in the SMs.
When a fault is detected, the state machine goes to the corresponding localization method. If the fault is detected by an alarm indicator, the detection system changes from Supervision State to Alarm Detection State before starting a localization method. Both Supervisory Sensor Fault Localization and Individual Sensor Fault Localization States localize and correct the fault and return to the Supervision State.
In contrast, the SM Fault Localization States change to the SM Deactivation State after localizing a fault. This state waits for the next capacitor voltage balancing algorithm sampling period before returning to the Supervision State, since the slower sampling period of this algorithm cannot ensure the immediate deactivation of the faulty SM after its localization.
Fault Detection Inability State or State 9 is activated only when a fault has been previously detected in the supervisory arm sensor and a fault is now detected in the Supervision State. The new fault cannot be identified because the correct performance of the supervisory set sensors cannot be checked; consequently, it cannot be corrected. This state finishes the fault detection and localization processes until the faulty sensor is fixed and the system has restarted.
A. Supervisory Sensor Fault Localization
When a fault in a supervisory sensor is detected, the system changes to State 3, and looks for the faulty supervisory sensor. The localization method consists of comparing all the supervisory set sensors with the supervisory arm sensor when their values should be the same. That is, when all the activated SMs are in the compared set and all the SMs of the other sets are deactivated. Under proper operation of the voltage sensors, the supervisory arm sensor and the supervisory set sensor should provide the same voltage value. However, since an error has been detected, one or more of the comparisons will be different and the faulty sensor will be detected. If only one of the com- parisons is different, the compared supervisory set sensor is the faulty one. On the contrary, if all the comparisons are different, the faulty sensor is the supervisory arm sensor.
Activation or deactivation of all the SMs in a Set is performed through the enforced activation method. In order to deactivate all the SMs in a set, the number of on-state SMs in the arm has to be equal or lower than the number of SMs in a set. Since this situation is only available during half a period, the enforced activation method will not be activated until the number of activated SMs is the required, that is, until the modulation signal is positive for the upper arm or negative for the lower arm. This limitation reduces the time where the enforcement is activated, and therefore, the capacitor voltage imbalances.
B. Sensor and SM Fault Localization
The localization methods for sensor faults and SM faults are very similar. In fact, they use the same pattern, but the analyzed variables and the applied solutions are different for each localization method.
The localization method is based on looking for a change in the value of the error when the SM where the fault is located changes its state. As an error has been detected just before activating the localization method, the faulty SM is in the state that causes the error to be larger than the threshold value when the localization process starts. When the error disappears, the last changing SM corresponds to the one with the fault. If the fault has been detected from an alarm or an alarm appears during the localization process, the origin of the alarm is automatically assigned as the faulty SM. A flowchart of the localization method algorithm is depicted in Fig. 5 .
In order to accelerate the process, the enforcing activation method is used. Taking into account the dynamic of each fault, the SMs in the set that are in the faulty state (activated or deactivated) are forced to change. Each time one of the targeted SMs changes its state without changing the error significance, that SM is eliminated from the vector of SMs to be forced.
The differences between the localization methods are mainly the checked error and the forced change of state. In the sensor fault localization method, the expected error e exp j (x) is the variable that is tracked for a change, and the activated SMs are forced to be deactivated.
The same changes are searched for in the upper switch OC localization method, but due to the fault dynamics, the error change is checked only when the current is negative. Also, the SMs are eliminated from the enforcing list only when they change with negative current. The dynamic characteristics of OC faults are explained in Section IV-C.
Lower switch OC faults are also localized from a change in the expected error, which is detected when a faulty SM is deactivated. Therefore, the SMs are forced to be activated. Error change detection is only validated when the current is positive.
Finally, SC faults are localized through a change in the theoretical error e tj (x) in both current directions. Due to the rapid discharge of the SMs, SC faults are often localized by an alarm indicator.
C. OC Fault Dynamics
OC faults can be detected for only one direction of the arm current. The upper switch OC faults appear only when the current is negative, and lower switch OC faults appear with positive current. This fact reduces the opportunities to localize the fault, as a change in the error variable means that the faulty SM has been deactivated only if the change is produced with the correct current direction.
Moreover, OC faults modify the internal dynamics of the arm currents. In upper switch OC faults (which are detected when the arm current is negative), the fault reduces the voltage generated by the arm. The change in the applied voltage increases the voltage in the arm inductors and the arm current. If the power flows from the dc side to the ac side of the MMC, the arm current is mostly positive. Consequently, the increase in the arm current can change the direction of the arm current from negative to positive. This change in the arm current to positive causes the effects of the fault to disappear, provoking an oscillating dynamic around zero during the negative part of the arm current.
The oscillating dynamics make localizing OC faults very difficult, as the expected error continuously appears and disappears. With the purpose of reducing the oscillating dynamics and detecting the upper switch OC fault, the circulating current reference is modified, forcing a negative arm current. This modification consists on adding a gain in the circulating current reference when it is negative and, hence, increasing the negative differential control signal.
Lower switch OC faults have an opposite effect, reducing the arm current when it is positive. If the power flows from the dc side to the ac side of the converter, the positive part of the arm current is large enough to avoid an oscillating dynamic. However, if the power flows from the ac side to the dc side, an oscillating dynamic may also appear. Therefore, under this kind of fault, the current reference is modified in the opposite direction, forcing a positive arm current. It should be highlighted that the circulating current reference modification is only applied during OC fault localization algorithms.
V. FAULT CORRECTION METHODS
The proposed fault-tolerant system not only detects and localizes the faulty components, but it also reconfigures the converter's operation in order to maintain a proper performance.
When a supervisory set sensor fails, it can simply be disabled and its value will be measured indirectly. By subtracting the voltage of all the correct supervisory set sensors from the supervisory arm sensor, the value of the faulty sensor is obtained. A similar procedure is performed for the supervisory arm sensor, whose value is calculated as the sum of all the supervisory set sensors. However, as explained in Section IV, the system loses the ability to identify faults not detected by an alarm indicator.
The supervisory set sensors are also used for correcting faults in individual sensors. The measurement of the faulty sensor is substituted by the output of a robust estimation algorithm [30] . This algorithm calculates the evolution of the capacitor voltages from the values of the switching states and the measured values of the arm current. Moreover, the estimation is periodically corrected with the actual value of the capacitor, which is measured through the supervisory set sensor when the estimated SM is the only activated one in the set. SM faults are corrected by simply disabling and shortcircuiting the SM. Disablement is performed by an algorithm similar to that of the enforcing activation. The voltage introduced to the voltage balancing algorithm is modified by making the faulty SM the one with the lowest priority. Since the technique of "active" redundant SMs [21] is used, the system can maintain its performance without using the faulty SM anymore. Moreover, the faulty SM is short-circuited externally by a switching device that is integrated with a contactor and a thyristor in order to ensure its deactivation [8] .
VI. RELIABILITY ANALYSIS
Reliability improvement achieved with the proposed technique can be demonstrated numerically. Assuming different failure rate values for the switching devices and the voltage sensors, the total failure rate of one arm of the MMC has been calculated. Failure rate for the whole SM (λ SM ) has been assumed to be 100 failures per year and 10 5 hours of operation. Failure rate for the individual sensors (λ I ), supervisory set sensors (λ S ) and supervisory arm sensors (λ A ) are assumed to be 10, 15, and 20 failures per year and 10 5 hours, respectively. The supervisory sensors have a higher failure rate due to the higher nominal voltage requirements.
The total failure rate of a system [15] , [31] , [32] is calculated through the combination of individual reliability functions R(t). Once the total reliability function is obtained, the mean time to failure (MTTF) is calculated, and also the failure rate obtained as the inverse of MTTF. The main equations are as follows:
In this paper, three MMC configurations are considered and compared:
1) A basic MMC without any fault detection system (R(t) Basic ). Since this topology is not able to detect and localize faulty devices, the faulty SMs cannot be disabled. Therefore, the whole converter fails after any simple fault of the SMs or of the individual sensors. The reliability of this configuration is calculated as the product all the devices reliability function, what is equivalent to add the failure rates
2) An MMC with a fault-tolerant system based on estimators [26] - [28] (R(t) Est ). This configuration is able to detect, localize, and bypass M faulty SMs, but not individual sensor faults. Therefore, a failure of the system is produced after M + 1 SM faults or after a sensor fault. Equations for calculating the reliability of a system with some redundancy are obtained from [15] , [31] . The reliability function of all the SMs is multiplied by the reliability function of the sensors, because one simple fault of them causes failure of the entire converter
. (11) 3) An MMC with the proposed fault-tolerant system, based on additional sensors (R(t) Add ). The proposed system can tolerate M faulty SMs and failure of all the individual sensors. However, it loses its fault-tolerant capability after the fault of the supervisory arm sensor or the fault of two supervisory set sensors. For the sake of simplification, these conditions are considered as a failure of the entire system. Therefore, the total reliability function is calculated as the product of the reliability of all the SMs, the reliability of the supervisory set sensors (which can tolerate one fault), and the reliability of the supervisory arm sensor Table II demonstrate that the proposed technique highly reduces the failure rate of the converter. Reliability is mainly provided by the tolerance to SM faults, since the estimator-based fault-tolerant MMC also presents a low failure rate. However, tolerance to sensor faults also increases the reliability. The reliability improvement of the proposed technique is more significant as the number of SMs increases, since the reliability of the proposed system is independent of the number of individual sensors.
VII. EXPERIMENTAL RESULTS
The proposed fault-tolerant topology and detection method have been implemented and tested in a low-power laboratory prototype. It consists of a single-phase MMC operating over an RL load. The arms are composed of eight SMs: seven basic ones (N = 7) and an additional one (M = 1). Each arm has been divided into two supervisory sets (nS = 2), of which each one supervises four SMs (k =4). All tests have been performed with a modulation index m a = 0.7.
The prototype has been implemented using silicon-carbide (SiC) technology, with MOSFET devices CREE CMF20120D and Schottky diodes CREE C4D10120D. The main control and acquisition tasks are implemented in a dSPACE DS1103 platform using ControlDesk software. A picture of the experimental prototype is presented in Fig. 6 , and the main data of the prototype are given in Table III .
In order to demonstrate the effectiveness of the proposed technique, the system response has been tested for almost all the considered sensor and SM faults.
A. Supervisory Set Sensor Fault
The first fault is tested in an upper arm supervisory set sensor, v S u (1) . The fault is produced by opening a relay that is in series with the sensor. Fig. 7 shows the voltages of the upper arm supervisory set sensors when a fault appears. The voltage v S u (1) drops to zero at time t = 0.04 s, but when the fault is localized shortly afterwards, the measured value is substituted by the calculated value v * S u (1) . Fig . 8 depicts the localization process in detail. Fig. 8(a) depicts the supervisory set sensor voltages and the supervisory arm sensor voltage. The fault is detected just immediately after it appears, as the sum of the supervisory sensor voltages does not equal the supervisory arm sensor voltage. Then, the supervisory sensor fault localization process (State 3) is initiated. The instants when each of the supervisory set sensors and the supervisory arm sensor are compared can be seen in Fig. 8(a) . The system state is depicted in Fig. 8(b) . 
B. Individual Sensor Fault
Figs. 9 and 10 show the experimental results from forcing a fault in an individual sensor. Fig. 9 (a) depicts the measured capacitor voltages when a fault appears on sensor v C u (2) at time t = 0.01 s. The fault is rapidly detected and the estimation algorithm substitutes the individual sensor. In Fig. 9(b) , it can be seen how the estimated voltage is close to the other capacitor voltages. With the aim of having a reference value, the voltage v C u (3) is depicted in the same figure. Fig. 10(a) shows the expected and theoretical errors in detail, e exp u (1) and e tu (1) . As can be observed, the expected error has a high positive value when the SM is activated, as the measured value is higher than the calculated one. Conversely, the theoretical error remains at a low value. Fig. 10(b) shows the system state, which activates State 2 in order to identify the origin of the alarm indicator, followed by State 4 being activated to locate and substitute the faulty sensor.
C. OC Fault
The OC SM faults have been tested only for the upper switch. The implemented prototype does not include switching devices to bypass the faulty SMs. Therefore, the SMs can be deactivated, but not externally short-circuited. This fact prevents testing lower switch OC faults, since they cannot be corrected.
Similar to the sensor faults, the OC fault is tested by opening a relay connected in series with the MOSFET device. Results are shown in Figs. 11, 12 , and 13. Fig. 11(a) shows the SM capacitor voltages when a fault appears in the SM u (1) . The fault is detected at time t = 0.04s, when the faulty SM is activated with negative current. Fig. 11(b) shows the output voltage of the converter, which becomes slightly distorted between the fault appearance and its correction. Due to the existence of additional SMs, the output voltage is not modified when disabling the faulty SM. The dynamics of the upper arm current can be seen in Fig. 11(c) , where the current becomes zero when the fault appears. Due to the modification in the circulating current reference, the current is forced to be negative and the fault is located about 2 ms after its appearance. A detail of the circulating current modification is depicted in The details of the fault localization process are depicted in Fig. 13 . The expected and theoretical errors (both of which have the same values) are depicted in Fig. 13(a) , and the state of the system is depicted in Fig. 13(b) . The system changes from State 1 (fault detection) to State 5 (upper switch OC fault localization) at time t = 0.04 s and then changes to State 8 (SM deactivation) at time t = 0.0418 s. 
D. SC Fault
SC faults have been tested in both the upper and lower switches. In order to limit the peak current during the tests, SCs have been emulated by activating a low value resistance in parallel to the switch. In this paper, a 5-Ω resistor has been used. Fig. 14(a) shows the SM capacitor voltages during an upper switch SC fault. It can be seen that the capacitor of the faulty SM discharges before the fault detection, as the SM is deactivated. When the SM is activated, the fault is detected and immediately located. Then, as the SM is deactivated, its capacitor continues discharging. Fig. 14(b) shows the output and arm currents in which a distortion appears when the fault is detected. The distortion appears only in the circulating current, without affecting the output current.
A detail of the fault detection and localization processes are depicted in Fig. 15 . The theoretical and expected errors are shown in Fig. 15(a) . It can be seen that the expected error remains near zero while the theoretical error changes. The theoretical error increases before fault detection due to the variation of the average voltage v C avg , but it does not overpass the threshold value. However, when the SM is activated, the theoretical error becomes negative and overpasses the threshold value, whereby the fault is detected. The state of the system is depicted in Fig. 15(b) , which shows that the error is localized (State 7) immediately after its detection, and it then changes to the SM Deactivation State (State 8).
SC faults have also been tested in the lower switch of a SM. Fig. 16 depicts the SM capacitor voltage when an SC fault appears on the lower switch of the SM u (1) . This figure shows how the voltage of the faulty SM drops until the fault is detected. Once the fault is corrected, the voltage of the SM capacitor remains at the same value.
VIII. CONCLUSION
In this paper, a strategy for detecting, localizing and correcting SM and sensor faults in MMCs has been presented. The use of a few external sensors provides fault detection capability and voltage sensor redundancy. Moreover, the use of additional SMs allows the faulty ones to be substituted easily. The detection technique is based on measuring the voltage provided by a set of SMs and comparing it with a calculated reference value. The localization method is based on forcing the deactivation of the suspicious SMs until the fault disappears. This method provides robust and fast responses to both SM and sensor faults with minor additional costs. The experimental results demonstrate the effectiveness of the proposed technique in detecting and correcting all considered faults in less than 5 ms, which is much faster than other methods that can be found in the literature.
