Fault Tolerant Power Systems by Nesgaard, Carsten
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
General rights 
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners 
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. 
 
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. 
• You may not further distribute the material or use it for any profit-making activity or commercial gain 
• You may freely distribute the URL identifying the publication in the public portal  
 
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately 
and investigate your claim. 
   
 
Downloaded from orbit.dtu.dk on: Dec 18, 2017
Fault Tolerant Power Systems
Nesgaard, Carsten; Andersen, Michael A. E.
Publication date:
2004
Document Version
Publisher's PDF, also known as Version of record
Link back to DTU Orbit
Citation (APA):
Nesgaard, C., & Andersen, M. A. E. (2004). Fault Tolerant Power Systems.
 C A R S T E N  N E S G A A R D  
Fault Tolerant Power Systems
P h . D .  t h e s i s  
A U T O M A T I O N  
ØrstedDTU 
J A N U A R Y  2 0 0 4  
Table of contents 
 
 
1 Introduction ..............................................................................................................................1 
2 Fault tolerance ..........................................................................................................................4 
2.1 State of the art techniques....................................................................................................................... 5 
2.2 Definition ................................................................................................................................................ 7 
2.3 References............................................................................................................................................... 8 
3 Concept clarification and point of origin .............................................................................10 
3.1 Failure Modes Effects and Criticality Analysis (FMECA) ................................................................... 10 
3.2 Derating................................................................................................................................................ 13 
3.3 Statistical distributions and methods.................................................................................................... 16 
3.4 Redundancy........................................................................................................................................... 26 
3.5 Standards .............................................................................................................................................. 31 
3.6 Research based on applicable evaluation techniques .......................................................................... 32 
3.7 References............................................................................................................................................. 33 
4 Array-based redundancy.......................................................................................................35 
4.1 Redundancy........................................................................................................................................... 35 
4.2 Introduction .......................................................................................................................................... 36 
4.3 System realization................................................................................................................................. 39 
4.4 Array-based control.............................................................................................................................. 43 
4.5 Reliability assessment ........................................................................................................................... 53 
4.6 Discussion and Summary...................................................................................................................... 55 
4.7 References............................................................................................................................................. 56 
5 Digital control of DC-DC converters....................................................................................58 
5.1 Introduction .......................................................................................................................................... 58 
5.2 Specifications for digital converter implementation............................................................................. 61 
5.3 Improved design.................................................................................................................................... 64 
5.4 Digital converter control reliability ..................................................................................................... 72 
5.5 Experimental verification ..................................................................................................................... 75 
5.6 Discussion and summary ...................................................................................................................... 82 
5.7 References............................................................................................................................................. 84 
6 Load sharing ...........................................................................................................................85 
6.1 Introduction .......................................................................................................................................... 85 
6.2 Current-based load sharing.................................................................................................................. 87 
6.3 Temperature-based load sharing.......................................................................................................... 92 
6.4 Reliability assessment of the two techniques ........................................................................................ 97 
6.5 Summary of the theoretical evaluation ................................................................................................. 98 
6.6 Specifications for the laboratory test setup ........................................................................................ 100 
6.7 Experimental results ........................................................................................................................... 105 
6.8 Theoretical system evaluation ............................................................................................................ 114 
6.9 Discussion and summary .................................................................................................................... 120 
6.10 Patent rights? ................................................................................................................................. 122 
6.11 References ...................................................................................................................................... 123 
7 Thermal droop load sharing ...............................................................................................125 
7.1 Introduction ........................................................................................................................................ 125 
7.2 Specifications for the laboratory implementation............................................................................... 127 
7.3 Experimental results ........................................................................................................................... 133 
7.4 Thermal droop load sharing reliability .............................................................................................. 142 
7.5 Discussion and summary .................................................................................................................... 145 
7.6 References........................................................................................................................................... 146 
8 Partners for Advanced Transit and Highways ................................................................. 147 
8.1 Introduction ........................................................................................................................................ 147 
8.2 PATH mission and advanced transit .................................................................................................. 148 
8.3 Precision Docking Project.................................................................................................................. 149 
8.4 Discussion and summary .................................................................................................................... 158 
8.5 References........................................................................................................................................... 158 
9 Conclusion ............................................................................................................................ 159 
10 References (Sorted alphabetically with chapter index).................................................... 162 
 
 
 
Appendices 
 
 
A1  List of abbreviations ..................................................................................................................... 1 
A2  List of variables............................................................................................................................. 2 
A3  CD contents ................................................................................................................................... 3 
A4  Complete connection matrix ........................................................................................................ 8 
A5  Complete schematics................................................................................................................... 10 
A6  Publications ................................................................................................................................. 13 
 
 
 
 
 
 
List of figures 
 
 
Figure 1 : Project diagram......................................................................................................................2 
Figure 2 : ‘State of the art techniques’ database ....................................................................................6 
Figure 3 : Successive FMECA procedure flowchart............................................................................10 
Figure 4 : Stress vs. strength relationship for electronic components..................................................14 
Figure 5 : Derating effects on resistor failure rate (a) and capacitor failure rate (b) ...........................15 
Figure 6 : Density function in which Q(t) and R(t) are illustrated.......................................................17 
Figure 7 : Network reduction example.................................................................................................23 
Figure 8 : Result of network reduction.................................................................................................25 
Figure 9 : Power requirement in percent of total system power capability..........................................28 
Figure 10 : Basic block identification ..................................................................................................28 
Figure 11 : Parallel-connection of two blocks might increase the reliability.......................................29 
Figure 12 : Plot of (3-23) with fault ratio as a variable........................................................................30 
Figure 13 : Direct link between prior art, project diagram and research work ....................................33 
Figure 14 : 3-dimensional redundant system .......................................................................................35 
Figure 15 : Initial implementation concept ..........................................................................................37 
Figure 16 : Averaged and normalized fault distribution for 5 highly reliable space converters..........38 
Figure 17 : Detailed fault assessment of PWM and voltage regulation ...............................................38 
Figure 18 : Board level system realization...........................................................................................40 
Figure 19 : Block interconnection........................................................................................................40 
Figure 20 : Power system block identification.....................................................................................41 
Figure 21 : Operating point movement after fault occurrence .............................................................42 
Figure 22 : Alternative use of power system redundancy....................................................................42 
Figure 23 : Possible switch combinations ............................................................................................45 
Figure 24 : Overall system structure ....................................................................................................46 
Figure 25 : 8 faults distributed among all 5 converters........................................................................47 
Figure 26 : Traditional procedure for establishing maximum number of working converters ............48 
Figure 27 : Dataflow diagram for step 1 ..............................................................................................49 
Figure 28 : Result array........................................................................................................................52 
Figure 29 : Real-world implementation of proposed converter topology............................................52 
Figure 30 : Cross section of Figure 22 .................................................................................................53 
Figure 31 : Partial connection matrix for illustration purposes............................................................54 
Figure 32 : Probability of system survival ...........................................................................................55 
Figure 33 : PIC 16F877 microcontroller developer kit ........................................................................59 
Figure 34 : Mixed analog/digital system..............................................................................................59 
Figure 35 : Basic digital converter system...........................................................................................62 
Figure 36 : Digitally controlled buck converter ...................................................................................62 
Figure 37 : First prototype of the digitally controlled converter..........................................................63 
Figure 38 : Control system...................................................................................................................65 
Figure 39 : Converter and sensing resistor network.............................................................................65 
Figure 40 : Alternative representation of a buck converter..................................................................67 
Figure 41 : Control software dataflow diagram ...................................................................................69 
Figure 42 : Relation between temperature and output current in PWM mode ....................................70 
Figure 43 : Example of parts redundancy ............................................................................................71 
Figure 44 : Inside look at the improved converter ...............................................................................71 
Figure 45 : Temperature distribution used for reliability assessment.................................................. 73 
Figure 46 : Failure rate vs. temperature............................................................................................... 73 
Figure 47 : Survivability as a function of temperature ........................................................................ 74 
Figure 48 : Zoomed view of survivability as a function of temperature ............................................. 74 
Figure 49 : Percent-wise decrease in overall failure rate as a result of analytical redundancy ........... 75 
Figure 50 : Duty cycle in PS mode...................................................................................................... 75 
Figure 51 : Duty cycle in PWM mode................................................................................................. 76 
Figure 52 : Inductor current in PS mode ............................................................................................. 76 
Figure 53 : Inductor current in PWM mode ........................................................................................ 77 
Figure 54 : Converter input voltage in PS mode ................................................................................. 77 
Figure 55 : Converter input voltage in PWM mode ............................................................................ 78 
Figure 56 : Converter output voltage in PS mode ............................................................................... 78 
Figure 57 : Converter output voltage in PWM mode .......................................................................... 79 
Figure 58 : Gain/phase plot of the converter operated in PS mode..................................................... 79 
Figure 59 : Gain/phase plot of the converter operated in PWM mode................................................ 80 
Figure 60 : Efficiency vs. output current ............................................................................................. 81 
Figure 61 : Enhanced view of the built-in efficiency hysteresis ......................................................... 81 
Figure 62 : Microcontroller power consumption................................................................................. 82 
Figure 63 : Power system configuration.............................................................................................. 86 
Figure 64 : Load sharing by means of current sharing ........................................................................ 88 
Figure 65 : Current sharing implementation and controller current waveform................................... 89 
Figure 66 : MOSFET RDS(ON) temperature dependency ...................................................................... 90 
Figure 67 : Power dissipation caused by convection........................................................................... 91 
Figure 68 : Power dissipation caused by radiation .............................................................................. 91 
Figure 69 : Thermal system equivalent ............................................................................................... 92 
Figure 70 : Load sharing by means of thermal reliability management .............................................. 93 
Figure 71 : Control parameters at feedback error amplifier ................................................................ 93 
Figure 72 : Temperature sensor mounting........................................................................................... 94 
Figure 73 : Power system startup and steady state operation .............................................................. 95 
Figure 74 : Thermal load sharing implementation and signal waveform............................................ 96 
Figure 75 : System temperature distribution ....................................................................................... 97 
Figure 76 : Test setup block diagram ................................................................................................ 100 
Figure 77 : Simplified power system schematic................................................................................ 101 
Figure 78 : Load sharing schematic for a single converter................................................................ 102 
Figure 79 : Real-world test setup of two identical buck converters .................................................. 102 
Figure 80 : Close up view of converter 2 .......................................................................................... 103 
Figure 81 : Ideal redundancy implementation ................................................................................... 104 
Figure 82 : Non-Ideal redundancy implementation........................................................................... 104 
Figure 83 : Real-world implementation............................................................................................. 104 
Figure 84 : Individual converter current distribution ........................................................................ 105 
Figure 85 : Duty cycles of the two converters ................................................................................... 105 
Figure 86 : Current sharing................................................................................................................ 106 
Figure 87 : Temperature measurement of the two switching MOSFET transistors .......................... 107 
Figure 88 : Temperature measurements of the two freewheeling diodes .......................................... 107 
Figure 89 : Initial current distribution of the thermal load sharing technique................................... 108 
Figure 90 : Initial power system efficiency measurement of the thermal load sharing technique .... 108 
Figure 91 : Power system efficiency of current sharing and semi-droop .......................................... 109 
Figure 92 : Current distribution while operated by thermal load sharing.......................................... 110 
Figure 93 : System efficiency for the three different techniques....................................................... 110 
Figure 94 : Common output voltage ..................................................................................................111 
Figure 95 : Output voltage ripple .......................................................................................................111 
Figure 96 : Power system input DC voltage ......................................................................................112 
Figure 97 : Power system input AC voltage ......................................................................................112 
Figure 98 : Gain/phase plot of converter 1.........................................................................................113 
Figure 99 : Gain/phase plot of converter 2.........................................................................................113 
Figure 100 : Conduction losses vs. output current .............................................................................115 
Figure 101 : Switching losses as a function of output current ...........................................................116 
Figure 102 : Simplified temperature distribution...............................................................................119 
Figure 103 : Average system temperature..........................................................................................120 
Figure 104 : Simple droop load sharing.............................................................................................126 
Figure 105 : Ideal droop output voltage .............................................................................................126 
Figure 106 : Off-the-shelf converter from CALEX (a) and pin-out configuration (b) ......................128 
Figure 107 : Basic feedback network.................................................................................................128 
Figure 108 : Feedback voltage ...........................................................................................................129 
Figure 109 : Output voltage resulting from the feedback voltage shown in Figure 108....................129 
Figure 110 : Modified feedback network...........................................................................................130 
Figure 111 : Output characteristic of modified feedback network.....................................................130 
Figure 112 : Laboratory test configuration ........................................................................................131 
Figure 113 : Thermal load sharing circuit, droop resistor and OR’ing diode ....................................132 
Figure 114 : Simplified converter feedback and resulting waveforms ..............................................132 
Figure 115 : Test setup .......................................................................................................................133 
Figure 116 : Individual converter voltage droop vs. output current...................................................133 
Figure 117 : Individual converter temperature vs. output current......................................................134 
Figure 118 : Individual converter current sharing..............................................................................135 
Figure 119 : Individual converter voltage droop vs. output current...................................................136 
Figure 120 : Individual converter current sharing during thermal load sharing ................................136 
Figure 121 : Average system temperature vs. output current ............................................................137 
Figure 122 : Overall system efficiency ..............................................................................................138 
Figure 123 : Common power system output voltage at 1A (discontinuous conduction mode) .........139 
Figure 124 : Common power system output voltage at 5A (continuous conduction mode)..............139 
Figure 125 : Output voltage glitch during a single converter failure .................................................140 
Figure 126 : Converter 3 OR’ing diode voltage drop at 4A...............................................................140 
Figure 127 : Converter 2 feedback voltage at 40°C...........................................................................141 
Figure 128 : Output voltage interval due to component tolerances ...................................................142 
Figure 129 : Droop resistor temperature increase above ambient as a function of current................143 
Figure 130 : System unavailability vs. time in years .........................................................................144 
Figure 131 : Vehicle guidance using magnetometers and magnetic markers ....................................148 
Figure 132 : Magnetic marker and magnetometer .............................................................................149 
Figure 133 : Power system criticality analysis...................................................................................150 
Figure 134 : Individual converter realization.....................................................................................151 
Figure 135 : Probability of system survival as a function of time .....................................................153 
Figure 136 : Enhanced view of circled time interval shown in Figure 135 .......................................153 
Figure 137 : Droop load sharing with over voltage protection circuit ...............................................155 
Figure 138 : Waveforms during normal operation.............................................................................156 
Figure 139 : Waveforms during abnormal operation .........................................................................156 
Figure 140 : Enhanced view of the on_off voltage during abnormal operation ................................157 
 
 
List of tables 
 
 
Table 1 : Systematic FMECA information representation .................................................................. 11 
Table 2 : Criticality rating for precision docking project .................................................................... 12 
Table 3 : NASA derating guidelines.................................................................................................... 14 
Table 4 : List of distributions commonly used in reliability engineering............................................ 16 
Table 5 : Summary of applicable analytic expressions ....................................................................... 26 
Table 6 : List of transformers and operations used throughout this chapter ....................................... 56 
Table 7 : Core features of PIC 16F877 microcontroller ...................................................................... 59 
Table 8 : Simplified failure modes effects analysis for sensing resistors (initial design) ................... 66 
Table 9 : Simplified failure modes effects analysis for sensing resistors (improved design) ............. 66 
Table 10 : Buck topological matrix ..................................................................................................... 68 
Table 11 : Digital vs. analog control ................................................................................................... 83 
Table 12 : Two-converter system performance ................................................................................... 99 
Table 13 : Summary of advantages and drawbacks of the three load sharing techniques................. 121 
Table 14 : Reliability data for Texas Instruments controller IC’s ..................................................... 122 
Table 15 : Outer limits for output voltage variation due to component tolerances ........................... 141 
Table 16 : Reliability data for comparing the two techniques........................................................... 143 
Table 17 : Functional Failure Mode Effects and Criticality Analysis for the overall system ........... 151 
Table 18 : Failure modes relating to converter over voltage ............................................................. 155 
 
 
 
 
 
 
 
Preface 
 
 
This Ph.D. thesis documents the research work performed in reliability enhancement techniques as 
partial fulfillment of the requirements for obtaining the Ph.D. degree from the Technical University 
of Denmark. 
 
 The project has been financed by a scholarship from the Technical University of Denmark. 
Additional expenses such as seminar and conference attendance has been financed by Alcatel Space 
Denmark. 
 Besides being a research project at the Technical University of Denmark the project in general is 
comprised of several major elements including research and design work at Alcatel Space Denmark, 
a 12 month stay at University of California at Berkeley, research work for Partners for Advanced 
Transit and Highways (PATH) and frequent visits to International Rectifier’s Hi-Rel facility in Santa 
Clara, CA. 
 
I would like to express my sincerest appreciation for the contributions made by the following 
people/companies: 
 
• My advisor at the Technical University of Denmark, Professor Michael A. E. Andersen for his 
continued support, many comments and design ideas.  
 
• Alcatel Space Denmark for their financial support as well as allowing me to use their facility 
in Ballerup for my research work. Furthermore, I would like to thank Senior Designer Henrik 
Møller from Alcatel Space Denmark for his continued support and help throughout the 
project.   
 
• My advisor at University of California at Berkeley, Professor Seth Sanders for his support 
during my entire stay at UC Berkeley, for the many project meetings and for his help with the 
PATH research work. 
 
• My advisor at the Richmond Field Station - PATH, Program Manager Wei-Bin Zhang for the 
many long and enlightening conversations throughout the project. 
 
• International Rectifier for allowing me to use their facility for research purposes. Without this 
facility the accurate converter temperature measurements could not have been performed. In 
addition, a special thanks to Director of Design Engineering Richard Wallstrom and Senior 
Designer Arturo Arroyo both from IR, Santa Clara for all their help.  
 
• CALEX Mfg. Co., Inc. for sponsoring high quality converters for testing the thermal droop 
load sharing techniques.  
 
• My grandparents, Willy and Esther Madsen, for their financial support. This support made my 
life a lot easier especially for the duration of my stay in the United States, where living 
expenses are a bit higher than in Denmark.  
 
• My father, Jørgen Nesgaard, for his assistance with the statistical analysis as well as the initial 
concept description of the array-based redundancy approach. 
 
• Last, but certainly not least I would like to express my gratitude to my beloved wife, Tina. 
Her patience, forgiveness and support throughout this project has been incredible. 
Abstract 
 
 
 This Ph.D. thesis documents the research work performed in partial fulfillment of the 
requirements for obtaining the Ph.D. degree from the Technical University of Denmark. Among the 
additional requirements for obtaining the Ph.D. degree is teaching activities and course participation 
both of which have been fulfilled.  
 The topic of this project is fault tolerant power systems although a more universal classification 
would be reliability enhancement techniques for power electronic systems. The research originated 
with a state of the art examination in order to establish a foundation on which the techniques could 
be based. To assist in this examination a database was created. In addition to the database a solid 
theoretical foundation for reliability assessment is established. Both of these topics are covered in 
chapter ‘3 Concept clarification and point of origin’.  
 From the theoretical examinations a more practical approach in system design was taken, which 
lead to the array-based redundancy concept presented in chapter ‘4 Array-based redundancy’. 
Although this work is mainly theoretical, suggestions for real-world power system implementations 
are presented and discussed.  
 Based on the promising results from the array-based redundancy, the concept of digital control in 
power electronic systems are further examined in chapter ‘5 Digital control of DC-DC converters’, 
where the major elements are timing issues, software execution speed and analytical redundancy. 
The results in this chapter show that the use of low-cost microcontrollers allows for easy, cheap and 
relatively high performing converter implementations.  
 Having explored the capabilities of digital converter control, the focus of the research work is 
turned towards analog system implementations utilizing parallel-connection as a means to achieve 
high reliability. This work is presented in chapter 6 and chapter 7. Chapter ‘6 Load sharing’ focuses 
on the use of dedicated load share controllers and the information needed for optimum system 
reliability. Chapter ‘7 Thermal droop load sharing’ concerns the implementation of power systems 
for high current/low voltage applications. The techniques presented in this chapter provide a simple 
means for parallel-connecting multiple power converters to form a single high-power system. A 
consequence of the proposed techniques is an equalization of the individual converter temperatures, 
which in turn results in improved overall reliability. 
 Finally, the research work performed at University of California, Berkeley is described in chapter 
8. This work included participation in a major project managed by Partners for Advanced Transit and 
Highways. Among other things this work resulted in examination of real-world problems associated 
with operating heavy machinery in urban areas. 
 The overall conclusion of the the work presented in this thesis is that several reliability techniques 
apply to modern power system. Among the techniques considered during the Ph.D. project, several 
result in significant improvements in system reliability. A technique that unfortunately did not 
provide the anticipated reliability increase is the digital control of DC-DC converters. The overall 
failure rate of the digital controller is simply too high to be compensated by the additional features 
provided by increase in ‘intelligence’. 
 Due to the rather high volume of data related to this project it has been decided to include all 
information on a CD, including a pdf-version of this thesis. A detailed table of contents for the CD 
can be found in the appendix. 
 
 
Resume (Abstract in Danish) 
 
 
 Denne rapport dokumenterer det arbejde, der er gennemført som led i erhvervelsen af ph.d. graden 
fra Danmarks Tekniske Universitet (DTU). Ph.d. studiet er finansieret via et DTU stipendium, et 
sponsorat fra Alcatel Space Denmark, økonomisk støtte fra mine bedsteforældre Esther og Willy 
Madsen samt legater fra COWIfonden, Otto Mønsteds Fond og Knud Højgaards Fond. 
 I takt med samfundets stigende brug af elektroniske systemer stiger også behovet for en øget 
pålidelighed for disse systemer. Det betyder, at området fejltolerance bliver en stadig vigtigere del af 
ethvert moderne design. Dette forhold danner grundlag for nærværende forskningsprojekt, som 
undersøger pålidelighedsoptimeringer i DC-DC konvertere. Mange af de opnåede resultater viser sig 
at være helt generelle, hvorfor disse resultater tillige kan anvendes inden for andre områder af 
effektelektronikken. 
 Rapporten tager udgangspunkt i alment accepterede og anvendte teknikker for derved at skabe en 
direkte forbindelse mellem kendt teknik og de i nærværende rapport beskrevne forskningsresultater. 
Med udgangspunkt i den kendte teknik skabes et teoretisk grundlag for pålidelighedsvurdering af 
elektroniske systemer. Da flere af de opnåede forskningsresultater adskiller sig væsentligt fra den 
kendte teknik, må de teoretiske overvejelser tilpasses tilsvarende. Dette er beskrevet i kapitel 3. 
 Kapitel 4 anvender de tilpassede teoretiske resultater på et parallelforbundet system med 
indbygget N+2 redundans. Den måde, hvorpå systemet fungerer, adskiller sig betydeligt fra 
traditionelle parallelforbundne systemer, idet parallelforbindelsen foretages på flere forskellige 
niveauer. Pålidelighedsberegninger bekræfter de forbedringer, som det var forventet, at teknikken 
ville tilføre det samlede system. Til trods for, at teknikken beskrevet i dette kapitel hovedsageligt er 
af teoretisk karakter, vises en praktisk implementering af en testkonverter.  
 På grund af de positive resultater fra kapitel 4 blev det besluttet at undersøge mulighederne for 
digital styring af konvertere. Dette arbejde er beskrevet i kapitel 5, hvor en simpel buck konverter 
implementeres med en microcontroller som styrende element. Da digitale styreenheder generelt 
tillader intelligent stillingtagen til pludseligt opståede situationer, er der i buck konverteren tillige 
implementeret analytisk redundans. Dette bevirker, at visse fejlsituationer helt kan undgås, hvilket 
øger det samlede systems pålidelighed. På baggrund af pålidelighedsvurderinger viser det sig 
imidlertid, at de mange transistorer, som indgår i den digitale styreenhed, forringer det samlede 
systems pålidelighed set i forhold til en tilsvarende analog realisering. I takt med at 
fremstillingsprocessen for digitale kredsløb forbedres, vil pålideligheden for digitalt styrede 
konvertere blive øget, hvorfor digitalt styrede konvertere på et tidspunkt med fordel vil kunne 
erstatte tilsvarende analoge realiseringer på grund af de mange ekstra overvågningsfunktioner, som 
en digital realisering muliggør. 
 Efter undersøgelsen af mulighederne for digital styring af konvertere blev fokus vendt mod de 
mere traditionelle analoge realiseringer af parallelforbundne systemer. I denne forbindelse blev 
mulighederne for at forbedre parallelforbundne systemers pålidelighed undersøgt. Resultaterne af 
denne undersøgelse/parameterevaluering er beskrevet i kapitel 6 og kapitel 7. 
 Som afslutning på hovedrapporten beskrives det gennemførte udlandsophold ved University of 
California, Berkeley. Dette ophold gav blandt andet mulighed for deltagelse i et stort 
forskningsprojekt sponsoreret af det Californiske trafikministerium. Projektet er ledet af Partners for 
Advanced Transit and Highways og fokuserer på langsigtede løsninger af de stigende trafikale 
problemer på det amerikanske motorvejsnet. Arbejdet med dette projekt er beskrevet i kapitel 8. 
  I rapportens bilag gives der en kort beskrivelse af de emner, som vurderes at falde uden for 
hovedrapportens centrale rammer. Herudover findes der i bilagene en liste over de i løbet af 
forskningsprojektet indsendte konference- og tidsskriftpublikationer. 
1. Introduction  Page 1 
 
1 Introduction 
 
 With the ever-increasing dependence on electronic systems, the need for highly reliable power 
systems arises. The trend has been especially noticeable in recent years where the popularization of 
digital computer networks and large Internet servers has altered power system requirements in terms 
of efficiency and reliability. With consequences of system downtime such as losses in sales, 
customer services, etc. the financial aspects of a power failure can be quite significant and is often 
the motivation for initiating new research projects. This project titled ‘Fault Tolerant Power Systems’ 
examines the possibilities of enhancing the overall reliability of a given power system and compares 
the findings to the results of more traditional approaches.  
 When considering highly reliable fault tolerant power systems the word ‘redundancy’ comes to 
mind. Indeed, a true fault tolerant power system is comprised of several subsystems working in 
parallel. Although the principle of parallel operation of two or more similar units cannot be said to 
apply in all situations, the general case will show that applying redundancy in some form enhances 
the overall reliability quite considerably.  
 This thesis approaches the field of fault tolerance from a system point of view by providing a 
conceptual structure of different concepts most often associated with fault tolerance. Furthermore, 
due to a lack of common concept of what is covered by the terms used within the field of fault 
tolerance, a brief introduction of terms and abbreviations used in this thesis is provided. 
 Beginning with a discussion of what constitutes ‘state of the art’, a database containing commonly 
used techniques for reliability improvements in power electronic systems is established. From there 
an introduction to the many different topics of modern reliability engineering is provided. Due to 
some rather unconventional power system propositions presented in this thesis the theoretical 
foundation utilized to evaluate traditional power systems had to be modified in order to account for 
the reliability improvements inherent in the new system realizations. This work is presented in 
chapter ‘3 Concept clarification and point of origin’ and serves as the mathematical foundation for 
the reliability calculations presented in subsequent chapters.  
 Next, the thesis provides a chronological description of the research results and thereby 
incorporates a ‘time of conception’ in relation to the overall project timeframe. The first topic 
covered is the array-based redundancy concept. This research originated from the theoretical results 
obtained during modification of the standard statistical equations. Chapter ‘4 Array-based 
redundancy’ presents this work by introducing the basic array-based concept followed by the 
redundancy management implementation in the power system under consideration. Next, a digital 
approach in the implementation of analytic redundancy is presented. This work is a direct spin-off of 
the work on array-based redundancy. As will be described in chapter ‘5 Digital control of DC-DC 
converters’ the digital approach in controlling simple converter topologies provides several key 
advantages, although the overall system reliability is lower than a similar analog implementation. 
One of these advantages is adaptability, meaning that a digitally controlled converter initially 
designed for a 5V, 10A output is easily modified by means of a simple software modification, to 
comply with specifications for a 4V output at a maximum current of 5A. However, since the 
electrical parts, especially the inductor, do not change accordingly, this adaptability will inevitability 
increase ripple and/or noise at the output. While this latter example only serves as an illustration of 
the diversity of the digital control, real-world implementations always require quantification of 
compliance parameters. 
 The next topic is presented in chapter ‘6 Load sharing’. This chapter redirects the focus from 
digital systems to the traditional parallel-connected redundant power systems. By adapting a new 
load sharing scheme it is verified both theoretically and experimentally that significant reliability 
1. Introduction  Page 2 
 
improvements are achievable by simply adapting the load sharing control to include temperature 
information. This work is then extended to include alternative load sharing techniques including the 
well-known droop technique. The results of a laboratory implementation of a 3-converter system 
utilizing the new droop procedure is described in chapter ‘7 Thermal droop load sharing’. 
 So far a brief outline of the work performed during this research project has been presented. A 
graphical illustration of the key elements being presented in this thesis can be seen in Figure 1. 
 
Power system reliability
SpacePSU
Military PSU
Backup
UPS
PSU +
backup +
generation
Unit System
Electric
power
supply
Power grid
'Regular'
PSU
Unit UnitSystem
Hardware
redundancy 'Derating'
Fault
tolerance
Wide-ranging
field
Units/systems
Reliability
enhancement
techniques
Fault
diagnosis
Unit 1
Unit 2
Unit N
Thermal
management
  
Figure 1 : Project diagram 
 
 Relatively early in the project it became apparent that in order to examine different reliability 
enhancement techniques in relation to switching power systems, the research work would have to 
1. Introduction  Page 3 
 
originate at a higher system level than the project title allowed. In other words, the research 
considered fault tolerance in power systems as one among several techniques applicable in 
enhancing system reliabilities. This is also depicted in Figure 1 where several techniques are 
presented at the same system level – ‘Reliability enhancement techniques’. The systems in which 
these techniques are applicable are introduced at the next higher level – ‘Units/systems’. Although 
implementation in all blocks at this level is possible it has been chosen to focus on the experimental 
verification of different techniques in the rightmost block ‘Regular PSU’. 
 This thesis is structured in a way that should facilitate user-friendly interpretation and maintain 
the center of attention at the research work by, among other things, providing a list of references 
related to the topics covered at the end of each chapter.  
 The research work presented in this thesis has led to the following publications: 
 
• An Array-based Study of Increased System Lifetime Probability, Applied Power electronics 
Conference and Exposition 2003, Miami, USA 
• Digitally Controlled Converter with Dynamic Change of Control Law and Power Throughput, 
Power Electronics Specialists Conference 2003, Acapulco, Mexico 
• Efficiency Improvement in Redundant Power Systems by Means of Thermal Load Sharing, 
Applied Power electronics Conference and Exposition 2004, Anaheim, USA 
• Thermal Droop Load Sharing Automates Power System Reliability Optimization, PELS 
Newsletter article, Second Quarter 2004 
• Optimized Load Sharing Control by Means of Thermal Reliability Management, Power 
Electronics Specialists Conference 2004, Aachen, Germany 
• Experimental Verification of the Thermal Droop Load Sharing, Power Electronics Specialists 
Conference 2004, Aachen, Germany 
• Topological Reliability Analysis of Common Front-end DC/DC Converters for Server 
Applications, International Power Electronics Congress 2004, Celaya, Mexico 
• Report on Power System Reliability, 2003 Annual report – Partners for Advanced Transit and 
Highways 
 
A more detailed list of publications can be found in the appendix to this thesis. 
 
 
 
2. Fault tolerance  Page 4 
 
2 Fault tolerance 
 
 In spite of designer’s best effort to remove or even avoid faults, they are bound to happen in any 
operational system. The severity of the fault consequences is often determined by the application at 
hand and should always be characterized in accordance with the end-user’s needs and requirements. 
Non-critical systems require little or no tolerance towards faults whereas critical systems often 
depend on the continuous operation of all system elements and thus require substantial resilience 
towards faults. Accommodating the latter requirement is accomplished by designing systems that 
make use of reliability enhancement techniques for improved fault handling. This ensures continuous 
system operation, although performance might be at a degraded level.  
 As indicated in section ‘1 Introduction’ the field of fault tolerance is associated with a diffuse set 
of terms and abbreviations often causing misunderstandings. This section provides a generalized 
perception of the concept of fault tolerance in power electronic systems as a theoretical topic. In 
subsequent sections these perceptions will then be applied where possible to the physical 
implementation of fault tolerance in the power system prototypes. A summary of the remaining 
terms and abbreviations used to document the results of this research project can be found in ‘List of 
abbreviations’ at the end of this thesis.  
 Fault tolerance as a concept originated within the field of software engineering in the late 1950’s. 
As digital logic evolved and software became a vital tool in designing new and faster systems, the 
need for software programs capable of managing error-situations arose. This marked the beginning 
of fault tolerance in electrical engineering. Modern fault tolerance consist of key elements found in a 
number of topics that nowadays are considered concepts of their own due to the large amount of 
theoretical work carried out in the past few decades.  
 An example of a concept that originated within the field of fault tolerance is ‘Reliability-centered 
Maintenance’. This concept plays a crucial role when optimizing reliability requirements vs. 
economical consequences of stocking spare units in modern large-scale computer systems. As the 
title of the concept indicates, the focal point is maintenance, which necessitates reachability of the 
system under consideration. Thus, space and avionics systems are likely to fall outside this category. 
 Before a detailed description of fault tolerance can be provided a classification of the two terms 
‘failure’ and ‘fault’ must be established. Examining the literature it seems that over time, failure has 
come to be defined in terms of specified service delivered by a system, thus avoiding the use and 
consequently the definition of every-day words such as malfunction, defect and error. A system is 
said to have a failure if the service it delivers to a user is noncompliant with the system 
specifications. A further constraint often added to the definition of failures is the period of time for 
which the system services are noncompliant. The latter constraint indicates that different degrees of 
failures can be expected, since the period of time a system suffers from a failure can be anything 
from a short transient to a permanent loss of service. Indeed, different degrees of fault tolerance 
exist. Depending upon the application and the customer needs, issues concerning the period of time 
for which the system must endure a given failure, should where possible be explicitly stated in the 
system specifications and corrective actions taken accordingly. 
 Turning the attention towards the classification of faults, it has become almost standard practice to 
define these in terms of failures. In doing this, a fault can be defined as failures in neighboring 
systems interacting with the system under consideration. Here the term ‘neighboring systems’ covers 
everything from a complete external subsystem to a component internal to the system under 
consideration. Thus, faults can be considered without the need to establish a direct connection with a 
failure and thereby facilitate the argumentation for natural fault tolerance – the encounter of faults 
not leading to system failure. Recognizing the natural fault tolerance as a special case of a system 
2. Fault tolerance  Page 5 
 
demonstrating resilience towards faults, it should be noted that due to this definition of faults, a base 
for recognizing any fault as a failure has been provided. The only differentiation between the two 
terms is the level of observation. Assume an observer inspects the internals of a system and identifies 
a faulty component. From the observers point of view the system breakdown was caused by a 
component failure, whereas from a system point of view it was caused by a fault. 
 Having described the fundamentals of the terms ‘failure’ and ‘fault’, an examination of the 
previously mentioned key elements that makes up a modern fault tolerant system results in 
identification of the following three parameters: 
 
Fault isolation If critical failure-modes cannot be avoided in the design of a given system, it is 
essential that these failure-modes are continuously monitored in order for fault 
tolerance within the system to be maintained. 
Fault detection If a fault is detected within a given system, the proper precautions must be taken 
by either dynamic replacement or redundancy. This prevents the propagation of 
a fault from its origin at one point within the system to a point where it can have 
a critical effect on a process or a user. 
Fault prediction As opposed to the above-mentioned topics that must be an integrated part of the 
fault tolerant system, a system’s ability to predict faults based on continuous 
measurements of key components is a desirable feature that is made possible 
mainly due to advances in digital controllers. System reaction to a deviation 
from normal operation is often characterized as symptom response; hence, the 
term ‘symptom’ describes the observable effects of a fault.  
 
 In relation to the concept of fault tolerance are the mathematical tools of reliability evaluation and 
safety assessment. These tools, described in section ‘3.3 Statistical distributions and methods’, are 
used extensively throughout this project as they make up the theoretical foundation for the qualitative 
goals contained in fault tolerant systems. 
 
 
 
  
2.1 State of the art techniques 
 
 Prior to beginning the research in reliability improvement techniques, a state of the art 
examination was performed. The results of this examination formed the basis for a database that can 
be found on the CD accompanying this thesis. Since the database is self-explanatory this section only 
provides a short description of the contents as well as the initial thoughts that lay the groundwork for 
the database.  
 The preliminary reliability examinations suggested that a digital approach had to be taken in order 
to significantly improve today’s power converters reliability-wise. Most entries in the database are 
From the above description it can be seen that the interpretation of the term ‘Fault 
Tolerance’ in this thesis deviates from the common interpretation used for instance 
in control engineering. However, the use of fault tolerance is in agreement with the 
common understanding within the field of power electronics, where fault tolerance 
can be anything from a simple isolation diode to a complete monitoring system. In 
other words, the thesis at hand approaches the concept of fault tolerance with 
reliability enhancement in focus. From the end-user’s point of view the ultimate 
goal of fault tolerance implementation is a dramatic increase in overall reliability. 
2. Fault tolerance  Page 6 
 
therefore related to digital control of power systems ranging from simple converter control to 
advanced control of high-power systems. Regardless of system complexity, these systems all share 
the same challenges – getting a digital controller to perform all aspects of the required control and 
monitoring. A key part in this common challenge was determination of the achievable sample 
frequency. Since this frequency is directly related to the dynamic performance of the system, overall 
system performance, and system immunity towards noise, it is desirable to implement the system 
with the highest sampling frequency possible. However, control computations, sample and hold time, 
clock frequency and the number of execution cycles, all contribute to a decrease in this frequency. 
Hence, the overall system design is a trade-off between dynamic performance and system monitoring 
features. As can be seen from the entries in the database, the common approach in solving the latter 
trade-off issue is the implementation of compact real-time control routines in high-speed 
microprocessors. Since a microprocessor consumes considerable amount of power this particular 
approach is limited to relatively high-power systems. The control realization in low-power 
applications are therefore often established by means of analog control IC’s. Part of the research 
work described in this thesis provides a solution to the problem of digital control of low-power 
system by implementing full converter control and monitoring features in a low-cost microcontroller 
by means of a look-up table. This work is the topic of chapter ‘5 Digital control of DC-DC 
converters’. 
 
 
Figure 2 : ‘State of the art techniques’ database 
 
 Directing the focus back towards the ‘state of the art techniques’ database, Figure 2 shows the 
overall layout. Searches can be performed on all fields for any combination of key words. A list of 
2. Fault tolerance  Page 7 
 
all entries in the database along with a description of the contents of the accompanying CD can be 
found in the appendix. 
 
 
2.2 Definition 
 
 Based on the description given in section ‘2 Fault tolerance’ as well as the explanation given by 
Webopidia the following definition of fault tolerance has been established: 
 
 
The ability of a system to respond gracefully to an unexpected hardware or software failure. 
 
There are many levels of fault tolerance, the lowest being the ability to continue operation 
in the event of a power failure. 
 
Many fault tolerant computer systems mirror all operations - that is, every operation is 
performed on two or more duplicate systems, so if one fails the other can take over. 
 
 This definition states that the application at hand – a power system – exhibits the lowest level of 
fault tolerance possible, thus being crucial to the entire system as opposed to, for example, a 
surveillance subsystem. 
 
With the definition in mind, the following basic rules of fault tolerance can be deduced: 
1. Knowing precisely what the system will do when working under both normal and abnormal 
circumstances. 
2. Group fault causes into different classes. Thus, identifying and categorizing all critical 
failure-modes. 
3. Determine fault containment regions within the system. This is important since fault 
propagation in any system is to be prevented. 
4. Determine the application failure margins and balance the level of fault tolerance with the 
cost of implementation. 
 
As will become apparent in section ‘3.1 Failure Modes Effects and Criticality Analysis 
(FMECA)’ rule 2 and 3 are deduced from the general awareness of fault tolerance having evolved 
into a concept crucial to most industries within the field of electronics. As a concluding remark, the 
definition of reliability and availability is provided:  
 
Reliability defines a system’s ability to stay in the operating state without failure. Thus, 
reliability is totally unsuitable as a measure for continuously operated systems that can tolerate 
failures. Availability on the other hand defines the probability of finding a system in the 
operating state at some point in the future. 
 
 In general, these terms are used without regard to their definition and it is common to see 
reliability being used as a measure for a repairable system. Since the meaning of the terms is easily 
deduced from the context of the system being described, there will be no distinguishing between the 
terms throughout this thesis. Hence, reliability is used to describe the probability of system survival 
in all subsequent chapters. 
 
 
2. Fault tolerance  Page 8 
 
2.3 References 
 
[Sa01]  Architecture and IC Implementation of a Digital VRM Controller, Jinwen Xiao, 
Angel V. Peterchev, Seth R. Sanders, Power Electronics Specialists Conference 2001, 
Vancouver, Canada 
 
[At01]  A Low Cost Digital SVM Modulator with Dead Time Compensation, C. Attaianese, 
D. Capraro, G. Tomasso, Power Electronics Specialists Conference 2001, Vancouver, 
Canada 
 
[Sa02]  Quantization Resolution and Limit Cycling in Digitally Controlled PWM 
Converters, Angel V. Peterchev, Seth R. Sanders, Power Electronics Specialists 
Conference 2001, Vancouver, Canada 
 
[Re01]  Analysis and Design of a Repetitive Predictive-PID Controller for PWM Inverters, 
C. Rech, H. Pinheiro, H. A. Gründling, H. L. Hey, J. R. Pinheiro, Power Electronics 
Specialists Conference 2001, Vancouver, Canada 
 
[Ri01]  A Fault Tolerant Induction Motor Drive System by Using a Compensation Strategy 
on the PWM-VSI Topology, R. L. A. Ribeiro, C. B. Jacobina, E.R. C. da Silva, A. M. N. 
Lima, Power Electronics Specialists Conference 2001, Vancouver, Canada 
 
[Fi01]  MOSFET Failure Modes in the Zero-Voltage-Switched Full-Bridge Switching Mode 
Power Supply Applications, Alexander Fiel, Thomas Wu, Applied Power electronics 
Conference and Exposition 2001, Anaheim, USA 
 
[Ce01]  A New Distributed Digital Controller for the Next Generation of Power Electronics 
Building Blocks, I. Celanovic, I. Milosavljevic, D. Boroyevich, R. Cooley, J. Guo, 
Applied Power electronics Conference and Exposition 2000, New Orleans, USA 
 
[Ho01]  Fault Detection Evaluation of Microcontroller Dyad Control System by Fault 
Injection Method, Zeljko Hocenski, Goran Martinovi, Josip Juraj Strossmayer, European 
Conference on Power Electronics and Applications 1999, Lausanne, Switzerland 
 
[Fe01]  Digital Control of a Single-Stage Single-Switch Flyback PFC AC/DC Converter with 
Fast Dynamic Response, Ya-Tsung Feng, Gow-Long Tsai, and Ying-Yu Tzou, Power 
Electronics Specialists Conference 2001, Vancouver, Canada 
 
[To01]  Adaptive, Stable Fuzzy Logic Control for Paralleled DC-DC Converters Current 
Sharing, Bogdan Tomescu, H.F. VanLandingham, Power Electronics Specialists 
Conference 2001, Vancouver, Canada 
 
[Na01]  Stability Analysis of Parallel DC-DC Converters Using a Nonlinear Approach, Sudip 
K. Mazumder, Ali H. Nayfeh, Dushan Borojevic, Power Electronics Specialists 
Conference 2001, Vancouver, Canada 
 
2. Fault tolerance  Page 9 
 
[Ke01]  Generalized Predictive Control (GPC) - Ready for Use in Drive Applications?, 
Kennel R., Linder A., Linke M., Power Electronics Specialists Conference 2001, 
Vancouver, Canada 
 
[We01]  Webopedia, http://www.webopedia.com/ 
 
3. Concept clarification and point of origin  Page 10 
 
3 Concept clarification and point of origin 
 
 The following section forms the basis of common methods and techniques used in reliability 
engineering for fault detection, analysis and prevention purposes. Evaluation of such methods and 
techniques necessitate the use of several mathematical procedures. Having established a common 
evaluation framework by means of mathematical models, a fair system comparison can be instituted. 
This section serves as reference for reliability calculations in subsequent chapters in that a 
mathematical assessment foundation adapted to the contents of this thesis will be presented.  
 Due to the large number of terms and abbreviations used in modern system evaluation this section 
also provides a clarification of verb interpretation used throughout this thesis. A research basis is 
necessary to ensure continuity from concept via ideas to the solution. Thus, providing the reader with 
background information of where the research originated. 
 
 
3.1 Failure Modes Effects and Criticality Analysis (FMECA) 
 
 Modern power systems are becoming more and more sophisticated in order to comply with 
equipment specifications set forth by an industry that uses complex power systems in either their 
end-products or part of the manufacturing process. To cover all possible case scenarios in the event 
of system failure any power system design must begin with a Failure Modes Effects and Criticality 
Analysis (FMECA). 
 The FMECA developed by the United States Military in November 1949, is a systematic process 
for identifying potential design and process failures before they occur, with the intent to eliminate or 
minimize the risks associated with them. The effectiveness of the modern FMECA lie in its ability to 
allow the designer to interchange terms used in the original Military version MIL-STD-1629, thus 
adapting the analysis to cover terms such as costumer satisfaction, costs, safety and reliability. The 
latter topic, being of vital importance in this thesis, allows for a quantitative system evaluation, hence 
facilitating a system comparison in terms of survivability and circuit complexity. 
 Due to the adaptability of the FMECA, the analysis can be applied to all levels of the system 
design. Unfortunately, this adaptability also prevents the establishment of general FMECA 
guidelines, for which reason the analysis has to be modified to each design. However, the successive 
procedure flowchart shown in Figure 3 is applicable in most cases concerning highly reliable 
electronic equipment. 
 
System design Corrective actions
System perception FMECA based oncurrent design Failure effects System criticality
 
Figure 3 : Successive FMECA procedure flowchart 
 
3. Concept clarification and point of origin  Page 11 
 
 From Figure 3 it can be seen that the process repeats itself until the output or outcome stops 
deviating significantly from the input – essentially being a feedback system. The process begins with 
a system perception or very high level description. The FMECA is then performed at this system 
level with indication of failure modes and their criticality for system survival. The results are 
examined and corrective action towards eliminating unacceptable failure modes can be taken. The 
process is then repeated and compared to the results of the previous FMECA. Such a recursive 
procedure helps the designer assess different design alternatives with high reliability and safety 
potential during the initial system synthesis.  
 An example of a lower level FMECA is shown in Table 1 where each component is examined for 
likely failure modes. In most FMECA’s the rate at which particular failure modes occurs are rarely 
included. Since different failure modes most often have different probabilities of occurring the 
inclusion of such information could actually provide a simple solution for obtaining the needed 
system reliability. Indeed, as can be seen in the PATH report on power system reliability (see 
appendix), failure mode probability considerations can improve system reliability when proper 
actions are taken. In this case, the parallel-connection of two diodes decreases the unavailability of 
the current path initially formed by one diode by 30%. A detailed description of reliability 
enhancement techniques like the latter case will be provided in section ‘3.4 Redundancy’. 
 
Block/part Fault no. Failure mode Failure effect Effect on unit Criticality
Resistor 45 1 Open circuit Loss of feedback Loss of unit 2 
- - - - - - 
- - - - - - 
UVP 745 Output short circuit Loss of UVP None 5 
- - - - - - 
- - - - - - 
Table 1 : Systematic FMECA information representation 
 
 The information in Table 1 enables the designer to determine the function of all system blocks or 
parts. In turn, this provides a measure of quantitative system reliability in the initial system design 
where unacceptable failure modes easily can be associated with corrective or predictive actions. In 
many cases simple failure modes leading to unacceptable system performances can often be 
‘designed out’ by changing the system topology and/or modifying monitoring circuitry. 
 With reference to Table 1 the individual column headings are identified and described in 
accordance with the research work presented in this thesis. 
 
Block/part: This column identifies the part or block for which the analysis is being performed. 
Since the block/part causing the failure mode is listed in this column, a commonly 
used heading for this particular column is ‘Cause’. In initial designs this column can 
assist in the synthesis of a functional system block diagram.  
 
Fault no.: This column relates each failure mode with a consecutive number for easy reference 
during the synthesis and documentation state. 
 
Failure mode: This column describes the failure mode under consideration. Common failure 
modes include short circuit to common ground and/or mains, open circuit evaluation 
of noise sensitive parts etc. An outline of failure modes to include in a FMECA can 
be found in the folder ‘Application notes’ on the accompanying CD.  
 
3. Concept clarification and point of origin  Page 12 
 
Failure effect: This column provides a verbal description of the local effects of the failure mode 
under consideration. Local effects in this context is adjacent parts or block level 
functionality. 
 
Effect on unit: This column expands the local failure effects description to include the global 
system effects that results from the failure mode. Possible effects of fault 
propagation would also be included in this column. 
 
Criticality: This column rates the severity of the failure mode in question based on a 
predetermined criticality assessment. In the power system evaluation of, for 
instance, the precision docking project the criticality ratings shown in Table 2 have 
been used. 
 
Rating Severity Description 
1 Very critical This part or subsystem is essential for human safety. Loss of any function within this category could result in loss of human life. 
2 Critical Loss of this part or subsystem causes system malfunction. Important to continued system operation. System might fail in a non-graceful way. 
3 Significant 
Loss of this part or subsystem causes important system degradation. Degradation 
includes loss of important monitoring functionalities. System fails in a controlled and 
graceful way. 
4 Minor 
Loss of this part or subsystem causes only minor system degradation. Degradation of 
secondary supervisory functions unimportant to the primary loads. System does not 
fail. 
5 None Loss of this part or subsystem has no effect on overall system performance. Might cause a surveillance circuit to loose power. System does not fail. 
Table 2 : Criticality rating for precision docking project 
 
 It should be noted that the criticality ratings shown in Table 2 by no means can be considered a 
standard. The rating is adjusted to accommodate end-user specifications in each application. 
However, ratings similar to those shown in Table 2 are often set forth by, for example, the European 
Space Agency (ESA). Since the precision docking project involves heavy machinery operating in 
close proximity to humans the FMECA criticality rating had to include possible loss of human life. 
The precision docking project is intended for implementation in urban areas, for which reason 
system malfunction leading to loss of human life is an unacceptable rating for any subsystem within 
the power system. 
 In many situations the likelihood that a particular fault occurs and results in a failure mode during 
the intended operating lifespan of the equipment is included in the FMECA table. In other words it is 
common to include a rating of how likely a particular failure mode is to occur on a scale from 1 to 
10. A similar method is adapted in this thesis in that a mathematical component reliability evaluation 
and the system FMECA is performed simultaneously. This incorporates a more detailed image of the 
overall system reliability. The drawback is that the detailed reliability evaluation comes at the cost of 
lost system perspective due to the rather complicated calculations required. 
 The design of complex power systems involves a large number of failure modes and the use of a 
dedicated FMECA software program is highly recommended. Furthermore, the use of a dedicated 
FMECA software program has the advantage of recognizing the so-called common mode failures 
that might be avoidable by simple circuit modifications. Also, detailed knowledge of potential down-
stream consequences of a particular failure mode can easily be identified and assessed. The latter 
information is a fundamental necessity for preventing fault propagation leading to unacceptable 
system states. 
3. Concept clarification and point of origin  Page 13 
 
 The FMECA is performed consistently – at both a functional as well as at part level – for all 
systems presented in this thesis. From these analysis’s the concept and reliability evaluation of 
‘failure mode based partial parts level redundancy’ arose. A detailed description of the findings is 
provided in section ‘8 Partners for Advanced Transit and Highways’ where a theoretical research 
project involving a precision docking procedure is described. 
 Within the field of reliability engineering a FMECA is commonly referred to as an inductive 
bottoms-up method of analyzing a system design in order to evaluate the potential for failures. This 
interpretation of the FMECA process is questionable since the main idea of performing a FMECA is 
to eliminate initial failures from a functional level before proceeding to the design of subsystems and 
eventually analyzing and choosing parts. From the latter description, which is commonly accepted 
throughout the industry, the correct interpretation of the FMECA process must be a systematic top-
down method of analyzing system designs. 
 
 
3.2 Derating 
 
 The ratings on electronic parts and selection of their use for an application environment are a 
matter of concern in all aspects of electronic equipment design. A widely accepted technique for 
increasing the probability of survival for a given system is the so-called ‘Derating’ technique. 
Derating contributes to a conservative design approach incorporating realistic stress levels, thus 
ensuring a higher probability of a flawless equipment life. The point of origin for derating 
considerations lies within the field of statistical analysis. Electronic component strength is a random 
variable that varies from one manufacturer to another or even from one batch of components to 
another from the same manufacturer. Data analysis of randomly selected samples from these 
components reveals that the strength of the individual components can be represented by a statistical 
distribution. Likewise the stress applied to a component is random, changing with temperature, 
electric transients and even application. For this reason stress can also be represented by a statistical 
distribution. Figure 4 shows the probability density functions for both the stress and strength 
distributions. In Figure 4a the two probability density functions are well separated, which indicate a 
proper use of the component. The probability that a component will fail during its intended operating 
life is relatively small as can be seen by the shaded region where the two density functions intersect. 
The area of this region (interference) determines the probability of component failure. The larger the 
area, the higher the probability of component failure becomes. A ‘large area’ situation is shown in 
Figure 4b where the strength of the components barely exceeds the applied stress. This is known as 
‘dirty design’ or ‘no-limit design’ and is often applied in very low-cost power supplies manufactured 
mainly in Asia. More and more end-users are becoming aware of this reliability problem and now 
look towards quality design for solutions to their power needs.  
 Based on the above description it should be clear that in order for a component to work properly 
its strength must exceed the applied stress - preferably with a decent margin as shown in Figure 4a. 
 
 
3. Concept clarification and point of origin  Page 14 
 
StrengthStress
Mean
P
ro
ba
bi
lit
y 
de
ns
ity
 fu
nc
tio
n
Stress/strength
Interference
   
P
ro
ba
bi
lit
y 
de
ns
ity
 fu
nc
tio
n
Stress/strength
StrengthStress
Mean
Interference
 
  (a) (b) 
Figure 4 : Stress vs. strength relationship for electronic components 
 
 The curves depicted in Figure 4 by no means represent the correct shape of an actual sample from 
a population of components. If fact, it should be expected that the standard deviation of a given 
population changes from one type of components to another, hence altering the bell shape of the 
distribution.  
 The purpose of derating is to protect against variations in the previously described random 
variables. Proper use of derating prevents small changes in operating conditions from causing large 
increases in failure rates. The amount of derating needed depends on how well the variation in 
operating parameters can be predicted. In most cases it is impossible to accurately describe the 
operating parameters at the time of design, so designers turn to reliability estimates in order to 
determine the system impact of faults and thus the level of derating needed to reduce the risk of these 
faults. As an example, Table 3 presents derating guidelines set forth by NASA. 
 
Component type Recommended derating level 
Capacitor Max. 60% of rated voltage 
Resistor Max. 60% of rated power 
Semiconductor device Max. 50% of rated power 
 Max. 75% of rated voltage 
 Max. junction temperature of 110°C 
Microcircuits Supply voltage max. 80% of rated voltage 
 Max. 75% of rated power 
 Max. junction temperature of 100°C 
Inductive device Max. 50% of rated voltage 
 Max. 60% of rated temperature 
Relays and Connectors Max. 50% of rated current 
Table 3 : NASA derating guidelines 
 
 From the derating recommendations shown in Table 3 it can be seen that finding electronic parts 
rated for the temperature range required for operation in many modern applications is becoming 
harder as semiconductor manufacturers discontinue production of components rated for extended 
temperature ranges. Furthermore, most new and improved semiconductor based microcircuits that 
become available have a very limited temperature range in which proper operation is guaranteed. 
3. Concept clarification and point of origin  Page 15 
 
 As components age, their strength characteristics tend to move towards the probability density 
function of the applied stress. While this observable fact takes place over time the good news is that 
the applied stress density function only changes significantly if a component fails. In other words, 
the overall probability of system survival decreases as components gets older. A similar shift in 
strength characteristic in relatively new components can be observed when used outside their 
designed operating condition or near their rated parameters for long periods of time. 
 The derating recommendations shown in Table 3 are applied to all designs in this thesis with the 
exception of the temperature derating of semiconductor devices. A major element in this thesis is the 
examination of system reliability at temperatures ranging from subzero to 175°C for which reason it 
is impossible to comply with the temperature derating recommendation. 
 The reliability of electronic parts is directly related to the stresses caused by the application, 
including both the assembly environment and the operating conditions. The MIL-HDBK-217F 
contains models for calculating the effects of various stresses on the failure rate and thus on 
component reliability. Examples of variables included in the MIL-HDBK-217F are environmental 
factors, temperature and voltage parameters, level of quality and application factors. Additional 
information on the MIL-HDBK-217F can be found in section ‘3.5 Standards’. A graphical 
illustration of the derating effects on failure rate (MIL-HDBK-217F data) can be seen in Figure 5. 
 
Voltage stress
Temperatu
re
150
100
50
0
1.0000
0.9995
0.9990
0.9985
0.5
0.0
1.0
Pr
ob
ab
ilit
y 
o f
 s
ur
vi
va
l
     
Voltage stress
Temper
ature
150
100
50
0
1.0000
0.998
0.996
0.994
0.5
0.0
1.0
Pr
ob
a b
ilit
y 
o f
 s
ur
vi
va
l
 
 (a) (b) 
Figure 5 : Derating effects on resistor failure rate (a) and capacitor failure rate (b) 
 
 Figure 5 is comprised of a set of curves - each for a specific temperature with the amount of 
derating as a variable. The curves clearly show the correlation between part temperature and applied 
derating, which are the two electrical parameters most easily altered by the system designer. 
Achieving a certain part reliability can be done by either applying derating and/or keeping the 
temperature within predetermined limits. In most situations a proper management of the thermal 
system characteristics will give rise to the largest increase in reliability. The latter situation can easily 
be identified in Figure 5b where the probability of survival drops significantly as the temperature 
reaches approximately 100°C. 
3. Concept clarification and point of origin  Page 16 
 
 In conclusion, it should be noted that derating in this context is by no means the same as electrical 
design within the electrical limits of the part as specified in the manufacturer’s datasheet. In [Mø01] 
the terms ‘absolute limit derating’ and ‘reliability derating’ are introduced to distinguish the two 
topics from one another. However, most equipment designers would agree that designing within the 
electrical limits of the parts simply is responsible engineering.  
 
 
3.3 Statistical distributions and methods  
 
 In order to have a statistical foundation on which the reliability analysis can be based this section 
introduces a set of commonly used distributions. A short characterization of each distribution will be 
provided including a clarification of the distinct features of each distribution. The distribution used 
throughout this thesis will be introduced and expressions for use in subsequent chapters are deduced.  
 Evaluation of complex systems begins with an analytic reduction of subsystems or blocks within 
the system. The failure rate of each of these blocks are then calculated and used in the reliability 
assessment of the overall system. Deducing analytic expressions for all system blocks is done by 
means of network reduction techniques, which will be introduced at the end of this section. 
 
Distribution Failure density 
f(t) 
Survivor function 
R(t) 
Hazard rate 
λ(t) 
Variance σ2 
Poisson ( )
!x
et tx ⋅−⋅⋅ λλ  ( )∑
=
⋅−
⋅⋅
−
n
j
tj
j
et
0 !
1
λλ  - t⋅λ  
Gaussian 
( )




−
⋅
−
⋅
⋅⋅
22
2
2
1 σ
µ
πσ
t
e  
( )
∫
∞




−
⋅
−
⋅
⋅⋅t
dte
t
22
2
2
1 σ
µ
πσ )(
)(
tR
tf  2σ  
Exponential te ⋅−⋅ λλ  te ⋅−λ  λ  2
1
λ  
Weibull 
( )




−
−
⋅
⋅
β
α
β
β
α
β tet
1
 ( ) − βαte  β
β
α
β 1−⋅ t  ( ) ([ βα 222 11 +Γ−+Γ⋅
 
Table 4 : List of distributions commonly used in reliability engineering 
 
Poisson distribution  represents the probability of an isolated event occurring a specified 
number of times in a given interval of time or space (discrete) when the 
failure rate in a continuum of time or space is fixed. The occurrence of 
events must be affected by chance alone. 
 
Gaussian distribution  is among the most important and widely used distributions in the field of 
statistics and probability. However, in reliability evaluation the 
distribution is of less significance compared to other continuous 
distributions. 
 
Exponential distribution  is the most widely used distribution in reliability evaluation. Like the 
Poisson distribution the failure rate must be fixed in order for the 
exponential distribution to be applicable. For this reason the exponential 
distribution is often considered a special case of the Poisson distribution.  
 
3. Concept clarification and point of origin  Page 17 
 
Weibull distribution  is widely used in statistical analysis of experimental data due to its lack 
of graphical shape. In other words, the distribution has the ability to 
represent a wide variety of experimental data sets as a continuous 
function of time. 
 
 Having described a set of commonly used distributions, attention will now be turned towards 
finding a distribution for describing the reliability of the systems and techniques presented in this 
thesis. As will become apparent in a subsequent section the failure rate, calculated using reliability 
data found in the MIL-HDBK-217F, is a constant number for a given set of operating parameters. 
The exponential distribution requires the failure rate or hazard rate, as it is often labeled, to be fixed 
and is therefore very suitable to describe the reliability of electronic parts. This fact combined with 
the relatively simple computations needed for establishing the reliability of a given system makes it 
the most widely used distribution in reliability evaluation of electronic equipment. For this reason the 
exponential distribution will form the basis for all reliability evaluations throughout this thesis.  
 Establishing the parameters associated with the exponential distribution is relatively 
straightforward if referenced to a graphical illustration. Hence, the starting point in this process is the 
representation of the density function as shown in Figure 6. 
 
time
f(t)
λ
R(t)
Q(t)
t
f(t)   = λ.e-λt
λ      = failure rate       constant
Q(t)  = cumulative distribution       unavailability
R(t)  = survival distribution       reliability
 
Figure 6 : Density function in which Q(t) and R(t) are illustrated 
 
 Figure 6 shows both the cumulative failure distribution, represented by the area under the f(t) 
curve denoted Q(t), and the survival function, represented by the area under the f(t) curve denoted 
R(t). An analytic expression for these two distributions can be deduced from the density function: 
 
 t-e  f(t) ⋅⋅= λλ   (3-1) 
 
 Integrating (3-1) from time = 0 to time = t defines the area under the curve f(t) limited in the 
horizontal direction by the two time limits. Since this area represents the cumulative failure 
distribution the analytic expression for this section of the density function becomes: 
 
3. Concept clarification and point of origin  Page 18 
 
 t-
t
0
t-
t
0
e - 1   e   f(t)  Q(t) ⋅⋅ =⋅== ∫∫
λλλ dtdt   (3-2) 
 
 In reliability engineering terms Q(t) is often referred to as the system unavailability, since it 
represents the probability of system failure. The other area under the f(t) curve denotes the survival 
distribution or simply the reliability R(t). Finding this area follows the same principles as mentioned 
above, thus integrating from time = t to time = ∞ gives the following analytical expression for R(t): 
 
 t-
t
t-
t
t
0
e   e   f(t)     f(t) - 1  R(t) ⋅
∞
⋅
∞
=⋅=⇒= ∫∫∫
λλλ dtdtdt   (3-3) 
 
 Other key elements of the exponential distribution are the expected value and the standard 
deviation. Finding the expected value of a continuous random variable having a range of time = 0 to 
time = ∞ is given by: 
 
 λλ
λ 1   et     f(t)t  E(t)
t
t-
0
=⋅⋅⇒⋅= ∫∫
∞
⋅
∞
dtdt   (3-4) 
 
Similarly, the standard deviation of the exponential distribution can be found: 
 
 λλλλσ
λ 1  1  2  (t)E -  et  
22
2
t
t-2
=−=⋅⋅= ∫
∞
⋅ dt   (3-5) 
 
 From (3-4) and (3-5) it can be seen that the expected value and the standard deviation are equal. 
This is one of the special properties of the exponential distribution. The expected value calculate in 
(3-4) is perhaps better knows as the Mean Time To Failure (MTTF), a term often confused with the 
Mean Time Between Failures (MTBF). Conceptually there is a significant difference between the 
two terms in that the MTTF applies to reparable systems whereas the MTBF applies to static 
systems. However, system repair time is usually very small compared with the operating time and 
the difference between MTTF and MTBF becomes very small. Since this research work only 
considers non-repairable systems there will be no distinction between these terms throughout this 
thesis. 
 
3.3.1 Combinational aspects 
 
 Most practical reliability evaluations require probabilities to be united in order to calculate the 
overall system reliability. Before this concept can be described any further the level of detail in each 
event must be established. To do this two concepts come to mind – permutations and combinations. 
Permutation represents the number of different ways a set of events can be arranged whereas 
combination represents the number of different ways in which these events can happen without 
regard to the order in which they happen. In electronic applications the concept of combination is 
usually of greater importance than permutation, since it is generally necessary to obtain knowledge 
of which combined events that lead to system failure, and of less concern to know in which order the 
events occurred. In general, the number of event combinations is equal to or less than the number of 
3. Concept clarification and point of origin  Page 19 
 
event permutations. The binomial distribution is a mathematical description directly associated with 
combinational issues and is therefore perfectly suited for joining probabilities in system evaluations. 
The binomial distribution can be represented by the general expression: 
 
 ( ) 1 qpC  q  p n
0r
rnr
rn
n
=⋅⋅=+ ∑
=
−  n = number of trials r = number of successes  (3-6) 
 
 In order for (3-6) to be applicable there must be fixed number of trials each resulting in either 
success or failure. Also, each trial must be independent and have the same probability of success. 
However, in many real-world situations these prerequisites for the applicability of the binomial 
distributions cannot be met and a direct application of the distribution is not possible. While the 
principles behind the distribution and the concept of combinations still apply, an adapted version of 
the distribution can be established by analytically accounting for the fact that a part failure is a 
mutually exclusive event, meaning that in the event the part fails it cannot continue to successfully 
perform its intended task. Similarly, different failure modes are mutually exclusive events, since a 
part cannot be shorted out and at the same time suffer from an open circuit fault. These observations 
lead to the concept of complementary events, meaning that if one outcome does not occur, the other 
must. In terms of reliability evaluation of electronic systems this indicates that if a part has not failed 
it works.  
 Using these basic observations for adapting the probability theory to real-world system 
evaluations produces an equation similar in concept to the binomial distribution. Adjusting (3-6) 
results in the following equation with a usability limited to small number of trials (n): 
 
 ( ) ( ) ( ) ( )nn2211nxx q  p...............q  pq  p        q  p +⋅+⋅+⇒+  x = 1, 2, 3, ………..n  (3-7) 
 
 Since all parentheses in (3-7) represent different probability states for the system under 
consideration, no general expression for evaluating the reliability can be deduced. Hence, an event 
assessment has to be made in each case in order to properly combine identical system probabilities. 
 Having briefly described the concepts of combining probabilities, the focal point in the remainder 
of this section will be an adaptation of the theory for use in reliability evaluations of parallel-
connected systems. This is done by applying the theory to a set of 4 parallel-connected converters. 
The resulting equations will then be extended to provide the reliability of a system comprised of 5, 3 
and 2 converters. 
 
Parallel-connection of 4 converters – system success requiring at least 3 working converters: 
 
Probability of converter 1 working  : p1  
Probability of converter 1 failed  : q1 = 1 - p1   
Probability of converter 2 working  : p2  
Probability of converter 2 failed : q2 = 1 – p2   
Probability of converter 3 working  : p3  
Probability of converter 3 failed : q3 = 1 – p3   
Probability of converter 4 working  : p4  
Probability of converter 4 failed : q4 = 1 – p4   
 
3. Concept clarification and point of origin  Page 20 
 
Number of combinations:  ( ) ( ) ( ) ( )44332211 q  pq  pq  pq  p +⋅+⋅+⋅+  
  ⇓ 
  
 qqqq  qqqp  qqpq  qqpp
  qpqq  qpqp  qppq  qppp
 pqqq  pqqp  pqpq  pqpp
  ppqq  ppqp  pppq  pppp
4321432143214321
4321432143214321
4321432143214321
4321432143214321
⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅
+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅
+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅
+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅
 
 
 The abovementioned equation includes all 24 = 16 combinations of system success (px) and failure 
(qx). Since the system can tolerate a maximum of 1 converter failure, the combinations resulting in 
system success are: 
 
 43214321432143214321System qppp  pqpp  ppqp  pppq  pppp  R ⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅+⋅⋅⋅=   (3-8) 
 
 Had the probabilities of converter success and failure for each converter been identical (p1 = p2 = 
p3   ↔   q1 = q2 = q3), as required for application of the binomial distribution in its basic form, (3-8) 
could have been simplified: 
 
 qp4  p  R 34System ⋅⋅+=     (3-9) 
 
 The result in (3-9) is often used in the reliability evaluation of parallel-connected systems. 
However, the conditions for which it is derived should be recognized, as a special case of (3-8), and 
used accordingly.  
 Deducing the binomial coefficients leading to system success in systems comprised of a different 
number of converters follow the same principles as described in the case of 4 parallel-connected 
converters. Consequently, only the results will be provided.  
 
5-converter system, success requires 4 working converters: 
 
The total number of combinations is 25 = 32 of which only 6 are valid for system success: 
 
 
543215432154321 
543215432154321System
qpppp pqppp ppqpp            
 pppqp  ppppq  ppppp  R
⋅⋅⋅⋅+⋅⋅⋅⋅+⋅⋅⋅⋅
+⋅⋅⋅⋅+⋅⋅⋅⋅+⋅⋅⋅⋅=
   (3-10) 
 
In the special case of identical probabilities this equation becomes: 
 
 qp5  p  R 45System ⋅⋅+=    (3-11) 
 
3-converter system, success requires 2 working converters: 
 
The total number of combinations is 23 = 8 of which only 4 are valid for system success: 
 
 321321321321System qpp  pqp  ppq  ppp  R ⋅⋅+⋅⋅+⋅⋅+⋅⋅=    (3-12) 
 
3. Concept clarification and point of origin  Page 21 
 
In the special case of identical probabilities this equation becomes: 
 
 qp3  p  R 23System ⋅⋅+=    (3-13) 
 
2-converter system, success requires 1 working converter: 
 
The total number of combinations is 22 = 4 of which only 3 are valid for system success: 
 
 212121System qp  pq  pp  R ⋅+⋅+⋅=    (3-14) 
 
In the special case of identical probabilities this equation becomes: 
 
 qp2  p  R 2System ⋅⋅+=    (3-15) 
 
 Having established a set of reliability equations for evaluating parallel-connected systems the 
foundation for probability combinations has been established. As will become apparent, since the 
same rules apply, at a later stage in this section the reduction of complex networks resembles the 
techniques used in combining systems with different probabilities of success.  
 
3.3.2 Part failure rate 
 
 Describing individual part failure rates as a function of operating temperature, voltage and/or 
current stress, part quality or any other parameter that affects part reliability requires a set of 
equations. These equations can be found in the MIL-HDBK-217F, but for easy reference, they are 
listed below: 
 
 
1000      
1000       
1000      
1000       
1000        
1000)C  C(           
1000     
EQCbInductor  
EQRbResistor
EQCVbCapacitor
EQSRTbBipolar 
EQCSTb Diode
LQE2T1 IC
EQSRATb MOSFET
⋅⋅⋅⋅=
⋅⋅⋅⋅=
⋅⋅⋅⋅=
⋅⋅⋅⋅⋅⋅=
⋅⋅⋅⋅⋅⋅=
⋅⋅⋅⋅+⋅=
⋅⋅⋅⋅⋅⋅⋅=
πππλλ
πππλλ
πππλλ
πππππλλ
πππππλλ
ππππλ
ππππππλλ
 All failure rates are in failures per 109 hours [FIT] 
 
 A detailed parameter description can be found in the MIL-HDBK-217F for which reason only the 
two parameters of immediate interest at this stage will be described. The first of these parameters is 
the temperature. This parameter affects all electronic parts. As a rule of thumb it can be assumed that 
a 10°C increase in operating temperature doubles the part failure rate [Xp01]. The temperature 
information to be included in the failure rate calculations is denoted by the parameter πT. Also, there 
is a hidden inclusion of temperature information in the base failure rate λb. 
 The other parameter is the electrical stress factor denoted by πS. Compared to the temperature, 
this parameter has a significantly lower overall impact on system reliability. A more detailed 
3. Concept clarification and point of origin  Page 22 
 
discussion of this topic has already been provided in section ‘3.2 Derating’ and is therefore omitted 
in this section. 
 
3.3.3 Network modeling and reduction 
 
 Obtaining high reliability within a given system can be accomplished in a number of ways. This 
section provides the foundation for examining complex networks by applying network reduction 
techniques to simplify system reliability evaluations. The results of this section will be used in 
conjunction with most reliability evaluations in subsequent chapters particularly in assessment of the 
array-based redundancy concept.  
 Based on concepts derived from set theory, the comparison of different system topologies can be 
done by using network reduction techniques. The analysis originates with a short description of two 
conceptually different types of systems – the series system and the parallel system. From there, more 
complicated analysis techniques involving combined series-parallel systems are derived. 
 
Series systems 
 
 This type of system is particularly simple, since the overall system reliability is merely based on 
the accumulation of all calculated failure rates. Although the individual parts might not be connected 
in series physically this technique also applies to the reliability assessment of the overall system. In 
fact, this method is suggested in MIL-HDBK-217F for doing reliability prediction of electronic 
equipment. In other words, this method accounts for the fact that all parts of the system are needed 
for system success. The mathematical foundation of series system reduction is given by: 
 
       where          e  e......ee  R......RR  R(t)
n
1i
iTotal
t-t-t-t- Totaln221
n21 ∑
=
⋅⋅⋅⋅
==⋅⋅⋅=⋅⋅⋅= λλλλλλλλλ    (3-16) 
 
 It can be seen that this equation corresponds to the survival function (3-3) with the exception that 
the failure rate in (3-16) is the accumulated value of all system failure rates. 
 
Parallel systems  
 
 This type of system is much harder to describe in terms of simple mathematical functions. The 
number of variables in a parallel-connected system are many and range from the number of parallel-
connected blocks to requirements for system integrity in terms of system failure. In this thesis, the 
mission of parallel-connection is achieving the so-called N+1 redundancy (redundancy will be 
discussed in a subsequent section). This is not the case in chapter ‘4 Array-based redundancy’ where 
N+2 redundancy is required, but for network reduction purposes in, for example, chapter ‘6 Load 
sharing’ a set of equations is needed. A generalized expression for reducing a system comprised of n 
units, where only a single unit is needed for success, is given by: 
 
 )e - (1 - 1  )e - (1......)e - (1)e - (1 - 1  Q......QQ - 1  (t)R t-
n
1i
t-t-t-
P
in21
n21
⋅
=
⋅⋅⋅ ∏=⋅⋅⋅=⋅⋅⋅= λλλλλλλ    (3-17) 
 
 This expression is very different from the cumulative failure distribution (3-2). In fact, the only 
resemblance is the derivation of the unavailability. Although the result in (3-17) only requires a 
3. Concept clarification and point of origin  Page 23 
 
single unit to work for system success, the expression is still applicable in, for example, a 5-unit 
system needing 4 units for system success as long as the valid system states are accounted for during 
the reduction process.  
 The system reduction equations described in (3-16) and (3-17) are limited in their use to simple 
series and/or parallel configurations, which fortunately provides an accurate description in most 
systems including the load sharing techniques and digital approach proposed in this thesis. However, 
in order to accurately describe complex interconnections in multipart systems a different approach 
has to be taken. One such system is the array-based redundancy configuration presented in chapter ‘4 
Array-based redundancy’. In fact, several techniques can be applied for reliability assessment of 
these systems. The technique used in this research work, originally taken from graph theory, is based 
on connection matrices. The effectiveness of this particular technique is that it preserves the 
interconnection information even as the network is reduced.  
 Since the connection matrix technique does not introduce further mathematical expressions that 
would have to be generalized in order to be applicable in a wide variety of situations throughout this 
thesis, the concept of network reduction using connection matrices is introduced by means of a case 
study. The system under consideration can be seen in Figure 7. 
 
R1 = 0.9 R2 = 0.9 R3 = 0.9
R4 = 0.8 R5 = 0.7
R6 = 0.7
Input Output
A
B C
FE
D
 
Figure 7 : Network reduction example 
 
 A systematic description of the blocks that interconnect the different notes is achieved by setting 
up the information in a matrix as shown in (3-18). 
 
 100000  
R10000  
R01000  
R00100  
00RR10  
RR00R1  
 
5
5
3
22
641
F
E
D
C
B
A
FEDCBA
A  =
 
 
   (3-18)
3. Concept clarification and point of origin  Page 24 
 
 In (3-18), a zero indicates that there is no connection between two nodes and unity represents a 
‘virtual’ connection between the node and itself. With the orientation of the matrix information the 
latter representation can be seen to form the principal diagonal.  
 Comparing Figure 7 to the information contained in (3-18) it can be seen that the blocks 
interconnecting the individual nodes are characterized by algebraic variables. This allows for a wide 
range of applications of the connection matrix technique. In this example the information represents 
the probability of each interconnecting block providing a fault free operation for a specified period of 
time.  
 An important advantage of representing the system information in matrix form is that directional 
data is stored within the interconnection information. The system shown in Figure 7 only allows for 
unidirectional flow as indicated by the arrows, thus the information below the principal diagonal is 
zero. Bidirectional flow is used in the array-based redundancy technique and will be described in 
chapter ‘4 Array-based redundancy’. Based on the latter information it can at this point be concluded 
that the matrix entries below the principal diagonal in the array-based redundancy system are non-
zero.  
 Having established the entire connection matrix, the next step is either node removal through 
sequential reduction or matrix multiplication. When system computations are assisted by 
mathematical software the latter method provides the results in the smallest number of computation 
steps, thus being the preferred method in this thesis. Application of the matrix multiplication is 
straightforward as the basic connection matrix is multiplied by itself until the resulting matrix 
remains unchanged.  
 Multiplying (3-18) with itself once results in: 
 
 100000  
R10000  
R01000  
R00100  
RRRR0RR10  
RRRRRRRR   R   1  
 
5
5
3
523222
654421211
F
E
D
C
B
A
FEDCBA
⋅+⋅
+⋅⋅⋅
A2  =
 
 
This matrix differs from (3-18), thus (3-19) has to be multiplied once again: 
 
 100000  
R10000  
R01000  
R00100  
RRRR0RR10  
RRRRRRRRRRRRRR   R   1  
 
5
5
3
523222
654521321421211
F
E
D
C
B
A
FEDCBA
⋅+⋅
+⋅+⋅⋅+⋅⋅⋅⋅
A3  =
 
 
 
   (3-19)
   (3-20)
3. Concept clarification and point of origin  Page 25 
 
 From (3-20) it can be seen that all matrix elements remain unchanged except for one. It should be 
noted that the coefficients to the individual matrix elements changes as the matrix is multiplied by 
itself. However, in relation to finding the transmission or connection from input to output, these 
coefficients are unimportant and are therefore omitted. Since the entry in (row A; column F) still 
changes as the multiplication process is applied to (3-18), the matrix must be multiplied by itself 
once more. 
 
100000  
R10000  
R01000  
R00100  
RRRR0RR10  
RRRRRRRRRRRRRR   R   1  
 
5
5
3
523222
654521321421211
F
E
D
C
B
A
FEDCBA
⋅+⋅
+⋅+⋅⋅+⋅⋅⋅⋅
A4  =
  
 
 Comparing (3-21) with (3-20) it becomes clear that during this last multiplication the resulting 
matrix remained unchanged, thus the multiplication process can be stopped and the result can be 
found using either the A3 or A4 matrix. Finding the transmission from input (A) to output (F) results 
in the following combinations of block reliabilities: 
 
 654521321FA R  RR  RRR  RRR  R +⋅+⋅⋅+⋅⋅=→    (3-22) 
 
 It can be seen that the concept of connection matrix theory in applications such as the one 
described above simplifies a given system in a very straightforward way while the system 
information remains unchanged in the original matrix (A). The result (3-22) can be graphically 
illustrated in terms of a single block as shown in Figure 8.  
 
λResulting
OutputInput
 
Figure 8 : Result of network reduction 
 
 Reliability evaluation based on connection matrix techniques provides a simple means for 
reducing complex systems. In chapter ‘4 Array-based redundancy’, this approach is used extensively. 
However, due to the very complicated interconnections in the array-based redundancy concept the 
evaluation results will be limited to a system matrix – the so-called topological matrix and the final 
result. All calculations can be found in the directory ‘Mathematica’ on the accompanying CD. 
 
3.3.4 Summary 
 
 For easy reference the expressions deduced in this section applicable to the reliability evaluations 
in this thesis are shown in Table 5. 
 
   (3-21)
3. Concept clarification and point of origin  Page 26 
 
Description Expression/variable 
Failure rate λ 
Density function t-e  f(t) ⋅⋅= λλ  
Survivability function (survivability) t-e  R(t) ⋅= λ  
Cumulative failure distribution (probability of failure) t-e - 1  Q(t) ⋅= λ  
Reliability of series systems           wheree  R(t)
1i
iTotal
t- Total ∑
=
⋅
==
n
λλλ  
Reliability of parallel systems )e - (1 - 1  (t)R t-
n
1i
P
⋅
=
∏= iλ  
Standard deviation λσ
1  =  
MTBF = MTTF = expected value (in this thesis only) λ
1  E(t) =  
Parallel-system with identical probabilities of success qp4  p  R 1-nsystem-Parallel ⋅⋅+=
n  
N+1 redundant 2-unit parallel-connected system 212121System qp  pq  pp  R ⋅+⋅+⋅=  
Table 5 : Summary of applicable analytic expressions 
 
 This section has provided a short coverage of the exponential distribution used in this thesis. 
Several analytical expressions for reliability evaluation of the power systems proposed in succeeding 
chapters has been established and adapted to the application at hand. 
 
 
3.4 Redundancy 
 
 Redundancy is the concept of parallel-connecting multiple units often with identical 
specifications. In its basic form redundancy implementation is very simple and adds several 
reliability elements to the overall system.  
 Adapting the concept of redundancy to the topics in this thesis the parallel-connection of multiple 
units is divided into two distinct areas of applications. The first, mainly for use in systems where an 
increased output current is desirable, makes use of multiple identical units to achieve overall system 
characteristics similar to a large single unit power system. The advantages of this approach range 
from improved mechanical and/or electrical stress characteristics to easy system synthesis and short 
time to market. Furthermore, multiple parallel-connected units have the advantage of providing some 
resilience toward faults although not as absolute as the technique described below. 
 The second area of application is the implementation of highly reliable power systems with fault 
tolerance integrated into the system configuration. This is achieved by semi-isolating the output of 
the individual units from the common output power bus. In turn, this averts faults within any single 
3. Concept clarification and point of origin  Page 27 
 
unit from affecting the system performance as seen by the load. In theory, the load would be 
completely unaware of any power system integrity degradation unless intentionally notified by the 
power system. In a real-world setup the latter scenario is virtually impossible to achieve since most 
power faults result in some degree of disturbance on the common output power bus. Hence, the 
system must be designed to limit such disturbances to be small, and fast enough to guarantee 
compliance with load specifications at all times.  
 In subsequent chapters both areas of applications will be examined by means of detailed 
reliability calculations and practical test configurations for measuring purposes. The measurements 
will verify that the use of redundancy in different power system configurations increases the overall 
system reliability. This section provides a description of the different types of redundancy and 
presents the mathematical foundation for reliability evaluations of such redundant systems. The 
equations deduced in this section combined with those deduced in the previous section will form the 
basis for most subsequent reliability calculations. 
 Another important consideration is the determination of the number of input sources. According 
to the Uptime Institute, fault tolerant classification of a power system requires the input power to 
come from at least two independent sources (this issue is briefly mentioned in section ‘3.5 
Standards’). In these situations it might be irrelevant to protect the converter inputs with fuses or so-
called front switches. However, in many applications it might be impractical to implement the power 
system by means of two independent input sources. In such situations the input of each converter 
making up the power system must be protected in order to prevent input voltage loss during a single 
point failure. 
 
3.4.1 System redundancy 
 
 This type of redundancy is performed at the highest level possible within the overall power 
system. As will be determined in later sections the use of this type of redundancy usually requires 
some form of load sharing to avoid overstressing single units. However, as load sharing circuitry is 
added to the system, overall complexity and cost increase accordingly. Since the main idea of system 
redundancy is to ensure that the load is provided with adequate power to perform its intended tasks 
to the fullest extent without any degradation, the added circuitry must be analyzed to guarantee fault 
free operation and dynamic performance similar to that inherent in the individual units in order to 
avoid reducing the overall system reliability.  
 Implementing system redundancy for reliability optimization by means of parallel-connected 
converters, the choice of redundancy configuration must be established. The most common approach 
is an N+1 redundant configuration where one extra converter is added to the system. This approach 
enables the system to tolerate one fault while still providing the required output power. The most 
straight forward implementation of such an N+1 redundant system is the design of two identical 
converters each capable of supplying the maximum load current. However, this approach results in a 
100% power ‘overbudget’ – meaning that the available system power is twice that required by the 
specifications. Increasing the number of converter units reduces this power ‘overbudget’. For an N+1 
redundant power system Figure 9 shows the required system power in percent of total system power 
capability as the number of converter units increases. The other curve is an index that takes into 
account the decrease in converter unit cost price, the increase in circuit complexity and the increase 
in load sharing circuitry costs – all a function of the number of converter units. The index is based on 
component cost (per 1000 pieces) and standard load sharing implementation circuitry. It should be 
noted that the index curve in many situations will change as a function of the number of units when 
large scale manufacturing is employed and/or different load sharing techniques are used. 
3. Concept clarification and point of origin  Page 28 
 
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7 8 9 10 11 12
Number of units in N+1 system
Po
w
er
 re
qu
ire
m
en
t i
n 
%
 o
f s
ys
te
m
 c
ap
ab
ili
ty
 
Figure 9 : Power requirement in percent of total system power capability 
 
 From Figure 9 it can be seen that the two curves intersect somewhere between 3 and 4 converter 
units. This point is the optimum in the configuration at hand. However, as indicated above this 
optimum point is most likely to shift to either side along the axis of abscissas when other power 
system implementations are considered.  
 
3.4.2 Parts/block redundancy 
 
 This technique, ultimately leading to power system proposals for the Precision Docking Project 
managed by Partners for Advanced Transit and Highways, copy the methods applicable to system 
redundancy. Thus similar calculations, although modified slightly, apply. 
 In this context the interconnection of two points within a network is denoted by the term ‘block’. 
This use of the term coincides with its use in the array-based redundancy concept described in 
chapter ‘4 Array-based redundancy’ where the term represents a set of parts interconnecting two 
points in a power system. In general, a part with n inputs and m outputs is represented by a block 
with the same number of inputs and outputs. Figure 10 shows the most basic block – a single part. 
 
R
R
 
Figure 10 : Basic block identification 
 
( ) 0.751)-(xindex  Price -x indexcircuitry  LS index  Complexity ⋅⋅+
( )
100
(x)unit  pr. P
 1)(xunit  pr. P - (x)unit  pr. P 
Max
MaxMax
⋅
+
Number of units in N+1 system: 
N + 1 = 2    ⇒    N = 1
3. Concept clarification and point of origin  Page 29 
 
 For block redundancy to be effective the probability of occurrence of the failure modes associated 
with the block under consideration must be different. In other words, the likelihood that a block for 
example fails short must differ from the probability that the same block fails open circuit. Only then 
is it possible to effectively use redundancy principles for improving the reliability of the block in 
question. As an example, consider a block with the following failure modes: 
 
• Block failed open circuit 
• Block failed short circuit 
 
 Taking corrective actions towards the block failure necessitates knowledge of the distribution of 
open circuit failures and short circuit failures. Such information can be found in various part 
standards. The information used in this thesis is the same as Alcatel Space Denmark uses for its 
space equipment for which reason a detailed parts failure mode list cannot be disclosed. However, 
the part under consideration in this example is a diode with the following failure mode probabilities: 
 
• Block failed open circuit  → 65% 
• Block failed short circuit  → 35% 
 
The percentages indicate that 35% of the probability of block failure the block fails short circuit 
while the remaining 65% of the probability of block failure the block fails open circuit. Optimizing 
this block can be accomplished by doubling the number of blocks, thus forming a new parallel-
connected block as shown in Figure 11. 
 
P
P
A B
PNew
New block
Original blocks
A B
 
Figure 11 : Parallel-connection of two blocks might increase the reliability 
 
 For the probability assessment it should noted that the two failure modes are mutually exclusive. 
This information is important when deducing the minimal cut set used to assess the probability of 
block survival. The following equation forms the minimal cut set: 
 
 circuitOpen operation Normal
2
operation NormalNew PP2  P  R ⋅⋅+=    (3-23) 
 
 Specific reliability calculations will not be provided in this section but can be found in the PATH 
report on power system reliability where similar calculations are performed for a buck freewheeling 
diode. 
 It should be noted that equation (3-23) is equal to the special case of probability combination of 2 
units with identical failure rates (see equation (3-15) in section ‘3.3 Statistical distributions and 
3. Concept clarification and point of origin  Page 30 
 
methods’). A graphical illustration of (3-23) is shown in Figure 12 where the block reliability is 
plotted as a function of the ratio of open circuit faults to short circuit faults. It can be seen that if the 
probability of short circuit faults is greater than the probability of open circuit faults the overall block 
reliability is less than the reliability of a single block, thus undermining the concept of parallel-
connection. As the ratio changes towards a larger probability of open circuit faults the system 
becomes, as expected, more and more reliable. 
 
Single  block
Reliability
Parallel-connection of two blocks
0.25 1.00 4.00
0.92
0.94
0.96
0.98
PWorking  = 0.95
PFailure    = 0.05
PO/PS
 
Figure 12 : Plot of (3-23) with fault ratio as a variable 
 
 The intersection of the two curves in Figure 12 indicates the point where the ratio of open circuit 
faults to short circuit faults is equal to one. In other words, at this particular intersection the 
reliability of the parallel-connected blocks are equal to the reliability of a single block. Thus, in any 
real-world implementation a ratio larger than one is needed in order to benefit from the block 
redundancy. 
 
3.4.3 Analytical redundancy 
 
 Analytical redundancy can be characterized as the concept of determining the same parameter by 
means of different monitoring approaches. This means that faults in the overall system may be 
detected on the basis of generation and comparison of information from dissimilar sources. 
 The use of analytical redundancy is limited to the digital approach described in chapter ‘5 Digital 
control of DC-DC converters’. The measurement of parameters in these digital systems is based on 
simple theoretical relations between measurable parameters. As an example, the converter power 
throughput is related to the MOSFET temperature, which prevents the system from shutting down in 
case the output current measurement is lost.  
 The reliability calculations are adapted to the analytical redundancy concept by eliminating the 
probability of failure of certain operating modes. This is accomplished by relating as many system 
PO  =  Probability of open circuit faults 
PS  =  Probability of short circuit faults 
3. Concept clarification and point of origin  Page 31 
 
parameters as possible to one another, thus maximizing the number of ways the set of control 
variables can be deduced. The buck converters under consideration in chapter ‘5 Digital control of 
DC-DC converters’ are assessed by means of an FMECA that essentially forms the basis for the 
failure mode elimination. The reliability calculations are then performed as described in section ‘3.3 
Statistical distributions and methods’. A more detailed description of analytical redundancy in 
control applications can be found in [Bl01]. 
 In conclusion, it is worth noting that the concept of analytical redundancy would benefit most 
systems if implemented at the correct level. In simple analog controlled systems analytical 
redundancy would serve as a fault indicator by continuously comparing theoretical system 
constraints with actual system behavior. In digital implementations the use of analytical redundancy 
enables the system to respond intelligently to unusual system behavior, thus increasing the overall 
system fault resilience. 
 
 
3.5 Standards 
 
During the past decades a large number of reliability standards have evolved. The following list 
only mentions the most commonly used standards: 
 
• MIL-HDBK-217 is the original worldwide standard for electronic reliability analysis and 
supports both commercial and military grade components. The standard includes two different 
reliability methods – The Part Stress calculation method, which takes the actual temperature 
and stress information into account, and The Parts Count method, which is used for quick 
estimates and early design analysis.  
 
• The Telcordia method, formerly generated by Bell Communications Research (Bellcore), 
provides reliability models for commercial grade electronic components. This method also 
provides for using data from device burn-in, unit burn-in, laboratory testing, and actual field 
testing. 
 
• PRISM was originally developed by the Reliability Analysis Center (RAC), and is being used 
for MTBF prediction and system reliability analysis. PRISM enables the user to select parts 
from both the EPRD and NPRD documents published by RAC and enables the use of 
predecessor data and process grading factors in the reliability analysis. 
 
• CNET 93 is a standard developed by France Telecom that provides reliability models for a 
wide range of components. CNET 93 is a comprehensive model similar to MIL-HDBK-217 
and provides for a detailed stress analysis. 
 
• RDF 2000 is a newer version of the CNET 93 standard, developed by UTE. It uses cycling 
profiles and their applicable phases to provide a completely different basis for failure rate 
calculations. 
 
• HRD5 is a reliability standard developed by British Telecommunications plc that also 
provides models for a wide range of components. In general, HRD5 is similar to CNET 93, 
but provides simpler models and requires fewer data parameters for analysis. 
 
3. Concept clarification and point of origin  Page 32 
 
• 299B is based on the Chinese standard GJB/z 299B. After its conception it was translated into 
English by Beijing Yuntong Forever Sci.-Tech. Co. Ltd. 299B is very similar to the MIL-
HDBK-217 reliability standard, thus allowing the user to take actual temperature and stress 
information into account.  
 
Besides being the first worldwide standard the MIL-HDBK-217 is also among the most 
comprehensive standards in terms of reliability data for system evaluations. For this reason the only 
standard used in this thesis is the MIL-HDBK-217. Since there is no direct link between the 
abovementioned standards, an equipment reliability comparison would have to be performed using 
the same standard in order to be applicable. 
 As the concept of reliability becomes more important, new standards specifically addressing 
reliability issues are being developed. One example is the ‘Fault Tolerant Power Compliance 
Specification Ver. 1.2’ established by the members of ‘The Uptime Institute’. According to this 
specification the input of a power system must be comprised of a minimum of two AC power 
sources in order to qualify for the fault tolerant classification. 
 
3.6 Research based on applicable evaluation techniques 
 
 Having provided a short description of the techniques applicable to the current research this 
section seeks to correlate these techniques with the research work performed and the results 
obtained.  
 By setting aside the electrical constraints and considering a system from a strictly statistical point 
of view, the following question can be asked, ‘How is it possible to improve the reliability of a 
redundant power system?’ The theoretical solution to this question lies in a redundant reconfigurable 
system as described in chapter ‘4 Array-based redundancy’. The research in array-based redundancy 
implementation originated from the theoretical results of the reliability evaluation, derived in this 
chapter, and extending these to the point where a reconfigurable system could accurately be 
described. Consequently, the work on the array-based redundancy concept is mainly theoretical 
although a sample converter has been implemented (see Figure 29). Figure 13 shows the link 
between the techniques described, and adopted to the application at hand, in this chapter and the 
research work that resulted. 
 The work in digital control of DC-DC converters is a continuation of the outcome of the array-
based redundancy control but aimed at a more real-world implementation. The FMECA is used for 
assessing the system failure modes and examining the possibilities of eliminating as many of them as 
possible by means of analytical redundancy. This work originated with a student project intended to 
implement a fast and precise digital control of a buck converter. After completion it was clear that a 
relatively high sample frequency was obtainable and the work expanded to include a number of fault 
monitoring tasks that ultimately would eliminate certain failure modes. This work, described as an 
advanced version of the original digitally controlled converter, led to a publication presented at 
Power Electronics Specialists Conference 2003 in Mexico. 
 Having examined the possibilities of simple digital control of converters, the attention was turned 
towards analog system realizations based on system and parts redundancy. Chapter 6 and 7 describes 
this work. Figure 13 shows the link between prior art and the research work in analog reliability 
enhancement systems. 
 At the end of this thesis, the research work for Partners for Advanced Transit and Highways is 
presented. This work utilized the improvement techniques developed in this Ph.D. project to evaluate 
a real-world power system for use in automated busses.  
3. Concept clarification and point of origin  Page 33 
 
Project diagram in Figure 1
Derating FMECA Network analysis Statisticalassessment Redundancy
Topological reliability assessment
Digital control of DC-DC converters
Load sharingArray-based logic and graph theory
Thermal droop load sharing
'Regular'
PSU
2003 Annual report - PATH
 
Figure 13 : Direct link between prior art, project diagram and research work 
 
 Figure 13 shows the direct link between the adapted reliability enhancement techniques and the 
research work presented in this thesis. 
 
 
3.7 References 
 
[Re02] Relex Reliability Software, http://www.relexsoftware.co.uk/ 
 
[Me01]  Statistical Methods for Reliability Data, William Q. Meeker and Luis A. Escobar, 
Wiley series in Probability and Statistics, ISBN 0-471-14328-6 
 
[Ma01]  Electronic Failure Analysis Handbook, Perry L. Martin, McGraw-Hill, ISBN 0-07-
041044-5 
 
[De01]  Backplane Health Rests on Fault Finding, Tom DeLurio and George Hall, EETimes, 
October 13, 2003, Issue 1291, page 63 and 70. 
 
[Mø01]  Building Reliability into Power Electronic Systems, Jørgen Møltoft, Ørsted – DTU, 
Seminar august 20th 2002. 
 
3. Concept clarification and point of origin  Page 34 
 
[Wo01]  Mathematica – A System for Doing Mathematics by Computer, Stephen Wolfram, 
Addison-Wesley Publishing Company, ISBN 0-201-51502-451502 
 
[Mi01]  Reliability Prediction of Electronic Equipment, Military Handbook 217 
 
[Mi02]  Resistors - Selection and use of, Military Handbook 1999 
 
[Ac01]  Electronic Derating for Optimum Performance, Reliability Analysis Center, New 
York 
 
[Ra01]  Power Electronics Handbook, Muhammad H. Rashid, Academic Press series in 
Engineering, ISBN 0-12-581650-2 
 
[Up01]  The Uptime Institute, http://www.upsite.com/ 
 
[Bl01]  Diagnosis and Fault-Tolerant Control, Mogens Blanke, Michel Kinnaert, Jan Lunze 
and Marcel Staroswiecki, Springer, ISBN: 3-540-01056-4 
 
[Xp01]  Reliability in Electronics, XPiQ inc. application note, http://www.xp-iq.com/home.htm 
 
 
 
 
 
 
 
 
 
4. Array-based redundancy  Page 35 
 
4 Array-based redundancy 
 
 This chapter describes an alternative redundancy implementation through the use of the 
mathematical concept – Array-based logic. The point of origin is a subsystem level redundancy 
while maintaining redundancy at the top-most system level – meaning that redundancy is achieved at 
multiple levels throughout the system. This chapter introduces the power system in which the 
theoretical redundancy considerations are implemented. Reliability calculations verify that the 
proposed power system topology results in the anticipated increase in overall system reliability. 
 
 
4.1 Redundancy 
 
 Redundancy, as described in the previous chapter, was based on the concept of increasing the 
number of parallel-connected converter boards or parts from input to output, thus limiting the 
perception by the electrical and physical constraints. From a traditional power system point of view 
this description is precise and adequate in terms of system functionality. However, from a strictly 
statistical point of view, the description is very limited and does not provide the true redundancy 
capabilities of a parallel-connected system. Focusing on the latter issue, the description must be 
expanded to include redundancy concepts in n-dimensional space, whereby the following 
classification can be established: 
 
Redundancy is the concept of increasing the number of alike paths from one point in space to 
another via some form of parallel-connection.  
 
 The reliability of a particular path from point A to point B in an n-dimensional space may or may 
not be dependant on the reliability of other paths originating from point A and ending at point B 
within the same n-dimensional space. Thus, a measurement of the condition of each path in space is 
necessary. In this context, the operation as well as the representation of such parallel-connected paths 
in space has a geometrical structure similar to that of an array. This characteristic has led to the 
examination of possible control schemes for redundancy management within a given power system 
by adopting the use of basic array theoretical operations. An example of a 3-dimensional redundant 
system is shown in Figure 14. 
 
5D1 5D2 5D3
5D1 5D2 5C3
5A1 5A2 5B3
5A1 5A2 5A3
4D1 4D2 4D3
4D1 4D2 4D3
4D1 4D2 4D3
4A1 4A2 4A3
3D1 3D2 3D3
3D1 3D2 3D3
3D1 3D2 3D3
3A1 3A2 3A3
2D1 2D2 2D3
2D1 2D2 2D3
2D1 2D2 2D3
2A1 2A2 2A3
2D1 2D2 2D3
2D1 2D2 2D3
2D1 2D2 2D3
1A1 1A2 1A3
                    
A, B, C, D
1, 2, 3
1, 2, 3, 4, 5  
 (a) (b) 
Figure 14 : 3-dimensional redundant system 
4. Array-based redundancy  Page 36 
 
 The 3-dimentional redundant system in Figure 14a works by automatically interconnecting 
individual blocks. The sequences in which the blocks are connected depend upon the condition of the 
block in question as well as subsequent blocks. This is indicated in Figure 14b by the axis (1, 2, 3) 
and (1, 2, 3, 4, 5). The axis (A, B, C, D) is on the other hand a fixed sequence of interconnections as 
each block connects to one and only one subsequent block. Hence, a block containing an A in the 
identification code only connects to blocks containing a B in the identification code.  
 A system comprised of several parallel-connected units geometrically arranged in a configuration 
similar to that shown in Figure 14a has been investigated and a software program for the redundancy 
management has been developed and tested. The result and the basic concept is the focal point in this 
chapter. 
  
 
4.2 Introduction 
 
 Originated at the Department of Electric Power Engineering, the Technical University of 
Denmark in 1978 with the paper ”Group Representations of Finite Polyvalent Logic – a Case Study 
Using APL Notation” [Fr03] by Associate Professor Ole I. Franksen, the array-based logic has 
evolved into an effective tool when dealing with combinatorial and/or configuration applications. 
The foundation of the technology is a geometrical representation of logic in terms of nested arrays. 
In other words, the array-based concept deals with data objects regarded as arrays. Consequently, all 
calculations are performed on arrays which imply that systems comprised of large amounts of data 
often can be systematically simplified by the use of array theoretical operations. In general, the 
array-based logic can be considered to consist of the following three steps: 
 
• Step one: The establishment of a discrete n-dimensional configuration space using the 
Cartesian product, which ensures completeness. This is accomplished by the use of the tensor 
product ‘OUTER and’, which combines the system propositions and unites them to form one 
conjunctive proposition. 
• Step two: The inference by colligation is the operation of establishing the interconnections of 
the system. In other words, this step finds the solutions that comply with the system 
constraints.  
• Step three: The determination of states by elimination of variables through an or-reduction.  
 
 Having introduced the concept of constraints, it is obvious that prior to completion of the above-
mentioned steps, the constraints of the physical system must be translated into array theoretical 
terms. This is achieved through the use of propositional logic that transforms the system constraints 
into logic operations suitable for array theoretical implementation. A detailed description of the steps 
involved in this transition is provided in section ‘4.4 Array-based control’. 
 Summarizing the above description the basic idea of array-based logic can be expressed, 
according to Franksen, by the following statement: 
 
 Array-based logic explores the consequences of considering truth-values as physical 
measurements. The aim is to formalize logic in accordance with the theoretical structure of 
discrete systems and express this formalization algebraically in array-theoretic terms. 
 
 It should be noted that the concept of the array-based logic is based on discrete time 
implementations. This might be intuitively clear from Franksen’s interpretation of the basic 
perception of array-based logic, but for the sake of conceptual characterization it must be 
4. Array-based redundancy  Page 37 
 
emphasized. So, how well-suited is the array-based logic for discrete implementation and more 
importantly, how can it improve response time and system performance in the power system under 
consideration? The answer to these questions can be found in [Mø02], where it is shown that the 
array-based logic is very suitable for implementation in computer systems. This adds to the list of 
advantages of using a digital control scheme to monitor redundancy performance and taking actions 
accordingly. Since the array-based logic works on entire arrays, the response time to fault situations 
is finite and much shorter than the equivalent response time of a traditional logic implementation.  
 The overall power system discussed in this chapter is a mixed analog/digital system, where the 
redundancy management and fault identification is performed in a discrete environment while the 
system performance and functionality in terms of current supply to the load is a continuous time 
implementation.  
 
4.2.1 Initial concept 
 
 Based on simplified reliability calculations the initial system concept was comprised of parallel-
connected converter boards, where each individual board was split into two separate blocks. An 
illustration of this ‘split-board’ realization is depicted in Figure 15, where the digital controller also 
can be identified. 
 
Inrush control
On/Off switch Filter Power-switch Transformer Rectifier
Short circuit
protection
Measure current
adjust Vref/Duty-cycle
Measure voltage
Redundancy
control
Duty-cycle
Current-sharing
In Out
Power good
Shut-down
Start-up
                    Voltage
Comparator
                    Current
PIC16F877  MCU  (+ UC1825)
Inrush control
On/Off switch Filter Power-switch Transformer Rectifier
Short circuit
protection
In Out
 
Figure 15 : Initial implementation concept 
 
 From the figure it can be seen that the number of power system interconnections doubles although 
the total number of converter boards remains unchanged. This has a positive impact on the overall 
system reliability. However, in order to significantly enhance the reliability the failure rate of the two 
blocks, making up a single converter board, must be approximately equal. This means that the 
connection shown in Figure 15 might not be the optimum position for the electronically controlled 
switch. To establish a foundation for the physical switch positioning an examination of several 
4. Array-based redundancy  Page 38 
 
highly reliably converters, produced by Alcatel Space Denmark, was performed. Since reliability is a 
vital parameter in the space industry, the exact failure rates cannot be disclosed. However, the same 
information can be presented in terms of normalized data, which allows for a qualitative system 
assessment that can be used to determine the optimum position for the switch. The result of this 
failure rate examination can be seen in Figure 16. 
0 2 4 6 8 10 12 14 16 18 20
Post Reg. and Rectification 1
Post Reg. and Rectification 2
Post Reg. and Rectification 3
Isolation
Input Filter
Inrush control
PWM and Voltage Reg
 
Figure 16 : Averaged and normalized fault distribution for 5 highly reliable space converters 
 
 Since the data for the individual converter failure rates were characterized by several individual 
blocks, it was chosen to proceed with data analysis of each block. The result, depicted in Figure 16, 
shows that splitting the individual converters into more blocks than the initial two, allows for even 
further reliability improvements. This fact, described in section ‘4.3 System realization’, determines 
the number of blocks chosen for the power system examined in this chapter. 
 According to the Military Handbook 217F, concerning reliability prediction of electronic 
equipment, semiconductor devices, particular integrated circuits, has a fairly high probability of 
malfunction. Indeed, as depicted in Figure 16 the PWM controller and voltage regulation shows a 
much higher accumulated failure rate than all other blocks in the analysis. Focusing on the failure 
rates that characterize this particular block reveals that the integrated circuits are responsible for 93% 
of the accumulated failure rate. A graphical illustration of the failure rate distribution of the block 
‘PWM and Voltage Reg.’ is shown in Figure 17. 
0 2 4 6 8 10 12 14 16 18 20
IC's
Resistors
Capacitors
Diodes
Coils
Transistors
Zener
Transformer
 
Figure 17 : Detailed fault assessment of PWM and voltage regulation 
4. Array-based redundancy  Page 39 
 
 As previously mentioned, optimized reliability is achieved when each block within the system has 
the same failure rate. Although the final power system deviates slightly from the theoretical optimum 
the following subsections describes the power system, considered in this thesis, as well as the control 
scheme developed for the redundancy management. The point of origin for the array-based software 
development is based on the two keywords ‘fault detection’ and ‘fault isolation’ as described in 
chapter ‘2 Fault tolerance’. 
 
 
4.3 System realization 
 
 As with any system, a redundant power system has both advantages and drawbacks. Among the 
advantages is the possibility of a dramatic increase in reliability at the expense of an increase in 
system dependent parameters such as cost, mass, volume and circuit complexity. Although the 
increase in reliability can be quite high, added cost, mass, volume, and complexity are drawbacks 
that must be considered when deciding which approach to take during the design phase of the power 
system. However, the drawbacks tend to be less important nowadays, since system downtime in case 
of power failure often results in greater losses in sales, customer service etc. 
 As previously mentioned, the implementation of system level redundancy in power systems 
require each converter board within the overall power system to be equipped with a front switch that 
allows for controlled shut-down of faulty converter boards, since this is the only way the power 
system integrity can be maintained. In other words, the power system must exhibit failure free 
operation at the input as well as at the output. Due to this fact, most approaches in designing highly 
reliable power systems originates from the ability of the power system to shut-down faulty units. 
Focusing on the reliability of redundant systems, it is noteworthy that making a single path system 
redundant generally increases the overall reliability with a factor of the reciprocal of the initial 
failure rate for the single path system. This latter fact was briefly discussed in chapter ‘3 Concept 
clarification and point of origin’. 
 Suppose the redundant power system had the ability to reconfigure itself during operation. This 
would increase the reliability of the overall system even further and at the same time reduce the 
maintenance requirements, since faulty units could be ‘replaced’ automatically. Due to the dynamic 
process of continuous measurement of the system integrity and configuration at any given time the 
simple rules for parallel-connection of multiple units does not justify the true reliability potential in a 
reconfigurable system. To obtain a more truthful measure for the system reliability, one has to adopt 
the use of the more detailed equations provided in chapter ‘3 Concept clarification and point of 
origin’. Also presented in this chapter is an alternative to the rather complex equations – the 
connection matrix technique, which is the technique of choice for the reliability assessments of the 
system at hand.  
 It has been chosen to examine a power system comprised of 5 identical converter boards 
connected in parallel. In order for the system to perform satisfactorily, a minimum of 3 working 
converters are required at all times. Thus, the system is N + 2 redundant. On a board level each 
converter is designed to shut-down in case of a single point failure whereas the overall power system 
can tolerate 2 failures and still provide the needed power. From a traditional power system point of 
view two failures reduces the overall power system from a system comprised of 5 parallel-connected 
converters into a system comprised of the 3 converters needed to supply the required power to the 
load.  
 The power system under consideration approaches the parallel connection of the individual 
converter boards in a way that differs significantly, from what has just been described, by splitting 
the individual converter boards into 5 main blocks. The reason for choosing 5 blocks was a 
4. Array-based redundancy  Page 40 
 
combination of the percent-wise higher increase in system reliability and the fact that traditional 
converter design often is performed in 5 steps – each design step representing a block in the power 
system. Figure 18 shows the system structure where each converter board is comprised of 5 
individual blocks. Also shown in Figure 18 is the direction of power flow from input to output. 
 
Block 1 Block 2 Block 3 Block 4 Block 5
Block 1 Block 2 Block 3 Block 4 Block 5
ConverterN + 2
Input
Input
Output
Output
Power flow
 
Figure 18 : Board level system realization 
 
 Aligning these main blocks as shown in Figure 19 (ignoring the block ‘PWM controller’), each 
block connects to the previous block on the same converter board as well as to the previous blocks 
on the parallel-connected converter boards. This arrangement of multiple interconnections of 
individual blocks allows for intelligent control of combining blocks for maximum number of 
working converter boards at any given time. It should be noted that it is not an allowable state to 
have a block deliver power to more that one subsequent block. Thus, the first system constraint is the 
limitation of blocks being connected to one and only one subsequent block. 
 
Inrush control
On/Off switch Filter Power-switch Transformer
Rectifier
S/C protection
Current sharing
Input Output
PWM controller
Feedback
Switch 1A Switch 1B Switch 1C Switch 1D
Switch 1E
 
 Figure 19 : Block interconnection 
 
 The connecting devices are chosen to be electronically controlled switches, but could in theory 
also be mechanically operated relays. The reason for choosing electronically controlled switches is 
the fact that activation of the individual connection devices occurs during system operation, which 
for long switching periods would require a substantial number of capacitors at the power system 
4. Array-based redundancy  Page 41 
 
output in order to comply with ripple voltage specification. Thus, the timing of the system 
reconfiguration is important but not critical. A transition time of 0.1 ms is estimated to be reasonable. 
Compared to the 140 kHz, which has been chosen as the switching frequency for the individual 
converter boards, it is apparent that the transition times of the connection switches are far from 
critical. 
 Even though the connection switches are operated rarely, due to the rather low failure rate of the 
electronic components used and the fact that the transition time from one state to another is relatively 
short, the price paid for using extra switches as connecting devices between the different blocks 
within the power system is an increased failure rate for the individual converter boards and an 
increase in total conduction losses. Furthermore, the overall cost and complexity of the power system 
is increased due to the use of extra switches and associated controlling circuitry. However, if these 
drawbacks at the board level imply a higher probability of continuous operation at a system level, the 
added cost, complexity and losses might be negligible compared to the gain in reliability. Also, 
obtaining a similar reliability for a power system comprised of individual converter boards without 
the interconnecting capability require more converter boards, which adds to system parameters such 
as volume and mass. 
 Having introduced the basic concept of the array-based redundancy scheme the foundation for 
realizing the overall power system is in place. By interconnecting the different blocks according the 
system constraints and following the system structure shown in Figure 18 the power system depicted 
in Figure 20 is obtained. 
 
Converter 1
Converter 2
Converter 3
Converter 4
Converter 5
Input
Input
Input
Input
Input
Output
Output
Output
Output
Output
Inrush control
On/Off switch
Filter
Rectifier
S/C protection
Current sharing
Transformer
Power-switch
 
Figure 20 : Power system block identification 
 
 From Figure 20 the system structure as well as the characterization of the individual blocks can be 
identified. Although a detailed statistical analysis is required to describe the exact system reliability 
it should be obvious that the concept of combining two defective converter boards to form a working 
converter increases the overall system performance concerning both reliability and efficiency. As 
will be shown in Figure 22, the combination of two converter boards that have failed in different 
locations on a board level, enables an alternative path for the power throughput to be established. 
This lowers the stress on the original 3 converter boards, since the load current now is shared among 
4 converter boards. As a consequence the power system operating point on the efficiency curve tends 
to move towards the optimum operating point as shown in Figure 21. Most importantly, the overall 
4. Array-based redundancy  Page 42 
 
power system reliability increases as a result of the newly configured power system comprised of 3 + 
1 working converter boards. 
 
Preferable operating point
Operating point after two failures
Output current
Efficiency
50%
100%
Imax
 
Figure 21 : Operating point movement after fault occurrence 
 
 From Figure 21 it is easily identified that the reconfiguration of system blocks to form alternative 
paths from input to output by means of semi-defective converter boards increase the overall system 
efficiency. From a reliability point of view, a higher efficiency means lower power dissipation, lower 
power dissipation means lower system operating temperature, which in turn gives rise to the 
reliability increase. 
 
4.3.1 Theoretical implementation 
 
 Referring to Figure 19 and Figure 22 this section describes the abbreviations used to identify the 
individual blocks and switches within the power system. 
 
Failure Failure Failure
Failure Failure
Block A Block B Block C Block D Block E
Converter 1
Converter 2
Converter 3
Converter 4
Converter 5
Input
Input
Input
Input
Output
Output
Output
Output
Switch 1A Switch 1B Switch 1C Switch 1D
 
Figure 22 : Alternative use of power system redundancy 
 
 Starting with the blocks it can be seen from Figure 22 that these can be addressed using the 
converter number as row identification and the block letter as column identification. Thus, the first 
faulty location in the power system shown in Figure 22 can be identified as:  
 
 Converter 1, Block B.  
4. Array-based redundancy  Page 43 
 
Identification of the interconnection switches is accomplished through the adoption of the following 
notation:  
 
 SXYZ  
 
where S is the notation used in the software to identify a switch, X represents the converter number, 
Y represents the block prior to the switch in question and Z represents the switch position. Hence, a 
switch between block 1A and block 1B set in position 1 gets the identification S1A1. 
 In order to provide feedback to the redundancy control system, each block in the overall power 
system can take on two different logic values - logic 1 for a working block and logic 0 for a faulty 
block. Since a faulty block is switched off and the redundancy control system continues to check the 
status of the power system, the logic state of any faulty block is latched. This ensures that the 
redundancy control system always gets the correct logic values from each block, even though the 
block in question has failed. Having retrieved all truth-values from the blocks within the power 
system an array containing the retrieved truth-values is generated. This array now forms the basis for 
the calculation process as well as for the representation of the results. A detailed description of the 
array generation and calculations is the topic of section ‘4.4.1 Simulations and program flow’. 
 Turning the attention towards the operation of the power system the following description 
represents the actions taken by the redundancy control in case of fault detection. Assuming a well 
functioning structure as the starting point, the power system consists of 5 inputs and 5 outputs. After 
a failure within the power system, the redundancy control shuts down the blocks associated with the 
faulty block and leaves the power system comprised of 4 inputs and 4 outputs. Except for the faulty 
block the rest of the inactive blocks now serve as cold spares in case of further failures. Now suppose 
a second fault occurs, for instance, due to the increased stress on the remaining active converter 
boards. Since two faults have occurred it might now be possible to establish an alternative path 
through the power system and thus increase the number of active converter boards from 3 to 4. The 
only constraint that makes the establishment of an alternative path impossible is in case the two 
faults have occurred within the same column. In this case, the power system would consist of only 3 
converter boards, which is the minimum number required to sustain power delivery to the load.  
 Depending on the failure rate of the individual blocks it would be a rare situation that two 
successive faults occur in the same column, hence the probability of successful system 
reconfiguration is quite high. This indicates that the overall system reliability has increased 
compared to the situation with 5 separate converter boards. 
 Based on this short description of the power system and its operation, the mathematical task of the 
redundancy control can be thought of as a method of finding alternative paths through the power 
system in case of fault occurrence. 
 
 
4.4 Array-based control 
 
 As described in section ‘4.2 Introduction’ the mathematical foundation is a consideration of truth-
values as physical measurements. The truth-values in the application at hand are the discrete values 
obtained from each block in the power system, upon which the alternative path from input to output 
is calculated. The values obtained from the individual blocks can be considered as an array of 5 rows 
and 5 columns (see Figure 24). This array is a measurement of the condition of the overall power 
system and can therefore be used to identify problems within the system. Based on this 
identification, the array-based analysis suggests possible alternative paths through the power system. 
Similar results could have been obtained by using standard digital logic. The reason for not 
4. Array-based redundancy  Page 44 
 
implementing the redundancy control using this type of logic is due to the powerful array concept 
and operations in array theory, which makes it easy to expand the redundancy control scheme to 
include an arbitrary number of converter boards and switches. Thus, a formal description of a 
redundant power system comprised of any number of parallel-connected converter boards is 
straightforward, since the added system constraints are almost replicates of existing board level 
constraints. Similar implementation using standard digital logic would require considerable 
recalculations of the power system’s interconnections. 
 With reference to Figure 22, it can be seen that the number of parameters needed to describe the 
power system in question is relatively large. For this reason most system constraints have been 
omitted and the focus is on the determination of switch positions for a single switch. These results 
are then easily extended to include all system switches. 
 Having introduced the fundamentals of the power system, the array-based analysis can be carried 
out. The mathematical tool used in this project is based on the array theory developed by Dr. 
Trenchard More in the 1970’s and later (early 1980’s) implemented in the array-based software 
‘Queens Nested Interactive Language (Q’Nial)’ by Professor Michael Jenkins. This software is 
convenient for rapid design of data manipulation programs because it builds into its predefined 
operations many of the loops that are required in a conventional programming language. This 
automatic looping is also done for defined operations by using built-in second order functions (the 
so-called transformers) that apply an operation to each item of a list. This means that very fast 
algorithms are obtainable, which is of vital importance for maintaining the power system integrity. A 
more detailed introduction to Nial and its capabilities can be found in the reference manual [Je02].  
 Solving the problem at hand by establishing a generalized configuration space by means of the 
Cartesian product of all system parameters would require a tremendous amount of computer 
memory, since the number of possible combinations exceeds 1030. A different approach has therefore 
been pursued.  
 Using the allowable positions for each switch, an algorithm has been developed that in a 
successive way finds a feasible solution within a finite time interval.  
 A key element in the design of the array-based algorithm is the elimination of unacceptable states 
as opposed to generation of acceptable states. By establishing all possible solutions to a given 
combinatorial problem and eliminating the states that are noncompliant with the system constraints it 
is ensured that all allowable states are among the possible solutions. Generating the allowable states 
based on intended system performance impose an inherent risk of leaving out some of the desired 
solutions and thereby establishing an incomplete set of solutions. The starting point in the 
development of the software is a series of system constraints that would limit the number of 
allowable switch combinations. The point of origin in this development is the combination of all 
possible switch positions for a single switch. In Nial terms this can be expressed as: 
 
s1a1:= ol;     
s1a2:= ol;     
s1a3:= ol;     
s1a4:= ol;     
s1a5:= ol; 
s1a:= cart s1a1 s1a2 s1a3 s1a4 s1a5 
 
where cart is the Cartesian product. The result can be seen in Figure 23. 
 
 
4. Array-based redundancy  Page 45 
 
 
 
 
 
 
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
|ooooo|ooool| |ooloo|oolol| |loooo|loool| |loloo|lolol|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
|ooolo|oooll| |oollo|oolll| |loolo|looll| |lollo|lolll|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
|olooo|olool| |olloo|ollol| |llooo|llool| |llloo|lllol|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
|ololo|ololl| |olllo|ollll| |llolo|lloll| |llllo|lllll|
+-----+-----+ +-----+-----+ +-----+-----+ +-----+-----+
Figure 23 : Possible switch combinations 
 
 The switch positions shown in Figure 23 characterize the theoretical combinations of two discrete 
values among 5 positions. Since a switch is limited to one position at any given time the first system 
constraint can be applied to the array of switch solution. Describing this constraint by means of 
traditional ‘logic-level’ code results in: 
 
If   s1a1 = 1  Then  s1a2 & s1a3 & s1a4 & s1a5 = 0 
 
 This description is now converted into Nial terms, which results in the following lines of source 
code: 
 
a2:= OUTER <=s1a1 (not s1a2);   
a3:= OUTER <=s1a1 (not s1a3); 
a4:= OUTER <=s1a1 (not s1a4);   
a5:= OUTER <=s1a1 (not s1a5); 
 
 Noting the replication of parameter s1a1, the proposition must be united through the operation of 
colligation: 
 
(0 2 4 6) (1) (3) (5) (7) fuse (OUTER and a2 a3 a4 a5) 
 
 Following the above procedure, the remaining system constraints can be added to the source code. 
After a few transformations a list of allowable switch positions based on the system constraints can 
be obtained. Keeping in mind that the initial 25 switch positions for any given switch now has been 
reduced to 6 allowable switch positions that comply with the system constraints: 
 
+-----+-----+-----+-----+-----+-----+
|ooooo|ooool|ooolo|ooloo|olooo|loooo|  (4-1) 
+-----+-----+-----+-----+-----+-----+
s
1
a
2
s
1
a
4
 
s1a5 
 
s1a1 
 
s1a3 
4. Array-based redundancy  Page 46 
 
 The first entry from the left in (4-1) is the NULL solution where the block subsequent to the 
switch in question is disconnected from all blocks within the power system. The entry to the right in 
(4-1) is the notation used for switch position 1. The second entry from the right is the notation used 
for switch position 2 and so forth. As an example, the entry to the right in (4-1) indicates that the left-
hand side of the switch is connected to Converter 1 regardless to which converter board the right-
hand side of switch is connected. Thus, a ‘0’ in any entry in (4-1) indicates a disconnection whereas 
a ‘1’ indicates a connection between two blocks.  
 Since the software program decides which blocks to interconnect at all times, the power system 
would under normal circumstances, connect the individual block in a way that enables power 
throughput within the physical boundaries of each converter board as shown in Figure 22. In case of 
multiple failures the algorithm would find a way through the system that ensures a maximum number 
of working converter boards. In other words, allowing the algorithm to decide which blocks to 
interconnect, the overall power system is no longer comprised of 5 individual converter boards with 
interconnecting switches, but 5 times 5 blocks that can be connected in a large number ways. This 
gives rise to an increase in reliability. As a consequence, a highly reliable power system can be built 
with lower volume and mass than conventional power systems, but at the cost of increased circuit 
complexity and considerably higher cost price. 
 Before providing more detailed descriptions of the redundancy management control, it is worth 
outlining the intended interaction between hardware and software. This is illustrated in Figure 24. 
Failure Failure Failure
Failure Failure
Nial program
C
on
ve
rte
r
Block
1   0   0   0   1
0   1   1   1   0
1   1   1   1   1
1   1   1   1   1
1   1   1   1   1 C
on
ve
rte
r
Switch
0   0   0   1
1   1   1   0
1   1   1   1
1   1   1   1
1   1   1   1
 
Figure 24 : Overall system structure 
 
 From the system interaction depicted in Figure 24 it is apparent that the array-based software 
determines the system configuration at all times. The converter control is performed by a separate 
analog PWM controller not shown in Figure 24. 
 Examining the power system and the tasks of the redundancy control from a topological point of 
view it should be noted that the system’s topological array has characteristics similar to that of the 
incidence matrix describing electric networks within the field of graph theory. The reason the 
topological array only has similar and not identical characteristics to the incidence matrix is due to 
the unidirectional flow of power through the system (see Figure 18). The classical approach in 
electric network theory using arrays is the bi-directional power flow that uses the numbers 0, 1 and  –
4. Array-based redundancy  Page 47 
 
1 to identify the flow direction. Since the power system at hand only allows power to flow in one 
direction the closest match to the incidence matrix is the use of unidirectional circuit elements such 
as semiconductor devices within the electric network itself. From a mathematical point of view this 
adds considerably to the complexity of the system when performing reliability calculations, since the 
system now includes multiple failure modes for each block. Also, the analysis assumes a constraint 
between the blocks ‘PWM controller’ and ‘Power-switch’. This constraint ensures a correct 
connection between the driving PWM controller and the semiconductor device acting as the 
switching element in the converter.  
 
4.4.1 Simulations and program flow  
 
 As an example of the capabilities of the algorithm, let the power system suffer from 8 faults 
located in different places throughout the power system. This is a condition from which a power 
system comprised of standard parallel-connected converter boards could never recover. As seen from 
Figure 25 each row in the power system has suffered at least one failure. 
 
       
 
Figure 25 : 8 faults distributed among all 5 converters 
 
 The array to the left in Figure 25 is the system truth-values as they are entered into the system 
array for calculation purposes. In order to preserve the system information contained in the array to 
the left in Figure 25 a temporary solution-array is generated and denoted AA.  
 
+--+--+--+--+
|ll|lo|oo|ol|
+--+--+--+--+
|oo|ol|lo|ol|
+--+--+--+--+
AA = |ll|lo|ol|ll|
+--+--+--+--+
|ll|ll|ll|lo|
+--+--+--+--+
|ol|ll|ll|ll|
+--+--+--+--+
 
 In a traditional software implementation the steps shown in Figure 26 are taken on a successive 
basis before a complete switch reconnection matrix is established. It should be noted that the 
1 1 0 0 1 
0 0 1 0 1 
1 1 0 1 1 
1 1 1 1 0 
0 1 1 1 1 
4. Array-based redundancy  Page 48 
 
elements in the matrices shown in Figure 26 only serves as illustration and is therefore not related to 
the matrix shown in Figure 25.  
 Since the procedures in each of the three steps are almost identical the description that follows 
will be limited to illustrate the traditional logic implementation of step 1 followed by the equivalent 
array-based implementation. However, completing the 3 steps only provides part of the solution. The 
remaining elements in the solution-array come from inserting the NULL solution. The approach 
taken in inserting this solution differs from the 3 steps already performed and will therefore be 
introduced at the end of this section. 
 The starting point of the column examination is the answer to the following question:  
 
 Is 11 an element in the first column of the AA matrix?  
Co
nv
er
te
r
Block
1   0   0   0   1
0   1   1   1   0
1   1   1   1   1
1   1   1   1   1
1   1   1   1   1C
on
ve
rte
r
Block
1   0   0   0   1
0   1   1   1   0
1   1   1   1   1
1   1   1   1   1
1   1   1   1   1 C
on
ve
rte
r
Block
1   0   0   0   1
0   1   1   1   0
1   1   1   1   1
1   1   1   1   1
1   1   1   1   1
Step 1 Step 2 Step 3
 
 Figure 26 : Traditional procedure for establishing maximum number of working converters 
 
 The actions taken by a traditional logic implementation in completing step 1 is shown in Figure 
27. The examination starts out by considering the element in position (0,0). If this element differs 
from 11 the algorithm examines the next position (1,0). On the other if the element in position (0,0) 
matches 11 the algorithm finds a solution. This solution is now compared to other solutions in 
column 0 to avoid inserting the same solutions multiple times. Since this solution would be the first 
in column 0 it is actually unnecessary to search for previous solutions. However, in order to keep the 
algorithm as short and fast as possible it is chosen to let the algorithm search for previous solutions 
although the solution for position (0,0) is the first. 
 The algorithm now examines the internal counters to see if position (4,0) has been reached. If not, 
the algorithm starts comparing the element in the next position to 11 and the process repeats itself. 
On the other hand, if position (4,0) has been reached the algorithm ends the subroutine and proceeds 
to the next program level, which starts a similar search and insert routine for the row examination in 
step 2. 
4. Array-based redundancy  Page 49 
 
Does the
combination l l
exist?
Find solution
Does the solution
already exist in
column 0?
Find new solution
Chose solutionNo
Yes
Yes
Proceed to next
position in column
No
Begin in position
(0,0) in array 'AA'
Has position (4,0)
been examined?
Proceed to next
program level
Yes
No
Failure
Failure
S1A1
 
Figure 27 : Dataflow diagram for step 1 
 
 The rather tedious process, shown in Figure 27 and described above, can in Nial terms be 
expressed in the following very compact form: 
 
Q:= ((0 pick (cols AA) EACHLEFT = ll) link o) 
 
Output: lolloo 
 
 The result is shown as truth-values. Due to the number of allowable switch positions a falsehood 
has been attached to the end of the result, by adopting the use of the operation ‘link’. 
 The next step in the array-based realization is the assigning of correct switch positions to the 
entries that returned truth. This is completed through the operation ‘sublist’. 
 
Y:= Q sublist (reverse Res_1) 
4. Array-based redundancy  Page 50 
 
+-----+-----+-----+
Output: |loooo|ooloo|ooolo|       (4-2) 
+-----+-----+-----+
 In order to insert the correct switch positions into the result array, the positions that returned truth 
must be identified in Index origin 0. 
 
Index:= EACH first (Y EACHLEFT sublist tell (first shape AA)) 
 
Output: 0 2 3 
 
 Finally, the assigned switch positions are inserted into the temporary result array by using the 
operation ‘placeall’. 
 
Y (cart Index 0) placeall AA 
+-----+--+--+--+
|loooo|lo|oo|ol|
+-----+--+--+--+
|oo |ol|lo|ol|
+-----+--+--+--+
Output: Modified AA = |ooloo|lo|ol|ll|
+-----+--+--+--+
|ooolo|ll|ll|lo|
+-----+--+--+--+
|ol |ll|ll|ll|
+-----+--+--+--+
 
 Following a similar procedure, the rest of the truth-values in the array shown to the left in Figure 
25 are replaced by feasible switch positions. Since the insertion of feasible switch positions only 
provides part of the solution the next and final step in the process is the insertion of the NULL 
solution where applicable. For comparison purposes, a traditional software-based implementation of 
this final step has been implemented. The program is shown below: 
 
 For Y With 0 1 2 3 DO 
        For X With 0 1 2 3 4 DO  
   If (X Y pick AA) = oo Then (AA@(X Y):=pos00) 
          Elseif (X Y pick AA) = lo Then (AA@(X Y):=pos00) 
         Endif; 
  Endfor; 
 Endfor; 
 
 The program is implemented by means of the traditional use of for-if-then loops. Unfortunately, 
this realization requires the examination of each element in the AA matrix separately before a 
complete solution can be established. A similar routine using the array-based approach works on the 
entire AA matrix for each operation or transformer. In Nial terms, the first step in the determination 
of the NULL solution is the generation of a position array, similar to that already described for the 
insertion of switch positions during step 1. 
4. Array-based redundancy  Page 51 
 
X:= (AA eachleft = oo) or (AA eachleft = lo) 
 
  oolo 
  looo 
Output: oooo 
  oooo 
  oooo 
 
 The switch position defining the NULL solution is now recalled from the allowable switch 
positions found in (4-1). 
 
Y:= looooo sublist Res_1 
 
+-----+
Output: |ooooo|         (4-3) 
+-----+
 
 Next, the positions in the result array, where the NULL solution is being inserted, have to be 
determined. This is achieved through the use of the operation ‘sublist’. The final result array is 
initially a copy of the array shown to the left in Figure 25. As feasible solutions are found these are 
inserted into this copy to eventually form the result array shown in Figure 28. 
 
E:= X sublist (tell shape AA) 
 
+---+---+
Output: |0 2|1 0|         (4-4) 
+---+---+
 
 Having identified the positions where the NULL solution should be inserted the operation 
‘placeall’ is used to carry out this final step. 
 
Y E placeall AA 
 
+-----+--+-----+--+
|ll |lo|ooooo|ol|
+-----+--+-----+--+
|ooooo|ol|lo |ol|
+-----+--+-----+--+
Output: |ll |lo|ol |ll|
+-----+--+-----+--+
|ll |ll|ll |lo|
+-----+--+-----+--+
|ol |ll|ll |ll|
+-----+--+-----+--+
 
 The array shown above is a separate copy of the final result array. The NULL solution is actually 
the last solution to be inserted, but for illustration purposes all other solutions have been removed. 
4. Array-based redundancy  Page 52 
 
 The resulting array for the case of 8 faults distributed among all 5 converters has now been 
established. The outcome is shown in Figure 28. 
  
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||loooo|||ooooo|||ooooo|||ooloo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||ooooo|||loooo|||ooooo|||ooolo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||ooloo|||ooooo|||olooo|||ooool|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||ooolo|||ooloo|||ooolo|||ooooo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||ooooo|||ooolo|||ooool|||ooooo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+
Figure 28 : Result array 
 
 Comparing the array shown to the left in Figure 25, with the result array shown in Figure 28 it is 
obvious that the two arrays are linked through a transformation array. By examining the axis of the 
two arrays, it can be seen that the transformation array is the previously mentioned incidence matrix 
for electric networks.  
 For illustration purposes one of the proposed converters were implemented and the switches were 
simulated by mechanical interconnections. All the blocks mentioned in this chapter can be identified 
in the illustration shown in Figure 29. 
 
 
 Figure 29 : Real-world implementation of proposed converter topology 
4. Array-based redundancy  Page 53 
 
4.5 Reliability assessment 
 
 The reliability of the power system has to be established by means of either the rather complex 
equations concerning system parallel-connection on an individual basis or the simpler network 
reduction techniques. Due to the large number of states in which the power system can reside the 
individual system calculations become extremely complicated resulting in loss of any insight into the 
relation between survivability of each block and the impact on the overall system. Changing the 
viewpoint from dynamic parts level redundancy to system level survivability makes it possible to 
express the overall system performance concerning reliability as a function of time. This approach 
does not provide any system information during transition from one state to another. However, in 
most cases the figures of merit relevant to most customers is the probability of system survival 
within the expected system lifetime. For this reason the proposed system level approach will be 
utilized. 
 When considering reliability assessment several evaluation techniques are applicable as described 
in chapter ‘3 Concept clarification and point of origin’. Due to the complicated interconnection of the 
individual blocks within the power system, a generalized approach focusing on a formal system 
description by means of block reliabilities is desirable. Two approaches comply with the latter desire 
- event trees and connection matrix techniques. Since the power system is comprised of a rather large 
number of blocks, the event tree approach quickly becomes too complex. In contrast, the connection 
matrix technique establishes a matrix representing power flow between system nodes by means 
block reliabilities. Thus, the obvious approach is the connection matrix technique, which will be used 
throughout the remainder of this chapter. 
 Figure 30 shows a cross section of the power system found in Figure 22.  
BA
HF
2 3
Input
Input
1
G I
4
 
Figure 30 : Cross section of Figure 22 
 
 Representing both power flow and system nodes, the green arrows in Figure 30 are the basis for 
the connection matrix technique. The power flow from one block to another is unidirectional 
whereas the flow to and from a switch is bidirectional. The blocks interconnecting the individual 
nodes are characterized by their probability of providing fault free operation for a specified period of 
time. The establishment of the connection matrix is now straightforward, as the entries of the matrix 
are the probabilities for each block interconnecting two adjacent nodes. Figure 31 shows the 
technique applied to the blocks and nodes found in Figure 30. 
 
4. Array-based redundancy  Page 54 
 
1 2 3
1
2
3
4
4
1 A 0 0
0 1 B G
0 0 1 0
10 G 0
0
0
0 0 0
0 0
0
1
C
0
 
Figure 31 : Partial connection matrix for illustration purposes 
 
 In Figure 31 the blue circles define the unidirectional flow between node 1 and 2 (block A) while 
the red circles define the bidirectional flow between node 2 and 4 (block G - a switch). The complete 
connection matrix for the primary power system operating in a fault free state can be seen at the end 
of the thesis (see Table of contents). 
 The next step in the process is either node removal through sequential reduction or matrix 
multiplication. The latter method - being the easiest to apply - is the method of choice. Application 
of the matrix multiplication is straightforward, as the basic connection matrix is multiplied by itself a 
number of times until the resulting matrix remains unchanged. The transmission from input to output 
is now derived from the matrix, as the entry found in row 1 and the column containing the output 
node.  
 Based on the solution derived from the connection matrix and on the assumption that the block 
failure rates throughout the power system are identical the following system probability equation can 
be established:  
                                                                                                  
 ( ) t   -based-Array Switch5e  P ⋅+= λ
λ
   (4-5) 
 
where λSwitch is the failure rate of each switch and λ is the overall failure rate for each converter 
board. From the exponential distribution, described in chapter ‘3 Concept clarification and point of 
origin’, the probability of system survival of a traditional redundant power system can be found: 
 
 t-lTraditiona e  P
⋅
=
λ    (4-6) 
 
 Comparing (4-5) and (4-6) it can be seen that the only difference is the exponent. However, as 
will become apparent, this difference is of great importance when considering redundant systems.  
By means of the equations (3-7) and (3-17) the binomial coefficients for the N+2 redundant system 
can be established: 
                                                                                                  
 λλλ ⋅⋅⋅⋅⋅⋅ ⋅+⋅⋅= t-5t-4t-3lTraditiona e6e15-e10  R   (4-7) 
 
 κκκ ⋅⋅⋅⋅⋅⋅ ⋅+⋅⋅= t-5t-4t-3based-Array e6e15-e10  R   (4-8) 
 
where κ is equal to: 
                                                                                                  
4. Array-based redundancy  Page 55 
 
 ( ) t    Switch5 ⋅+= λκ λ    (4-9) 
 
 Plotting the two equations reveals the probability of system survival for a given period of time as 
a function of overall converter failure rate.  
 
2.105 4.105 6.105 8.105 10.105 12.105
Failure rate (FIT)
Reliability
0.8
0.6
0.4
0.2
1.0
 
Figure 32 : Probability of system survival 
 
 The red line in Figure 32 is the system reliability for the array-based approach while the green line 
shows the system reliability for a traditional redundant power system. Comparing the two equations 
(4-7) and (4-8) it can be calculated that the reliability of the array-based approach is worse at 
converter board failure rates below the switch failure rate plus one fifth the converter board failure 
rate. The boundary between the two reliability scenarios can in mathematical terms be expressed as: 
                                                                                                  
 ( ) t- t   - Switch5 ⋅=⋅+ λλλ    ⇒   45 Switch  λλ ⋅=     (4-10) 
 
 At converter board failure rates below the value given in (4-10) the traditional approach would be 
preferable. However, the converter board failure rate for any power system would by far exceed the 
failure rate of a single switch. For this reason it can be concluded that the array-based approach 
indeed increases the overall reliability of the proposed power system configuration. 
 
 
4.6 Discussion and Summary 
 
 An alternative approach in the design of reliable power systems has been presented. Based on 
statistical calculations, using among others the exponential distribution and the connection matrix 
technique, it has been show that redundancy is the tool to implement when considering highly 
reliable power systems. In fact, it can be deduced from (4-7) and (4-8) that the overall unavailability 
reduction is 88% of that of a traditional power system utilizing single parallel-connected converter 
Array-based redundancy 
 
Traditional redundancy 
4. Array-based redundancy  Page 56 
 
boards. In other words, the configuration proposed in this chapter allows for elimination of a single 
converter board while still maintaining the same reliability as a traditional system configuration. 
 Based on the dynamic reconfiguration of the system, it has been shown that a higher system 
efficiency is achievable in fault situations compared to other parallel-connection techniques. 
However, for the primary power system working without any faults the proposed power system 
shows lower efficiency than any other configuration due to the many extra switches. The many 
switches have several other drawbacks, among other things the added system cost and overall 
increase in complexity.  
 Having mentioned the main drawbacks, it is worth noting that the proposed power system, besides 
from improving the overall system reliability, imposes many advantages such as low system 
maintenance costs, consistency and speed of the redundancy management control and overall 
flexibility. The latter fact allows for easy system expansion in case extra load supply is desirable.  
 In conclusion, the transformers and operations used in the realization of the array-based 
redundancy control is shown in Table 6: 
 
Transformers Operations 
EACH cart 
EACHLEFT fuse 
OUTER and 
INNER or 
 link 
 cols 
 sublist 
 reverse 
 first 
 tell 
 second 
 shape 
 place 
 placeall 
Table 6 : List of transformers and operations used throughout this chapter 
 
 A detailed characterization of each element in Table 6 can be found in the help menu for the Nial 
software or by consulting the Nial reference manual. 
 
 
4.7 References 
 
[We02]  Introduction to Graph Theory, Douglas B. West, Prentice Hall, ISBN 0-13-227828-6 
 
[Je01]  Array Theory and Nial, Mike Jenkins and Peter Falster, Research report, Department of 
Electric Power engineering, Technical University of Denmark, August 1999 
 
[Mo01]  Considerations for Array Theory and the Design of Nial, Trenchard More, July 1989, 
Visiting Professor at the Electric Power Engineering Department, Technical University of 
Denmark 
 
4. Array-based redundancy  Page 57 
 
[Fr01]  A note on Inference by Transitivity, Ole Immanuel Franksen, April 1992, Electric 
Power Engineering Department, Technical University of Denmark 
 
[Fr02]  Basic Assumptions of Array Theory, Ole Immanuel Franksen, February 1996, Electric 
Power Engineering Department, Technical University of Denmark 
 
[Pe01]  An Introduction to Array Theory and Nial, Allan Pedersen, September 1990, Electric 
Power Engineering Department, Technical University of Denmark 
 
[Pe02]  Q’Nial Stand – By, Allan Pedersen and Jens Ulrik Hansen, May 1988, Electric Power 
Engineering Department, Technical University of Denmark 
 
[Mø02]  On the technology of array-based logic, Gert L. Møller, Ph.D. thesis 1995, Electric 
Power Engineering Department, Technical University of Denmark 
 
[Fr03]  Group Representations of Finite Polyvalent Logic – a Case Study Using APL 
Notation, Ole Immanuel Franksen, IFAC VII World Congress, Helsinki, June 1978. 
 
[Fr04]  Colligation or, the logic inference of interconnection, Ole Immanuel Franksen and 
Peter Falster, Mathematics and Computers in Simulation 52 (2000) 1-9. 
 
[Wa01]  Evaluating Performance and Reliability of Automatically Reconfigurable Aerospace 
Systems Using Markov Modeling Techniques, Bruce K. Walker, Department of 
Aerospace Engineering & Engineering Mechanics, University of Cincinnati, OH, USA. 
 
[Mo02]  Notes on the Diagrams, Logic and Operations of Array Theory, Trenchard More, 
Structures and operations in Engineering and Management Systems, The second 
Lerchendal Book, Tapir Publishers. 
 
[Je02]  Q’Nial Reference Manual, Mike A. Jenkins, Nial Systems Limited, Kingston, Ontario, 
Canada, 1985. 
 
[Mi01]  Reliability Prediction of Electronic Equipment, Military Handbook 217 
 
[Ne01]  An array-based study of increased system lifetime probability, Carsten Nesgaard, 
Applied Power electronics Conference and Exposition 2003, Miami, USA 
 
[Si01]  Digital Electronic Switching system, Siemens, A30808-X2751-X-2-7618 
 
 
 
 
 
  
5. Digital control of DC-DC converters  Page 58 
 
5 Digital control of DC-DC converters 
 
 This chapter presents a fully digital converter control implemented by means of a low-cost 
microcontroller. The different aspects in transitioning from an analog design to a digital design are 
considered and design choices based on timing limitations, power consumptions and reliability are 
proposed. These choices then form the basis for two different implementations that utilize look-up 
tables for improved execution speed, multiple control laws for improved efficiency and thermal 
monitoring for improved overall control. The reliability calculations unfortunately show that despite 
the effort in improving different aspects of the converter control the overall system utilizing a digital 
controller is much more likely to fail than a similar design utilizing an analog controller. 
 
 
5.1 Introduction 
 
 The promising results of the array-based redundancy control technique presented in the previous 
chapter motivated further examination of the capabilities of dedicated digital controllers. This 
chapter describes the implementation of a fully digitally controlled converter with built-in 
monitoring features for improved reliability. 
 The majority of digital control circuitry implementations are often based on real-time 
computations (see the ‘state of the art techniques’ database). This imposes a significant limitation of 
the obtainable sample frequency, which in turn limits the converter’s dynamic response to load 
changes. In order to be competitive in a commercial market, the digital approach has to provide the 
end-user with features similar to those already offered in fully analog controlled converters. 
Therefore, several optimization techniques were considered before finally settling with the look-up 
table approach described in a subsequent section. 
 Following the theoretical examinations of typical characteristics of modern digitally controlled 
converters, the choice of an appropriate controller had to be made. In order to accommodate the 
specifications for the converter described in this chapter, several controllers were considered based 
on execution speed, power consumption, memory capacity and signal conversion capabilities. The 
controller that had the highest execution speed to power consumption ratio was the 8-bit RISC PIC 
16F977 microcontroller, making it the most suitable candidate for controlling a DC/DC converter. 
This fact combined with a large memory capacity and high-quality development tools provided by 
the device manufacturer made it the microcontroller of choice (in March 2001). 
 Based on equations found in the device datasheet on calculating the acquisition time for a single 
analog to digital conversion, an average value of 20.8 µs for each analog parameter conversion can 
be established. Due to the low system bandwidth that results from this rather long acquisition time, 
converter control and implementation of monitoring features based primarily on periodic sampled 
analog parameters should be avoided. Therefore a trade-off between execution speed, precision, 
complexity and cost is inevitable. If the acquisition time of the ADC, imbedded in the 
microcontroller, is unacceptable due to lack of execution speed an external ADC circuitry can be 
added. Unfortunately, this adds to both cost and complexity of the overall system. Since the focal 
point in this research is reliability issues in digitally controlled converters it was decided not to add 
any external circuitry.  
 The prioritized list shown Table 7 clarifies some of the core PIC microcontroller features of 
relevance to this work. Further equations and detailed descriptions of the features provided by this 
microcontroller can be found in the manufacture’s datasheet. 
 
5. Digital control of DC-DC converters  Page 59 
 
Core features:  Use: 
8 k 14-bit word flash memory  
256 E2PROM data memory → Algorithm and look-up table 
Single cycle operations 
20 MHz clock frequency → Converter control 
10-bit PWM module  
8 channel 10-bit A/D converter → Execution speed 
Table 7 : Core features of PIC 16F877 microcontroller 
 
 Having chosen a suitable controller the next step in the development is realization of a 
microcontroller programmer and I/O card. Initially these circuits were designed using Protel’s PCB 
features and fabricated by means of discrete devices. However, it soon became clear that Microchip 
offered entire evaluation kits including manuals, an in-circuit microcontroller programmer and I/O 
card with features for testing the ADC and all RC ports. Since the topic of the research in digitally 
controlled converters did not include realization of programming hardware and debugging circuitry, 
it was decided to acquire this evaluation/developer kit. A picture of the in-circuit programmer (left) 
and I/O card (right) can be seen in Figure 33. 
 
          
Figure 33 : PIC 16F877 microcontroller developer kit 
 
 The mathematics involved in the design of digital systems differs from that associated with 
similar analog implementations and will therefore briefly be introduced in the remainder of this 
section. The point of origin is the mixed analog/digital system shown in Figure 34. 
 
Control
Law DAC Process
ADC
-
+ e(i) u(i) u(t)r(i)
c(i)
c(t)
Discrete time Continuous time  
Figure 34 : Mixed analog/digital system 
 
5. Digital control of DC-DC converters  Page 60 
 
 At each sample instant the controller samples the process output c(t) via the ADC to produce the 
sampled value c(i). The controller then subtracts this from the reference value r(i) to produce the 
error e(i). The error is then manipulated by the system control law to produce the control effort u(i), 
which is sent to the DAC to generate the desired duty cycle. In analytic terms the generation of the 
error signal is given by: 
 
 e(i) = r(i) – c(i)   (5-1) 
 
 The next parameter to establish is the control law. In general, a wide variety of control algorithms 
are applicable, each with its own set of pros and cons. In power electronics, especially in DC-DC 
converters, the most common type of control algorithm is the proportional plus integral plus 
derivative (PID). The variable control effort produced by this algorithm is given by:  
 
 





⋅++⋅= ∫ dt
tded )(T   )e(
T
1  e(t)K  (t)u D
t
0I
C ττ   (5-2) 
 
where the K defines the gain, TI is the integral time and TD is the derivative time. Since the control 
system in this application is discrete the continuous time equation (5-2) must be converted into a 
discrete time equivalent. This is achievable in a number of ways – for example by means of Tustin’s 
rule or utilization of Euler’s method. The approach taken in this thesis is comprised of two steps – 
the first being the application of Euler’s method, which is given by: 
 
 
T
 x(i)- 1)  x(i  (i)x +≅   (5-3) 
 
and second, by adapting the following notation as part of the equation deduction: 
 
 T =  ti+1 – ti is the sample interval 
 ti =  i⋅T applies for constant sample interval 
 i =  an integer 
 x(i) =  value of x at time ti  
 x(i+1) =  value of x at time ti+1  
 
Splitting (5-2) into 3 terms and applying (5-3) where applicable results in the following 3 analytic 
equations: 
 
 e(t)K  u(t) ⋅=  → e(i)K  (i)u P ⋅=   (5-4) 
 
 dτ )e(
T
K  u(t)
t
0I
τ∫ ⋅=  → e(i)TT
K  1) - u(i  (i)u
I
I ⋅⋅+=   (5-5) 
 
 
dt
de(t)
⋅⋅= DTK  u(t)  → [ ]1) - e(i - e(i)T
TK  (i)u DD ⋅
⋅
=   (5-6) 
 
 The proportional term (5-4) determines the controller gain. Converter control is achievable by 
means of a simple proportional (P)-controller through gain adjustments that effectively ensures 
5. Digital control of DC-DC converters  Page 61 
 
adequate phase margin at the crossover frequency. However, this technique has the drawback of 
lowering the overall dynamic range of the converter considerably, resulting in a poorly performing 
system. Furthermore, the P-controller introduces a small steady-state error and often causes large 
overshoots when reacting to a system change. Applying a term proportional to the integral of the 
error (5-5) reduces the steady-state error by changing the forward system path to a so-called type-1 
system. The system response has now been improved considerably but the large overshoot to system 
changes remains a problem, since the proportional term only reacts to changes and not to the rate of 
change. Applying a term proportional to the derivative of the error (5-6) improves the overall system 
response by damping this overshoot. While the derivative term provides the dampening feature to a 
step response and adds speed to the overall control system, it has the drawback of making the control 
loop more sensitive to noise. Although insignificant in the application at hand, the increased 
sensitivity can impose stability challenges in electrical systems and must be carefully considered in 
each design. 
 Since a digital converter has to provide the end-user with the same features as an analog 
realization, the use of a multi-term controller is the only viable solution. The initial real-world 
implementation utilizes a PI controller for the basic converter operation and concentrates on 
optimizing the software execution speed to allow for the implementation of advanced feature. The 
improved design changes the control strategy to a PID control law and incorporates additional 
monitoring features. The result is a slightly lower sampling frequency than otherwise achievable with 
the PI control law realization. 
 
 
5.2 Specifications for digital converter implementation 
 
 Based on initial system evaluations concerning reliability vs. converter stress parameters, it 
became apparent that component temperature was the major contributor to poor system reliability. A 
fact that plays a major role in several subsequent chapters in this thesis. Focusing on this particular 
parameter, an evident approach in lowering the overall system temperature in a single path system 
with a fixed heatsink surface area and preselected components is the change of modulation type. This 
approach is intuitively obvious since the power delivered to the load has to pass the power 
components in the converter and the only adjustable parameter is the way the power passes these 
components.  
 It is well-known that hard-switched PWM controlled converters often exhibit poor efficiency at 
light loads, while the efficiency at heavy and full load can be optimized by proper choice of 
parameters such as switching frequency and inductor current ripple. Closer examination of the 
reasons for the poor converter efficiency at light loads reveals that the major contributor is the 
relatively constant switching losses. In chapter ‘6 Load sharing’ it will be shown that the switching 
losses as well as the conduction losses are closely related to component temperature. Therefore the 
term ‘relatively constant switching losses’ is a statement with certain modifications that only apply in 
the application at hand due to the low power throughput. At the other end of the operating region, at 
heavy loads, the conduction losses are usually dominant and the use of fixed frequency PWM control 
in this region is often considered the best approach. Based on the observations just described it is 
clear that in order to improve the converter efficiency at light loads it would be necessary to 
minimize the switching losses. In turn this would increase the overall system efficiency in this region 
of operation, which causes the average system temperature to decrease, and consequently a higher 
system reliability is obtained. 
 A basic system capable of implementing the abovementioned features is illustrated in Figure 35 
where all relevant subsystems are identifiable – the microcontroller (MCU), the converter (Filter, 
5. Digital control of DC-DC converters  Page 62 
 
Switches, Rectifier/Filter) and the PC running the development software. It should be pointed out 
that Figure 35 only serves as an illustration of feasible system realizations, since further parameters 
and/or simplifications to the converter itself might have to be included.  
 
MCU
Filter Switches Rectifier/Filter VOutVIn
 
Figure 35 : Basic digital converter system 
 
 Combining the abovementioned requirements with the desire to implement a relatively simple 
topology that eventually would facilitate analytical redundancy implementations results in the 
following list of specifications for the converter considered in this chapter: 
 
• Simple converter topology (the focus should be on software execution and reliability) 
 
• Continuous measurements of input voltage, input current, output voltage and output current 
 
• Thermal monitoring and converter shut-down in case of thermal overloading 
 
• Pulse Width Modulation (PWM) control for power throughput above 1.85 W 
 
• Pulse Skipping (PS) control for power throughput below 1.85 W 
 
 To comply with these requirements it was decided to build a simple 5 W buck converter, thus 
maintaining focus on software development and reliability evaluation. Aside from the buck 
converter, a number of measurement circuits were added along with the PIC microcontroller. The 
microcontroller was clocked at 20 MHz, which is the maximum achievable clock frequency. A 
graphical representation of the test circuit is seen in Figure 36. 
 
Power switch Filter
PIC16F877
microcontroller
12V Input 5V Output
Temp
Duty-cycle
Input current
Input voltage
Output current
Output voltage
1AMAX
 
Figure 36 : Digitally controlled buck converter 
5. Digital control of DC-DC converters  Page 63 
 
 The buck converter, which is represented by the blocks ‘Power switch’ and ‘Filter’, is comprised 
of a 200 µH inductor, a 1 mF output filter capacitor, a 1N5811 schottky freewheeling diode and a 
IRF9530 power MOSFET transistor. Voltage sensing is achieved by utilization of a simple voltage 
divider, while current sensing at the in- and output is achieved by means of sensing resistors (see 
Figure 39). Utilization of these simple techniques has the disadvantage of resulting in lower overall 
converter efficiency than otherwise obtainable with more sophisticated techniques. However, the 
objective of this research remains clear regardless of which technique is used – that is, an 
examination of promising reliability enhancement techniques. In future designs the use of low-loss 
measurements techniques can easily be implemented which would then result in increased system 
efficiency.  
 To comply with the requirement of thermal monitoring a temperature-sensing device is mounted 
on the switching MOSFET transistor. The device is a 2-wire digital thermometer from Dallas 
Semiconductors/Maxim and is equipped with an I2C interface. Since the microcontroller supports the 
use of the I2C communication bus, this thermometer is very suitable for the application at hand. In 
fact, using the I2C bus for communication between external devices and the microcontroller allows 
for a faster CPU software execution since several key communication features have been directly 
implemented in the I2C unit. This enables the microcontroller to focus on system control and only 
respond to external devices when data is ready. 
 The implementation of the software for the initial prototype was part of a special course 
supervised by the undersigned. The student was given the system specifications, the buck converter, 
data on the microcontroller and the development software MPLAB. Although MPLAB allows for 
generation of source code using standard C, it was set as a further requirement that the source code 
for the converter control algorithm was written in assembler due to the optimized compilation hereof.  
Weekly meetings throughout the special course ensured that all aspects of the development were 
considered. 
 The test converter, given to the student, that complies with the abovementioned specifications is 
depicted in Figure 37. The toggle switch that can be seen to the left in Figure 37 is used to switch 
between the digital control and an analog reference circuitry. This facilitated an easy comparison of 
the different voltage and current waveforms during the initial controller synthesis. 
 
MOSFET
Analog IC or digital control
Interface for PIC16F877
Thermometer
 
Figure 37 : First prototype of the digitally controlled converter 
5. Digital control of DC-DC converters  Page 64 
 
 The software determination of which control law to apply at any given time is based on a set of 
measurements of the input current, input voltage, output current and output voltage. Of these four 
parameters that all can be identified in Figure 36, the measurement of the output voltage is the most 
important since the generation of a proper duty cycle depends solely hereof. The theoretical 
information about the system states is deduced by means of traditional converter analysis techniques 
combined with a detailed system efficiency assessment. The latter assessment indicates that the 
change from one control law to the other should occur around 1.85 W (equal to 370 mA). This 
information is stored in the microcontroller memory to form a fixed reference during the transition 
process.  
 After incorporating the converter state information into the control algorithm, it soon became 
clear that using a single point of the operating curve as a mark for control law changes results in 
oscillatory behavior, when slightly variable load currents around 370 mA is supplied by the 
converter. This oscillatory behavior results in increased high frequency noise and deteriorated 
dynamic converter performance. It was therefore chosen to change the control law from PWM to PS 
when the power level decreased below 1.5 W and changing the control law from PS to PWM when 
the power level increased above 2 W. Thus, effectively incorporating hysteresis into the control 
algorithm.  
 Initially the PWM control law was implemented in real-time where calculations are performed 
continuously based on the measured output voltage. However, analysis of the program execution 
speed revealed that in order to maintain an acceptable sampling frequency a different implementation 
technique had to be used. Under normal operating conditions it is possible to predict the behavior of 
the converter and therefore in an analytic way calculate the duty cycle needed for proper operation. 
This analytic fact is used to generate a look-up table containing all the information needed for 
continuous converter operation within the specified limits. From the microcontroller datasheet it can 
be seen that accessing the program memory can be achieved in 16 cycles which equates to 4 µs. Of 
the available 8 k program memory only a small segment is used for the actual control software, 
leaving plenty of memory available for other purposes. Implementation of the proposed look-up table 
is simplified by means of a small C program that generates an assembler file containing the 
numerical values. Once the entire set of assembler files are compiled, the look-up table is placed in 
memory along with the control algorithm. This optimized code execution results in a switching 
frequency of 77 kHz and a sampling frequency of 10 kHz when operated in PWM mode. Due to 
asymmetrical skipping of pulses in PS mode the switching frequency is no longer fixed. However, 
the sample frequency remains unchanged. The use of a dedicated control look-up table was found to 
increase the sample frequency from 3 kHz to approximately 12 kHz, thus improving the dynamic 
response considerably. The reason for using a sample frequency of 10 kHz as opposed to the 
achievable 12 kHz is to allow for future implementation of extra monitoring features – a topic in the 
next section. Also shown in the next section is a dataflow diagram of a program run for the improved 
design. Due to the duality of the two designs, a very similar diagram would characterize the system 
discussed in this section. For this reason a program run for the initial design is omitted. 
 
 
5.3 Improved design 
 
 The improved digitally controlled converter uses the same component values as the initial 
prototype, but is implemented on a printed circuit board and has additional features for monitoring 
and control. With the introduction of new control and monitoring parameters, the basic digital 
control system depicted in Figure 38 is used as a reference.  
 
5. Digital control of DC-DC converters  Page 65 
 
Plant
DiagnosisSupervisor
Controller
µP
 
Figure 38 : Control system 
 
 In Figure 38, the supervisor function, the diagnosis function and the controller make up the 
control algorithm implemented in the improved design. The converter (or plant) is unchanged 
whereby testing of the improved design is simplified since the control software for the initial 
prototype can be implemented directly.  
 The specifications for the initial converter provide a very limited number of the monitoring and 
surveillance features desirable in a highly reliable converter. Features such as fault management and 
prediction would optimize the overall system and is therefore highly desirable. This section extends 
the previous list of specifications to include continuous converter operation during certain falsely 
triggered shut-down events and thereby effectively eliminating a number of failure modes from the 
system. The point of origin of this failure mode elimination is the sensing resistor network shown in 
Figure 39.  
 
Converter
RI1
RI2
RI4
RI5
RI3
RO1
RO2
RO4
RO5
RO3
VRI3 = VI1 - VI2 VRO3 = VO1 - VO2
VO1 VO2VI1 VI2 VOUTVIN
+
-
+
-
RI3 = RO3 = 1 Ω
RI1 = RI2 = RI4 = RI5 = RO1 = RO2 = RO4 = RO5 = 10 kΩ  
Figure 39 : Converter and sensing resistor network 
 
 The measurement of VI1 and VO2 serves a dual purpose since they are used for the initial input 
voltage sensing, part of the input current sensing, the continuous output voltage sensing and the 
instantaneous output current sensing. A simplified failure modes effects analysis performed for these 
sensing resistors can be seen in Table 8 for the initial design and in Table 9 for the improved design.  
 
 
 
5. Digital control of DC-DC converters  Page 66 
 
Part Functionality Failure mode Local effect System effect
RI1 Input voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Input voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RI2 Input voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Input voltage measurement Open circuit Overvoltage at controller Loss of unit 
RI3 Input current measurement Short circuit Loss of IIN signal Loss of unit 
 Input current measurement Open circuit Loss of converter power Loss of unit 
RI4 Input voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Input voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RI5 Input voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Input voltage measurement Open circuit Overvoltage at controller Loss of unit 
RO1 Output voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Output voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RO2 Output voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Output voltage measurement Open circuit Overvoltage at controller Loss of unit 
RO3 Output current measurement Short circuit Loss of IOUT signal Loss of unit 
 Output current measurement Open circuit Loss of output current Loss of unit 
RO4 Output voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Output voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RO5 Output voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Output voltage measurement Open circuit Overvoltage at controller Loss of unit 
Table 8 : Simplified failure modes effects analysis for sensing resistors (initial design) 
 
 The data provided in Table 8 clearly shows that all sensing resistor faults leads to system shut-
down. The analytical redundancy approach utilized in the improved design ignores missing or 
incorrect parameters and instead deduces them based on the parameters available. However, the two 
parameters – output voltage and temperature – remain vital for continued system operation. 
Likewise, operation outside the specified input voltage range (9 V – 12 V) cannot be solved by 
means of analytical redundancy and the converter performance in the event of an under voltage will 
be at a deteriorated level.  
 
Part Functionality Failure mode Local effect System effect
RI1 Input voltage measurement Short circuit Overvoltage at controller None 
 Input voltage measurement Open circuit Loss of signal (VIN = 0V) None 
RI2 Input voltage measurement Short circuit Loss of signal (VIN = 0V) None 
 Input voltage measurement Open circuit Overvoltage at controller None 
RI3 Input current measurement Short circuit Loss of IIN signal None 
 Input current measurement Open circuit Loss of converter power Loss of unit 
RI4 Input voltage measurement Short circuit Overvoltage at controller None 
 Input voltage measurement Open circuit Loss of signal (VIN = 0V) None 
RI5 Input voltage measurement Short circuit Loss of signal (VIN = 0V) None 
 Input voltage measurement Open circuit Overvoltage at controller None 
RO1 Output voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Output voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RO2 Output voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Output voltage measurement Open circuit Overvoltage at controller Loss of unit 
RO3 Output current measurement Short circuit Loss of IOUT signal None 
 Output current measurement Open circuit Loss of output current Loss of unit 
RO4 Output voltage measurement Short circuit Overvoltage at controller Loss of unit 
 Output voltage measurement Open circuit Loss of signal (VIN = 0V) Loss of unit 
RO5 Output voltage measurement Short circuit Loss of signal (VIN = 0V) Loss of unit 
 Output voltage measurement Open circuit Overvoltage at controller Loss of unit 
Table 9 : Simplified failure modes effects analysis for sensing resistors (improved design) 
5. Digital control of DC-DC converters  Page 67 
 
 Table 9 provides visible evidence that the analytical approach has effectively removed 10 failure 
modes from the system. Unfortunately, all of the eliminated failure modes concern failing resistors, 
which means that the gain in overall reliability is minimal (see Figure 49 in section ‘5.4 Digital 
converter control reliability’). A more detailed description of electronic part failure rates can be seen 
in chapter ‘3 Concept clarification and point of origin’. In terms of converter designs for commercial 
use, the effort put into designing the control software for the buck converter far exceeds the gain in 
performance and reliability. It would therefore be highly unlikely that analytical redundancy would 
be used for eliminating failure modes with relatively low probability of occurrence. However, the 
technique used in this design clearly verifies the capabilities of the concept regardless of the gain in 
overall reliability.  
 Analyzing the data in Table 9 in more detail provides a theoretical framework for evaluating 
advantages of implementing analytical redundancy in other converter topologies. The starting point 
is a formalized examination of the analytical redundancy realization in the buck converter under 
consideration. The result of this examination is then utilized as verification of the initial concept idea 
of missing parameter deduction. The first step in this examination is an abstract buck converter 
design in compliance with the standard conventions of traditional electric networks. The result, 
which is illustrated in Figure 40, then forms the basis for generating the converter’s connection 
matrix. This matrix is the theoretical end-result that will confirm the advantages of implementing 
analytical redundancy in a standard buck converter. Simultaneously, this matrix is usable in 
determining where analytical redundancy is achievable.  
 The approach taken in determining the necessary parameter deductions is a top-down approach 
where each component is analyzed and characterized by means of theoretical equations and failure 
mode deductions. The technique presented in Figure 40 and Table 10 is a simple, but theoretically 
more correct method in deducing analytical redundancy implementations than the method used in the 
initial buck converter design. 
  
Q L
C
VOPWM
IO
T D
1 2 3
4
5
6
9
7
8
Vin VOUT
II
VI
10
11  
Figure 40 : Alternative representation of a buck converter  
 
 The oriented graph in Figure 40 has the same inherent semi-directional power flow as the power 
system presented in chapter ‘4 Array-based redundancy’. This means that power and signal flow is 
restricted to the orientation of the arrows. Analyzing the graph in Figure 40 using the electrical 
network approach, the connection matrix shown in Table 10 can be established. Extraction of 
information from this table is accomplished by cross-referencing a given column number with the 
row numbers. In other words, the elements in a given column provide information about which lines 
5. Digital control of DC-DC converters  Page 68 
 
make up the output line (column number) of the block in question. For example, column 2 (line no. 
2) is comprised of information from line 1, line 4 and line 5. 
 
 1 2 3 4 5 6 7 8 9 10 11
1 1 Q 0 0 0 Q 0 0 0 II VI
2 0 1 L 0 0 0 IO 0 0 0 0 
3 0 0 1 C 0 0 0 VO 0 0 0 
4 0 D C 1 0 0 0 0 0 0 0 
5 0 Q 0 0 1 Q 0 0 0 0 0 
6 0 0 0 0 0 1 0 0 T 0 0 
7 0 0 0 0 P 0 1 0 0 0 0 
8 0 0 0 0 P 0 0 1 0 0 0 
9 0 0 0 0 P 0 0 0 1 0 0 
10 0 0 0 0 P 0 0 0 0 1 0 
11 0 0 0 0 P 0 0 0 0 0 1 
Table 10 : Buck topological matrix 
 
where P is short for the PWM controller, VO is short for the output voltage sensing circuitry, IO is 
short for the output current sensing circuitry, VI is short for the input voltage sensing circuitry, and II 
is short for the input current sensing circuitry. The remaining matrix entries Q, L, D, C and T define 
the switching MOSFET transistor, the output inductor, the freewheeling diode, the output capacitor 
and the temperature monitoring, respectively.  
 With reference to Table 10, it can be seen that the duty cycle (5) for controlling the switching 
transistor is generated based on five different parameters. Establishing the theoretical relations 
between these converter parameters as well as considering their measurement from an operational 
point of view, it is possible to continue the converter operation in case of certain faults - although at a 
deteriorated level. This increases the overall system reliability as described in section ‘5.4 Digital 
converter control reliability’. 
 The analysis of the simple buck converter shown in Figure 40 is now complete and the analytical 
redundancy improvements have been presented in a mathematical form that facilitates the pin-
pointing of different redundancy locations throughout the converter. Combined with the approach 
taken in determining analytical redundancy in the improved design, the analysis provides a 
theoretical framework and model for future implementations of analytical redundancy in single path 
converters regardless of converter topology. In this context, a single path converter is characterized 
as a converter comprised of a single electrical connection between input and output. 
 Turning the attention to the software control of the improved design, Figure 41 illustrates that the 
input voltage specification is of vital importance for the look-up table approach to work. 
Alternatively, the values in the look-up tables could be updated on the fly, extending the usable range 
of the approach and minimizing the size of the look-up tables. A detailed description of an 
implementation using smaller look-up tables is presented in [Pr01].  
 
5. Digital control of DC-DC converters  Page 69 
 
System init
Measure input
voltage
ADC interrupt
If n=100
measure
temperature
Timer interrupt
Converter
control in
'real-time'
Check
temperature
and deduce
converter state
Shut-down
converter
Measure VOUT,
VIN, IOUT, IIN and
calculate power
Change in
control law
Main
Interrupt routine
Request sample
Control law
Sample
Outside spec.
Within spec.
Within spec.
Within spec.
Outside spec.
Outside spec.
Converter OK
Converter failed
 
Figure 41 : Control software dataflow diagram 
 
 The software routine works by initiating the system upon startup. It is hereafter examined if the 
input voltage is within specification. If this condition is not met the control algorithm will operate the 
converter in real-time with the drawbacks imposed by this type of control. On the other hand, if the 
condition is met, the look-up table approach is utilized and the routine proceeds to the next step, 
which is a check of a numerical counter variable. If the counter has reached 100 the controller 
measures the temperature and evaluates whether or not it is within specifications. If the temperature 
is above the preset limit the converter is commanded off. If the temperature is below the preset limit 
the routine proceeds to the next system state, which determines the correct control law. This 
particular state requires considerable computation efforts since it includes acquisition of all external 
parameters followed by several multiplications. Based on the result from this computation effort, it is 
once again examined if the converter operates within specifications – meaning that, the power 
throughput has to be larger than for example, zero. If the counter check fails the routine proceeds to a 
subroutine where the temperature is checked and the current converter state is deduced. If these 
calculations still fail the converter is commanded off. Conversely, if the deduction of current 
converter state based on the temperature measurement indicates that the converter is working 
properly the subroutine ends and execution of the main routine is reestablished. On the other hand, if 
the check of all external converter parameters passes the routine proceeds to the decision state where 
it is determined which control law to apply. Having applied the correct control law the counter 
variable is increased by one and the routine repeats the main loop. While the main loop performs all 
of the abovementioned steps, the program execution is regularly interrupted when the interrupt 
routine has new data acquisitions waiting to be read.  
 Summarizing the above description, the two key routines in the software can be characterized as 
follows: 
 
• The Interrupt routine is responsible for correct converter control 
 
• The Main routine is responsible for temperature measurement, calculation of correct control 
law and type of calculation method (look-up or real-time) 
5. Digital control of DC-DC converters  Page 70 
 
 It should be noted that source code implementation via subroutines to which the main program 
guides the instruction pointer provides a variable response time that depend on the number of as well 
as which instructions are executed. This is an unacceptable situation in a DC/DC converter and the 
faster and more precise code execution obtained by applying an interrupt based program structure for 
control parameter sampling is implemented.  
 When operated outside the specified conditions the generation of the duty cycle is performed in 
real-time, which lowers the sample frequency to 3 kHz. This rather low sample frequency would 
under normal conditions be unacceptable, but instead of simply shutting down the converter in case 
of abnormal operating conditions the proposed control algorithm provides a continuous supply of 
power although at a degraded level. 
 It is well known that converter power throughput is a function of temperature. Manufactures of 
commercially available converters most often specify the power a given converter can deliver under 
certain operating conditions, making the user responsible for proper cooling and/or control of 
ambient temperature. Exceeding the specified values and permanent damage to the converter is likely 
to occur. As an extra safety precaution many modern converters have thermal shut-down capabilities, 
which shut down the converter when a preset maximum temperature is reached. Under normal 
circumstances thermal shut-down capability provides little or no warning before the protective 
circuitry shuts down the converter. The proposed control algorithm provides an easy fix to this 
particular problem by incorporating a dynamic change of power throughput as a function of 
temperature. This allows for active temperature management since the temperature at any given time 
can be related to the losses within the converter. Keeping a converter running at maximum power 
throughput during a temperature rise eventually causes the thermal protection to shut-down the 
converter. In a stand-alone configuration the dynamic change in power throughput as a function of 
temperature provides little or no advantages over the thermal shut-down protection. However, in load 
priority applications or in redundant configurations the loads with the lowest priority will simply be 
disconnected from the converter and thereby improve the probability of a continuous supply of 
power to the critical loads. Actively controlling the power throughput as a function of temperature is 
the topic of chapter ‘6 Load sharing’. The results derived in that chapter also apply to redundant 
configurations using digitally controlled converters. 
 A base for deducing converter states, in the event an external parameter is lost, is established by 
relating the MOSFET transistor temperature to the converter output current. This corresponds to 
relating line 7 and 5 by means of line 9 in Figure 40. A graphical illustration of this process is shown 
in Figure 42 where the mounting of the thermometer on the MOSFET transistor also is depicted. 
0
20
40
60
80
100
120
140
160
0 0,2 0,4 0,6 0,8 1 1,2
Output current
Te
m
pe
ra
tu
re
 
Figure 42 : Relation between temperature and output current in PWM mode 
TSense
No heatsink 
5. Digital control of DC-DC converters  Page 71 
 
 The curve in Figure 42 is the theoretical correlation between temperature and output current in 
PWM mode. Due to the asymmetrical skipping of pulses a similar correlation in PS mode is a lot 
more complicated to deduce. The simplest way would be to take real-world measurements and use a 
curve fitting algorithm to deduce an analytic equation. This equation can then be implemented in the 
control algorithm along with the PWM correlation data. Since the evaluation of applicable 
techniques and their reliability impact on overall system reliability is just as easily evaluated by only 
observing a single modulation mode, it has been decided not to implement the correlation curve in 
PS mode. At the system level, this means that the analytical redundancy implementation only works 
in PWM mode. 
 From a system point of view the use of analytical redundancy eliminates several failure modes 
and thereby increases the overall reliability. However, in terms of fault tolerance the system is only 
partially resilient, since faults in for example the MOSFET transistor still leaves the converter in a 
non-operating state. If true fault tolerance is desirable, the MOSFET transistor must be replaced by a 
transistor array as shown in Figure 43. The use of this technique is described in more detail in 
chapter ‘3.4.2 Parts/block redundancy’ and in the PATH report in the appendix to this thesis. 
 
Increased reliability
Increased cost
Increased complexitySingle transistor Transistor array
 
Figure 43 : Example of parts redundancy 
 
 With the above description of the basic features of the system and the implementation techniques 
used in achieving failure mode elimination and converter control, the focus throughout the remainder 
of this chapter will be on system reliability and the experimental measurements that verify the 
converter performance.  
 An image of the improved design can be seen in Figure 44. For easy parameter extraction, the 
input voltage, output voltage, gate drive signal and feedback signal are connected to the BNC 
connectors. The latter signal is used to obtain the gain/phase measurements shown in Figure 58 and 
Figure 59 where short wires are important for minimizing noise pick-up. 
 
    
   (a)   (b) 
Figure 44 : Inside look at the improved converter 
5. Digital control of DC-DC converters  Page 72 
 
5.4 Digital converter control reliability 
 
 While the continuous development in microprocessor technology makes it attractive to replace 
analog circuitry with digitally equivalents the pros and cons in each case should be considered very 
carefully. Analog controllers have the advantage of high bandwidth, no need for data conversion, 
high resolution and fast response to changes in fixed control parameters while their digital 
counterparts have the advantage of being able to respond to changes in fixed control parameters as 
well as to changes in variables within the converter. Furthermore, digital controllers have the ability 
to react intelligently to the loss of fixed control parameters. By means of analytic behavior of the 
converter operation, the lost control parameter is usually deducible from the control parameters 
available. Although, different from the technique used in this thesis one such analytic approach in 
determining immeasurable parameters is presented in [Qi01].  
 Based on the continuous measurement of input power, output power and system temperature the 
control algorithm improves system reliability by means of analytical redundancy as described in the 
previous section. In terms of reliability this compensates for part of the decrease in system reliability 
when interchanging the analog controller with its digital counterpart. An exact number of the 
reliability decrease due to this interchange can be found using the data given in the Military 
Handbook 217F concerning reliability prediction of electronic equipment: 
 
A commercially available analog controller in a 16-pin DIP package designed for ground-based 
equipment has a MTTF of 1.7 million years. 
 
An 8-bit PIC microcontroller in a 40-pin DIP package designed for ground-based equipment has a 
MTTF of 0.46 million years. 
 
 It is seen that the MTTF decreases by a factor of 3.7 when replacing the analog controller unit 
with a digital controller. Furthermore, the reliability of the software implemented in the digital 
controller has to be considered. In this thesis, it is assumed that once the software is developed, 
tested and debugged it will provide fault free service to the converter it controls. This is an 
approximation justified by the fact that software reliability is a very large topic that would form the 
basis for an entire Ph.D. project. In any case, the approximation does not change the concluding 
results of this chapter and it is therefore assessed that the reliability deviation caused by this 
approximation is minor. Combining these reliability issues it becomes apparent that although 
analytical redundancy and other complex techniques can be applied in a digital controller, the analog 
controller still provides the optimum reliability in simple converter control applications. Having 
mentioned the drawbacks of digital control, it is worth noting that the increase in ‘intelligence’ in 
converter control opens the door to new reliability improvements not possible with analog 
controllers. 
 Although the response to a temperature change is much slower than to a voltage or current change 
analytical redundancy using the temperature in conjunction with other variables is both applicable 
and feasible. Suppose the circuit for measuring output current fails short (see RO3 in Table 9). Under 
normal circumstances this implies that the efficiency of the converter is zero and the converter has 
failed (see RO3 in Table 8). By applying analytical redundancy the controller is able to determine 
whether the converter has failed or not. The control algorithm applies analytical redundancy by 
relating the converter power throughput to the temperature. On the assumption that the previously 
mentioned fault occurs the controller sustains the power throughput as long as the temperature does 
not exceed the preset limit.  
5. Digital control of DC-DC converters  Page 73 
 
 In order to compute an accurate converter reliability that accounts for the abovementioned 
scenario, it is necessary to establish a temperature distribution. This is done by means of the 
graphical illustration shown in Figure 45. 
 
 
TSurface
TSurface - 10°C
TSurface - 30°C
1 resistor
1 MOSFET
5 resistors
2 IC's
1 inductor
1 diode
4 capacitors
1 resistor
4 diodes
2 capacitors
8 resistors
3 transistors
4 capacitors
Printed circuit board
 
Figure 45 : Temperature distribution used for reliability assessment 
 
 Since the converter under considerations is a single path system the equations used in the 
reliability assessment is directly usable in the form shown in chapter ‘3 Concept clarification and 
point of origin’. Starting out with the temperature dependence of the accumulated failure rate the 
following curve for the digitally controlled converter can be established. For reference purposes the 
accumulated failure rate of the same converter realized by means of an analog controller is shown. 
 
Failure rate
10000
8000
6000
4000
2000
Temperature
20 40 60 80 100 120
Analog design
Digital design
 
Figure 46 : Failure rate vs. temperature 
 
 From a reliability point of view Figure 46 clearly shows that below a system temperature of 
120°C an analog controller is preferable, while at temperatures above 120°C a digital controller 
should be utilized. However, no system with stringent reliability requirements would operate at 
temperatures above 120°C, thus the digital approach in controlling basic converter topologies simply 
cannot compete reliability-wise with the traditional analog implementations. In more complex and/or 
higher power systems comprised of several individual units the digital approach might still present a 
very attractive alternative to the traditional analog realizations. 
 The accumulated failure rates shown in Figure 46 is now inserted into the equation of the 
probability of system survival as a function of time:  
 
5. Digital control of DC-DC converters  Page 74 
 
 ∑ ⋅= t-e  R(t) λ     (5-7) 
 
which results in the following set of curves for an operating time span of 10,000 hours:  
 
0.2
Temperature
20 40 608 100 12060
0.4
0.6
0.8
1.0
R(t)
Analog design
Digital design
 
Figure 47 : Survivability as a function of temperature 
 
 The curves depicted in Figure 47 seem to be almost identical. Taking a closer look at the 
survivability throughout the common operating temperature range of 50°C to 90°C reveals that the 
two curves are far from identical. An enhanced view of this operating range is shown in Figure 48. 
 
0.975
Temperature
60 8070
0.965
0.980
0.985
0.990
R(t)
0.970
Analog design
Digital design
 
Figure 48 : Zoomed view of survivability as a function of temperature 
 
 In terms of unavailability the two curves in Figure 48 shows that the digital configuration is 36% 
more likely to fail within 10,000 hours at a temperature of 70°C than its analog counterpart. This is a 
considerable reliability decrease that clearly disqualifies the digital approach in simple systems. 
 Having shown the overall system reliability it is informative to consider just how much the 
reliability increases due to the elimination of the 10 failure modes. Since the resistors remain 
physically positioned in the system, they cannot be completely eliminated from the calculations. 
Furthermore, several of the resistors still impose a reliability concern in one of the two failure modes 
associated with all resistors. The technique used in deducing the curve shown in Figure 49 is 
therefore a combination of the simple reliability equation (5-7) and the techniques described in 
5. Digital control of DC-DC converters  Page 75 
 
chapter ‘3.4.2 Parts/block redundancy’ concerning component joining with the intention to eliminate 
failure modes.  
20 40 60 80 100 120
0.2
0.4
0.6
0.8
1.2
1.0
1.4
% decrease in  λ
Temperature
Failure rate of initial design - Failure rate of improved design x 100                          Improvement x 100
                                Failure rate of initial design                                               Failure rate of initial design=
 
Figure 49 : Percent-wise decrease in overall failure rate as a result of analytical redundancy 
 
 The optimum percent-wise decrease in overall system failure rate as a function of temperature is 
equal to 1.54% at a temperature of 71°C, thus being within the normal range of operating 
temperatures. Unfortunately, this percent-wise peak at the mean temperature of normal operation is 
simply due to chance, since elimination of failure modes has very little impact on the waveform of 
sensing resistor failure rates as a function of temperature. The major contributor in this case is the 
physical layout of the resistors in relation to the heat generating MOSFET transistor, hence 
controlled optimum positioning requires a redesign of the converter layout. 
 
 
5.5 Experimental verification 
 
 
Figure 50 : Duty cycle in PS mode 
 
5. Digital control of DC-DC converters  Page 76 
 
 The very narrow duty cycle pulses in PS mode are seen in Figure 50. Since the skipping of these 
pulses is load dependent, it is very hard to accurately predict the noise caused by activation of the 
MOSFET transistor. Conversely, the symmetrical duty cycle waveform of PWM operated MOSFET 
transistors generates very predictable noise components that can be effectively eliminated by 
designing a proper low pass filter. The affect of the filter used in the improved design can be seen in 
Figure 57 while the duty cycle pulses in PWM mode is depicted in Figure 51 
 
 
Figure 51 : Duty cycle in PWM mode 
 
 
Figure 52 : Inductor current in PS mode 
 
 Due to the very light load in PS mode the inductor current reaches zero after each switching. This 
asymmetrical activation of the MOSFET transistor is easily identified in Figure 52, which shows the 
inductor current in PS mode.  
5. Digital control of DC-DC converters  Page 77 
 
 
Figure 53 : Inductor current in PWM mode 
 
 Transitioning from the PS mode of operation to the PWM mode, the inductor current becomes 
continuous and the waveform is predictable and symmetrical for constant or slow varying loads. 
Since the digitally controlled converter is intended as a low power converter the boundary between 
PS mode and PWM mode is around two watts, which is equal to 400mA.  
 
 
Figure 54 : Converter input voltage in PS mode 
 
 The noise generated at the input by the nonlinear input current pulses is shown in Figure 54 for PS 
mode and in Figure 55 for PWM. Since the power throughput is at a larger level in PWM mode, the 
noise measured in this particular region of operation should be expected to exhibit considerable 
noise. When comparing the two figures it becomes quite clear that the input noise in PWM mode far 
exceeds that found in PS mode. The only instance where the input voltage decreases in PS mode is 
5. Digital control of DC-DC converters  Page 78 
 
during transistor conduction and even then the input voltage only decreases by a few millivolts. In 
PWM mode the input voltage ripple is relatively large and consistent. 
 
 
Figure 55 : Converter input voltage in PWM mode 
 
 
 
Figure 56 : Converter output voltage in PS mode 
 
 In Figure 56 it can be seen that the converter output voltage in PS mode has a large ripple 
component. This is partially due to the unpredictable spectral components caused by the random 
skipping of pulses. Another contributing factor to the large ripple is the finite DC gain and rather low 
attenuation at high frequencies. The latter issues will be shown in Figure 58. 
 
5. Digital control of DC-DC converters  Page 79 
 
 
Figure 57 : Converter output voltage in PWM mode 
 
 As shown in Figure 57 the amount of output voltage ripple is greatly reduced in PWM mode. This 
is caused by the predictability of the spectral noise components that enables an efficient design of a 
suitable output filter. Furthermore, as will be verified by the measurement in Figure 59 the converter 
operates as a second order compensated system with high DC gain and continued attenuation of 
spectral components at high frequencies. 
 The following set of measurements examines the bandwidth of the digitally controlled converter 
in both operating modes.  
 
Phase
Gain
Cross-over frequency
Gain increase
 
Figure 58 : Gain/phase plot of the converter operated in PS mode 
5. Digital control of DC-DC converters  Page 80 
 
 As can be seen in Figure 58 the cross-over frequency is 6.7 kHz and the open loop transfer 
function changes appearance to that of a first order system without compensation. From a stability 
point of view the system remains stable due to the fact that single order systems are inherently stable, 
thus exhibit a large phase margin at the cross-over frequency. The drawback of such a system is that 
the attenuation at higher frequencies levels off and eventually becomes constant. In this case the 
constant attenuation at higher frequencies is around 9 dB. This fact combined with unpredictable 
spectral components in PS mode coincides with the observation of higher noise at the output shown 
in Figure 56. It should be noted that the cross-over frequency in DCM is higher than the cross-over 
frequency in CCM. This is illustrated in Figure 58 by the horizontal arrow and the text ‘Gain 
increase’. The reason for this increase is the pulse skipping implementation by means of the look-up 
table. In most other systems the opposite phenomenon is usually observed. That is, the cross-over 
frequency decreases as the converter enters DCM. 
 The gain/phase measurement in PWM mode, shown in Figure 59, can be seen to match that of a 
second order compensated system. The continued attenuation at higher frequencies ensures a low 
noise output, which can be identified in Figure 57. 
 
Phase
Gain
Cross-over frequency
 
Figure 59 : Gain/phase plot of the converter operated in PWM mode 
 
 The starting frequency of the gain/phase plot shown in Figure 59 is changed from 100 Hz to 200 
Hz due to increased noise in the lower region of the frequency sweep. However, from a stability 
point of view this does not change anything and the two gain/phase measurements are still 
comparable. 
 
5. Digital control of DC-DC converters  Page 81 
 
0
10
20
30
40
50
60
70
80
90
0 0,3 0,6 0,9 1,2
Output current
Ef
fic
ie
nc
y
 
Figure 60 : Efficiency vs. output current 
 
 Figure 60 shows the system efficiency from very light loads to full load. The curve is comprised 
of two waveforms – one for PS mode and one for PWM mode. The intersection is at 370 mA where 
the optimum transition from one control law to the other would be. However, due to the oscillatory 
behavior close to this optimum point of transition the previously mentioned hysteresis has been 
implemented. A close-up view of the transistion region is shown in Figure 61. 
 
Figure 61 : Enhanced view of the built-in efficiency hysteresis 
 
 The final measurement is related to the microcontroller power consumption. After disconnecting 
all external devices, the power consumed by the microcontroller during operation was measured. 
This is achieved by measuring the microcontroller input current and input voltage. The oscilloscope 
then multiplies these two measurements to form the instantaneous power usage. This particular 
measurement was performed for both designs presented in this chapter and the results were identical. 
Therefore only a single measurement is shown. 
70
72
74
76
78
80
82
0,25 0,3 0,35 0,4 0,45
Output current
Ef
fic
ie
nc
y
PS mode
PWM mode
PS mode
PWM mode
5. Digital control of DC-DC converters  Page 82 
 
 
            
Figure 62 : Microcontroller power consumption 
 
 
5.6 Discussion and summary 
 
 In this chapter a fully digital converter control approach has been presented. The aim was an 
overall increase in converter reliability by means of advanced techniques such as multiple control 
law implementation and analytical redundancy. Two designs were considered. The initial design 
incorporated many of the features used in the improved design, but did not provide the failure mode 
elimination.  
 The realization of the control algorithm and basic buck converter verified that reliability 
optimization by analytic means indeed is possible, although the gain in overall reliability in the buck 
converter was minor. Furthermore, it has been shown that converter control by means of multiple 
control laws is within the timing limits of a standard low-cost microcontroller. Temperature 
measurements allowed for implementation of analytical redundancy, which improves system fault 
resilience, although true fault tolerance only is achievable in hardware redundant converter 
configurations.  
 Implementation of digital control algorithms can be achieved in several ways. One way is to 
initiate a fully digital control design. This approach has advantages in terms of achievable phase and 
gain margins for system stability, but requires a completely different design technique than the more 
well-known redesign approach. The latter approach is based on the traditional methods of analog 
control design and establishes a discrete time model through the use of conversion techniques. The 
converter presented in this chapter utilizes this technique and converts the continuous time control 
equations by means of Euler’s transformation. The dynamic performance achieved with the redesign 
approach shows stable behavior in both DCM and CCM. The transition from one control law to 
another proved to deteriorate the dynamic converter performance and the concept of hysteresis had to 
be implemented. 
 As indicated by several entries in the ‘State of the art techniques’ database, the digital control 
approach presents several advantages over the analog implementation. Unfortunately, the reliability 
evaluation in this chapter showed that the digital approach is much more likely to fail than its analog 
 
Controller voltage (2) 
 
 
 
 
 
Controller current (3) 
 
 
 
 
 
Power consumption (A) 
5. Digital control of DC-DC converters  Page 83 
 
counterpart. In other words, despite the promising features of the digital control, a long term feasible 
solution seen from a reliability point of view has to be implemented by means of analog circuitry. As 
new technologies develop at a very fast pace it is not unlikely that digital control of simple 
converters and larger power systems will become just as reliable as fully analog implementations. 
When this time comes, the extra features inherent in all software based control systems can add 
significantly to the list of pros and cons of digital converter control. 
 
So, which approach should be taken in the design of future converters? 
 
 To answer this question properly the term converters need to be further clarified. If the term 
covers simple converter topologies the answer would be that converter control should be performed 
by means of an analog controller. Conversely, if the term covers complex power systems the answer 
is no longer clear since the digital approach might be able to remove serious failure modes from the 
system. Also, complex systems are usually comprised of a large number of parts and the percent-
wise microcontroller impact on system failure rates tends to decrease, which would allow the 
intelligent features to boost the overall system reliability.  
 This chapter also provided a topological advance in determining how and where analytical 
redundancy can be implemented. This was accomplished by adopting the use of graph theoretical 
evaluation techniques for establishing connection matrices for electrical networks. Following the 
establishment of a converter’s connection matrix, it is possible to maximize the control effort by 
analytically generating overdetermined parameters in the converter. In compliance with the 
traditional rules of electrical networks these parameters can then be used in the deduction of other 
key parameters. Finally, creating a prioritized list of necessary key parameters the benefits of the 
theoretical system optimization can be maximized.  
 The design of analytical redundancy in the buck converter considered in this chapter differs 
slightly from the approach just described in that the point of origin was an analysis of each 
component in the system. From there, critical failure modes were considered and characterized 
accordingly.  
 Having described quite a few pros and cons of digital converter control, the following list 
summarizes the most common perception of this particular topic: 
 
Digital: Analog: 
Noise margin Short reaction time 
Temperature stability High accuracy 
Control algorithms   
Multiple surveillance functions  
Pros: 
Easy adaptation of algorithms 
Pros: 
 
    
Discrete values → bit errors Noise and temperature sensitive 
Finite sample time Non or very little ‘intelligence’ 
Cons: 
Error compensation is complex 
Cons: 
Single function circuitry 
Table 11 : Digital vs. analog control 
 
 It is seen that although the list of pros for the digital control includes several key features the cons 
continue to impose serious challenges in terms of mirroring the performance of an analog controller. 
 
 
5. Digital control of DC-DC converters  Page 84 
 
5.7 References 
 
 
[Fr05]  Digital Control of Dynamic Systems, Gene F. Franklin, J. David Powell and Michael 
workman, Addison-Wesley Longman Inc., ISBN 0-201-82054-4 
 
[Go01]  Control System Design and Simulation, Jack Golten and Andy Verwer, 
 McGraw-Hill, ISBN 0-07-707412-2 
 
[Me01]  Statistical Methods for Reliability Data, William Q. Meeker and Luis A. Escobar,  
 Wiley series in Probability and Statistics, ISBN 0-471-14328-6 
 
[Mi01]  Reliability Prediction of Electronic Equipment, Military Handbook 217 
 
[Ne02]  Digitally Controlled Converter with Dynamic Change of Control Law and Power 
Throughput, Carsten Nesgaard, Nils Nielsen and Michael A. E. Andersen, 
 Power Electronics Specialists Conference 2003, Acapulco, Mexico 
[Pi01]  PIC16F87x Datasheet, Microchip Technology Inc., www.microchip.com 
 
[Ba01]  Self-monitoring microcontroller based DC/DC converter, Rune M. Barnkob, Special 
report 2002, Department of Electric Power Engineering, Technical University of 
Denmark 
 
[Qi01]  On the Use of Current Sensors for Control of Power Converters, 
 D. Y. Qiu, S. C. Yip, Henry S. H. Chung, and S. Y. R. Hui, 
 Power Electronics Specialists Conference 2001, Vancouver, British Columbia, Canada 
 
[Ce01]  A New Distributed Digital Controller for the Next Generation of Power Electronics 
Building Blocks, I. Celanovic, I. Milosavljevic, D. Boroyevich, R. Cooley, J. Guo, 
 Applied Power electronics Conference and Exposition 2000, New Orleans, USA 
 
[Bl01]  Diagnosis and Fault-Tolerant Control, Mogens Blanke, Michel Kinnaert, Jan Lunze 
and Marcel Staroswiecki, Springer, ISBN: 3-540-01056-4 
 
[Pr01]  Design of a Digital PID Regulator Based on Look-Up Tables for Control of High-
Frequency DC-DC Converters, Aleksander Prodic and Dragan Maksimovic, 
 Computers in Power Electronics 2002, Puerto Rico 
 
[Pr02]  Design and Implementation of a Digital PWM Controller for a High-Frequency 
Switching DC-DC Power Converters, Aleksander Prodic, Dragan Maksimovic and 
Robert W. Erickson, 
 IECON 2001: The 27th Annual Conference of the IEEE Industrial Electronics Society 
 
6. Load sharing  Page 85 
 
6 Load sharing 
 
 Having described several aspects of digital converter control up until this point, the focus in the 
remainder of this thesis will be on analog power system implementations, where power system in 
this context is an acronym for multiple converter units connected in some form of parallel-
configuration. This chapter describes a new thermal load sharing implemented by means of a 
dedicated load share controller IC. The reliability improvements as well as the efficiency 
improvements are discussed and compared to the traditional current sharing technique that by many 
are considered the best approach in achieving equal converter stress. As will be shown, the reliability 
improvements caused by the new load sharing technique are significant, considering that the initial 
power system remains unchanged. 
 
 
6.1 Introduction 
 
 With new applications for high-current low-output-voltage power systems emerging nearly every 
day the need for new and cost-efficient power system designs is a matter of course. As output voltage 
levels continue to decrease an approach that seems more and more attractive is the implementation of 
distributed power configurations with point-of-load power conversion. This technique distributes a 
high voltage to all parts of the system, thus minimizing the voltage drops throughout the distribution 
network. Also, the increasing concerns regarding fault tolerance, improved reliability, serviceability 
and redundancy are better addressed with this type of power system configuration through the 
implementation of the needed features directly at the load and thereby protecting the integrity of the 
distribution network. These advantages are well-known throughout the industry for which reason the 
distributed power system approach has been widely adopted as an industry practice to power the next 
generation of information technologies. This widespread use has opened up an opportunity in the 
power supply industry to develop standardized modules that significantly improve both the design 
and manufacturing processes, thus enhancing system performance and reliability while product cycle 
time and costs are reduced. 
 While this seemingly optimal power system configuration provides a wide variety of advantages it 
offers little or no solution to the problem of high-current low-output-voltage conversion at the point-
of-load. A common solution to this problem is parallel-connection of multiple converter units. This 
technique is attractive for a number of reasons. The first and most obvious is that it provides the 
designer with a simple technique for reliability improvements as redundancy quite easily can be 
implemented. Another advantage of this particular technique is that it allows the designer to 
implement large power systems by means of off-the-shelf units, thus minimizing parameters such as 
design time and system cost. However, due to non-ideal parts each converter unit deviates from the 
ideal case, which makes a power system comprised of parallel-connected converters a rather poor 
performing system. To account for these non-ideal parts some form of load sharing is needed, 
whereby it is ensured that each converter in the configuration delivers its share of the total output 
power.  
 In other words, parallel-operation of multiple converters is employed when specifications require 
a highly reliable system designable in a relatively short time frame. However, to make full use of the 
systems potential – load sharing is a must. 
 The remainder of this section provides a general introduction to the design considerations of the 
power system considered in the first part of this chapter. It is the intention to implement a N+1 
redundant power system, therefore the first design consideration is an establishment of the number of 
6. Load sharing  Page 86 
 
converters to use in the realization. In the design of an N+1 redundant system, the most 
straightforward implementation is the design of two identical converters each capable of supplying 
the maximum load current. However, this approach results in a 100% power ‘overbudget’ – meaning 
that the available system power is twice that required by the specifications. Increasing the number of 
converter units reduces this power ‘overbudget’, but simultaneously decreases the overall reliability 
due to the increased number of components making up the system. Other parameters that affect the 
selection of the appropriate number of individual converter units to use include system volume, cost, 
and mass to name a few. Since the key concern in this chapter is reliability, the selection of the 
number of converter units to use is a compromise between power ‘overbudget’ and reliability. This 
phenomenon is described in more detail in chapter ‘3 Concept clarification and point of origin’, 
hence the description of this topic will be limited to that already presented. However, the result that 
was obtained in chapter 3 will be used throughout the theoretical evaluation that follows.  
 Based on the intersection of the two curves shown in Figure 9 and the reliability issues concerning 
parallel-connection of multiple converter units described in chapter 3, the power system considered 
in this chapter is comprised of N+1 = 3 parallel-connected buck converters each capable of supplying 
15 ARMS. A system structure that facilitates the required redundancy is depicted in Figure 63, where 
the individual converter currents contributing to the load current (IOUT) is also identifiable. The 
maximum value of the latter parameter is IOUT = 30 A. 
 
Converter 1     (T1)
Converter 2     (T2)
Converter 3     (T3)
I1
I2
I3
IOUTIin
 
Figure 63 : Power system configuration 
 
 With reference to Figure 63 the individual converter parameters in the parallel-configuration can 
be identified:  
 
 I1, T1 are associated with converter 1  
 I2, T2 are associated with converter 2 
 I3, T3 are associated with converter 3  
 
 These parameters form the basis of the thermal calculations that follows as well as the reliability 
assessments in section ‘6.4 Reliability assessment of the two techniques’. 
 Having determined the power system structure, the next design consideration is the choice of a 
suitable load sharing technique. Since a wide variety of load sharing techniques exists, this particular 
topic is associated with numerous trade-offs. The load sharing technique considered in this chapter 
utilizes a dedicated load share controller IC to ensure stable and optimized current distribution 
among the converters making up the power system. In the next chapter the much simpler droop load 
sharing technique is considered and a new thermal droop technique is introduced. Several other 
techniques can be utilized for load sharing purposes. However, these belong to the more customized 
6. Load sharing  Page 87 
 
approaches and the focus in this thesis will be on traditional techniques and how to improve these in 
terms of optimized reliability and/or efficiency.  
 The most commonly used technique, whether implemented by means of a dedicated load share 
controller IC or through the use of the droop characteristics of the converter output, is the current 
sharing technique. This chapter examines the current sharing technique as well as a new thermal load 
sharing technique both implemented and controlled by a dedicated control IC. In each case the pros 
and cons will be discussed and a comparison of the two techniques from a reliability point of view 
will be presented in section ‘6.4 Reliability assessment of the two techniques’. As will become clear 
from this comparison it is a general misconception that the widely used and accepted current sharing 
technique automatically results in optimized power system reliability and efficiency.  
 At the end of the chapter a laboratory implementation controlled by the current-based load sharing 
technique is presented. After several key measurements, presented in section ‘6.7 Experimental 
results’, the current-based load share control is replaced with the new thermal load sharing technique 
that proves to optimize both system reliability and efficiency. For easy experimental comparison, 
measurements of the latter system is also provided in section ‘6.7 Experimental results’. 
 
 
6.2 Current-based load sharing  
 
 Numerous papers and patents exist that describe different implementations of the current-based 
load sharing technique. A common characteristic of these techniques is the use of switch and/or 
output current as a measure of equal distribution of supply currents among the parallel-connected 
converters. Since the switch current is usually used for the converter current mode control this 
technique has the advantages of eliminating extra current sensing circuitry. However, the 
implementation of this technique using a dedicated load sharing IC’s is greatly complicated by the 
fact that the outer load sharing loop has a very low cut-off frequency. The switching current is a fast-
varying signal that requires a feedback system with a very high cut-off frequency for efficient load 
sharing. These contractive requirements often result in load sharing implementations by means of 
output current sensing, which is a continuous parameter suitable for a slow responding loop. For this 
reason the load sharing scheme considered in this chapter utilizes the latter technique. 
 The main idea behind the current sharing technique is that equal stress and temperature is 
achieved with identical currents through each converter. Theoretically, this results in optimized 
performance and reliability. In the ideal case with identical converter components and identical 
thermal operating surroundings this technique does indeed result in optimized performance and 
reliability. However, the ideal case is very rare and the result of implementing the current sharing 
technique is often less advantageous than predicted by the theoretical models. 
 Turning the attention to the topology of the power system converters, it is assumed that these are 
implemented as buck converters with a single MOSFET switching transistor. In converter 
implementations with multiple MOSFET transistors and/or synchronous rectification the overall 
impact of improper load sharing would be even more profound. This fact becomes clear in section 
‘6.4 Reliability assessment of the two techniques’. In other words, this first part of this chapter 
establishes a foundation for rethinking the ‘obvious’ load sharing approach – the current sharing 
technique. 
 Figure 64 shows the general case where a number of converters are paralleled and forced to 
supply an equal share of the total output current. Since the system under consideration is single point 
failure free all individual outputs are OR’ed by means of schottky diodes. 
 
6. Load sharing  Page 88 
 
DC/DC converter
Load
control
DC/DC converter
Load
control
DC/DC converter
Load
control
Load
Lo
ad
 s
ha
rin
g 
bu
s
 
Figure 64 : Load sharing by means of current sharing  
 
 A more detailed illustration of the current sharing technique is depicted to the right in Figure 65 
where it can be seen that high side current sensing is required (in non-isolated systems) as well as 
dual supply rails for the control circuitry. Despite these downsides, the advantages of the current 
sharing technique compared to that of a system without any load sharing are many. Being a simple 
technique to implement the current sharing technique ensures that no single converter unit is stressed 
to the maximum. This is ensured by preventing each converter from going into current limitation due 
to, for example, small variations in individual converter output voltages. As already mentioned, one 
of the drawbacks of the current sharing technique is the need for output current sensing. The sensing 
of this parameter is typically done by inserting a resistor in series with the converter output. This 
resistor causes additional power loss although it can be kept to a minimum compared to, for example, 
the semiconductor losses of the converter. In turn, the loss that results from inserting this resistor 
heats up the system. Also, in non-isolated systems the current sensing has to be done by differential 
high-side sensing, which adds considerably to the overall converter complexity unless more 
expensive load share controllers that facilitate this type of sensing are used. 
 The illustration of IOUT vs. Temperature in Figure 65 (lower left) is a representation of the 
maximum output current (IMAX) as a function of system temperature. In most converter designs the 
horizontal line (IMAX) determines the maximum safe output current and is often based on output 
current under worst-case temperature conditions. In a later section, it will be shown how the new 
thermal load share technique can be utilized to dynamically change the maximum safe output current 
as a function of temperature. This allows for optimized power redistribution through the system. 
 For now, the center of attention is the theoretical aspects of equal current sharing. Since this 
technique in its basic form is thoroughly described in numerous papers, articles and application notes 
a description of its mode of operation will be limited to that already presented. Thus, focus is 
maintained on the pros and cons of the technique and its impact on overall system temperature and 
power losses.  
 
6. Load sharing  Page 89 
 
Power
components
PWM control
Load share
control
Current
meas. OutputInput
Current Limit (ILIM)
ILimit
0
IMAX
IOUT
Temperature
ISENSE RMEAS
+ 9V
- 9V
LS controller
R3
R1 R2
R4
t
OP-amp
High side sensing
TMAX  
Figure 65 : Current sharing implementation and controller current waveform  
 
 All electronic parts are associated with parasitic elements that deviates from the ideal-part models 
used in the initial system analysis. A qualitative assessment of these elements reveals that the 
parameter of interest in this context is the MOSFET transistor power losses. Therefore, the following 
calculations concentrate on addressing the parasitic elements associated with the MOSFET 
transistors. A simplification that is justified by the fact that the MOSFET transistors generate the 
most heat. Being the primary source for system heating, the MOSFET transistor is also the primary 
cause of deteriorated system reliability. Besides from integrated circuits the switching MOSFET 
transistors are among the group of components that exhibits the highest failure rate in any system. 
This fact is described in more detail in chapter ‘3 Concept clarification and point of origin’. 
 Considering the parasitic elements associated MOSFET transistors, the first and most obvious 
element that comes to mind is the ON-resistance RDS(ON). This parasitic element causes power losses 
when the MOSFET transistor is conducting. In turn, this increases the MOSFET transistor junction 
temperature. Since the ON-resistance is a temperature dependent parameter, the increase in junction 
temperature causes further increase in the ON-resistance which results in an additional junction 
temperature rise. Although several other loss mechanisms are associated with the MOSFET 
transistor the recursive phenomenon just described will form the basis for the following load sharing 
estimations. The other major contributor to MOSFET transistor heating is the switching losses. 
These will be omitted in the initial system evaluation, but will be discussed in more detail as part of 
the laboratory implementation analysis and also form a partial basis for redistributing the load 
current.  
 According to transistor manufactures datasheets, the nominal value of the MOSFET transistor 
ON-resistance can vary by as much as ±30% from one batch of transistors to another. This fact must 
be taken into account when thermal system issues are considered. However, the variation in RDS(ON) 
between transistors from the same batch is usually much smaller.  
 Obtaining measured data points for one of the MOSFET transistors used in the power system 
under consideration and subsequently curve fitting these points, the following RDS(ON) vs. 
temperature curve can be established:  
  
6. Load sharing  Page 90 
 
RDS(ON) (Ω)
0.025
Temperature
125100755025 150-25
0.050
0.075
0.100
0.125
0.150
 
Figure 66 : MOSFET RDS(ON) temperature dependency 
 
 The MOSFET transistors used in this theoretical evaluation is the IRF3315 from International 
Rectifier. Comparing the RDS(ON) vs. temperature curve provided in the datasheet with the real-world 
measurements show that the physical MOSFET transistor performs better than predicted by the 
datasheet. Therefore the real-world data is used as a reference throughout the remainder of this 
evaluation. Considering Figure 66, it can be seen that the MOSFET ON-resistance increases from 70 
mΩ to 140 mΩ when the junction temperature increases from 25°C to 140°C. The theoretical 
increase over the same temperature range is, according to the device datasheet, 70 mΩ to 157.5 mΩ. 
This is a deviation of more than 11% which verifies the potentially large variations in MOSFET 
transistor ON-resistance. 
 By means of the following calculations, it will be shown that utilization of the current sharing 
technique with intend to optimize the overall system reliability quite often results in imbalanced 
power loss distribution within the system. The first step in this process is the determination of the 
MOSFET transistor conduction losses. This can analytically be expressed as: 
 
 DS(ON)
2
RMSR RI  P DS(ON) ⋅=   (6-1) 
 
 The heat generated by the power loss calculated in (6-1) is transferred from the MOSFET 
transistor housing and heat-sink to the ambient by means of convection and radiation. A 
mathematical description of this heat transfer can be established by the following two equations 
[Mo03]: 
 
 
( )
4
5
AmbientSurface
Convection h
T - T
A1,34  P ⋅⋅=   (6-2) 
 
 ( )4Ambient4Surface8Radiation T - TA015,7  P ⋅⋅⋅= −   (6-3) 
 
 The variable ‘h’ in (6-2) defines the height of the heat-sink while the variable ‘A’ in both (6-2) 
and (6-3) denotes the area of the heat-sink. The heat-sink used for all subsequent calculations has the 
dimensions: 20 cm. x 20 cm. A graphical representation of (6-2) and (6-3) is shown in Figure 67 and 
Figure 68 respectively.  
6. Load sharing  Page 91 
 
 
PConvection (W)
5
10
15
20
25
TSurface (
oC)
1401201008060
TAmbient =  40
oC
AHeats ink =  20 cm. x 20 cm.
 
Figure 67 : Power dissipation caused by convection 
 
 
PRadiation (W)
0.2
0.4
0.6
0.8
1.0
TSurface (
oC)
1401201008060
TAmbient =  40
oC
AHeatsink =  20 cm. x 20 cm.
 
Figure 68 : Power dissipation caused by radiation 
 
 From Figure 67 and Figure 68 it can be seen that the heat transfer from MOSFET transistor to 
ambient is almost solely due to convection. 
 In order to calculate an exact value of the conduction losses, the choice of MOSFET transistor 
ON-resistance deviation must be recognized. For illustration purposes it is chosen to implement the 
three converters in the configuration with a MOSFET transistor of nominal RDS(ON), a MOSFET 
transistor of nominal RDS(ON) + 30% and a MOSFET transistor of nominal RDS(ON) - 30%. In a real-
world implementation this scenario would be extremely rare even though Figure 66 clearly shows 
that a difference in RDS(ON) among the transistors should be expected. 
 The point of origin for the temperature calculations is a thermally stable system that has settled at 
a non-varying temperature. This thermal equilibrium is achieved when heat generation equals heat 
dissipation. To assist in the estimation of MOSFET transistor temperature, the thermal model shown 
in Figure 69 is established.   
 
6. Load sharing  Page 92 
 
PRDS(ON) PRadiation + PConvection
Rjc Rcs
Tc TSurfaceTj
TAmbient  
Figure 69 : Thermal system equivalent 
 
 Using (6-1), (6-2), (6-3) and the thermal model shown in Figure 69, an exact value for the 
conduction losses and MOSFET temperatures can be found. It should be noted that the following 
calculations assume that the load share controller provides an accurate current sharing of 10 A for 
each converter. 
 
 7.7W      P      C78.4  T   C90.7  T 30%-nom,Rs      j DS(ON) =°=°=   (6-4) 
 
 13.5W       P     C99.8  T     C121.4  T          nom,R sj DS(ON) =°=°=   (6-5) 
 
 24.4W    P   C134.9  T     C173.9  T  30%nom,R sj DS(ON) =°=°= +   (6-6) 
 
 As expected, the temperature dependency of the MOSFET transistor ON-resistance has a negative 
overall impact that contributes to a significant increase in conduction losses. The calculated 
temperatures indicate that the MOSFET transistor with RDS(ON),nom+30% operate very close to the 
recommended maximum temperature and is therefore very likely to fail. 
 From the results in (6-4), (6-5) and (6-6), the average junction temperature for the 3 MOSFET 
transistors can be found to be 128.7°C while the associated average heat-sink surface temperature is 
104.4°C. These average temperatures as well as the absolute converter heat-sink temperatures are 
used in the reliability assessment provided in section ‘6.4 Reliability assessment of the two 
techniques’. 
 
 
6.3 Temperature-based load sharing 
 
 The thermal load sharing technique proposed in this section compensates for the imbalanced 
power losses that result from implementing the current sharing technique. By monitoring the 
temperature of the heat generating component (or components) the load current supplied by each 
converter in the parallel-configuration can be adjusted to take into account parameters such as 
parasitic elements, physical layout and working environment. 
 Using this technique each converter works at the same temperature, which in turn results in 
identical converter reliability throughout the parallel-configuration. Figure 70 shows the thermal load 
sharing technique in a configuration where the MOSFET transistor heat-sink temperatures are 
monitored and fed back to the control circuitry. This configuration corresponds to the laboratory 
setup considered later in this chapter. Also, included in the figure is the feedback of OR’ing diode 
temperature information. This information is not used in the actual system, but is easily implemented 
to form an average temperature of the active power components in the system. 
 
6. Load sharing  Page 93 
 
Lo
ad
 s
ha
rin
g 
bu
s
DC/DC converter
DC/DC converter
DC/DC converter
Load
Load
control
T
Load
control
T
Load
control
T
T
T
T
 
Figure 70 : Load sharing by means of thermal reliability management 
 
 In Figure 70 the current sensing resistor is replaced with a temperature dependent device 
monitoring the temperature of the switching MOSFET transistor and/or the output 
diodes/synchronous rectifiers. The attractiveness of this technique is that it allows for load sharing by 
means of any temperature parameter of interest within the system. Without additional complications, 
the load sharing technique could be modified to account for the PWM controller temperature instead 
of the existing MOSFET transistor temperature. Having decided on a proper location for the 
temperature monitoring component, the information obtained from this device is fed back to the 
system error amplifier as shown in the simplified figure below. 
 
Error Amp.
Typical control parameters
New control parameters
Vref
VOUT
IL
Tswitch
VLoad-share
Error signal
 
Figure 71 : Control parameters at feedback error amplifier 
 
 It should noted that a dedicated load share controller IC incorporates the parameters in the green 
circle in the VOUT feedback signal. Hence, the number of parameters directly fed back to the error 
amplifier is lower than that illustrated in Figure 71. However, a discrete realization would feed all the 
signals back to the amplifier to generate an output signal that is compared to the traditional ramp 
signal to form a proper duty cycle. 
 The real-world implementation of the thermal load sharing technique is straightforward, since the 
existing control circuitry employed by, for example, the current sharing technique can be used. The 
6. Load sharing  Page 94 
 
temperature sensing device is simply mounted at the most critical location within the converter – in 
this case on the MOSFET transistor housing. The temperature signal is then fed back to the load 
share controller where it replaces the current signal. To ensure a system startup without running one 
or more converters in current limitation, the signal from the current measurement can be combined 
with the temperature information to create a load share controller that initially uses the current 
sharing technique. As the temperature of the individual converters change, the output current 
information is offset by the temperature signal – thus maximizing system efficiency and reliability. 
This topic will be discussed in more detail in the next section. 
 
Power
components
PWM control
Load share
control
Current
meas. OutputInput
2,7V - 20V
R1
R2
TSense
Part of
 
Figure 72 : Temperature sensor mounting 
 
 Using the same equations that were used to calculate the MOSFET transistor power losses and 
temperatures in section ‘6.2 Current-based load sharing’, the load distribution for the thermal load 
share technique can now be established. It should be noted that the temperature was a variable in the 
current sharing calculations while the individual load currents were fixed parameters. In the case of 
thermal load sharing, the temperature is fixed while the current distribution among the individual 
converters is the variable. With this information in mind, the following three converter currents can 
be established: 
 
 A 1.71  I  30%-nom Rds(on), =   (6-7) 
 
 A .69  I           nom Rds(on), =   (6-8) 
 
 A .78  I 30%nom Rds(on), =+   (6-9) 
 
 The average MOSFET junction temperature that results from distributing the load current as 
calculated in (6-7), (6-8) and (6-9) is 115.5°C. This is 13.2°C lower than the average junction 
temperature using the current sharing technique. Due to the fact that heat dissipation from heat-sink 
to ambient depends on source-to-ambient temperature difference, the average surface temperature 
associated with the 115.5°C junction temperature is 95.7°C. This is 8.7°C lower than the average 
surface temperature in the current sharing case. Since the overall system reliability is a function of 
heat-sink surface temperature it can easily be seen that the probability of system survival is much 
higher in the case of thermal load sharing.  
6. Load sharing  Page 95 
 
 Among the advantages of the thermal load sharing technique is its ability to optimize the system 
reliability at any given time, its system efficiency enhancing capabilities and its easy and cost-
effective implementation. Another perhaps less obvious advantage is the thermal load sharing 
technique’s ability to control converters with different power ratings. This particular topic is 
discussed in more detail in section ‘6.3.2 Dynamic change in converter power throughput ’.  
 Even though very little system affects results, a disadvantage of the thermal load sharing 
technique that must be mentioned is the possibility of a slight increase in individual converter failure 
rate due to the inherent failure rate of the thermal monitoring device. However, this drawback is 
compensated by far through the much lower average system temperature that results from the 
implementation. 
 
6.3.1 Current-based load sharing with temperature offset 
 
 As the previous section has shown, the thermal load sharing lowers the overall system 
temperature which increases the probability of system survival throughout its intended mission. 
Since temperature is a relatively slow-varying parameter, a load sharing implementation based solely 
on the thermal load sharing technique will inevitably be a rather slow-reacting system. To account 
for this fact during startup and load steps, the system can incorporate current information and as the 
temperature increases offset the load current distribution to account for the temperature differences 
among the individual converters. 
 Figure 73 shows a situation where both current and temperature information is used to distribute 
the load current among two converters. Provided the startup is slow enough, the system will 
approximately share the load current equally. As components start to heat up, the current distribution 
is offset by a value proportional to the temperature difference between the two converters. In steady 
state and for slow varying loads the sequence shown to the right in Figure 73 can be expected. 
 
Converter 1
Converter 2
Time
Current
Equal current during startup Temperature-based load
sharing in steady state
Steady state
 
Figure 73 : Power system startup and steady state operation 
 
 Although temperature-based load sharing is a rather slow-reacting system, it should be kept in 
mind that the outer load sharing loop usually is orders of magnitude lower in cross-over frequency 
than the inner current loop or the voltage regulation loop. In power systems comprised of converters 
with individual switching frequencies around 100 kHz and a converter control loop cross-over 
6. Load sharing  Page 96 
 
frequency around 10 kHz the outer load sharing loop will typically have a cross-over frequency 
around 100 to 300 Hz. This means that even optimized designs utilizing current sensing information 
will exhibit slow load control reaction to system startup and/or load transients. 
 
6.3.2 Dynamic change in converter power throughput  
 
 The illustration of IOUT vs. Temperature in Figure 74 (lower right) is a representation of the 
maximum output current (IMAX) as a function of system temperature. As briefly mentioned in section 
‘6.2 Current-based load sharing’, the horizontal line (IMAX) determines the maximum safe output 
current. It is often based on output current under worst-case temperature conditions, and has the 
drawback of unnecessary limiting the maximum output current at low temperatures (shown by the 
dotted line). By adding temperature information to the optimum current limit signal a dynamic 
changing current limitation is achievable. Figure 74 (left) shows how the thermal load sharing 
technique allows each converter to supply more current at lower temperatures while still limiting the 
current at higher temperatures. This dynamic power limiting capability makes it possible for 
optimum power distribution in system configurations where similar working conditions for each 
converter is not obtainable.  This means that regardless of each converter’s positioning in the overall 
system, the thermal load sharing technique always optimizes the system reliability and maximizes 
each converters power throughput capability according to the working environment. 
 
Power
components
PWM control
Load share
control
Current
meas. OutputInput
Current Limit (ILIM)
IMAX
IOUT
Temperature
TMAXLS controller
VTEMP
R1
R2
C1ISENSE
0
VTemp
.
0
+
ILimit
0
R2
R1+R2
I'SENSE
t
t
t
 
Figure 74 : Thermal load sharing implementation and signal waveform 
 
 The fact that the thermal load sharing technique operates by means of a general system parameter 
allows for parallel-connection of converters with different power ratings. In other words, a power 
system is easily realized by means of a 50 W and a 100 W converter. Before the 50 W converter 
starts to overheat the load share controller redistributes the load current so the 100 W converter 
supplies the major part. A similar situation in a system employing the current sharing technique 
would quickly cause a converter malfunction. 
 To the upper left in Figure 74 a simple circuit for implementing the proposed dynamic power 
throughput capability is provided. It can be seen that the continuously measured MOSFET transistor 
6. Load sharing  Page 97 
 
current is capacitively added to the DC (or slow-varying) temperature signal. The combined signal is 
then fed to current limit pin (ILIM) at the PWM controller.  
 
 
6.4 Reliability assessment of the two techniques 
 
 It has been mentioned in previous chapters that system temperature is the single most important 
parameter in system reliability assessments. Minimizing the temperature rise increases the system 
reliability and quite often also results in better system efficiency. Therefore, in order to assess the 
system reliability the PCB component distribution must be known.  
 When considering the physical layout of a converter, there is a trade-off assessment between the 
thermal aspects of the converter design and the electrical constraints of, for instance, the physical 
distance between MOSFET transistor and controller IC. From a reliability point of view the IC 
should be positioned as far away from the heat generating MOSFET transistor as possible. However, 
from an electrical point of view the IC should be positioned as close to the MOSFET transistor gate 
terminal as possible – in order to minimize the effects of PCB trace inductance. As a compromise the 
layout shown in Figure 75 is chosen for the reliability assessment.  
 
Transformer
Heatsink
Transistor
ICIC
Misc. components
Temperature
Distance
TSurfaceTTransformer
TAmbient
TIC
PCB
TEnd of PCB
 
Figure 75 : System temperature distribution 
 
 Based on the above temperature distribution an assessment of the overall system reliability can be 
established. Using the component data provided in chapter ‘3 Concept clarification and point of 
origin’ the following failure rates (expressed as failures in 109 hours) for the three converters in the 
current sharing configuration can be calculated: 
 
 FIT 3834   30%-nom ,R DS(ON) =λ    (6-10) 
 
 FIT 0429            nom ,R DS(ON) =λ    (6-11) 
  
 FIT 74712  30%nom ,R DS(ON) =+λ    (6-12) 
 
 The probability of survival for each converter is calculated by means of the exponential 
distribution shown in (3-3). By inserting 8760 hours (one year) the following converter probabilities 
can be obtained: 
6. Load sharing  Page 98 
 
 
 .96230      Prob 30%nom ,R DS(ON) =−    (6-13) 
 
 .92380           Prob nom ,R DS(ON) =    (6-14) 
 
 .78610      Prob  30%nom ,R DS(ON) =+    (6-15) 
 
 Combining the binominal coefficients for the probability that all converters work with that of one 
converter fails results in the following system reliability:  
 
 0.9740  ProbSystem =    (6-16) 
 
 Expressing this probability in terms of system unavailability, the following probability of annual 
downtime can be established: 
 
 2.60%  0.0260  .97400 - 1  Prob - 1  Q System ====    (6-17) 
 
 Performing the same reliability calculations for the thermal load sharing technique provides a 
foundation for a system performance comparison. Since the temperatures in this case are the same 
for all three converters they have identical failure rates: 
 
 .93380  Prob        FIT 7819  ThermalThermal =⇒=λ    (6-18) 
 
Based on (6-18) the overall system reliability can be calculated: 
 
 .98740  Prob System =    (6-19) 
 
Expressing (6-19) in terms of unavailability: 
 
 1.26%  0.0126  .98740 - 1  Prob - 1  Q System ====    (6-20) 
 
 Comparing (6-17) and (6-20) it can easily be seen that the probability of system malfunction for 
the thermal load sharing technique is less than half that of the current sharing technique. Calculating 
the percent-wise decrease in system unavailability it becomes clear that the proposed technique 
reduces the annual downtime probability by 51.6%. This is a significant reduction caused simply by 
considering the parasitic elements of the MOSFET transistors. Had the converters been positioned in 
different working surroundings, the effect could have been even more profound. 
 
 
6.5 Summary of the theoretical evaluation 
 
 The first part of this chapter has provided the theoretical foundation for a new thermal load 
sharing technique that at any given time ensures optimum reliability, performance and efficiency. A 
6. Load sharing  Page 99 
 
comparison between the thermal load sharing technique and the common and widely accepted 
current sharing technique has been provided and the pros and cons in each case have been discussed. 
 Reliability estimations provided the analytic evidence that the thermal load sharing technique has 
superior reliability compared to the traditional current sharing technique. Besides superior reliability 
the advantages of the thermal load sharing technique includes minimization of MOSFET transistor 
losses, simple implementation and lower overall system temperature. The simplicity comes from the 
fact that most converters are fitted with a thermal monitoring device for the thermal protection 
circuitry often found in modern converters. Also, since the system is intended for parallel-connection 
of multiple converters the needed load share controller is already present. In short, all that is needed 
is a network for adjusting the temperature information correctly and feed it to the load share 
controller. 
 A disadvantage of the thermal load sharing technique is the possibility of a slight increase in 
individual converter failure rate. However, this fact is by far compensated through the much lower 
average system temperature that results from the implementation. Also, the dynamics of the 
temperature control will inevitably be a bit slower than that of the current sharing. However, as was 
explained in a previous section, the effects of this slower reaction to load changes only imposes 
minor issues, since the outer load sharing loop in any case has a very low cross-over frequency. 
 It should be remembered that all calculations so far are based on the conduction losses associated 
with the switching MOSFET transistors. In a real-world system several components are affected by a 
temperature change, which can be directly related to the system reliability. As an example of an 
application where the thermal load sharing technique with great advantages could be used is server 
applications where rack mounting is a common way of physically positioning power supplies. The 
differences in temperature between the individual converter boards that result from this configuration 
can be quite significant, since one or more power supplies are usually positioned in-between two 
adjoined power supplies while at least one power supply is positioned to one side of the rack. This 
means that the power supply mounted to one side of the rack has better working conditions, thus 
being able to supply a larger part of the total load current. Doing this, the power supplies mounted in 
inferior working environments are alleviated. 
 Having established an algorithm for the current vs. temperature calculations in the mathematical 
software ‘Mathematica’, the power system under consideration is easily modified to a two-converter 
system. The laboratory implementation that follows is comprised of two converters and a 
modification of the theoretical evaluation presented so far would facilitate a qualitative comparison. 
Utilizing the same MOSFET transistors that were used in the three-converter system, the following 
data can be established for a two-converter system:  
 
MOSFET combination TJ,average current TS,average current TJ,average thermal TS,average thermal % decrease in Ploss 
RDS+30% and RDS,nom 142.5 °C 139.1 °C 135.6 °C 133.7 °C 7.7 
RDS+30% and RDS-30% 128.1 °C 125.2 °C 113.7 °C 111.5 °C 27.3 
RDS-30% and RDS,nom 106.9 °C 104.9 °C 103.8 °C 102.0 °C 7.4 
Table 12 : Two-converter system performance 
 
 From the data in Table 12 it can be concluded that the larger the difference in RDS(ON) the larger 
the decrease in MOSFET transistor power losses become. Furthermore, it is apparent that the 
percent-wise gain in overall MOSFET transistor losses, when transitioning from the traditional 
current sharing technique to the thermal load sharing technique, drops as the parasitic elements 
decrease in magnitude. From a system point of view this result is obvious for two reasons. First, the 
temperature dependence of the parasitic elements is very nonlinear. Second, the power loss and 
thereby the temperature increase is lower for small parasitic values. 
6. Load sharing  Page 100 
 
6.6 Specifications for the laboratory test setup 
 
 This section describes the experimental results of the new thermal load sharing technique 
presented in the previous section. From a schematic presenting all key components in the test setup 
to measurements of load sharing currents and overall system efficiency this section will verify that 
thermal load sharing in most cases outperforms the widely used current sharing technique.  
 The power system considered in the previous section was comprised of 3 parallel-connected 
converter units. The theoretical results indicate that the new load sharing technique increases the 
overall system reliability quite significantly based on an unequal distribution of the individual 
converter load currents. From the description associated with the theoretical power system 
evaluation, it can be deduced that the more converter units making up the power system the better 
results concerning reliability is achievable. From a system point of view, this result seems obvious 
since the more paths the current can take from input to output the easier it is for the system to 
optimize the system temperatures by balancing the currents through each converter unit.  
 The system considered in this and the following sections is comprised of 2 identical parallel-
connected buck converter units, for which reason it should be expected that the overall reliability 
improvement is less than that obtained in the theoretical power system evaluation. Indeed, as the 
experimental verification will show, the overall system improvement in terms of reliability and 
efficiency is less than that obtained in the 3 converter power system previously described. However, 
the results obtained still provide significant improvements in overall system performance. These 
improvements are explained both analytically and verbally in section ‘6.8 Theoretical system 
evaluation’. 
 Finally, based on reliability calculations using the same equations that were used for the 
theoretical system evaluation in the first part of this chapter, it will become clear that utilization of 
the thermal load sharing technique increases the overall system reliability by lowering the average 
system operating temperature. 
 Each of the two parallel-connected buck converters are capable of supplying a load current of 25 
A at an output voltage of 5 V. A block diagram of the two-converter system can be seen in Figure 76. 
 
             
Converter 1
Converter 2
OutputInput
IOUT/2
IOUT/2
IOUT
 
Figure 76 : Test setup block diagram 
 
 The annual downtime for the two-converter system, shown in Figure 76, utilizing the traditional 
current sharing technique can be calculated to 10 minutes and 14 seconds. This downtime is just one 
of the parameters used to compare the two load sharing techniques.  
 Turning the attention towards a more detailed system outline, Figure 77 shows a simplified 
schematic of the test setup under consideration. Each converter utilizes 4 IC’s, a single MOSFET 
 
 
           10 min. 14 sec. / year 
6. Load sharing  Page 101 
 
transistor and two freewheeling diodes. The reason for explicitly mentioning these active 
components is due to the fact that they are major contributors to the overall converter failure rate. 
Besides, from these active components the converters are comprised of input- and output capacitors, 
the energy storing inductors and a relatively large number of small-signal components (not included 
in Figure 77). 
 
100 µF 100 µF
48 µHIRFP064 10 mΩ
470 µF
RFeedback
+5V
MC3307UC3902UC3843
IR2110
PBYR
3045
RGate
+16V
Input Output
100 µF 100 µF
48 µHIRFP064 10 mΩ
470 µF
RFeedback
MC3307UC3902UC3843
IR2110
PBYR
3045
RGate
 
Figure 77 : Simplified power system schematic 
 
The following specification provides the key parameters as well as the components used: 
 
Specifications for each converter: 
 
 Converter wattage: 125 W 
 Input voltage: 16 V 
 Output voltage: 5 V ±5% 
 Switching frequency 122 kHz 
 
 
Key components for each converter: 
 
 Load share controller UC3902 
 PWM controller UC3843 
 High-side driver IR2110 
 Differential op-amp MC3307 
 MOSFET transistor IRFP064 
 Freewheeling diode PBYR3045 
 Current sense resistor 10 mΩ 
 
 Even though every attempt has been made to ensure equal converter layout, performance and 
switching frequency, Figure 85 clearly shows that small differences exists. The timing problem is 
easily fixed by utilizing clock synchronization (however, the UC3843 has no sync pin) while the 
6. Load sharing  Page 102 
 
difference in duty cycle is intentional, since this determines the current supplied by each converter. 
The percent-wise difference in switching frequency is very low and it is therefore estimated that the 
effects hereof are unimportant in relation to testing the two load sharing techniques. The exact 
switching frequencies can be found to 123.4 kHz for converter 2 and 120.1 kHz for converter 1. 
 The load sharing circuitry is implemented following the guidelines provided in [Ti01]. The 
overall load sharing implementation including the differential amplifier can be seen in Figure 78. 
  
GND
Sense
ADJ
ADJR COMP
VCC
Share+
Share-
UC3902
Out1
In1-
In2+
GND In2+
VCC
Out2
In2-
MC3307
1 kΩ
1 kΩ
1 kΩSense resistor
To Load
To Conv erter
+9 V supply
-9 V supply
COMP
VFB
ISense
RT/CT GND
VRef
VCC
Output
UC3843
+12 V supply
0.1 µF
MOSFET
10 kΩ
10 kΩ
40 Ω
+9 V supply
360 Ω 1.5 kΩ
1 µF
0.1 µF
ADJ(m a x)
Se nsem a xOut,m a xOut,
ADJ I
RI - V
  R
⋅∆
=
CompCross
Comp Cf2
1  R
⋅⋅⋅
=
π
PWRCSA
Load
Sense
G
ADJ
Cross
M
Comp AAR
R
R
R
f2
G  C ⋅⋅⋅⋅
⋅⋅
=
π
ADJ(max)
ADJ(max)
G I
V
  R =
 
Figure 78 : Load sharing schematic for a single converter 
 
 
 
Figure 79 : Real-world test setup of two identical buck converters 
 
 Figure 79 and Figure 80 show images of the laboratory test setup. The large copper baseplate on 
which the two converters are implemented is used as a heat stabilizing mechanism for all small-
signal devices. The physical separation between the MOSFET transistor heat-sinks (24 cm.) prevents 
the two converters from interacting thermally, thus increasing the adjustability of the control system. 
 
6. Load sharing  Page 103 
 
Heatsinks
Inductor
Current sensor
Control circuit
 
Figure 80 : Close up view of converter 2 
 
6.6.1 Prototype limitations during startup 
 
This section describes some of the problems associated with parallel-connection of multiple 
converters. The control circuitry of most converters uses a voltage feedback control scheme to 
stabilize the output voltage and to account for load variations. As long as the converters are operated 
independently, a properly designed control circuitry will ensure a stable output voltage within the 
feedback bandwidth. However, due to component tolerances, the output voltages of seemingly 
identical converters might differ slightly, thus causing severe problems when parallel-connected 
without considering the effects of this voltage difference. Figure 81 shows the ideal case with two 
identical converters connected in parallel. The load sharing in terms of equal converter current 
supply is almost perfect with only minor off-set differences. This type of load sharing is a theoretical 
possibility but in a real-world implementation not likely to work. A more realistic real-world 
implementation of the system is depicted in Figure 82 where converter 1 supplies the entire output 
current as well as charging the output capacitor of converter 2. This occurs because the output 
voltage of converter 1 is slightly higher than the output voltage of converter 2. Since the control 
circuitry of converter 2 monitors the output voltage and detects that there already is sufficient voltage 
available it adjusts the duty cycle to almost zero. The control circuitry of converter 1 performs the 
same task, but in contrast to the control circuitry of converter 2 it adjusts its output voltage to 
account for the demand for load current as well as charging current for the output capacitors of 
converter 2. A graphical view of this situation can be seen in Figure 84b. 
 
6. Load sharing  Page 104 
 
DC/DC Converter 1
DC/DC Converter 2
Temperature
control
Temperature
control
Temp.
Temp.
Load
~ITot/2 ITot
~ITot/2
COut
COut
Bus
VOut
VOut
 
Figure 81 : Ideal redundancy implementation 
 
 
DC/DC Converter 1
DC/DC Converter 2
Temperature
control
Temperature
control
Temp.
Temp.
Load
ITot + ICharge ITot
ICharge
COut
COut
Bus
VOut
VOut - δ
δ = small
 
Figure 82 : Non-Ideal redundancy implementation 
 
 
DC/DC Converter 1
DC/DC Converter 2
Temperature
control
Temperature
control
Temp.
Temp.
Load
ITot
COut
COut
Bus
VOut
VOut - δ
δ = small
χ = small
ITot/2 - χ
ITot/2 + χ
 
Figure 83 : Real-world implementation 
 
 The system depicted in Figure 83 illustrates a real-world implementation utilizing OR’ing diodes. 
These diodes serve a dual purpose – meaning that besides preventing a single converter fault from 
propagating through the system and possibly shorting out the common output voltage bus it prevents 
the converter with the highest offset voltage from charging the output capacitors of the parallel-
6. Load sharing  Page 105 
 
connected converters. Hence, the problems described above are eliminated to a certain degree 
although additional power loss has been added to the system. This latter issue deserves a bit more 
consideration. It is generally recommended that the isolation device in low to medium current 
applications is a schottky diode due to its relatively low forward voltage drop and thereby lowering 
the impact on system efficiency and temperature. In high-current applications a MOSFET transistor 
is recommended. Several manufacturers produce OR’ing MOSFET transistors with ON-resistances 
as low as 3.2 mΩ. The voltage drop across such a device is a merely 0.32 V at 100 A output current 
as opposed to a 0.75 V drop across a schottky diode at the same current level. It should be noted that 
successful use of MOSFET transistors as OR’ing elements require implementation techniques that 
prevents the MOSFET body diode from conducting in fault situations.  
 The current distribution for the system in Figure 83 is shown in Figure 84c.  
 
DC/DC Conv erter 1
DC/DC Conv erter 2
DC/DC Conv erter 1
DC/DC Conv erter 2
Output currentOutput current Output current
In
di
vid
ua
l c
on
ve
rte
r c
ur
re
nt
In
di
vid
ua
l c
on
ve
rte
r c
ur
re
nt
In
di
vid
ua
l c
on
ve
rte
r c
ur
re
nt
(a) (b) (c)
DC/DC Conv erter 1
DC/DC Conv erter 2
ITot/2 - χ
ITot/2 + χ
 
Figure 84 : Individual converter current distribution 
 
 From at fault tolerant point of view, the implementation approach shown in Figure 81 and Figure 
82 would result in system shut-down due to the lack of fault isolation capabilities, whereas the 
implementation in Figure 83 continues to provide service to the load although at a deteriorated level.  
 
 
6.7 Experimental results 
 
 
Figure 85 : Duty cycles of the two converters 
6. Load sharing  Page 106 
 
Initially the two 25A buck converters were paralleled and operated in a ‘semi-droop’ manner 
where the load sharing was based on the ‘natural’ output impedance of the converters. This technique 
is very simple, but results unfortunately only in very rare situations in an acceptable level of 
performance, efficiency and reliability. Indeed, as the efficiency measurement shows, one converter 
supplies almost the entire load current - leaving the other converter in an idle state. To optimize the 
efficiency and system reliability some form of load control is needed. The implementation of this 
concept is achieved by utilizing the dedicated load share controller UC3902 from Texas Instruments. 
Since this controller does not allow for high-side differential current sensing an OP-amp is employed 
to compensate for the lack of this feature. It should be noted that Texas Instruments offers load share 
controllers that allow for high-side differential sensing (like the UC3907), but due to a rather tight 
schedule it was chosen to proceed with the load share controller available at the time of 
implementation – the UC3902. 
 Following the guidelines provided in the load share controller datasheet and associated 
application notes, the current sharing technique was successfully implemented. The two buck 
converters were then operated at nominal output power (12.5 A each) while tuning the current share 
controller. The result of this tuning can be seen in Figure 86. 
 
0
2
4
6
8
10
12
14
0 5 10 15 20 25 30
Output current
In
di
vi
du
al
 c
on
ve
rt
er
 c
ur
re
nt
Converter 1
Converter 2  
Figure 86 : Current sharing 
 
 Figure 86 shows that the achievable current sharing is very accurate. The only observable 
deviation from identical current levels is in the 1 A – 7 A range. Since the load share controller 
operates over a fairly large current range a small deviation should be expected.  
 The next set of measurements is performed while each converter operates individually, thus 
allowing for very accurate temperature data to be obtained. The result is shown in Figure 87. 
 
6. Load sharing  Page 107 
 
5 10 15 20
150
100
50
Temperature
Output current
Converter 1 MOSFET temperature
Converter 2 MOSFET temperature
175
25
25
75
125
 
Figure 87 : Temperature measurement of the two switching MOSFET transistors 
 
 The temperature at which the MOSFET transistors will be working is around 70°C, since this 
temperature corresponds to an individual converter output current of 12.5A. From Figure 87 it can be 
seen that converter 1 generally operates at a higher temperature than converter 2. The point where 
the temperatures of the two MOSFET transistors are equal is at an output current of 23.6A. Above 
this very high converter output current the temperature of the MOSFET transistor used in converter 2 
exceeds the temperature of the MOSFET transistor used in converter 1. Part of the explanation for 
the relatively large MOSFET transistor temperature difference is shown in Figure 88, which shows 
the freewheeling diode temperatures. 
 
5 10 15 20
150
100
50
Temperature
Output current
Converter 1 diode temperature
Converter 2 diode temperature
25
25
75
125
 
Figure 88 : Temperature measurements of the two freewheeling diodes 
 
 The temperatures of the freewheeling diodes are also relatively high. Again, it can be seen that 
converter 1 operates at a higher temperature than converter 2. Since an intense mutual heating 
between the two active components takes place, it is difficult to identify the actual self-heating of 
each component. As one component increases the ambient temperature (in the immediate vicinity of 
the power components) the other suffers from increases in parasitic elements resulting in increased 
6. Load sharing  Page 108 
 
internal heating. The diode parameters affected by the thermal changes will briefly be discussed in 
section ‘6.8 Theoretical system evaluation’. 
 The following set of measurements was taken by means of an automated test setup capable of 
measuring every parameter of interest over the entire operating range in a matter of seconds. 
However, close examination of these results reveal that a true system efficiency can not be measured 
in very short time steps, since the temperature of the load sharing technique needed time to settle. 
Furthermore, the parasitic elements in the system also need time to increase to a realistic continuous 
operating level, meaning that high efficiency is easily obtainable by taking a single measurement at 
high output currents while the converter effectively operates close to room temperature.  
 
0,0
2,0
4,0
6,0
8,0
10,0
12,0
14,0
16,0
18,0
20,0
0 5 10 15 20 25 30
Converter 1 current
Converter 2 current
 
Figure 89 : Initial current distribution of the thermal load sharing technique 
 
 Figure 89 shows the current distribution of the thermal load sharing technique. It is noteworthy 
that the current contribution by converter 1 actually drops when the load current exceeds 12.5A. 
Although highly unusual, this phenomenon is theoretically possible, since external as well as internal 
operating conditions could change abruptly, thus forcing the thermal load share controller to 
redistribute the currents. 
80,0%
82,0%
84,0%
86,0%
88,0%
90,0%
92,0%
94,0%
0 5 10 15 20 25 30
 
Figure 90 : Initial power system efficiency measurement of the thermal load sharing technique 
6. Load sharing  Page 109 
 
 Figure 90 shows the initial system efficiency that resulted from taking the measurements shown in 
Figure 89 in a matter of seconds. Since this efficiency exceeded the theoretically obtainable 
efficiency it was clear that a different approach had to be taken. For this reason a new set of 
measurements was obtained. These can be seen in Figure 91 and Figure 93. 
 
0,4
0,5
0,6
0,7
0,8
0,9
0 5 10 15 20 25 30
Output current
Ef
fic
ie
nc
y
Semi-droop sharing efficiency
Current sharing efficiency  
Figure 91 : Power system efficiency of current sharing and semi-droop 
 
 Figure 91 shows the efficiencies of the two techniques for parallel converter operation discussed 
so far. With reference to Figure 91, it can be seen that the semi-droop configuration exhibits higher 
efficiency at low output currents compared to the current sharing approach. This is simply due to 
chance since the converter supplying the majority of the current in this test setup apparently has both 
the highest output voltage and the highest efficiency at light loads. The control circuitry of the other 
converter monitors the common output voltage which is higher than its internal reference voltage, 
and adjusts the duty cycle accordingly. 
  In order to make a fair comparison between the two load sharing techniques, each converter is 
implemented with the exact same components, same length of wiring and current sensing resistors in 
both cases although not necessary in the thermal load sharing situation. Furthermore, since the 
thermal load sharing technique does not need high-side sensing the added OP-amp and associated 
passive components could also be removed from the circuit. However, for comparison purposes 
these components remain active during the thermal load sharing implementation. 
 
6. Load sharing  Page 110 
 
0
2
4
6
8
10
12
14
16
0 5 10 15 20 25 30
Output current
In
di
vi
du
al
 c
on
ve
rte
r c
ur
re
nt
Converter 1
Converter 2  
Figure 92 : Current distribution while operated by thermal load sharing 
 
 As can be seen in Figure 92, the individual converter currents are no longer identical – far from it 
actually. At lighter loads the difference between the two currents is 1 A, but as the load increases the 
separation between the two converter currents become larger. At the nominal output current (25 A) 
the difference between the two converter contributions is 3.1 A. The system efficiency that results 
from this redistribution of converter currents can be seen in Figure 93. A complete schematic of the 
thermal load sharing implementation can be found in the appendix to this thesis. 
 
0,4
0,5
0,6
0,7
0,8
0,9
0 5 10 15 20 25 30
Output current
Ef
fic
ie
nc
y
Semi-droop sharing efficiency
Current sharing efficiency
Thermal sharing efficiency  
Figure 93 : System efficiency for the three different techniques 
6. Load sharing  Page 111 
 
 It should be noted that the efficiency of the thermal load sharing follows that of the semi-droop, 
since this causes the lowest system heating. At heavier loads the efficiency of the thermal load 
sharing exceeds that of the current sharing approach by approximately 2%. 
 
 
Figure 94 : Common output voltage 
 
 
 
Figure 95 : Output voltage ripple 
 
 Figure 94 and Figure 95 show the output voltage as a DC measurement and AC measurement, 
respectively. From Figure 95 it can be seen that the output voltage ripple deviates slightly from the 
expected triangular waveform. This is due to the small difference in switching frequency and the 
constant altered duty cycle. However, the ripple voltage is clearly within the ±5% voltage variation 
limit set as a requirement for the power system under consideration. The frequency of the output 
6. Load sharing  Page 112 
 
voltage ripple is 125 kHz, which is higher than either one of the converter switching frequencies. 
Thus, the output is a partial combination of each converters output voltage ripple. 
 
 
Figure 96 : Power system input DC voltage 
 
 
 
Figure 97 : Power system input AC voltage 
 
 Figure 96 and Figure 97 show measurements of the input voltage during a full load situation. It 
can be seen that the voltage spikes that results from the large current pulses is significant and the 
only way to minimize these spikes is by adding more capacitance close to the switching MOSFET 
transistors. In order to ensure continuous PWM controller and high-side driver operation several 
high-quality capacitors were mounted directly on top of the IC’s.  
 
6. Load sharing  Page 113 
 
Cross-over frequency = 35.3kHz
Phase margin = 54.6o
Gain
Phase
 
Figure 98 : Gain/phase plot of converter 1 
 
Cross-over frequency = 39.1kHz
Phase margin = 49.2o
Gain
Phase
 
Figure 99 : Gain/phase plot of converter 2 
6. Load sharing  Page 114 
 
 Although the careful converter layout and design results in similar gain/phase waveforms for the 
two converters, it can still be seen that converter 2 has a cross-over frequency 3.8 kHz higher than 
converter 1. Due to this increase in cross-over frequency the associated phase margin decreases by 
5.6°, thus being very close to the theoretical rule-of thumb minimum limit of 45°. 
 
 
6.8 Theoretical system evaluation 
 
 This section explains why the current distribution of the thermal load sharing technique results in 
higher overall efficiency. The calculations will be limited to include only high power components – 
meaning, components that are related to the high current path from input to output. Identifying these 
components the following list can be established: 
 
• MOSFET transistors 
• Freewheeling diodes 
• Current measurement resistors 
• Inductors 
• Capacitors 
 
 Based on the subsequent descriptions and computations loss estimations are provided at the end of 
each subsection. These loss estimations verify that a shift in individual converter currents gives rise 
to the efficiency gain predicted in the theoretical evaluation presented in the first part of this chapter. 
It should be noted that the correlation between output current, temperature and power losses in some 
of the power components is very complex. Due to this complexity the following descriptions only 
state the initial loss equations or make a reference to where the equations can be found - otherwise 
the results are shown in terms of graphical illustrations. All calculations and the associated 
algorithms can be found on the accompanying CD in the folder ‘Mathematica’. 
 
6.8.1 MOSFET transistors 
 
 The redistribution of MOSFET transistor losses is the dominant factor in the system efficiency 
improvement. However, as will be shown, the freewheeling diodes and the filter capacitors also 
contribute to a shift in system losses whereas the contributions from the current measurement 
resistors and the inductors are only minor.  
 In the following section, the subscript ‘Current’ is used to denote the losses associated with 
current sharing technique while the subscript ‘Thermal’ is used to denote the losses associated with 
thermal load sharing technique. Also, since the system is comprised of two converters, the losses are 
calculated for each converter and are represented by the aforementioned subscript notation followed 
by two numbers. For example the MOSFET transistor conduction losses in the current sharing case 
are denoted ‘PConduction, Current = 5.13 W and 2.30 W’. This indicates that the loss in converter 1 is 5.13 
W and the loss in converter 2 is 2.30 W. Also, as will be shown in Figure 100 and Figure 101 the 
notation RDS(ON) + 2.9mΩ is adapted to indicate the difference in MOSFET transistor ON-resistance for 
the two transistors used in the test system. This value is found by comparing the actual 
measurements to the theoretical loss evaluations based on the nominal RDS(ON) value (transistor 
datasheet). 
6. Load sharing  Page 115 
 
 The MOSFET transistor conduction losses are found using the equations used in the theoretical 
evaluation presented in section ‘6.2 Current-based load sharing’. Based on this approach the 
following set of loss curves can be established. 
 
Conduction losses
5
10
15
20
5 10 15 20 25
Output Current
Nominal RDS(ON)
Nominal RDS(ON) +2.9mΩ
In
cr
ea
si
ng
 te
m
pe
ra
tu
re
 
Figure 100 : Conduction losses vs. output current 
 
 Each curve in Figure 100 represents the conduction losses for a fixed temperature while varying 
the output current. This clearly shows that not only do the conduction losses increase as a function of 
output current, but also as a function of temperature. The latter fact actually has a significant impact 
on the overall MOSFET transistor losses. By relating the conduction losses at a given output current 
to the correct temperature based on heat-sink heat dissipation to the ambient the following results can 
be obtained: 
 
 2.30W  and5.13W    P Current   ,Conduction =  .92W2  and3.39W    P  Thermal ,Conduction =  
 
 The process of determining the above losses and temperatures is successive, meaning that a 
change in one variable results in a change in the other variable. For this reason the curves shown in 
Figure 100 are established by calculating a number of points that relates output current, junction and 
heat-sink temperatures, MOSFET transistor ON-resistance and total MOSFET transistor power loss. 
Using the mathematical tool ‘Mathematica’ these points are then fitted to make up the curves shown 
in Figure 100.  
 MOSFET transistor switching losses is another heat generating factor that must be included in the 
overall MOSFET transistor losses. These losses are found using the procedure provided in [Mo03], 
from which the following graphical representation can be established: 
 
6. Load sharing  Page 116 
 
Switching losses
2
4
6
8
10
12
14
5 10 15 20 25
Output Current
Nominal RDS(ON)
Nominal RDS(ON) +2.9mΩ
 
Figure 101 : Switching losses as a function of output current 
 
 It should be noted that the temperature dependency of the switching losses found in [Ko01] have 
been interpolated and normalized to the switching losses at 25°C at an output current of 12.5 A. 
Also, since the increase in current results in higher conduction losses, the associated MOSFET 
transistor junction temperature increases. In turn, this increases the overall switching losses (as well 
as conduction losses) as a function of output current. This dependency is included in the switching 
loss curves shown in Figure 101. The overall effect of the redistribution of the converter currents 
results in the following set of losses: 
 
 .19W4  and.65W  5  P Current   Switching, =  .30W4  and.47W  4  P  Thermal Switching, =  
 
 Considering the decrease in total MOSFET transistor losses, it can be seen that the conduction 
losses accounts for 1.12W while the switching losses contribute 1.07W to the overall system loss 
reduction. 
 
6.8.2 Freewheeling diodes 
 
 The diode losses are found using the simple equation shown below: 
 
    IR  IV  P 2 RMSDiode,Dynamicavg Diode,staticDiode ⋅+⋅=    (6-21) 
 
where Vstatic is the forward voltage drop, RDynamic is the inverse slope of the forward current vs. 
voltage drop. The parameters IDiode,avg and IDiode,RMS denote the average and RMS diode currents 
respectively and can be found using the following two equations: 
 
    )D-(1I  I Outavg Diode, ⋅=     D = Duty-cycle     (6-22) 
 
6. Load sharing  Page 117 
 
 





+
∆
⋅=
2
Out
2
L
RMSDiode, I  12
ID  I  ∆IL  = Inductor ripple   (6-23) 
 
 As will become apparent, the effect of the diode losses on the overall decrease in system power 
losses is much lower than that of the MOSFET transistors. One reason for this would be that the 
forward voltage drop of a typical diode decreases with increasing temperature. However, the change 
in forward current also affects the forward voltage drop – with increasing forward voltage drop with 
increasing forward current. Being in close proximity to the MOSFET transistor heat-sinks the change 
in diode temperature is a combination of internal heating, heat transfer from the MOSFET transistors 
and a change in forward current. By taking these parameters into account the following values can be 
found: 
 
 .44W4  and.46W  3  P Current   Diode, =  3.86W  and.71W  3  P  Thermal Diode, =    (6-24) 
 
 In addition to the equations (6-21), (6-22) and (6-23), the abovementioned parameters that forms 
the basis for the results in (6-24) is directly deducible from the device datasheet and is therefore 
omitted from this section. The parameter in (6-21) that is affected by the forward voltage drop is 
Vstatic. The overall decrease in diode losses amounts to 0.33W. 
 
6.8.3 Current measurement resistors 
 
 The relationship between current and power loss for this component is almost linear in the current 
range of interest. Thus, the change in efficiency from redistributing the output current is negligible, 
which the following calculations will verify: 
 
The total power loss in the current measurement resistors can be found using: 
 
 α⋅⋅= Resistor
2
OutResistor RI  P    (6-25) 
 
where α denotes the temperature factor for copper. Inserting values for the load sharing scenarios 
gives the following results: 
 
 .218W3  P Current  Resistor, =  .224W3  P Thermal Resistor, =    (6-26) 
 
 In (6-26) the two individual converter losses have been combined to form a resistor power loss for 
each of the two load sharing techniques. As can be seen in (6-26) the overall change in resistor 
power loss is negligible. 
  
6.8.4 Inductors 
 
 The inductors are implemented using high flux powder cores from Magnetics. Since the shape of 
this component deviates from that of the current measurement resistors the correlation between 
temperature, wire resistance and DC power loss is no longer linear in the range of operation of 
6. Load sharing  Page 118 
 
interest to the measurements. Instead, the following equation for the temperature estimation is used 
[Ma02]: 
 
 Temperature Rise (°C)
833.0
2 )(cm Area Surface
(mW) LossPower  Total  





=    (6-27) 
 
Inserting values in order to assess the copper and core losses gives the following result: 
 
 .327W3  P Current  Inductor, =  .124W3  P Thermal Inductorl, =    (6-28) 
 
 The difference between these two losses accounts for a power loss decrease of 203mW. Similar to 
the current measurement resistors, this decrease has only minor overall impact.  
 
6.8.5 Capacitors 
 
 The last components that will be considered in this section are the filter capacitors. On the 
assumption that the losses associated with these components are solely caused by the ripple current 
and the internal capacitor ESR the following total power loss can be found: 
 
 .511W4  P Current  Capacitor, =  .019W4  P Thermal Capacitor, =    (6-29) 
 
 The equation used for determining the abovementioned capacitor losses is the same as that used 
for determining the losses in the current measurement resistors. The thermal load sharing causes an 
overall decrease in capacitor losses of 492mW, which is mainly caused by the nonlinear change in 
capacitor ESR as a function of temperature [Po01], [Sa03].  
 
6.8.6 Combined losses 
 
 Combining all of the subtotals above, the result is a total loss reduction of 3.21W. This decrease 
in system losses results in an overall efficiency increase by: 
 
 .6%1  
P  P
P - 
P  P
P
OutLScurrent  Loss,
Out
OutLS  thermalLoss,
Out
=
++
   (6-30) 
 
 The efficiency increase shown in (6-30) is approximately 0.4% lower than that indicated in Figure 
93. However, additional losses due to changes in diode reverse recovery currents have not been 
included. Also, loss adjustments taking into account the difference in switching frequency have not 
been considered. All in all, this section has provided a theoretical system evaluation resulting in 
analytically derived data that matches the measured values presented in section ‘6.7 Experimental 
results’ relatively accurately. 
 It should be noted that the shift in power losses presented in this section is caused by a number of 
factors, including component variations, differences in ambient/operating temperature and unequal 
thermal contact between power generating components and heatsinks. For this reason the theoretical 
optimum of equal power component losses is seldom achievable in any real-world system. 
6. Load sharing  Page 119 
 
6.8.7 Reliability of laboratory implementation 
 
 This section briefly introduces the reliability calculations that form the basis for the previously 
mentioned annual downtime. The point of origin is the Military Handbook 217F concerning 
reliability prediction of electronic equipment and the combinatorial aspects presented in chapter ‘3.3 
Statistical distributions and methods’. Following these guidelines the following equation for the N+1 
redundant system can be established:  
 
The total number of combinations is 22 = 4 of which only 3 are valid for system success: 
 
 212121System qp  pq  pp  R ⋅+⋅+⋅=    (6-31) 
 
As described in chapter 3 the special case of (6-31) can be expressed as: 
 
 qp2  p  R 2System ⋅⋅+=    (6-32) 
 
 The latter equation (6-32) can be used in the thermal load sharing situation since both converters 
operate at the same temperature, thus having the same failure rate. 
 Having established a theoretical foundation for the reliability assessment the next information 
needed is the temperatures of the individual components. This is a rather complicated task, for which 
reason the simplified thermal model shown in Figure 102 is used in all reliability calculations. This 
model shows the components in close proximity to the MOSFET transistors. Although additional 
components for the load sharing controller, the OP-amp etc. are present, these are assumed to be 
operating at ambient temperature and are not affected by the change in heat-sink temperature. 
 
TSurface
TSurface - 10°C
TSurface - 30°C
1 resistor
1 MOSFET
5 resistors
2 IC's
1 inductor
2 diodes
4 capacitors
1 resistor
1 diode
2 capacitors
8 resistors
2 IC's
4 capacitors
          
Figure 102 : Simplified temperature distribution 
 
 It can be seen that the temperature distribution shown in Figure 102 resembles that used in chapter 
‘5 Digital control of DC-DC converters’. This is due to the fact that most converter designs from a 
layout point of view are similar in process. The factors contributing to this layout ‘standard’ are 
described in chapter ‘5 Digital control of DC-DC converters’ and will therefore not be included in 
this section.  
 Based on the temperature distribution in Figure 102, an average annual downtime of 10 minutes 
and 14 seconds is established (see Figure 76) for the current sharing technique. This number takes 
6. Load sharing  Page 120 
 
into account the redundancy concepts built into the power system. In other words, the calculations 
indicate the probability of at least one working converter. 
 Reliability calculations for the thermal load sharing technique reveal that due to the redistribution 
of converter currents, an annual downtime of 6 minutes and 11 seconds can be achieved. Compared 
to the results depicted in Figure 76 this is a reduction of almost 40%. This is a significant reduction 
obtained by simply replacing the current sharing information with thermal information from the 
MOSFET transistor.  
 A final verification of the thermal load sharing technique’s advantages over the traditional current 
sharing technique is shown in Figure 103. 
 
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30
Output current
A
ve
ra
ge
 s
ys
te
m
 te
m
pe
ra
tu
re
Current sharing technique
Thermal load sharing technique  
Figure 103 : Average system temperature 
 
 Figure 103 shows the average system temperature as a function of output current. It can be seen 
that the system operated by the thermal load sharing is at a constant lower temperature than its 
current sharing counterpart. At the extreme ends of the operating range, the temperature difference 
between the two techniques is only 1°C while the difference throughout the normal operating range 
is as high as 3.3°C. This may not seem that impressive, but it should be remembered that the 
temperatures depicted in Figure 103 are average temperatures – meaning that the individual 
converter temperatures in the current sharing implementation still varies by as much as 15°C. A 
temperature difference of this magnitude lowers the converter reliability of the hotter converter 
considerably. 
 
 
6.9 Discussion and summary 
 
 The advantages of using temperature as a control parameter are quite clear. An equal temperature 
distribution among, for example the converter’s switching MOSFET transistors, actually lowers the 
overall system temperature and hereby decreases the unavailability considerably. The first part of 
this chapter introduced the new concept and provided a theoretical evaluation of a power system 
comprised of three individual converters. The type of loss that formed the basis for this evaluation 
6. Load sharing  Page 121 
 
was the MOSFET transistor conduction losses. As the last part of this chapter showed, this type of 
loss only provides part of the explanation for the improved performance.  
 The experimental results of a laboratory realization of the new thermal load sharing technique 
showed that using the thermal load sharing technique not only increases the overall system reliability 
as calculated in [Ne03], but also has a positive impact on the system efficiency. The increase in 
efficiency is achieved by redistributing the current supplied by each converter to obtain equal 
thermal conditions as opposed to the current sharing technique’s intent to establish equal currents.  
 It should be noted that further efficiency improvements are achievable if the current measurement 
resistors, not used by the thermal load sharing, are removed. However, for comparison purposes it 
was chosen to leave them in the circuit along with the OP-amps and the associated small-signal 
components. 
 Another less obvious advantage of the temperature control is the system’s ability to route power 
through converter boards mounted in cooler environments and thereby optimize the working 
conditions for converter boards positioned, for example, in-between to adjoined converter boards 
giving off heat.   
 Having mentioned several advantages of the thermal load sharing it should also be mentioned that 
the individual converter base failure rate might increase slightly as a result of the added temperature 
sensing devise. Furthermore, parameters such as cost and overall power system size should be taken 
into account before it is decided whether or not to use the thermal load sharing technique in a given 
application.  
 A summary of the advantages and drawbacks of the three different load sharing techniques, 
discussed in this chapter, can be found in the following table. 
 
Load sharing technique Advantages Drawbacks 
Approximately equal converter currents 
during startup and transients Unequal power losses 
Relatively simple implementation Low reliability 
Low efficiency 
Needs approximately equal operating 
temperatures 
Differential sensing required in non-isolated 
systems 
Lowest possible base failure rate 
Current sharing 
 
Control circuitry needs dual power supplies 
unless the more expensive controllers are 
utilized. These are often internally corrected 
and or bootstrapped. 
Improved reliability Extra components 
Improved efficiency Slow response during startup and transients 
Capable of different operating temperatures Slightly higher base failure rate 
Lowest average system temperature 
Thermal load sharing 
Elimination of current sensing resistor – will 
increase overall efficiency  
 
Approximately equal converter currents 
during startup and transients 
The most complex implementation of the 
three techniques 
Improved reliability Highest base failure rate 
Capable of different operating temperatures Differential sensing required in non-isolated systems 
Dynamic power throughput capability 
Control circuitry needs dual power supplies 
unless the more expensive controllers are 
utilized. These are often internally corrected 
and or bootstrapped. 
Current sharing combined 
with thermal load sharing 
Medium efficiency 
Table 13 : Summary of advantages and drawbacks of the three load sharing techniques 
6. Load sharing  Page 122 
 
 Towards the end of the work presented in this chapter (November 2002) Texas Instruments 
started to publish failure rates, for there IC’s, on the Internet. The following table lists the failure 
rates and key features of several commonly used IC’s: 
 
Part type Function Pin no. Failure rate MTBF Temperature
UCC3800 Low-Power BiCMOS Current-Mode PWM 8 3.7 2.703⋅10
8 0 - 70 
UCC35702 Advanced Voltage Mode Pulse Width Modulator 14 3.7 2.703⋅10
8 0 - 70 
UC3825 High Speed PWM Controller 16 4.1 2.439⋅108 0 - 70 
UC3843 Current Mode PWM Controller 8 4.1 2.439⋅10
8 0 - 70 
UC3902 Load Share Controller 8 4.1 2.439⋅108 0 – 70 
UC3907 Load Share Controller 16 4.1 2.439⋅108 0 - 70 
UCC39002 Advanced Loadshare Controller 8 1.1 9.498⋅10
8 0 – 55 
Table 14 : Reliability data for Texas Instruments controller IC’s 
 
 Considering the failure rates provided in Table 14 it seems as if Texas Instruments generate the 
part failure rates based on manufacturing process alone. In the Military Handbook 217F a large 
number of parameters are considered when generating failure rates, including failure rates for IC’s. 
An advanced PWM controller (such as the UCC35702) would have a much higher failure rate than 
the rather old and well-tested UC3843. To clarify this matter Texas Instruments’ power support was 
contacted. Having talked to Mr. Ed Walker from ‘TI Power Support’ it became clear that Texas 
Instruments indeed generate their failure rates based on production process alone. Mr. Walker 
acknowledged that the failure rates did not apply in military or space applications. The only 
application where the provided data should be used is in consumer electronics operating within the 
indicated temperature range.  
 Comparing the data provided by Texas Instruments with, for example, the failure rate for a simple 
resistor calculated by means of the Military Handbook 217F (see section ‘3.3.2 Part failure rate’), it 
becomes clear that they have almost identical failure rates. According to the Military Handbook, a 
resistor operated at 70°C with a derating of 75% has a failure rate of 3.52 FIT. 
 If the data provided by Texas Instruments would have been applicable in the work presented in 
this chapter, the overall system reliability could have been optimized differently. For instance, the 
indicated failure rates for the PWM controllers are much lower that the failure rates of the rather 
simple thermal monitoring devices used. This points to the fact that converter layout in the thermal 
load sharing could have been optimized according to the electrical constraints instead of settling for a 
compromise between the thermal issues and the electrical constraints.  
 
 
6.10 Patent rights? 
 
 The results presented in this chapter lead to an examination of patent rights. An external patent 
bureau was hired to perform the prior art search and conclude whether or not a patent could be 
obtained. The conclusion of their report was that a patent indeed was obtainable. Since Alcatel Space 
6. Load sharing  Page 123 
 
Denmark has been a part of this research from the very beginning, they were consulted about the 
patent rights. After several meetings, it was decided to proceed with the patent as a joint patent 
between Alcatel Space Denmark and the Technical University of Denmark. The process continued 
for several months before internal changes within the Alcatel organization changed the initial joint 
patent partnership. Due to these changes Alcatel Space Denmark could no longer participate in the 
patent process. Since the Technical University of Denmark is a research institution and has no 
immediate interest in patent rights without having industrial partners, it was decided to stop the 
patent process completely. Another contributing factor to the termination of the patent process was 
the deadline for submission of papers and the final Ph.D. thesis that approached rapidly. Since the 
thermal load sharing is a vital part of the research work it could not be excluded from this thesis due 
to a delayed patent process.  
 
 
6.11 References 
 
[Mi01]  Reliability Prediction of Electronic Equipment, Military Handbook 217 
 
[Ne03]  Optimized load sharing control by means of thermal reliability management,  
 Carsten Nesgaard and Michael A. E. Andersen, 
 Submitted for review at Power Electronics Specialists Conference 2004, Aachen, 
Germany 
 
[Mo03]  Power Electronics, Second Edition, Ned Mohan, Tore M. Undeland and William 
P.Robbins, John Wiley & Sons Inc., ISBN 0-471-58408-8 
 
[Ko01]  Reliability challenges due to excess stress under high frequency switching of power 
devices, Professor Johann W. Kolar, ETH, Zürich, Seminar August 2002 
 
[Us01]  Paralleled DC power supplies sharing loads equally, US patent 4,635,178 
 
[Us02]  System and method of load sharing control for automobile, US patent 5,157,610 
 
[Us03]  Current share circuit for DC to DC converters, US patent 5,521,809 
 
[Ti01]  Texas Instruments Application note U-129,  
 UC3907 Load Share IC Simplifies Parallel Power Supply Design 
 
[Ma02]  High Flux Powder Cores, Magnetics core data, http://www.mag-inc.com/ 
 
[Ne04]  Efficiency improvement in redundant power systems by means of thermal load 
sharing, Carsten Nesgaard and Michael A. E. Andersen,  
 Applied Power electronics Conference and Exposition 2004, Anaheim, USA 
 
[Wa02] Texas Instruments Power Support,  
 Mr. Ed Walker, support@ti.com, phone: 972-644-5580 
 
[Po01] Simple ESR Meter for Electrolytics, Ray Porter  
 TELEVISION Servicing Magazine January and April 1993  
6. Load sharing  Page 124 
 
[Sa03] Electrolytic Capacitor Life Testing and Prediction, V. A. Sankaran and C.S. Avant, 
Ford Research Laboratory, IEEE Industry Applications Society Annual Meeting, New 
Orleans, Louisiana, October 5-9, 1997  
 
 
 
 
7. Thermal droop load sharing  Page 125 
 
7 Thermal droop load sharing 
 
 This chapter describes an alternative redundancy technique developed while working at the 
Precision Docking Project managed by Partners for Advanced Transit and Highways (PATH). A 
more detailed description of the Precision Docking Project and PATH can be found in chapter 8. The 
load sharing technique presented in this chapter originates from the traditional series resistor droop 
technique. By replacing the resistive droop element with a temperature dependant component the 
power system load sharing is based on temperature information, which increases the overall 
reliability and efficiency significantly. 
 
 
7.1 Introduction 
 
 Keeping the design of high-current power supplies at a low cost, low circuit complexity and 
relatively high efficiency is a challenge that continues to feed the search for alternative 
implementation methods. One such implementation method often used in non-critical high-current 
power systems is the parallel-connection of several identical converters each capable of supplying 
part of the load current. This approach allows for relatively easy design with little circuit complexity 
beyond that of the individual converters.  
 In order to control the current flow through each converter in this type of configuration, some 
form of load sharing must be applied. In the previous chapter the use of dedicated load share 
controller IC’s for mission critical systems were considered. The added cost and circuit complexity 
resulting from this implementation is justified by the requirements of high system availability. The 
power system for the precision docking project is also a mission critical application. However, the 
volume of power supplies needed in a full scale implementation does not allow for custom designs. 
Also, due to operational constraints such as repair and power system synthesis, a fast and easy access 
to spare parts is required. It is therefore decided to proceed with alternative solutions to these 
apparently contradictive requirements.  
 The technique considered in this chapter is the droop load sharing technique and its reliability 
optimized counterpart, the thermal droop load sharing. The former technique ensures an approximate 
load sharing among the individual converters by distributing an equal current throughout the system 
while the latter technique uses temperature information to adjust the individual converter currents, 
hence intentionally creating an unequal load sharing. The remainder of this section is dedicated to the 
introduction of the basic droop load sharing technique followed by a description of the circuit 
modifications needed to transition to the thermal droop load sharing technique. 
 The basic droop load sharing technique relies on the output characteristics of each converter. In 
the ideal situation a converter provides a constant output voltage over the entire operating range. In 
real-world systems this ideal situation seldom holds and the output voltage will slightly decrease as 
the load current increases. Whether or not this correlation between output voltage and output current 
is sufficient to ensure adequate accuracy for load sharing purposes it must be considered in each 
power system design. However, in the typical case, the converter control provides a very tight 
regulation and the voltage drop from no load to full load can be in the order of a few millivolts. One 
technique for ensuring equal load sharing is the modification of the feedback loop to exhibit a lower 
gain within regulation. This technique enhances the effects of the correlation between output voltage 
and output current, thus enabling multiple converters to share a common load. Intentionally lowering 
the gain of a well-functioning system unfortunately changes the dynamic properties of the system 
and as a result the overall dynamic response to, for example, load transients will generally be 
7. Thermal droop load sharing  Page 126 
 
significantly deteriorated. If this degradation of the dynamic properties is unacceptable a different 
technique can be utilized. This technique, which will be used in the power system in this chapter, 
introduces a resistor in series with the converter output and hereby effectively ensuring adequate 
voltage drop for the load sharing implementation. Although, the latter technique preserves the high 
bandwidth of the converter control it introduces additional system losses. Thus, the design is a trade-
off between efficiency, dynamic response and load sharing accuracy. 
 Following the guidelines provided above, the basic droop load sharing technique comprised of a 
set of converters all connected to the same load by means of small series resistors, can be 
theoretically illustrated as shown in Figure 104. 
 
Power supply 1 Power supply 2
R1 R2
RLOAD VOUT
+
-
V1
+
-
V2
+
-
I2I1
 
Figure 104 : Simple droop load sharing 
 
 The correlation between load voltage and individual converter output voltages is given by: 
 
 222111OUT RI - V  RI - V  V ⋅=⋅=   (7-1) 
 
 For comparison purposes, the traditional droop load sharing is initially implemented with the 
intention of achieving a load sharing as close to the ideal as possible. Following this implementation, 
a set of measurements is taken to verify the design and to serve as a reference for the thermal load 
sharing technique. In section ‘7.2 Specifications for the laboratory implementation’ an analysis of the 
differences between the two load sharing techniques is provided, followed by a verbal discussion of 
the pros and cons. A graphical illustration of the ideal droop load sharing is depicted in Figure 105.  
 
Upper voltage limit
Lower voltage limit
Nominal voltage
VOUT
IOUT
Output characteristics
 
Figure 105 : Ideal droop output voltage  
 
 The linear correlation between output voltage and output current is shown in Figure 105. The 
upper and lower voltage limits is the window within the converter would have to operate in order to 
7. Thermal droop load sharing  Page 127 
 
comply with the power system specifications. It is seen that the droop voltage is perfectly centered in 
between these limits. Achieving load sharing by means of the series resistor technique does not allow 
for this centering of the droop voltage, unless the output voltage of the converter can be slightly 
increased. On the assumption that the simple series resistor technique is used in a system that allows 
for a slight increase in output voltage it is worth noting that such a system actually exhibits excellent 
response to load steps. At heavy or full load the output voltage is lower than the nominal voltage, 
meaning that during a load step, from for example 100% load to 50% load, the voltage spike that 
results remain within the specified limits due to the extra margin (indicated with vertical arrows). 
The same situation holds at light loads where the voltage spike would have a negative amplitude, 
thus actually being a very short voltage drop.  
 All in all, the system depicted in Figure 104 provides load sharing, although with inferior 
regulation and current sharing, than the dedicated load sharing controller presented in chapter 6.  
 
 
7.2 Specifications for the laboratory implementation 
 
 Having described the differences between the two load sharing techniques under consideration, 
this section provides the theoretical background for the proposed load sharing technique as well as 
the data forming the basis for the laboratory setup specifications. 
 Since the droop load sharing technique in the basic form only adds a single external resistor to the 
system it is obvious that the technique is very suitable for parallel-connection of already existing 
converters. As briefly mentioned in the previous section, the power system described in this chapter 
is intended for implementation in automated busses. Designing the power system by means of off-
the-shelf parts facilitate a short design time and easy repair. Furthermore, system parameters such as 
cost, complexity and power system volume are easily improved by utilizing commercially available 
units. 
 It was therefore decided to implement the load sharing technique in a system comprised of 3 
commercially available converters. The converters are stand-alone hybrid modules – meaning that 
they are not intended for parallel-operation. Using stand-alone converters simplifies the internal 
design of the converters and minimizes the parts count, which in turn increases their reliability. The 
converters were sponsored by CALEX Mfg. Co., Inc. located in Concord, California. The main 
specifications for the converters are: 
 
 Output voltage : 5 V 
 Max output current : 15 A 
 Input voltage range : 36 V – 75 V 
 
 A detailed datasheet can be found on the accompanying CD in the folder ‘Datasheets’. Also 
included on the CD are several application notes that describe the use of CALEX converters and 
their features. 
 
Based on the bus power system specifications, the following set of requirements were established: 
 
 Output voltage : 5 V ± 5% 
 Max load current  : 20 A 
 Voltage droop : 300 mV 
 
A top and bottom view of the CALEX converters is shown in Figure 106 along with their pin-out.  
7. Thermal droop load sharing  Page 128 
 
                  
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE
 
   (a) (b) 
Figure 106 : Off-the-shelf converter from CALEX (a) and pin-out configuration (b) 
 
 To accommodate the abovementioned requirements, it was decided to implement the power 
system by means of 3 parallel-connected converters. This allows one unit to fail while still providing 
service to the load. Ideally, each converter would then supply a maximum current of approximately 7 
A during normal operation and 10 A in case of a single point failure, thus leaving plenty of margin 
for the uncertainties associated with the droop load sharing. Furthermore, operating the converters at 
a lower current level minimizes the internal temperature rise, which contributes to an increase in 
overall reliability. 
 Since the converters are encapsulated, the technique proposed in [Ne05] cannot be implemented 
directly. Furthermore, had it been the intention to implement the traditional droop technique by 
means of error amplifier gain modification, this too would have been impossible since access to the 
compensation components would require removal of the converter housing. Due to the fact that there 
is no direct access to the internal parts of the converters a different approach has to be used. 
According to the manufacturer’s datasheet, each converter is fitted with a trim pin that allows for a 
±10% alternation of the output voltage. In terms of achieving accurate droop load sharing, this is 
more than adequate since the technique usually limits the voltage droop to a few tenths of a volt. 
Indeed, this power system seeks to create a droop voltage of 300mV in the range from no load to full 
load, leaving room for additional voltage variations at the output before the ±5% regulation limit is 
reached.  
 Initiating the thermal droop load sharing analysis with the very simple feedback network shown in 
Figure 107, it becomes clear that the thermistor (RT) to series resistance ratio (RS) over the intended 
operating range results in a voltage droop that would exceed the specified regulation limits (see 
Figure 108). In fact, the converter output voltage would be at a constant minimum at an ambient 
temperature of 40°C, thus being outside the regulation boundaries (the shaded areas in Figure 109). 
 
To TRIM pin
VOUT
RT
RS
CT
 
Figure 107 : Basic feedback network  
7. Thermal droop load sharing  Page 129 
 
 With reference to Figure 107, the individual components are briefly described. RT is a thermistor 
with a room temperature resistance (R25) of 5kΩ and a material constant (β) of 3950, RS is a series 
resistor of 4.9 kΩ used to increase the output voltage at low current levels and CT is a 
stabilizing/noise reducing capacitor of 1nF. With the implementation shown in Figure 107, the 
correlation between temperature and feedback voltage will in general be very non-linear due to the 
characteristic of the thermistor RT. However, as can be seen in Figure 109, the part of the output 
voltage that actually exhibits the droop slope is very close to linear. Nonetheless, the feedback 
network has to be modified in order to achieve regulation and the intended 300mV droop voltage. 
 
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5V   V
3950       
k 5.0     R
k 4.9     R
 OUT
 
 25
  S
=
=
Ω=
Ω=
β
20 40 60 80 100
( ) β⋅−
+
⋅+
⋅=
298
1
T273
1
eR  R
RV  V
25S
S
OUTFB
Temperature
VFB
TAmbient = 40
oC
 
Figure 108 : Feedback voltage  
 
 The equation relating temperature and thermistor resistance is a commonly used empirical 
approximation for NTC thermistors and is given by: 
 
 
( ) β⋅
+
⋅=
298
1
273  T
1  - 
25T eR  R   (7-2) 
 
 It can be found in most thermistor manufacturer’s datasheets and provides a very close estimate of 
thermistor behavior over temperature. The thermistors used in this power system design are 
manufactured by Ametherm located in Carson City, Nevada. 
  
1.0
2.0
3.0
4.0
6.0
5.0
20 60 80 100
Temperature
VOUT
Droop slope
TAmbient = 40
oC
40  
Figure 109 : Output voltage resulting from the feedback voltage shown in Figure 108  
7. Thermal droop load sharing  Page 130 
 
 The approach taken in modifying the feedback network comes from a common linearization 
technique applicable to non-linear elements. The modified feedback network can be seen in Figure 
110 while the associated output voltage droop can be seen in Figure 111. An analytic expression 
relating the system feedback voltage as a function of thermistor temperature equals:  
 
 ( )
( )
 
  
RReR
ReRR
  R
RV  V
F1S
  
25
S
  
25F1
F2
F2
OUTFeedback
T  273
1
298
1
T  273
1
298
1






++⋅






+⋅⋅
+
⋅=
⋅+−
⋅+−
+
+
β
β   (7-3) 
 
 The modified feedback network is often referred to as a resistance mode linearization circuit. It 
connects the thermistor and its series resistance (RS) in parallel with the top resistor of the feedback 
network to create a negative temperature coefficient output voltage as required by the basic droop 
technique. 
 
RF1
RF2
To TRIM pin
VOUT
RT
RS
CT
RF1 = 13  k
RF2 = 3.9 k
RT,25C = 5.0 k
RS = 4.3 k
CT = 1.0 nFβ = 3950
 
Figure 110 : Modified feedback network  
 
 Inserting component values into a mathematical software program the waveform generated by the 
thermistor can be graphically illustrated as a function of temperature as shown in Figure 111.  
 
16040 60 80 140100 120
Ideal droop output voltage
Thermal droop output voltage
Temperature
VOUT
5.00
4.85
5.15
Nominal output voltage
TAmbient = 40
oC
 
Figure 111 : Output characteristic of modified feedback network  
 
7. Thermal droop load sharing  Page 131 
 
 The dotted line in Figure 111 is the ideal droop output voltage for the system under consideration, 
while the solid line illustrates the thermal droop output voltage. When the axis of ordinate is zoomed 
around the nominal output voltage, it becomes clear that the output voltage droop will exhibit very 
non-linear characteristics in the final design. 
 Focusing on the laboratory implementation, the modified feedback network was implemented and 
mounted at the appropriate converter pins. The thermistors were fitted onto the base-plates with 
small metal brackets. In order to ensure that the applied pressure was approximately equal among all 
converters the metal brackets were fitted using a torque wrench. Figure 112 shows the real-world 
design before the thermistors were mounted on the baseplates. To facilitate easy access to the 
different components as well as the converter pins the entire power system was mounted on a single 
piece of PCB by means of 1 inch spacers.  
 
 
Figure 112 : Laboratory test configuration  
 
 As already mentioned, it was the intention to implement a single point failure free system by 
using OR’ing diodes. These diodes also serve to prevent one converter from charging the output 
capacitor of another converter, thus eliminating the inherent problem of parallel-connecting multiple 
units. This issue was discussed in more detail in section ‘6.6.1 Prototype limitations during startup’ 
where it became clear that the use of OR’ing diodes is almost a necessity. However, the problem in 
this configuration is less critical since the output capacitors of a nonoperating converter would be 
charged through the droop resistor in series with the output. 
 A closer look at the power system is provided in Figure 113 where the OR’ing diode, droop 
resistor, thermistor and feedback circuitry for converter 1 can be identified. The feedback circuit is 
realized using standard E48 2% resistors while the OR’ing diode is a 10TQ045 from International 
Rectifier. The thermistor has a tolerance of 2%. As will be shown in the next section, these 
tolerances cause small differences in the converter feedback voltages and thereby creating voltage 
off-sets on top of those already inherent in each converter.  
 
7. Thermal droop load sharing  Page 132 
 
 
Figure 113 : Thermal load sharing circuit, droop resistor and OR’ing diode 
 
 This completes the implementation process, but before the experimental results are provided the 
overall power system stability is briefly considered. The thermal droop load sharing technique 
dynamically changes the feedback voltage whereby a similar change in output voltage is achieved. In 
other words, the ratio of feedback voltage to output voltage remains constant. From a control theory 
point of view this means that nothing has changed and the converter loop remains stable. This is 
illustrated in Figure 114 with the block K(T) defining a temperature dependant constant that is 
continuously added to the measured output voltage to form the combined feedback signal. Therefore, 
as long as the initial converter modules remain stable and unaffected by component temperature drift 
the system will remain stable. It has not been possible to obtain gain/phase plots of the converter 
modules. This would have made it possible to verify that the system had enough phase margin at the 
cross-over frequency to allow for unforeseen effects of the added feedback network.  
 
Converter
K(T)
+
+
Feedback
VOUTVIN
VOUT
T
VOUT ,nom
T
K(T)
 
Figure 114 : Simplified converter feedback and resulting waveforms 
 
 
7. Thermal droop load sharing  Page 133 
 
7.3 Experimental results 
 
 Figure 115 provides an illustration of the initial test setup. To account for the voltage drop across 
the OR’ing diode in series with each output, the remote sensing pin is connected as shown in the 
figure. Otherwise, this voltage drop would have to be included in the calculations. Since the voltage 
drop across a diode varies very little as a function of the current that passes through it this particular 
component cannot be used as the system’s droop element.  
 In compliance with the specifications, the total load current is limited to 20A, whereby 
overstressing of the converters in case of malfunction is avoided.  
 
Converter 2
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-2
DIsolation-2
Load
Converter 1
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-1
DIsolation-1
Converter 3
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-3
DIsolation-3
VIn
 
Figure 115 : Test setup  
 
 The first set of measurements is completed for the traditional droop load sharing implemented 
with a 60 mΩ resistor in series with each converter output. 
4
4,2
4,4
4,6
4,8
5
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
Vo
lta
ge
 d
ro
op
 (V
)
Converter 1
Converter 2
Converter 3
      
Figure 116 : Individual converter voltage droop vs. output current  
7. Thermal droop load sharing  Page 134 
 
 From Figure 116 it can be seen that very small set point differences exist between the individual 
converters. As the individual converter currents increase, the voltage drop across the wires and droop 
resistors increase. At no load, converter 1 and converter 3 have approximately equal set point 
voltages. However, as the current supplied by each converter increases, the difference between the 
two voltages becomes larger until the full load situation is reached and the difference settles at 80.9 
mV. This rather small voltage difference would normally imply that the converters and associated 
droop resistors would exhibit a fairly good load sharing. However, as will be shown in Figure 118, 
the difference in individual converter current becomes quite large considering that the intention is to 
equalize the current distribution among the converters. As an example consider a load voltage of 
4.956 V. Figure 116 provides the data for computing the resulting load current. The individual 
converter currents at this load voltage are:  
 
 Converter 1 = 1.75 A 
 Converter 2 = 0.95 A 
 Converter 3 = 3.30 A 
 
 Combining these three converter currents it can be seen that the total load current is 6 A. This 
simple example shows that the differences due to the small set point variations are quite large. If a 
higher degree of accuracy is needed, either one of the following two solutions can generally be 
applied. The first is the use of very high precision droop resistors and initial converter set point 
voltage trimming. However, in most droop sharing applications this technique is not a feasible 
solution, since each converter requires accurate trimming. Furthermore, a higher level of precision 
usually has a negative impact on overall system costs. The second solution is the implementation of a 
steeper droop characteristic. Changing the very flat characteristic of the power system output droop 
voltage used in this design facilitates a closer load regulation. This fact is intuitively clear from the 
waveforms shown in Figure 116. Unfortunately, this technique comes with the drawback of 
additional system losses and a larger load voltage variation from no load to full load. The latter fact 
unfortunately violates the system specifications, thus proving to be an unacceptable solution.  
 
0
20
40
60
80
100
120
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
Te
m
pe
ra
tu
re
 (C
)
Converter 1
Converter 2
Converter 3
 
Figure 117 : Individual converter temperature vs. output current  
7. Thermal droop load sharing  Page 135 
 
 Figure 117 shows the individual converter temperatures as they are operated in a stand-alone 
configuration. Although the converters are very well-designed and generally operate at a lower 
temperature, the data provided in Figure 117 clearly illustrates that large temperature differences 
exist. Closer examination of the phenomenon reveals that the OR’ing diodes and droop resistors 
become very hot and therefore heat up the entire converter system. The largest deviation can be 
found to be 15°C. 
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10 12 14 16 18 20
Load current (A)
In
di
vi
du
al
 c
on
ve
rt
er
 c
ur
re
nt
 (A
)
Converter 1
Converter 2
Converter 3
      
Figure 118 : Individual converter current sharing  
 
 From the current distribution among the individual converters, shown in Figure 118, it is easily 
identified that the use of precision resistors as droop elements provides a relatively good current 
regulation. The large differences in load sharing are due to the initial variation in converter set point 
voltages as shown in Figure 116. It can be seen that converter 3 supplies the majority of the load 
current. This fact combined with its high temperature over the entire operating range decreases the 
overall system reliability and increases the combined average system temperature.  
 The very simple steps involved in changing from the traditional droop load sharing technique to 
the thermal droop load sharing are now carried out. This means that the system from this point 
forward uses temperature information to optimize the power system. Simultaneously, the series 
droop resistors are eliminated, which contributes significantly to an increase in overall system 
efficiency. 
 From Figure 119 it can be seen that the added thermal load sharing circuitry decreased the output 
voltage of converter 3, thus causing it to have the lowest droop voltage of the three converters over 
almost the entire operating range. This, being just the opposite of the scenario in the series resistor 
droop configuration, is caused by component tolerances although much effort has been put into 
finding accurate resistors. Another observation worth mentioning is the voltage slope of converter 3. 
This voltage, as opposed to the voltages of converter 1 and converter 2, is almost a straight line. 
7. Thermal droop load sharing  Page 136 
 
4,5
4,6
4,7
4,8
4,9
5
5,1
5,2
5,3
5,4
5,5
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
Vo
lta
ge
 d
ro
op
 (V
)
Converter 1
Converter 2
Converter 3
 
Figure 119 : Individual converter voltage droop vs. output current  
 
 The information provided in Figure 119 clearly shows that the added feedback circuit has a 
positive impact on the converter output voltages. With the exception of light loads and full load, the 
three droop voltages are almost equal.  
 Powering up all three converters in the configuration, the individual converter contributions to the 
total load current can be measured. The result is shown in Figure 120. 
 
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14 16 18 20
Load current (A)
In
di
vi
du
al
 c
on
ve
rte
r c
ur
re
nt
 (A
)
Converter 1
Converter 2
Converter 3
 
Figure 120 : Individual converter current sharing during thermal load sharing 
 
 The current distribution measurements show that the use of the thermal load droop sharing 
improves the overall current sharing. However, this is not the main intention of the technique. On the 
7. Thermal droop load sharing  Page 137 
 
contrary, the overall aim is an equal operating temperature achieved by means of unequal current 
distribution. The improved current distribution in the power system at hand is due to the fact that this 
redistribution of the initial converter currents results in minimized system temperature. It may seem 
strange that increasing the set point voltages at no load improves the current distribution. A closer 
look at this phenomenon reveals that the shift in set point voltages does not cause the improvement 
in current sharing. Instead, the reason for the higher degree of current sharing is the steeper slope of 
the individual converter droop voltages. Comparing Figure 116 and Figure 119 this fact is easily 
identified. Another observation that can be made by considering the two figures is that the thermal 
droop load sharing technique eventually becomes worse than the traditional droop method in terms 
of current sharing. This occurs at heavy loads where the curves in Figure 119 are almost horizontal. 
In turn, these small voltage differences causes converter 1 to deliver most of the load current. When 
this converter reaches the current limit, converter 2 begins to supply the remainder of the load 
current. In fact, it can be deduced that the maximum current supplied by converter 3 is 7.4 A. If the 
load current limit had been 30 A instead of the 20 A, converter 1 would have supplied 15 A 
(maximum current per converter), converter 2 would have supplied 7.6 A and converter 3 would 
have supplied 7.4 A. This situation is far from viable and would therefore require that the droop 
characteristics of the system are revised if the operational range is extended. 
 While obtaining the data provided in Figure 118 and Figure 120, the temperatures of the 
individual converters were monitored. This was accomplished by mounting a thermocoupler at each 
converter. The continuous measurement provided by these three thermocouplers is processed by a 
rather advanced thermometer that allows for several temperature monitoring devices to be connected 
simultaneously. One of the many features provided by this thermometer is an averaging functionality 
that continuously averages the data from the three thermocouplers. The result is shown in Figure 121 
where the blue curve represents the average temperature for the traditional droop technique and the 
purple curve represents the average temperature for the thermal droop technique. 
 
0
10
20
30
40
50
60
70
80
0 5 10 15 20
Load current (A)
A
ve
ra
ge
 s
ys
te
m
 te
m
pe
ra
tu
re
 (C
)
Resistor droop
Thermal droop
 
Figure 121 : Average system temperature vs. output current  
 
 From Figure 121 it can be seen that the average system temperature is at a constant lower 
temperature when the power system load sharing is controlled by the thermal droop technique. The 
7. Thermal droop load sharing  Page 138 
 
temperature difference increases as the load current increases, which indicate that the temperature 
contributions of the droop resistor were significant. At full load the temperature difference increases 
to 13.9°C. As will be shown in section ‘7.4 Thermal droop load sharing reliability’ this temperature 
difference imposes a significant reliability deviation between the two techniques. 
 In terms of system efficiency, the elimination of the dissipative series droop resistors has an 
overall positive impact. The measured efficiency of the two techniques is shown in Figure 122. 
 
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 10 20 30 40 50 60 70 80 90 100
Output power (W)
Ef
fic
ie
nc
y
Traditional droop load sharing
Thermal droop load sharing
 
Figure 122 : Overall system efficiency  
 
 The next two figures are measurements of the common output voltage under different operating 
conditions. The measurement shown in Figure 123 depicts the output voltage at a total load current 
of 1 A. At this light load the converters operate in discontinuous conduction mode and the built-in 
increase in output voltage at light loads is easily identified. The average output voltage at this load 
current is 5.14 V. Figure 124 shows the common output voltage at a total load current of 5 A. At this 
load current the converters are operating in continuous conduction mode. Inserting numerical values 
into the theoretical equations for calculating the common output voltage, results in a voltage of 5.0 
V. The measurement in Figure 124 has an average value of 5.04 V, thus being slightly higher.  
 
7. Thermal droop load sharing  Page 139 
 
 
Figure 123 : Common power system output voltage at 1A (discontinuous conduction mode) 
 
 
Figure 124 : Common power system output voltage at 5A (continuous conduction mode) 
 
 The measurement in Figure 125 shows the common output voltage during a single converter 
failure while supplying a total output current of 5A. The resulting voltage drop is significant 
although the duration is only a few hundred nanoseconds. Since the power system considered in this 
chapter is comprised of 3 hybrid converters without any additional capacitance at the output, the only 
mechanism working to prevent the voltage glitch from happening is the control circuitry of the 
converters. Forming a small capacitor bank at the output will assist the control circuitry in its attempt 
and will greatly reduce the voltage glitch in the event of a converter failure. 
7. Thermal droop load sharing  Page 140 
 
 
 
Figure 125 : Output voltage glitch during a single converter failure  
          
Figure 126 : Converter 3 OR’ing diode voltage drop at 4A 
 
 Figure 126 shows the voltage drop across the OR’ing diode of converter 3 while supplying a 
current of 4 A. In terms of system efficiency, these components contribute significanty to the overall 
system losses. The steady state voltage drop at 4 A can be found to 0.37 V. 
 The last measurement, shown in Figure 127, verifies the feedback voltage of converter 2. 
According to (7-3) this value should be 2.3 V at 40°C using ideal components. Figure 127 clearly 
 
Katode voltage (1) 
 
 
 
 
 
 
Anode voltage (2) 
 
 
 
 
 
 
 
Differential voltage (A) 
7. Thermal droop load sharing  Page 141 
 
shows that the measured value is 2.2 V. In other words, there is a voltage difference of 0.1 V, which 
might not seem that important. However, this voltage will off-set the current distribution as shown in 
Figure 116 and Figure 119, and could potentially force a single converter to supply more than its 
share of the load current.  
 
Figure 127 : Converter 2 feedback voltage at 40°C 
 
 In order to examine this curio in more detail, it is necessary to establish the extreme limits of the 
output voltage caused by component tolerances in the feedback network. The first limit is the 
absolute lowest output voltage possible. This occurs when the feedback network constantly provides 
the converter with the highest feedback voltage. In other words, the thermistor to feedback resistor 
(RF2) ratio must be at a minimum. At the other extreme, the feedback voltage would have to be at its 
minimum at all times. This occurs when the thermistor to feedback resistor (RF2) ratio is at a 
maximum. Summarizing these observations the following table can be established: 
 
Component Maximum output voltage Minimum output voltage 
RF1 -2% +2% 
RF2 +2% -2% 
RT,25C -2% +2% 
RS -2% +2% 
Table 15 : Outer limits for output voltage variation due to component tolerances 
 
A graphical illustration of the two extremes is shown in Figure 128.  
7. Thermal droop load sharing  Page 142 
 
VOUT
7050 90
4.9
5.1
5.2
Temperature
 
Figure 128 : Output voltage interval due to component tolerances 
 
 Directing focus back to the feedback voltage issue, it can be established that the voltage measured 
at the feedback network of converter 2 (see Figure 127) results in an output voltage of 5.2V, thus 
supplying the majority of the load current. In fact, this is indeed the case as can be seen in Figure 
120.  
 The combinations of feedback network component tolerances that could cause this deviation from 
the ideal are endless and the only way to establish accurate data is to measure all components. 
However, from an operational point of view these tolerances are just a fact of life and as long as the 
power system performs satisfactory there is no need to measure every component. As an example of 
component tolerances leading to the abovementioned feedback voltage deviation, the following set 
of values has been deduced: 
  
 RF1 = + 1.18% 
 RF2 = + 1.18% 
 RT,25C = + 1.18% 
 RS = + 0.0% 
 
 This completes the experimental verification and the discussion of the differences between the 
theoretical design in section ‘7.2 Specifications for the laboratory implementation’ and the ‘real-
world’ operation. 
 
 
7.4 Thermal droop load sharing reliability 
 
 The reliability evaluation of the power system at hand is complicated greatly by the fact that 
detailed component data is unobtainable. The only data available is that provided in the converter 
datasheet. It is well-known that the reliability of electronic parts is very dependent on operating 
temperature. Due to the parasitic elements inherent in any real-world component, different reliability 
optimums under given working conditions apply to each component. The proposed thermal droop 
load sharing technique accounts for this fact by using the individual converter temperatures as local 
feedback off-sets. This ensures the lowest overall power system temperature, thus optimizing the 
overall system reliability.  
7. Thermal droop load sharing  Page 143 
 
 The reliability evaluation presented in this section uses the reliability data in the converter 
datasheet by normalizing the MTBF to the worst-case temperature. It is hereafter possible to 
calculate the relative changes in reliability as a function of temperature. It should be noted that the 
degree of accuracy of these calculations depends on the component distribution as well as the ratio of 
passive components to active components. To account for this fact the calculations are based on 
seven active components. Most real-world designs incorporate a smaller number of active 
components, and the results presented below will provide the end-user with even higher system 
reliability. In other words, the assumptions on which the following calculations are based forms a 
basis for worst-case reliability improvements (minimum reliability improvement possible). 
 The point of origin is a measurement of the temperature contribution made by the series droop 
resistors. The result, shown in Figure 129, provides the resistor temperature increase above ambient 
temperature as a function of the current passing through it. 
0
10
20
30
40
50
60
70
80
90
0 2 4 6 8 10
Current through droop resistor (A)
Te
m
pe
ra
tu
re
 a
bo
ve
 a
m
bi
en
t (
C)
Droop resistor temperature increase above ambient
 
Figure 129 : Droop resistor temperature increase above ambient as a function of current 
 
 It can be seen that the temperature curve in Figure 129 has the shape of a second order function, 
which is exactly what should be expected of resistive power losses.  
 Having established a correlation between converter current and series resistor temperature rise the 
reliability calculations can be performed. The point of origin is the determination of the individual 
converter temperatures at full load current. Fortunately, this is easily done by relating Figure 117 and 
Figure 118. These values are then used to establish a failure rate for each converter and consequently 
a reliability number as a function of time. Next, the same procedure is followed for the thermal droop 
technique and a comparison is possible. The data used is summarized in the following table: 
 
 Series resistor droop technique Thermal droop technique 
Converter Current Temperature FIT Current Temperature FIT 
1 6.23 A 60.1°C 178 7.42 57.4°C 160 
2 5.33 A 57.2°C 158 6.11 53.0°C 134 
3 8.34 A 94.9°C 802 6.49 60.1°C 178 
Table 16 : Reliability data for comparing the two techniques 
7. Thermal droop load sharing  Page 144 
 
 From the data presented in Table 16, it can be seen that converter 3 operates at a temperature of 
more than 30°C above the other two converters in the configuration when the load sharing is 
achieved by means of the traditional droop technique. This has a very negative impact on the 
associated failure rate (FIT). In fact, the failure rate of converter 3 is 4.8 times higher than the 
average of the other two converters. Inserting this much higher value into the exponential function 
for calculating the overall reliability, results in a very high probability of failure for converter 3. In 
turn, this decreases the overall system reliability considerably. The average system temperature that 
results from the data shown in Table 16 is 70.7°C for the series resistor droop technique and 56.8°C 
for the thermal droop technique.  
 Graphically illustrating the unavailability of the two techniques using the data in Table 16 is 
shown in Figure 130. 
  
Unavailability
Years
1 2 3 4 5
0.00005
0.00010
0.00015
0.00020
0.00025
0.00030
0.00035
Series resistor droop technique
Thermal droop technique
 
Figure 130 : System unavailability vs. time in years 
 
 Figure 130 shows the combined system unavailability as a function of years in operation. Based 
on the curves it is quite clear that the traditional series resistor droop technique is much more likely 
to fail than its thermal droop counterpart. An exact number of the percent-wise decrease in system 
unavailability can be established by dividing the unavailability difference between the two load 
sharing techniques with the unavailability of the traditional series resistor droop load sharing. In 
mathematical terms this decrease can be expressed as: 
 
 
1eeee2-
100-  Q
1562500
t4599
78125
t657
1250000
t10731
12500000
t124611--
−+++⋅
⋅
=∆
⋅−⋅−⋅−⋅
ξ   (7-4) 
 
where ξ is given by: 
 
 6250000
t15987
3125000
t8541
2500000
t10293
78125
t657
1250000
t10731
12500000
t-124611
eee2eee2  
⋅−⋅−⋅−⋅−⋅−⋅
++⋅−−−⋅=ξ   (7-5) 
 
7. Thermal droop load sharing  Page 145 
 
 Inserting numerical values into (7-4) results in an overall unavailability decrease of 75.04%. This 
is a significant reduction, which to a large extent is caused by lowering the overall system 
temperature by elimination of the droop resistors. Also, the significant drop in the operating 
temperature of converter 3 contributes to the large decrease in overall system unavailability. In fact, 
it can be calculated that in the initial series resistor droop technique implementation, converter 3 has 
a 70% chance of failing while the other two converters approximately split the remaining 30% 
(converter 1 = 16.2% and converter 2 =13.8%).  
 As a concluding remark it is worth noting that the temperature distribution that results from the 
thermal droop load sharing implementation still allows for certain deviations. As opposed to the 
technique described in chapter ‘6 Load sharing’, this technique has no feedback to a common 
controller that effectively equalizes the temperatures. A system incorporating a dedicated controller 
could also be implemented by means of standard off-the-shelf converters. However, such a system 
increases the overall circuit complexity and thereby eliminates the entire idea of the droop technique 
– its simplicity of implementation. 
  
 
7.5 Discussion and summary 
 
 This chapter has presented a very simple technique for implementing reliable power systems 
comprised of several parallel-connected converters. The system is intended for implementation in 
automated busses – a project managed by Partners for Advanced Transit and Highways.  
 Detailed analysis of the design steps involved in the synthesis of a laboratory test power system 
comprised of off-the-shelf converters has been provided. The pros and cons are discussed verbally 
and supported by analytic verifications and experimental measurements. 
 The analysis started by considering the pros and cons of commonly used droop techniques. The 
first technique discussed uses the fact that all converters inherently have a finite output impedance. 
By intentionally lowering the loop gain, the correlation between output voltage and output current 
increases, which is then used to establish load sharing in parallel configurations. Another technique, 
which is the technique used in this thesis, is the series resistor droop technique. This technique is 
even simpler than the loop gain changing technique but comes at the cost of lower overall efficiency. 
However, since the latter technique only introduces a single resistor at the output of each converter, 
the dynamic response remains unchanged. Which of the two techniques to apply depends on the 
application. In the application at hand there is no direct access to the internal parts of the converters, 
thus using the loop gain changing technique is impossible.  
 Following these considerations, a detailed examination of the steps involved in transitioning from 
the traditional series resistor droop technique to the new thermal droop technique is provided. The 
proposed thermal droop load sharing alters the traditional droop techniques by adding an external 
temperature dependent feedback circuit. Since power system simplicity is an important factor in this 
design, the external feedback circuit is kept as simple as possible.  
 Having established a theoretical foundation for comparison of the two techniques, a power system 
comprised of three parallel-connected converters were implemented. Extensive measurements 
verified the theoretical results while comments and theoretical interpretations of the laboratory 
measurements provided a base for future improvements and/or modifications. 
 As the reliability calculations have shown the decrease in unavailability using the thermal droop 
load sharing technique is quite significant. The technique lowers the overall system temperature at 
full load by almost 14°C while maintaining the efficiency of the loop gain changing technique and 
the dynamic response of the series resistor droop technique. In other words, the thermal droop load 
7. Thermal droop load sharing  Page 146 
 
sharing technique combines the best properties of two commonly used techniques and 
simultaneously adds a reliability optimization feature to the system. 
 
 
7.6 References 
 
[Ne05]  Thermal Droop Load Sharing Automates Power System Reliability Optimization, 
Carsten Nesgaard and Seth R. Sanders, Power electronics Society Newsletter, Second 
quarter 2004 
 
[Jp01]  JP patent 2000358371 
 
[Ne06]  Experimental verification of the thermal droop load sharing, Carsten Nesgaard and 
Michael A. E. Andersen, Submitted for review at Power Electronics Specialists 
Conference 2004, Aachen, Germany 
 
[Ro01]  When It Comes To Compact PCI Supplies, Standards Are Helping, Lazar Rozenblat 
and Paul Kingsepp, Todd Products Corp., web-article. 
 
[Ca01]  75Watt QH Single Series DC/DC Converters, CALEX datasheets, www.calex.com 
 
 
 
 
8. Partners for Advanced Transit and Highways  Page 147 
 
8 Partners for Advanced Transit and Highways 
 
 This chapter presents the work performed at University of California, Berkeley. Part of this work 
was participation in a large scale project on precision docking procedures, managed by Partners for 
Advanced Transit and Highways. The main outcome of this work has already been presented in 
chapter 7. Therefore, this chapter provides a brief introduction to the project as well as a special 
feature that had to be added to the power system to avoid over voltage situations during single point 
failures.   
 
 
8.1 Introduction 
 
 Partners for Advanced Transit and Highways (PATH) were established in 1986 as a multi-
disciplinary research program with staff, faculty and students statewide. Currently 45 full-time staff 
members, 50 faculties and 90 Ph.D. students are involved with the research program. Due to the 
large number of people as well as the large number of annual project proposals PATH research is 
divided into 4 programs: 
 
• Policy and Behavioral Research 
• Transportation Safety Research 
• Traffic Operations Research 
• Transit Operations Research 
 
 The precision docking project, which will be described in a subsequent section, belongs to the 
Transit Operations Research program. The 4 PATH research programs have gained wide acceptance 
and relations to universities, industry and public agencies include (only major contributors are 
mentioned):  
 
University relations:  University of California, Berkeley; University of California, Irvine; 
University of California, Los Angeles; California State University, San 
Diego; University of California, Davis and University of California, 
Riverside. 
 
Public agencies:  California Department of Transportation; Metropolitan Transportation 
Commission; California Highway Patrol; City of Irvine and City of Los 
Angeles. 
 
Industry relations:  Rockwell International; Jet Propulsion Laboratory; General Motors; Ford 
Motor Company and Lockheed Martin.  
 
From the above list it can be seen that there is a widespread interest in transit solutions and the 
market aspects these solutions facilitate.  
 
 
 
 
 
8. Partners for Advanced Transit and Highways  Page 148 
 
8.2 PATH mission and advanced transit 
 
 The PATH mission is to establish long-term solutions to the growing US surface transportation 
problems – particular California’s increasingly congested highways. PATH focuses on long-term and 
high impact solutions that have the potential to radically change modern transportation issues. 
 Among the many PATH projects that exist, a project that could have a high impact on the daily 
commute in large US cities is the so-called Automated Bus-Rapid-Transit (A-BRT). This system is 
an alternative to the well-known and widely used subways or rapid transit systems. The common 
denominator in these latter systems is the need for tracks, which significantly limits the local 
availability. The newly developed A-BRT system uses busses for transporting passangers via 
existing roadways, which allows the system to operate in most areas – even the suburbs. This fact 
allows for relative easy implementation in large scale configurations as soon as the guidance and 
safety technology has been tested and approved for urban implementation.  
 One of the many technological solutions developed by PATH is autonomous vehicle 
communications, where vehicles and the surrounding environment communicate to facilitate smooth 
and safe autonomous transit. The communication from roadway to the bus is unidirectionally 
achieved by magnetic markers embedded in the pavement. Onboard magnetometers sense the 
changes in magnetic field strength and adjust the lateral control accordingly. The principles in the 
magnetic marker vehicle guidance (implemented in a passenger car) can be seen in Figure 131.  
 
 
Figure 131 : Vehicle guidance using magnetometers and magnetic markers 
 
 The magnetic markers (shown to the right in Figure 132) are embedded in the center of the lane 
with an average distance of 1.2 meters. By alternating the magnetic polarities, the markers allow for 
a binary coding used to indicate roadway characteristics. As the roadway or conditions change, the 
8. Partners for Advanced Transit and Highways  Page 149 
 
guidance system adjusts speed and lateral position accordingly. Field tests have shown that the 
system provides very accurate lane keeping (far better than a human driver) and smooth lane 
changing.  
 
 
Figure 132 : Magnetic marker and magnetometer 
 
 A project that relies on the magnetic markers for guidance and positioning is the precision 
docking project that seeks to improve a number of issues in the public bus transit. In the past, low-
floor busses and so-called kneeling busses have provided major advances in improving bus 
accessibility especially for the elderly, vision impaired and disabled people. However, these systems 
rely on the bus driver to make perfect stops at each docking platform. If the bus pulls up too far from 
the curb the horizontal gap between bus door and curb requires passengers to first step onto the 
pavement before stepping onto the bus, thus causing dangerous situations where a passenger could 
fall between the bus and docking platform. On the other extreme, where the bus driver hits the curb 
during docking results in unnecessary wear on the right hand side bus tires, thus requiring extensive 
maintenance. To assist the driver in the docking procedure, PATH initiated the precition docking 
project that ultimately led to a stable system capable of repeatedly docking within millimeters from 
the curb. The contributions made to this project during the the Ph.D. work are discussed in the 
remainder of this chapter and includes the material presented in chapter 7. A series of video 
sequences that verify the docking procedure implementations in both regular vehicles and transit 
busses are provided on the accompanying CD. 
 
 
8.3 Precision Docking Project 
 
 The research work presented in this section documents the work performed for PATH at the 
Richmond Field Station near San Francisco, California. A large part of the work performed is 
presented in chapter “7 Thermal droop load sharing”. 
 Most researchers associated with the precision docking project are specialized in control theory 
and have therefore been focusing on the control issues. Consequently, very little attention has been 
paid to the power system. Since the system is intended for implementation in urban areas the 
probability of failure has to be minimized. Furthermore, with the potential loss of human life in the 
8. Partners for Advanced Transit and Highways  Page 150 
 
event of a system failure the consequences of all failure modes has to be known in order to assess the 
possible scenarios and determine which corrective actions to take. Failure modes that cannot be 
‘designed-out’ have to fail in a way that would leave the system in a known safe state. 
 With these requirements in mind, a number of power system realizations were considered. At a 
project meeting relatively early in the design phase it was decided that the figure of merit in terms of 
Mean Time Between Failures (MTBF) should be around 1⋅106 hours. Since the power system design 
would have to be implemented in a large number of systems it was in addition to the MTBF number 
decided that the power system mainly should be comprised of commercially available components – 
preferably entire subsystems that would speed up the implementation process in future systems. 
 
8.3.1 Power system configuration: 
 
 In order to establish the criticality of the power system, a general outline of the system structure 
had to be established. The first step in this process is the identification of individual subsystems. 
From there each subsystem is rated according to its task in relation to the overall system performance 
– meaning that subsystems being used for critical operations would get a high priority in this rating. 
The resulting system structure is shown in Figure 133 while Table 17 shows the criticality rating of 
the individual subsystems. 
12 V     6 W
DVI
24 V    12 W
DVI monitior
12 V
100 WAVG
500 WPeak
Service
Break
Controller
24 V    25 W
Control
computer
12 V    20 W
Radar
12 V    20 W
Radar
12 V    20 W
Lidar
12 V     2 W
V-V com.
9-30 V
3 W
VDS
9-36 V
2.5 W
GPS
12 V     5 W
VR com.
9-36 V
20 W
Mag. meters
Significant
Very Critical
Critical
Minor None
Battery 1
Battery 2
S/C protection
Front Switch
S/C protection
Front Switch
SupplyLoad
 
Figure 133 : Power system criticality analysis 
 
The individual blocks are rated according to the following criticality list: 
 
1  Very critical (This block is essential for human safety) 
2  Critical (Loss of this block causes system malfunction) 
3  Significant (Loss of this block causes important system degradation) 
4  Minor (Loss of this block causes only minor system degradation) 
5  None (Loss of this block has no effect on overall system performance – might cause a 
surveillance circuit to loose power) 
8. Partners for Advanced Transit and Highways  Page 151 
 
Based on the above list, the following Functional Failure Mode Effects and Criticality Analysis is 
established: 
 
Block Functional effect Voltage level Criticality Power rating
Control computer System malfunction 24V 2 25W 
Differential GPS system Loss of exact location 9-36V 4 2.5W 
Driver vehicle interface Loss of control 12V 1 6W 
Lidar Deteriorated avoidance system 12V 3 20W 
Magnetometers Loss of guidance 9-36V 2 20W 
Radar Deteriorated avoidance system 12V 3 20W 
Safety monitor computer Loss of control 24V 1 12W 
Service brake controller System malfunction - 2 - 
Steering actuator Loss of steering 12V 2 100W (500W) 
Vehicle dynamics sensor Loss of motion detection 9-30V 4 3W 
V-R communication Loss of roadside communication 12V 5 5W 
V-V communication Loss of avoidance communication 12V 3 2W 
Table 17 : Functional Failure Mode Effects and Criticality Analysis for the overall system 
 
 It should be noted that the common power bus illustrated by a single wire in Figure 133 actually is 
comprised of several individual wires connected to two different power sources – a battery and the 
bus generator. Images of the major system components described above can found on the 
accompanying CD in the folder: /PATH/Pictures/Docking Bus Components/ 
 
8.3.2 Preliminary topology evaluation: 
 
 Due to the severity of certain system malfunctions a high degree of overall system reliability is 
required. To further increase the overall reliability fault resilience is built into the electrical design 
resulting in a single point failure free system. The proposed converter design that complies with the 
latter requirement is shown Figure 134. 
 
Front Switch
or Fuse Input Filter
BUCK
Converter
Resonant
Converter Output Filter
PWM 1 PWM 2
OVP
Latch OVP
OVP
Latch
 
Figure 134 : Individual converter realization 
8. Partners for Advanced Transit and Highways  Page 152 
 
 The system is comprised of a buck converter followed by a resonant converter operated at a 50%-
50% duty cycle - thus serving as a DC-DC transformer. To ensure that no single failure can short out 
the input power bus, a front switch or a fuse should be inserted in series with each individual 
converter. If a front switch is used it would serve as protection of the input power bus as well as a 
current limiter during system startup and/or converter replacement. Controlled converter shut-down 
in case of fault occurrence is ensured by the built-in latch, which also prevents the system from 
operating in a state where one or more converters are trying to restart after being shut-down (hiccup 
mode). On the other hand, if a fuse protection is utilized the system no longer has an inherent current 
limiter and the fuse design would have to account for the peak inrush current during startup. 
 The next step in the design is the determination of the number of converters to parallel-connect in 
the overall power system. As discussed in previous chapters, the number of converter units 
comprising the overall power system should be kept to a minimum. As an example, using the data 
for the power system at hand, an N+1 redundant system comprised of 4 converter units is 40% more 
likely to fail at any given time than a N+1 redundant power system comprised of 3 converter units. 
The same tendency holds when transitioning from a 3 converter system to a 2 converter system. 
However, due to the percent-wise larger increase in component count in the latter case, the 
probability of system failure is 65% higher in a N+1 redundant 3 converter system than that of a N+1 
redundant 2 converter system. From these calculations, it can be seen that as the number of converter 
units increase a smaller and smaller gain in reliability is achieved when substituting an X unit system 
with an X-1 unit system. 
 In order to calculate the reliability improvement obtainable with different number of converters in 
the parallel-configuration at hand, the following set of equations (modified versions of (3-7)) is 
established. 
 
 ( ) ( ) T  -TT1-2 2121 e1 - e  e  P ⋅+⋅⋅ ⋅+= λλλλ  
 ( ) ( ) T    -TTT2-3 321321 e2 - e  e  e  P ⋅++⋅⋅⋅ ⋅++= λλλλλλ       (8-1) 
 ( ) ( ) T      -TTTT3-4 43214321 e3 - e  e  e  e  P ⋅+++⋅⋅⋅⋅ ⋅+++= λλλλλλλλ  
 
 The equations are derived using the exponential distribution with a constant hazard rate for all 
components combined with the binominal coefficients for successful system operation. It should be 
noted that the equations allow for reliability calculations of converters with different accumulated 
failure rates. In the special case of equal failure rates, the 3 equations can be further simplified: 
 
 ( ) T-2T1-2 e1 - e2  P ⋅⋅⋅ ⋅⋅= λλ  
 ( ) T-3T2-3 e2 - e3  P ⋅⋅⋅ ⋅⋅= λλ   (8-2) 
 ( ) T-4T3-4 e3 - e4  P ⋅⋅⋅ ⋅⋅= λλ  
 
 Plotting the equations as a function of time provides a visual assessment of the different 
configurations: 
 
8. Partners for Advanced Transit and Highways  Page 153 
 
Time (hours)
Probability
0.2
0.4
0.6
0.8
1.0
50000 100000 150000
2 converters - 1 working
3 converters - 2 working
4 converters - 3 working
 
Figure 135 : Probability of system survival as a function of time 
 
 As expected, Figure 135 shows that the system comprised of 2 converters provides the best 
overall reliability whereas the system comprised of 4 converters performs the worst reliability-wise. 
This result is intuitively clear since a system comprised of 4 converters of which, 3 are required to 
work at all times, has at least twice the number of components than that of a 2 converter system. 
Without changing the redundancy to, for example, N+2 this doubling of components impose a much 
higher probability of malfunction, thus causing the dramatic decrease in system reliability.  
 In Figure 135 a red circle indicates the normal range of system life. A more detailed view of this 
section can be seen in Figure 136. 
 
Time (hours)
Probability
0.999
2000
2 converters - 1 working
3 converters - 2 working
4 converters - 3 working
0.998
0.997
0.996
0.995
0.994
4000 6000 8000
 
Figure 136 : Enhanced view of circled time interval shown in Figure 135 
 
 The curves shown just above the probability functions for the 3 different configurations are the 
system probability of survival for the special case where all converters have the same failure rate. It 
can be seen that equalizing the failure rates result in improved system reliability. Ensuring equal 
8. Partners for Advanced Transit and Highways  Page 154 
 
failure rate can be accomplished by means of thermally distributing each converters current 
contribution.  
 
8.3.3 Output protection 
 
 If a redundant system supplies a common load, it is important to ensure that none of the 
converters fail in a manner that shorts the output power bus, since this will disable the entire power 
system. In other words it is important that each converter is single failure tolerant towards short 
circuiting of the power supply outputs. One way of ensuring this is by inserting fuses in series with 
each converters output. Unfortunately, a fuse can sustain several times its nominal current rating for 
prolonged periods of time. Therefore a large current is needed to blow the fuse in a timely manner. 
The current rating needed to blow a traditional fuse within 1 ms. is on the order of 4 times the 
nominal current. This sets a lower limit on the number of parallel-connected converters since the 
remaining converters has to supply the large current needed to blow the fuse of the faulty converter. 
At the same time the remaining converters must maintain the proper current level for the load. 
Further details of pre-arcing time vs. multiple integers of nominal current can be found in fuse 
manufactures datasheets. 
 From a reliability point of view, the use of fuses has an overall system impact that results in lower 
converter failure rates. Whether to use fuse protection, or some means of actively limiting the current 
flow to one direction, should be based on system assessments for each particular application. In this 
application a mix of fuses and active semiconductors will be used. The fuses will be used as buffers 
at each converter’s input while each converter’s output is actively OR’ed to the common output 
voltage bus. 
 
8.3.4 Final system 
 
 The results of the analysis in power system reliability shows that a highly reliable power system 
can be custom designed to fit the needs of the precision docking project control computers, 
magnetometers, radars etc. However, due to the criticality of the power system built into control 
structures operating in urban environments and the desire to implement future power systems in a 
relatively short timeframe, a reliable solution comprised of off-the-shelf parts was examined. The 
result is the power system described in chapter “7 Thermal droop load sharing”. The main focus of 
this work is getting standard converters to share a common load. The load in this context is a set of 
guidance computers that control the automated driving and docking procedure. Since these are vital 
components within the overall system it is of utmost importance that the supply of power is 
continuous and fault free. 
 Using standard converters, sponsored by CALEX in a parallel configuration with OR’ing diodes, 
all but one fault within a converter is isolated from the common output power bus (see Figure 112). 
The one fault that could permanently damage the computers is an over voltage generated by either 
one of the converters in the configuration. Since this is an unacceptable failure mode it either has to 
be eliminated or continuously monitored by a protection circuit. Elimination is not possible due to 
the predetermined resistor ratio of the thermal droop load sharing network. An over voltage 
protection circuit is therefore added. This can be seen in Figure 137. 
 
8. Partners for Advanced Transit and Highways  Page 155 
 
R3
R1
R22.2kΩ
2.2kΩ
5.6kΩ
2N3904
Q1
2N3906
Q2
C11nF
output
VOutput
RT
RS
RF1
RF2
13kΩ
3.9kΩ 4.3kΩ
2.2kΩ
R4 C2 100pF
74F125 time = t1
Test circuit
on_off
Thermal droop load sharing
Over voltage protection
RT,25Ccc    = 5.0kcΩβ             = 3950
Test circuit shorts out
individual resistors
buffer
switch
feedback
trig C3 10pF
 
Figure 137 : Droop load sharing with over voltage protection circuit 
 
A basic FMECA for the thermal droop load sharing network components is performed. The result 
can be seen in Table 18.  
 
Part Failure mode Failure effect Criticality 
RT Short circuit Over voltage situation 2 
 Open Circuit None 4 
RS Short circuit None 4 
 Open Circuit None 4 
RF1 Short circuit Over voltage situation 2 
 Open Circuit None 4 
RF2 Short circuit None 4 
 Open Circuit Over voltage situation 2 
Table 18 : Failure modes relating to converter over voltage 
 
 The 3 failure modes causing an over voltage situation is examined by means of the test circuit. At 
a predetermined moment the test circuit closes the switch, thus causing a short circuit of its 
terminals. To make sure the over voltage circuit does not trigger prematurely, the test circuit 
incorporates a time delay of 20 µs.  
8. Partners for Advanced Transit and Highways  Page 156 
 
8.3.5 Simulation results 
 
 The set of curves depicted in Figure 138 shows the normal operating mode of the converter. The 
curves show the node voltages at 50% of full load, which is equivalent to 10 A. The buffer (buffer) 
and trigger (trig) voltages are zero, although the curves show some noise in the nano and pico volt 
range. The output voltage (output) is 5 V with a sinusoidal ripple voltage of ±100mV that represents 
both the natural converter voltage ripple as well as random noise. The fact that the feedback voltage 
is a scaled replica of the output voltage is used to trigger the over voltage protection in case the 
feedback voltage exceeds 2.7V. It should be noted that the non-linear characteristics of the thermistor 
must be taken into account in order for the feedback voltage to be a true scaled replica of the output 
voltage. 
 
Feedback voltage at 50% load
Output voltage with noise
ON/OFF voltage for converter shut-down
Trigger voltage
 
Figure 138 : Waveforms during normal operation 
 
 Close examination of the 3 failure modes leading to over voltages reveals that the exact same 
timing behavior occurs in all situations. For this reason only one set of waveforms are provided. The 
3 failure modes examined are: 
 
• RT short circuit 
• RF1 short circuit 
• RF2 open circuit 
 
Figure 139 shows the node voltages from which the protection circuit response can be observed. 
 
Feedback returns to normal
Retriggering of overvoltage
Output voltage with noise
Switch activation voltage
Buffer voltage
Feedback over voltage
 
Figure 139 : Waveforms during abnormal operation 
8. Partners for Advanced Transit and Highways  Page 157 
 
 At the instant the test circuit closes the switch and causes an over voltage situation, the buffer 
voltage (buffer) generates the trigger signal (trig) that activates the ON/OFF latch (on_off). Figure 
140 shows a close up view of the on_off voltage during abnormal system operation. 
 
Triggering of over voltage protection latch
Voltage spike
Immune to retriggering attempts
 
Figure 140 : Enhanced view of the on_off voltage during abnormal operation 
 
 The reaction time from over voltage detection to converter shut-down is 663ns. This reaction time 
can be minimized at the cost of a larger voltage spike. According to the manufacturer’s datasheet the 
voltage at the ON/OFF pin should be limited to 3V. Currently a 533mV voltage spike results from 
the circuit configuration, but can be minimized if more capacitance is added to C1 and C2. Larger 
capacitors results in longer charge times, which in turn prolongs the reaction time of the over voltage 
protection. Very fast-reacting protection circuits and low voltage spikes at the converter TRIM input 
during latch triggering are contradictive requirements and a trade-off must be made. In this case a 
relatively fast-reacting protection circuit is essential for system survival, therefore the voltage spike 
that results has to be accepted. 
 Once the over voltage protection has detected an over voltage from the converter, it would be 
desirable if the converter never attempted to restart and possibly causing another over voltage 
situation. The over voltage protection latch ensures that retriggering attempts are ignored and the 
converter stays off-line. From the switch voltage (switch) shown in Figure 139, it can be seen that a 
retriggering is attempted 300 µs after the first over voltage situation. Furthermore, although the 
feedback voltage (feedback) returns to normal 50 µs after triggering the over voltage protection, the 
ON/OFF latch remains in the low state, thus keeping the converter off. 
8. Partners for Advanced Transit and Highways  Page 158 
 
8.4 Discussion and summary 
 
 The research work in power system reliability for the precision docking project began in March 
2003 and continued until the report deadline October 31, 2003.  
 This chapter has provided a brief introduction to PATH and the precision docking project 
concerning the design and implementation of a system that essentially takes over the docking 
procedure when loading and unloading bus passengers. The result of this work is partially described 
in this chapter while the power system developed for field testing is presented in chapter 7. 
 The result of the research work for PATH was described in a status report to the California State 
Department of Transportation. The report can be found in the appendix to this thesis. 
 
 
8.5 References 
 
All images used with the permission of PATH 
 
[Nw01]  Get Ready to Take a Back Seat to a Circuit Board, Newsweek, June 2nd.1997 
 
[Td01]  Building a Smarter Car, The Daily Californian, October 9th 1996 
 
[Ce02]  California’s PATH Project, California Engineer, March 3rd 1995 
 
[Au01]  Buick experiments with hands-free driving, Automotive News, April 21st 1997 
 
[In01]  Magnetic Sensors for Automatic Steering Control, Intellimotion vol. 5 no. 2, 1996 
 
[In02]  Magnetic Markers on I-15 Test Track for NAHSC Demonstration, Intellimotion vol. 
5 no. 4, 1996 
 
[In03]  Automating Bus Docking to Improve Transit Services, Intellimotion vol. 7 no. 2, 1998 
 
[Ne07]  Report on Power System Reliability, Carsten Nesgaard, October 31st 2003 
 
 
9. Conclusion  Page 159 
 
9 Conclusion 
 
 This thesis has presented several techniques for improving system reliability in different 
applications. The point of origin was a theoretical examination of traditional reliability engineering. 
From there, a relatively large set of equations were established for future system evaluations.  
 The first system to benefit from these equations was the array-based redundancy concept 
introduced in chapter ‘4 Array-based redundancy’. The system proposed in this chapter considers 
redundancy from a subsystem level by incorporating redundancy at multiple levels within the 
system. This alternative approach in the design of reliable power systems was based on statistical 
calculations, using among others the exponential distribution and the connection matrix technique. 
The results of these assessments showed that the proposed redundancy concept reduces the overall 
system unavailability by 88% compared to a traditional power system utilizing single parallel-
connected converter boards.  
 In addition, the dynamic reconfiguration of the system proved to have a positive impact on system 
efficiency during fault situations, since the system automatically optimizes the number of working 
electrical paths from input to output. However, for the primary power system working without any 
faults, the proposed power system exhibits lower efficiency than any other configuration due to the 
implementation of subsystem level redundancy. In turn, the failure rate at the board level of the 
proposed system is a bit higher than a similar system implemented by means of system level 
redundancy. Although this drawback seems important it has been shown that a low board level 
failure rate does not necessarily result in optimized system level failure rate. In fact, the proposed 
power system disproves this common conception by achieving the previously mentioned reduction in 
unavailability. Furthermore, parameters such as low system maintenance costs, consistency and 
speed of the redundancy management control and overall flexibility are advantages of the proposed 
power system that exceeds traditional power system implementations. All in all, the proposed power 
system provides a theoretical foundation for high impact solutions to critical power system designs.  
 Based on the positive results from the array-based redundancy implementation it was decided to 
evaluate the capabilities of digital control in simple DC-DC converter applications. This work, 
presented in chapter ‘5 Digital control of DC-DC converters’, originated with a simple control 
algorithm but advanced to look-up tables, analytical redundancy and multiple control law 
implementations. As the research has shown the main difference between an analog and a digital 
implementation in terms of fault detection is the level of achievable intelligence. In most analog 
implementations the fault detection and isolation is often inherent in the initial system either by 
choice of topology or designed-in during the synthesis phase. A digital implementation offers the 
same detection possibilities in terms of topology choice but have the added properties of allowing for 
intelligent control decisions based on monitoring parameters.  
 Realization of a discrete control algorithm verified that reliability optimization by analytic means 
is possible, although the gain in overall reliability in this particular converter was minor. 
Furthermore, it has been shown that converter control by means of multiple control laws is within the 
timing limits of a standard low-cost microcontroller. Temperature measurements in the test setup 
allowed for implementation of analytical redundancy, which improves system fault resilience, 
although true fault tolerance only is achievable in hardware redundant converter configurations.  
 Unfortunately, the reliability evaluation showed that the digital approach is much more likely to 
fail than its analog counterpart. In other words, despite the promising features of the digital control, a 
long term feasible solution seen from a reliability point of view has to be implemented by means of 
analog circuitry.  
9. Conclusion  Page 160 
 
 Having examined several aspects of digital design of converter control, the attention was turned to 
the more traditional power system implementations. Chapter 6 introduces the concept of system level 
redundancy utilizing dedicated load share controller IC’s while chapter 7 establishes the foundation 
of an alternative droop load sharing technique. 
 The analysis in chapter 6 originates with a theoretical examination of a new thermal load sharing 
technique that at any given time ensures optimum reliability, performance and efficiency. A 
comparison between the thermal load sharing technique and the common and widely accepted 
current sharing technique is provided and the pros and cons in each case are discussed. This 
discussion is followed by reliability estimations that provide the analytic evidence that the thermal 
load sharing technique has superior reliability compared to the traditional current sharing technique. 
Besides superior reliability the advantages of the thermal load sharing technique includes 
minimization of MOSFET transistor losses, simple implementation and lower overall system 
temperature. The simplicity comes from the fact that most converters are fitted with a thermal 
monitoring device for the thermal protection circuitry often found in modern converters. Also, since 
the system is intended for parallel-connection of multiple converters the needed load share controller 
is often built directly into each converter. In short, all that is needed is a network for adjusting the 
temperature information correctly and feed it to the load share controller. These observations make 
the thermal load sharing technique an attractive solution. 
 A disadvantage of the thermal load sharing technique is the possibility of a slight increase in 
individual converter failure rate. However, this drawback is by far compensated through the much 
lower average system temperature that results from the implementation. Another drawback that 
deserves a bit of attention is the fact that the dynamics of the temperature control is slightly slower 
than that of the current sharing technique. However, the effects of this slower reaction to load 
changes only impose minor issues, since the outer load sharing loop in any case has a very low cross-
over frequency. 
 The advantages of using the temperature as a control parameter are quite clear. An equal 
temperature distribution among, for example, the converter’s switching MOSFET transistors lowers 
the overall system temperature, which in turn decreases the unavailability considerably.  
 In order to verify the theoretical aspects discussed so far, a laboratory setup using the thermal load 
sharing technique was implemented. For comparison purposes the setup was initially operated by 
means of the current sharing technique. Having obtained measurements from both techniques it is 
clear that new thermal load sharing technique not only increases the overall system reliability as 
predicted by the theoretical assessments, but also has a positive impact on the system efficiency. The 
increase in efficiency is achieved by redistributing the current supplied by each converter to obtain 
equal thermal conditions as opposed to the current sharing technique’s intent to establish equal 
currents. Another less obvious advantage of the thermal load sharing technique is the system’s ability 
to route power through converter boards mounted in cooler environments and thereby optimize the 
working conditions for converter boards positioned, for example, in-between to adjoined converter 
boards giving off heat. 
 In conclusion, the proposed load sharing technique optimizes the overall system significantly but 
requires the use of a dedicated load share controller. 
 Chapter 7 on the other hand, presents a very simple technique for implementing reliable power 
systems comprised of multiple parallel-connected converters. The system is intended for 
implementation in automated busses and closely related to the work presented in chapter 8.  
 Detailed analysis of the design steps involved in the synthesis of a laboratory test power system 
comprised of off-the-shelf converters have been provided. The analysis started by considering the 
pros and cons of commonly used droop techniques. The first technique discussed uses the fact that 
all converters inherently have a finite output impedance. By intentionally lowering the loop gain the 
9. Conclusion  Page 161 
 
correlation between output voltage and output current increases, which is then used to establish load 
sharing in parallel configurations. Another technique, which is the technique used in the reference 
circuit, is the series resistor droop technique. This technique is even simpler that the loop gain 
changing technique but comes at the cost of lower overall efficiency. However, since the latter 
technique only introduces a single resistor at the output of each converter, the dynamic response of 
the initial converter remains unchanged.  
 Following these considerations, a detailed examination of the steps involved in transitioning from 
the traditional series resistor droop technique to the new thermal droop technique is provided. The 
proposed thermal droop load sharing alters the traditional droop techniques by adding an external 
temperature dependent feedback circuit. Since power system simplicity is an important factor in this 
design, the external feedback circuit is kept as simple as possible.  
 Having established a theoretical foundation for comparison of the two techniques, a power system 
comprised of three parallel-connected converters were implemented. Extensive measurements 
verified the theoretical results that the proposed thermal droop load sharing decreases the overall 
unavailability quite significantly. Similar to the technique using dedicated load share controllers this 
technique lowers the overall system temperature. At full load the temperature is lowered by almost 
14°C while maintaining the system efficiency of the loop gain changing technique and the dynamic 
response of the series resistor droop technique. In other words, the thermal droop load sharing 
technique combines the best properties of two commonly used techniques and simultaneously adds a 
reliability optimization feature to the system. 
 Finally, chapter 8 presents the work performed at the University of California, Berkeley and the 
participation in a large scale project managed by Partners for Advanced Transit and Highways 
(PATH). Much of this work were presented as a separate topic in chapter 7, for which reason chapter 
8 provides an introduction to PATH and the special features that had to be implemented in the power 
system discussed in chapter 7. Focusing on the latter issue, it was shown that the proposed power 
system potentially could cause unacceptable failure modes as a result of a converter malfunction. 
Since these failure modes could not be eliminated from the design a monitoring circuit was 
successfully implemented. Extensive simulations verified that the monitoring circuit provided the 
required shut-down features and simultaneously prevented a failed converter from restarting. 
  
 
 
 
Carsten Nesgaard, January 31st 2004 
 
 
 
 
 
 
 
 
 
10. References (Sorted alphabetically with chapter index Page 162 
 
10 References (Sorted alphabetically with chapter index) 
 
 
 Title, Author, Conference/company/journal/university Chapter
[Ac01]  Electronic Derating for Optimum Performance, Reliability Analysis Center, New York 3 
[At01]  
A Low Cost Digital SVM Modulator with Dead Time Compensation, C. Attaianese, 
D. Capraro, G. Tomasso, Power Electronics Specialists Conference 2001, 
Vancouver, Canada 
2 
[Au01]  Buick experiments with hands-free driving, Automotive News, April 21st 1997 8 
[Ba01]  
Self-monitoring microcontroller based DC/DC converter, Rune M. Barnkob, Special 
report 2002, Department of Electric Power Engineering, Technical University of 
Denmark 
5 
[Bl01]  Diagnosis and Fault-Tolerant Control, Mogens Blanke, Michel Kinnaert, Jan Lunze and Marcel Staroswiecki, Springer, ISBN: 3-540-01056-4 3,5 
[Ca01] 75Watt QH Single Series DC/DC Converters, CALEX datasheets, www.calex.com 7 
[Ce01]  
A New Distributed Digital Controller for the Next Generation of Power Electronics 
Building Blocks, I. Celanovic, I. Milosavljevic, D. Boroyevich, R. Cooley, J. Guo, 
Applied Power electronics Conference and Exposition 2000, New Orleans, USA 
2,5 
[Ce02]  California’s PATH Project, California Engineer, March 3rd 1995 8 
[De01]  Backplane Health Rests on Fault Finding, Tom DeLurio and George Hall, EETimes, October 13, 2003, Issue 1291, page 63 and 70. 3 
[Fe01]  
Digital Control of a Single-Stage Single-Switch Flyback PFC AC/DC Converter with 
Fast Dynamic Response, Ya-Tsung Feng, Gow-Long Tsai, and Ying-Yu Tzou, 
Power Electronics Specialists Conference 2001, Vancouver, Canada 
2 
[Fi01]  
MOSFET Failure Modes in the Zero-Voltage-Switched Full-Bridge Switching Mode 
Power Supply Applications, Alexander Fiel, Thomas Wu, Applied Power electronics 
Conference and Exposition 2001, Anaheim, USA 
2 
[Fr01]  A note on Inference by Transitivity, Ole Immanuel Franksen, April 1992, Electric Power Engineering Department, Technical University of Denmark 4 
[Fr02]  Basic Assumptions of Array Theory, Ole Immanuel Franksen, February 1996, Electric Power Engineering Department, Technical University of Denmark 4 
[Fr03]  Group Representations of Finite Polyvalent Logic – a Case Study Using APL Notation, Ole Immanuel Franksen, IFAC VII World Congress, Helsinki, June 1978. 4 
[Fr04]  Colligation or, the logic inference of interconnection, Ole Immanuel Franksen and Peter Falster, Mathematics and Computers in Simulation 52 (2000) 1-9. 4 
[Fr05]  Digital Control of Dynamic Systems, Gene F. Franklin, J. David Powell and Michael workman, Addison-Wesley Longman Inc., ISBN 0-201-82054-4 5 
[Go01]  Control System Design and Simulation, Jack Golten and Andy Verwer, McGraw-Hill, ISBN 0-07-707412-2 5 
[Ho01]  
Fault Detection Evaluation of Microcontroller Dyad Control System by Fault 
Injection Method, Zeljko Hocenski, Goran Martinovi, Josip Juraj Strossmayer, 
European Conference on Power Electronics and Applications 1999, Lausanne, 
Switzerland 
2 
[In01]  Magnetic Sensors for Automatic Steering Control, Intellimotion vol. 5 no. 2, 1996 8 
[In02]  Magnetic Markers on I-15 Test Track for NAHSC Demonstration, Intellimotion vol. 5 no. 4, 1996 8 
[In03]  Automating Bus Docking to Improve Transit Services, Intellimotion vol. 7 no. 2, 1998 8 
[Je01]  
Array Theory and Nial, Mike Jenkins and Peter Falster, Research report, 
Department of Electric Power engineering, Technical University of Denmark, 
August 1999 
4 
[Je02]  Q’Nial Reference Manual, Mike A. Jenkins, Nial Systems Limited, Kingston, Ontario, Canada, 1985, Report on Power System Reliability 4 
10. References (Sorted alphabetically with chapter index Page 163 
 
 
 Title, Author, Conference/company/journal/university Chapter
[Jp01]  JP patent 2000358371 7 
[Ke01]  
Generalized Predictive Control (GPC) - Ready for Use in Drive Applications?, 
Kennel R., Linder A., Linke M., Power Electronics Specialists Conference 2001, 
Vancouver, Canada 
2 
[Ko01]  Reliability challenges due to excess stress under high frequency switching of power devices, Professor Johann W. Kolar, ETH, Zürich, Seminar August 2002 6 
[Ma01]  Electronic Failure Analysis Handbook, Perry L. Martin, McGraw-Hill, ISBN 0-07-041044-5 3 
[Ma02]  High Flux Powder Cores, Magnetics core data, http://www.mag-inc.com/ 6 
[Me01]  Statistical Methods for Reliability Data, William Q. Meeker and Luis A. Escobar, Wiley series in Probability and Statistics, ISBN 0-471-14328-6 3,5 
[Mi01]  Reliability Prediction of Electronic Equipment, Military Handbook 217 3,4,5,6 
[Mi02]  Resistors - Selection and use of, Military Handbook 1999 3 
[Mo01]  
Considerations for Array Theory and the Design of Nial, Trenchard More, July 
1989, Visiting Professor at the Electric Power Engineering Department, Technical 
University of Denmark 
4 
[Mo02]  
Notes on the Diagrams, Logic and Operations of Array Theory, Trenchard More, 
Structures and operations in Engineering and Management Systems, The second 
Lerchendal Book, Tapir Publishers. 
4 
[Mo03]  Power Electronics, Second Edition, Ned Mohan, Tore M. Undeland and William P.Robbins, John Wiley & Sons Inc., ISBN 0-471-58408-8 6 
[Mø01]  Building Reliability into Power Electronic Systems, Jørgen Møltoft, Ørsted – DTU, Seminar august 20th 2002. 3 
[Mø02]  On the technology of array-based logic, Gert L. Møller, Ph.D. thesis 1995, Electric Power Engineering Department, Technical University of Denmark 4 
[Na01]  
Stability Analysis of Parallel DC-DC Converters Using a Nonlinear Approach, Sudip 
K. Mazumder, Ali H. Nayfeh, Dushan Borojevic, Power Electronics Specialists 
Conference 2001, Vancouver, Canada 
2 
[Ne01]  An array-based study of increased system lifetime probability, Carsten Nesgaard, Applied Power electronics Conference and Exposition 2003, Miami, USA 4 
[Ne02]  
Digitally Controlled Converter with Dynamic Change of Control Law and Power 
Throughput, Carsten Nesgaard, Nils Nielsen and Michael A. E. Andersen, Power 
Electronics Specialists Conference 2003, Acapulco, Mexico 
5 
[Ne03]  
Optimized load sharing control by means of thermal reliability management, 
Carsten Nesgaard and Michael A. E. Andersen, Power Electronics Specialists 
Conference 2004, Aachen, Germany 
6 
[Ne04]  
Efficiency improvement in redundant power systems by means of thermal load 
sharing, Carsten Nesgaard and Michael A. E. Andersen, Applied Power electronics 
Conference and Exposition 2004, Anaheim, USA 
6 
[Ne05]  
Thermal Droop Load Sharing Automates Power System Reliability Optimization, 
Carsten Nesgaard and Seth R. Sanders, Power electronics Society Newsletter, 
Second quarter 2004 
7 
[Ne06]  
Experimental verification of the thermal droop load sharing, Carsten Nesgaard and 
Michael A. E. Andersen, Submitted for review at Applied Power electronics 
Conference and Exposition 2005, Austin, TX, USA 
7 
[Ne07]  Report on Power System Reliability, Carsten Nesgaard, October 31st 2003 8 
[Ne08]  
Topological reliability analysis of common front-end DC/DC converters for server 
applications, Carsten Nesgaard and Michael A. E. Andersen, Digest submitted for 
review at International Power Electronics Congress 2004, Celaya, México 
(2,3) 
[Nw01]  Get Ready to Take a Back Seat to a Circuit Board, Newsweek, June 2nd.1997 8 
[Pe01]  An Introduction to Array Theory and Nial, Allan Pedersen, September 1990, Electric Power Engineering Department, Technical University of Denmark 4 
 
10. References (Sorted alphabetically with chapter index Page 164 
 
 
 Title, Author, Conference/company/journal/university Chapter
[Pe02]  Q’Nial Stand – By, Allan Pedersen and Jens Ulrik Hansen, May 1988, Electric Power Engineering Department, Technical University of Denmark 4 
[Pi01]  PIC16F87x Datasheet, Microchip Technology Inc., www.microchip.com 5 
[Po01]  Simple ESR Meter for Electrolytics, Ray Porter, TELEVISION Servicing Magazine January and April 1993 6 
[Pr01]  
Design of a Digital PID Regulator Based on Look-Up Tables for Control of High-
Frequency DC-DC Converters, Aleksander Prodic and Dragan Maksimovic, 
Computers in Power Electronics 2002, Puerto Rico 
5 
[Pr02]  
Design and Implementation of a Digital PWM Controller for a High-Frequency 
Switching DC-DC Power Converters, Aleksander Prodic, Dragan Maksimovic and 
Robert W. Erickson, IECON 2001: The 27th Annual Conference of the IEEE 
Industrial Electronics Society 
5 
[Qi01]  
On the Use of Current Sensors for Control of Power Converters, D. Y. Qiu, S. C. 
Yip, Henry S. H. Chung, and S. Y. R. Hui, Power Electronics Specialists 
Conference 2001, Vancouver, British Columbia, Canada 
5 
[Ra01]  Power Electronics Handbook, Muhammad H. Rashid, Academic Press series in Engineering, ISBN 0-12-581650-2 3 
[Re01]  
Analysis and Design of a Repetitive Predictive-PID Controller for PWM Inverters, C. 
Rech, H. Pinheiro, H. A. Gründling, H. L. Hey, J. R. Pinheiro, Power Electronics 
Specialists Conference 2001, Vancouver, Canada 
2 
[Re02] Relex Reliability Software, http://www.relexsoftware.co.uk/ 3 
[Ri01]  
A Fault Tolerant Induction Motor Drive System by Using a Compensation Strategy 
on the PWM-VSI Topology, R. L. A. Ribeiro, C. B. Jacobina, E.R. C. da Silva, A. M. 
N. Lima, Power Electronics Specialists Conference 2001, Vancouver, Canada 
2 
[Ro01]  When It Comes To Compact PCI Supplies, Standards Are Helping, Lazar Rozenblat and Paul Kingsepp, Todd Products Corp., web-article. 7 
[Sa01]  
Architecture and IC Implementation of a Digital VRM Controller, Jinwen Xiao, Angel 
V. Peterchev, Seth R. Sanders, Power Electronics Specialists Conference 2001, 
Vancouver, Canada 
2 
[Sa02]  
Quantization Resolution and Limit Cycling in Digitally Controlled PWM Converters, 
Angel V. Peterchev, Seth R. Sanders, Power Electronics Specialists Conference 
2001, Vancouver, Canada 
2 
[Sa03]  
Electrolytic Capacitor Life Testing and Prediction, V. A. Sankaran and C.S. Avant, 
Ford Research Laboratory, IEEE Industry Applications Society Annual Meeting, 
New Orleans, Louisiana, October 5-9, 1997 
6 
[Si01]  Digital Electronic Switching system, Siemens, A30808-X2751-X-2-7618 4 
[Td01]  Building a Smarter Car, The Daily Californian, October 9th 1996 8 
[Ti01]  Texas Instruments Application note U-129, UC3907 Load Share IC Simplifies Parallel Power Supply Design 6 
[To01]  
Adaptive, Stable Fuzzy Logic Control for Paralleled DC-DC Converters Current 
Sharing, Bogdan Tomescu, H.F. VanLandingham, Power Electronics Specialists 
Conference 2001, Vancouver, Canada 
2 
[Up01]  The Uptime Institute, http://www.upsite.com/ 3 
[Us01]  Paralleled DC power supplies sharing loads equally, US patent 4,635,178 6 
[Us02]  System and method of load sharing control for automobile, US patent 5,157,610 6 
[Us03]  Current share circuit for DC to DC converters, US patent 5,521,809 6 
[Wa01]  
Evaluating Performance and Reliability of Automatically Reconfigurable Aerospace 
Systems Using Markov Modeling Techniques, Bruce K. Walker, Department of 
Aerospace Engineering & Engineering Mechanics, University of Cincinnati, OH, 
USA. 
4 
[Wa02]  Texas Instruments Power Support, Mr. Ed Walker, support@ti.com,  phone: 972-644-5580 6 
 
10. References (Sorted alphabetically with chapter index Page 165 
 
 
 Title, Author, Conference/company/journal/university Chapter
[We01]  Webopedia, http://www.webopedia.com/ 2 
[We02]  Introduction to Graph Theory, Douglas B. West, Prentice Hall, ISBN 0-13-227828-6 4 
[Wo01]  Mathematica – A System for Doing Mathematics by Computer, Stephen Wolfram, Addison-Wesley Publishing Company, ISBN 0-201-51502-451502 3 
[Xp01]  Reliability in Electronics, XPiQ inc. application note, http://www.xp-iq.com/home.htm 3 

  Page 1 
 
A1  List of abbreviations 
 
 
This section presents a short list of terms and abbreviations used throughout this thesis. 
 
Abbreviation Description 
Fault tolerance A technique for enhancing overall system reliability/availability. 
Fault resilience Not easily susceptible to faults. 
Redundancy A technique for enhancing overall system reliability/availability and/or improving the overall system power capabilities. 
Derating A technique for enhancing overall system reliability/availability by accurately controlling the stress each part of the system has to tolerate. 
FMECA Short for Failure Modes Effects and Criticality Analysis. Used in classification of potential system failure modes. 
Reliability The probability that an electronic system operating under specified conditions will perform satisfactorily for a given period of time. 
Availability Defines the probability of finding a system in the operating state at some point in the future. 
Unavailability Is often used in both repairable and non-repairable systems to define the probability of system malfunction. 
Topological matrix Alternative representation of system interconnections. 
NIAL Nested Interactive Array Language – a software tool that 
Failure effect The effect a certain fault will have on the system. Also used to examine potential fault propagation causes. 
Criticality A rating system used to prioritize the severity of system faults. 
MTBF Mean Time Between Failures. Applies to systems that are repaired and returned to service. 
MTTF Mean Time To Failure. Applies to parts/components that are discarded once failed. 
MIL-HDBK-217F Comprehensive standard for reliability prediction of electronic equipment. Used throughout this thesis.  
Droop Simple load sharing technique. The abbreviation refers to the output characteristic of the individual power supplies. 
PATH 
Partners for Advanced Transit and Highways. US research program for 
developing long-term solutions to the ever-increasing surface transportation 
problems. 
PCB Printed circuit board. 
ADC Analog to digital converter. 
DAC Digital to analog converter. 
MPLAB Microchip’s integrated development environment for programming PIC microcontrollers. 
I/O card Circuit for applying signals at the input pins and monitoring the output pins of the microcontroller. 
  Page 2 
 
A2  List of variables 
 
 
This section presents a short list of variables used throughout this thesis. 
 
Abbreviation Description 
λx Failure rate of a part or system. Often expressed in failures per one billion hours (FIT) - subscript used to denote a specific subsystem or component 
σ Standard deviation 
σ2 Variance 
π x Stress factor - subscript used to denote a specific type of stress 
R(t) Survivability function (survivability) – often referred to as reliability 
Q(t) Cumulative failure distribution (probability of failure) 
f(t) Density function 
E(t) Expected value 
px = Probx 
Probability of success - subscript used to denote specific subsystems (for 
example in parallel configurations) 
qx 
Probability of failure - subscript used to denote specific subsystems  
q = 1 - p 
n Number of trials 
r Number of successes in n trials 
Rx 
Same as R(t) - subscript used to denote the reliability of a specific 
subsystem or component 
t Time 
Tx 
Temperature in °C (unless otherwise noted) – subscript: 
s = surface and j = junction 
e(i) Error signal 
r(i) Reference value 
c(i) Process output 
u(i) Control effort produced by a controller 
Ix 
Current - subscript used to denote currents associated with a specific 
subsystem 
Px 
Power generation or power dissipation - subscript used to denote a specific 
subsystem or component 
 
  Page 3 
 
A3  CD contents 
 
 
Below is a graphical illustration of the CD contents. The arrows indicate subfolders within the top 
level folder. Folders without any arrows only contain files at the same level.  
 
 
 
 
Descriptions of folders containing material that need further clarification are provided below. This 
applies especially to the PATH folder that contains a lot of demonstration material.  
 
 
  Page 4 
 
PATH → Video: 
 
PATH video sequences relevant to the work on reliable power systems 
 
 
 
 
Filename:  Bus Platoon 
Location: Interstate 15, San Diego, California, August 2003 
Description: This demonstration shows the concept of an automated ’virtual train’ of busses. 
Wirelessly coupling busses together provides for very high passenger volumes in 
high-density corridors. For example, a 3 bus ‘virtual train’ would enable 70,000 
passengers to be transported per hour.  
  The busses ran at a separation of 40 m and 15 m and smoothly performed the 
automatic transitions between these two target separations. The target separation was 
monitored by the radar as well as the lidar mounted in the front bumper of the bus. 
 
 
 
 
 
Filename:  Bus Docking 
Location: Caltrans HOV Control Yard, San Diego, California, August 2003 
Description: This docking test demonstrates the automatic docking system’s ability to consistently 
and accurately dock at a loading platform. Docking at a platform with a gab of less 
than an inch between the bus floor and the platform eases the difficulty of getting on 
and off the bus especially for elderly and disabled persons. The docking is fully 
automated, thus leaving steering, lane change and stopping to the control system. A 
real-world system deployment would probably only use the automated steering and 
leave the stopping to the driver who is more aware of pedestrians and traffic. 
  
  Page 5 
 
 
 
 
 
Filename:  Automated Car Docking 
Location: East Liberty, Ohio, July 1999 
Description: Vehicle demonstration at the Transportation Research Center of Ohio on July 26-28 
1999. This demonstration was the first to use surface-mounted magnetic markers for 
guidance, rather than the permanent magnets places in holes below the road surface. 
The new neodymium magnetic markers were encased in a plastic hemisphere, 75mm. 
in diameter and 35 mm. high. These hemispheres, placed in the center of the lane on 
top of the pavement, made it possible to setup the field test with relatively short notice 
since there was no need for drilling holes. Numerous demonstrations showed that the 
test vehicle was able to dock with a lateral accuracy of 5 mm. and a longitudinal 
accuracy of 30.5 cm. 
 
 
 
 
 
Filename:  Automated Docking Demonstration 
Location: Houston, Texas, September, 1998 
Description: Automated vehicle docking test at the Huston Pinemont Park and Ride bus station on 
the first week of September 1998. After optimizing the vehicle system to ignore 
relatively strong magnetic background noises at the bus station the test was 
successfully completed. The test objective was to automatically steer the vehicle 
laterally across one lane and park it parallel to the curb, at 1.5 cm. distance, without 
ever touching the curb. During 10 consecutive fully automated docking runs the 
desired 1.5 cm. lateral distance from the curb was achieved consistently with a single 
docking deviation of 5 mm. 
 
  Page 6 
 
 
 
 
 
Filename:  Automated Docking Procedure 
Location: Richmond Field Station, California, May 1998 
Description: One of the initial tests on the precision docking project using magnetic markers. The 
docking procedure is implemented in a Buick passenger car and tested at the 
Richmond Field Station in California on May 1998. The car follows an S-shaped 
trajectory, similar to that of a bus approaching a curbside bus stop, with a precision 
better than 1 cm.   
 
 
PATH → Pictures → Docking Bus Components 
 
Description: These images provide a visual illustration of the major components used in the 
precision docking project. A short description of the individual images follows: 
 
Accelerometer Shows the accelerometer mounted underneath the bus. 
Actuator 1 Shows the actuator mounted on the steering wheel. From a safety point of view this 
component as well as its power supply must be highly reliable. 
Actuator 2 Shows an enhanced view of the steering actuator. 
Brake Shows the brake system used. 
Bus Shows the bus in which the system is implemented. 
Computer Shows the bus computer with all cable interconnections. In relation to the power 
system this is one of the most crucial components. 
Doppler Radar Shows the Doppler radar mounted at the front of the bus. This component is used 
for determining obstacles in the bus’s path. 
FrontMag Shows the magnetometers mounted underneath the bus at the front. 
Gyro Shows the fiber optic gyro used. 
Lidar Shows the distance measuring lidar embedded in the front bumper. 
PC Shows one of the control computers. 
RearMag Shows the magnetometers mounted underneath the bus at the rear. 
Status Lights Shows a series of status lights used to determine the overall system state. 
 
 
PATH → Pictures → Richmond Field Station Docking June 2003 
 
Description: System test at the Richmond Field Station test facility near San Francisco. The test 
demonstrates the docking systems ability to accurately guide the bus to a safe stop 
at a simulated curb-side platform without touching the curb. Touching the curb 
during docking results in excess tire wear thus an automated docking procedure 
could prolong the interval between tire changes.  
  Page 7 
 
 
 
PATH → Pictures → Washington Docking Test June 2003 
 
Description: These images were taken during a field test in Washington. The test demonstrates 
the precision docking system in automatic control of the 40 foot bus. The docking 
is similar to the video sequence showing a docking test in San Diego described 
above. In the pictures two different platforms can be seen. These represent an in-
line platform at a bus terminal and a curb-side platform. In both cases the gab 
between platform and bus is less than an inch without ever touching the curb. 
 
 
State of the art → Database 
 
 
 
Screen dump of the ‘State of the art techniques’ database 
 
Using the built-in Access functions it is possible to search on all fields within the database. 
 
 
 
 
 
  Page 8 
 
A4  Complete connection matrix 
 
 
  Page 9 
 
 From the connection matrix shown on the previous page it is apparent that due to the array 
structure of the power system a general symmetry exists (this was the reason for using the array-
based approach). This helps in establishing the connection matrix, but provides little or no assistance 
in the matrix multiplication. 
 The connection matrix considered in this section is established for the primary power system 
operating in a fault free state. If a particular block or switch fails the associated entries in the 
connection matrix are set to zero and thereby disabling any connections that might have been used in 
the previous power system configuration. In other words, each power system reconfiguration 
requires modifications to the initial connection matrix. 
 
  Page 10 
 
A5  Complete schematics 
 
Schematic for initial digital converter in chapter 5 
 
  Page 11 
 
Schematic for improved digital converter in chapter 5 
 
 
 
 
 
 
 
  Page 12 
 
Schematic for load sharing implementation in chapter 6 
 
  Page 13 
 
A6  Publications 
 
 
Publications: 
 
The following publications are all spin-offs of the research work described in this thesis. 
 
 Page Author Reference Chapter 
An Array-based Study of Increased System 
Lifetime Probability  
Applied Power Electronics Conference and 
Exposition 2003, Miami Beach, USA 
15 – 21 Carsten Nesgaard Ne[01] 4 
Digitally Controlled Converter with 
Dynamic Change of Control Law and 
Power Throughput 
Power Electronics Specialists Conference 
2003, Acapulco, Mexico 
22 – 27 
Carsten Nesgaard 
Michael A.E. Andersen 
Nils Nielsen 
Ne[02] 5 
Efficiency improvement in redundant 
power systems by means of thermal load 
sharing 
Applied Power Electronics Conference and 
Exposition 2004, Anaheim, USA 
28 – 34 Carsten Nesgaard Michael A.E. Andersen Ne[04] 6 
Thermal droop load sharing automates 
power system reliability optimization 
Power Electronics Society Newsletter, Second 
quarter 2004 
35 – 36 Carsten Nesgaard Seth R. Sanders Ne[05] 7 
Optimized load sharing control by means 
of thermal reliability management 
Power Electronics Specialists Conference 
2004, Aachen, Germany 
37 – 42 Carsten Nesgaard Michael A.E. Andersen Ne[03] 6 
Experimental Verification of the Thermal 
Droop Load Sharing 
Camera ready paper submitted for review at 
Applied Power Electronics Conference and 
Exposition 2005, Austin, TX, USA 
43 – 48 Carsten Nesgaard Michael A.E. Andersen Ne[06] 7 
Topological reliability analysis of common 
front-end DC/DC converters for server 
applications 
Digest submitted for review at International 
Power Electronics Congress 2004, Celaya, 
México 
49 – 53 Carsten Nesgaard Michael A.E. Andersen Ne[08] (2, 3) 
Report on Power System Reliability 
Concluding project report for the precision 
docking project, PATH 
54 – 70 Carsten Nesgaard Ne[07] 8 
 
 
Presentation material for some of the abovementioned publications can be found on the 
accompanying CD in the folder ‘Presentations’. 
 
  Page 14 
 
 
 
Other Ph.D. presentations: 
 
Fejltolerante Power Systemer 
Project description published in ‘Alcatal’ – Alcatel Space Denmark’s internal newsletter (in Danish) 
 
Project status 
Presentation prepared for visit at UC Berkeley 
 
Small-signal converter modeling and frequency dependant behavior in controller synthesis 
Qualifying examination 
 
State space averaging in power electronics 
Presentation for internal seminar at International Rectifier 
 
Fault Tolerant Power Systems 
Project description for the Department of Electrical Power Engineering web-site 
 
Research at Alcatel Space Denmark 
Presentation, for prospective interns and graduate students, about Alcatel Space Denmark’s 
contributions to the current research project.  
 
Optimized load sharing control by means of thermal reliability management 
Presentation of a reliability enhancing invention – described in chapter ‘6 Load sharing’ 
 
 
Several of the abovementioned presentations are not included with this report.  
 
All publications can be found on the accompanying CD.  
 
  Page 15 
 
An array-based study of increased 
system lifetime probability  
 
 
Carsten Nesgaard, Member IEEE 
Department of Electrical Power Engineering 
Technical University of Denmark 
2800  Kongens Lyngby 
Email: carsten@nesgaard.com 
 
 
 Abstract: Society’s increased dependence on electronic 
systems calls for highly reliable power supplies comprised of 
multiple converters working in parallel. This paper describes 
a redundancy control scheme, based on the array technology 
that increases the overall reliability quite considerably and 
thereby ensures a stable supply voltage. 
 
I.  INTRODUCTION 
 
 With the ever-increasing dependence on reliable 
electronic systems, the area of highly reliable power 
systems is rapidly expanding. When considering highly 
reliable fault tolerant power systems the word ‘redundancy’ 
comes to mind. Indeed, a true fault tolerant power system is 
comprised of several converters working in parallel. This 
paper describes part of an ongoing project of building a 
prototype of a fault tolerant power system with N + 2 
redundancy. The paper gives a short description of the 
mathematics used, the power system in question and the 
choice of control scheme. Since each of these short 
descriptions can form the basis for an entire paper, the focal 
point in this presentation is the redundancy control within 
the overall power system.  
 
II.  ARRAY-BASED LOGIC 
 
 Originated at the Department of Electric Power 
Engineering, the Technical University of Denmark in 1978 
with the paper ”Group Representations of Finite Polyvalent 
Logic – a Case Study Using APL Notation” by associate 
professor Ole I. Franksen [4], the array-based logic has 
evolved into an effective tool when dealing with 
combinatorial and/or configuration applications. The 
foundation of the technology is a geometrical 
representation of logic in terms of nested arrays. In other 
words, the array-based concept deals with data objects 
regarded as arrays [7]. Consequently, all calculations are 
performed on arrays which implies that systems comprised 
of large amounts of data often can be systematically 
simplified by the use of array theoretical operations. In 
general, the array-based logic can be considered to consist 
of the following three steps: 
 Step one is the establishment of a discrete n-dimensional 
configuration space using the Cartesian product, which 
ensures completeness. This is accomplished by the use of 
the tensor product OUTER and [8], which combines the 
system propositions and unites them to form one 
conjunctive proposition. 
 Step two is the inference by colligation, which is the 
operation of establishing the interconnections of the 
system. In other words this step finds the solutions that 
comply with the system constraints.  
 Step three is the determination of states by elimination 
of variables through an or-reduction.  
 Having introduced the concept of constraints, it is 
obvious that prior to completion of the above-mentioned 
steps, the constraints of the physical system must be 
translated into array theoretical terms. This is achieved 
through the use of propositional logic that transforms the 
system constraints into logic operations suitable for array 
theoretical implementation. An example of this translation 
and implementation is shown in section ‘V. ARRAY-BASED 
CONTROL SCHEME’. 
 Summarizing the above description the basic idea of 
array-based logic can be expressed, according to Franksen, 
as: 
 Array-based logic explores the consequences of 
considering truth-values as physical measurements. The 
aim is to formalize logic in accordance with the theoretical 
structure of discrete systems and express this formalization 
algebraically in array-theoretic terms. 
 
III.  REDUNDANCY AS RELIABILITY ENHANCEMENT  
 
 Increasing the reliability of a power system can be 
achieved in a number of ways, some more successful than 
others. A simple way of increasing the overall system 
reliability is using high quality components with low 
failure rates. Although, the overall circuit complexity 
remains the same the cost for high quality components is 
considerably higher than that of an implementation using 
commercial components. Also, the reliability gain using 
these high quality components is only moderate compared 
to other reliability enhancement techniques. 
Presented at Applied Power Electronics Conference and Exposition 2003, Miami, USA, February 2003 
  Page 16 
 
 For the reasons mentioned above, by far the most 
common way of increasing the reliability is the use of 
redundancy. Once the choice of redundancy has been made 
the level of redundancy needed must be identified. Ranging 
from the component level all the way to the system level, 
the level of redundancy used in the power system in this 
presentation is what can be characterized as block level 
redundancy. 
 As with any system, a redundant power system has both 
advantages and drawbacks. Among the advantages is the 
possibility of a dramatic increase in reliability at the 
expense of an increase in system dependent parameters 
such as cost, mass, volume and circuit complexity. 
Although the increase in reliability can be quite high, added 
cost, mass, volume, and complexity are drawbacks that 
must be considered when deciding which approach to take 
during the design phase of the power system. However, the 
drawbacks tend to be less important nowadays, since 
system downtime in case of power failure often results in 
greater losses in sales, customer services etc. 
 When implementing system level redundancy in power 
systems, each converter board within the overall power 
system must be equipped with a front switch that allows for 
controlled shutdown of faulty converter boards, since this is 
the only way the power system integrity can be maintained. 
In other words the power system must exhibit failure free 
operation at the input as well as at the output. Due to this 
fact, most approaches in designing high reliably power 
systems originates from the ability of the overall power 
system to shutdown faulty units whether these consists of 
converter boards, power distribution units etc. Focusing on 
the reliability of redundant systems it is noteworthy that 
making a single path system redundant generally increases 
the overall reliability with a factor of the reciprocal of the 
initial failure rate for the single path system, assuming the 
exponential distribution is valid and that the working 
conditions in both cases remain unchanged.  
 Now, suppose the redundant power system had the 
ability to reconfigure itself during operation. Such a system 
would further increase the reliability of the overall power 
system and at the same time reduce the maintenance 
requirements, since faulty units could be ‘replaced’ 
automatically. Due to the dynamic process of continuous 
measurement of the system integrity and configuration at 
any given time the simple exponential distribution for 
reliability calculations does not justify the true reliability 
potential in a reconfigurable system. To obtain a more 
truthful measure for the system reliability, one has to adopt 
the use of Markov modeling techniques, which is a 
commonly used modeling technique dealing with dynamic 
systems. Although Markov modeling often includes 
random noise errors within the sensor feedback system, it is 
for simplicity assumed that such errors do not occur within 
the power system at hand. The justification for making this 
simplification is that the focal point in this presentation is 
the redundancy control scheme. Furthermore, since the 
sensor feedback system is discrete with logic truth-values, 
the noise margin is quite large and random noise errors 
would be extremely rare in any case. The detailed 
description of Markov modeling uses can be found in the 
literature [6] and will not be dealt with in this presentation. 
 
IV.  THE POWER SYSTEM 
 
 It has been chosen to investigate a power system 
comprised of 5 identical converter boards connected in 
parallel. As indicated in the introduction this is 2 converter 
boards more than needed. Thus, the system is N + 2 
redundant. On a board level each converter is designed to 
shutdown in case of a single point failure whereas the 
overall power system can tolerate 2 failures and still 
provide the needed power. From a traditional power system 
point of view two failures reduces the overall power system 
from a system comprised of 5 parallel-connected converter 
boards into a system comprised of the 3 converter boards 
needed to supply the required power to the load.  
 The proposed power system approaches the parallel 
connection of the individual converter boards in a way that 
differs significantly from what have just been described, by 
splitting the individual converter boards into 5 main blocks. 
Aligning these main blocks as shown in Figure 1 (ignoring 
the block ‘PWM controller’), each block connects to the 
previous block on the same converter boards as well as to 
the previous blocks on the parallel-connected converter 
boards. This arrangement of multiple inter-connection of 
individual blocks allows for intelligently control of 
combining blocks for maximum number of working 
converter boards at any given time. It should be noted that 
it is not an allowable state to have a block deliver power to 
more that one subsequent block. Thus, the first system 
constraint is the limitation of blocks being connected to one 
and only one subsequent block. 
 
Inrush control
On/Off switch Filter Power-switch Transformer
Rectifier
S/C protection
Current sharing
Input Output
PWM controller
Feedback
Switch 1A Switch 1B Switch 1C Switch 1D
Switch 1E
 
Figure 1 : Block diagram for converter 1 
 
 The connecting devices are chosen to be electronically 
controlled switches but could in theory also be 
mechanically operated relays. The reason for choosing 
electronically controlled switches (hereafter referred to as 
switches) is the fact that activation of the individual 
interconnecting devices would occur during power system 
operation, which for long switching periods would require 
substantial amounts of capacitors at the output of the power 
system in order to comply with ripple voltage specification. 
  Page 17 
 
Thus, the timing of the switching is of importance although 
not critical. A transition time of 0.1 ms is estimated to be 
reasonable. Compared to the 140 kHz, which has been 
chosen as the switching frequency for the individual 
converter boards, it is apparent that the transition times of 
the interconnecting switches are far from critical. 
 Even though the interconnecting switches are operated 
rarely, due to the rather low failure rate of the electronic 
components used and the transition time from one state to 
another is fairly quick, the price paid for using extra 
switches as connecting devices between the different 
blocks within the overall power system is an increased 
failure rate for the individual converter boards and an 
increase in total conduction losses. Furthermore, the overall 
cost and complexity of the power system is increased due 
to the use of extra switches. However, if these drawbacks at 
the board level imply a better probability of continuous 
operation at a system level, the added cost, complexity and 
losses might be negligible compared to the gain in 
reliability. Also, it should be noted that obtaining a similar 
reliability for a power system comprised of individual 
converter boards without the interconnecting capability 
would require more converter boards, which adds to system 
parameters such as volume and mass.  
 As mentioned in section ‘III. REDUNDANCY AS 
RELIABILITY ENHANCEMENT’ the statistical description of 
the reliability of the power system should be carried out 
using the Markov modeling approach. However, from a 
system point of view it should be obvious that the process 
of combining two defective converter boards to form one 
working converter board increases the overall system 
performance concerning both reliability and efficiency. In 
Figure 3 it can be seen that by combining two converter 
boards, which have failed in different locations on the 
board level, an alternative path for the power throughput 
can be established. This lowers the stress on the original 3 
converter boards, since the load current now is shared 
among 4 converter boards. As a consequence the power 
system operating point on the efficiency curve tends to 
move towards the optimum operating point as shown in 
Figure 3. Also and most importantly, the overall power 
system reliability increases as a result of the newly 
configured power system comprised of 3 + 1 working 
converter boards. 
 
Failure Failure Failure
Failure Failure
Block A Block B Block C Block D Block E
Converter 1
Converter 2
Converter 3
Converter 4
Converter 5
Input
Input
Input
Input
Output
Output
Output
Output
Switch 1A Switch 1B Switch 1C Switch 1D
 
Figure 2 : The 5 parallel connected converter boards 
 
 Referring to Figure 1 and Figure 2 this paragraph 
describes the abbreviations used to identify the individual 
blocks and switches within the power system. Starting with 
the blocks it can be seen from Figure 2 that these can be 
addressed using the converter number as a row 
identification and the block letter as column identification. 
Thus, the first faulty location in the power system shown in 
Figure 2 can be identified as: Converter 1, Block B.  
 Identification of the interconnection switches is 
accomplished through the adoption of the following 
notation: SXYZ  
Where S is the notation used in the software to identify a 
switch. X represents the converter number, Y represents the 
block prior to the switch in question and Z represents the 
switch position. Hence, the switch between block 1A and 
block 1B set in position 1 gets the identification S1A1. 
 
Preferable operating point
Operating point after two failures
Output current
Efficiency
50%
100%
Imax
 
Figure 3 : Example of operating point movement 
 
 In order to provide feedback to the redundancy control 
system, each block in the overall power system can take on 
two different logic values - logic 1 for a working block and 
logic 0 for a faulty block. Since a faulty block is switched 
off and the redundancy control system continues to check 
the status of the overall power system, the logic state of any 
faulty block is latched. This ensures that the redundancy 
control system always gets the correct logic values from 
each block, even though the block in question has failed. 
Having retrieved all truth-values from the blocks within the 
power system an array containing the retrieved truth-values 
is generated. As will become apparent in section ‘V. 
ARRAY-BASED CONTROL SCHEME’ this array forms the basis 
for the calculation process as well as for the representation 
of the results. 
 Turning the attention towards the operation of the power 
system the following description represents the actions 
taken by the redundancy control in case of fault occurrence. 
Assuming a well functioning structure as the starting point, 
the power system consists of 5 inputs and 5 outputs. After a 
failure within the power system, the redundancy control 
shuts down the blocks associated with the faulty block and 
leaves the power system comprised of 4 inputs and 4 
outputs. Except for the faulty block the rest of the inactive 
blocks now serve as cold spares for the power system in 
case of further failures. Now suppose a second fault occurs, 
for instance due to the increased stress on the remaining 
active converter boards. Since two faults have occurred it 
might now be possible to establish an alternative path 
through the power system and thus increase the number of 
active converter boards from 3 to 4. The only constraint 
that makes the establishment of an alternative path 
impossible is in case the two faults have occurred within 
the same block. In this case the power system would 
consist of only 3 converter boards, which is the minimum 
  Page 18 
 
number possible in order to sustain power delivery to the 
load.  
 Depending on the failure rate of the individual blocks it 
would be a rare situation that two successive faults occur in 
the same block, hence the probability of successful system 
reconfiguration is quite high. This indicates that the overall 
system reliability has increased compared to the situation 
with 5 separate converter boards. 
 Based on this short description of the power system and 
its operation, the mathematical task of the redundancy 
control can be thought of as a method of finding alternative 
paths through the power system in case of fault occurrence. 
 
V. ARRAY-BASED CONTROL SCHEME 
 
 As described in section ‘II. ARRAY-BASED LOGIC’ the 
mathematical foundation is a consideration of truth-values 
as physical measurements. The truth-values in the 
application at hand are the discrete values obtained from 
each block in the power system, upon which the alternative 
path from input to output is calculated. With reference to 
Figure 2 it can be seen that the discrete values obtained 
from the individual blocks can be considered as an array of 
5 rows and 5 columns. This array is a measurement of the 
condition of the overall power system and can therefore be 
used to identify problems within the system. Based on this 
identification of problems, the array-based analysis 
suggests possible alternative paths through the power 
system. It should be noted that similar results could have 
been obtained by using standard digital logic. The reason 
for not implementing the redundancy control using this 
type of logic is due to the powerful array concept and 
operations in array theory, which makes it easy to expand 
the redundancy control scheme to include an arbitrary 
number of converter boards and switches. Thus, a formal 
description of a redundant power system comprised of any 
number of parallel-connected converter boards is straight 
forward, since the added system constraints are almost 
replicates of existing board level constraints. Similar 
implementation using standard digital logic would require 
considerable recalculations of the power system’s 
interconnections. 
 With reference to Figure  and Figure  it can be seen that 
the number of parameters needed to describe the power 
system in question is relatively large. For this reason most 
system constraints have been omitted in this presentation, 
although an example will be given on the following pages. 
 Having introduced the fundamentals of the power 
system, the array-based analysis can be carried out. The 
mathematical tool used in this project is based on the array 
theory developed by Dr. Trenchard More in the 1970’s and 
later (early 1980’s) implemented in the array-based 
software ‘Queens Nested Interactive Language (Q’Nial)’ 
by Professor Michael Jenkins. 
 Solving the problem at hand by establishing a 
generalized configuration space by the use of the Cartesian 
product of all system parameters would require a 
tremendous amount of computer memory, since the number 
of possible combinations exceeds 1030. A different 
approach has therefore been pursued. Using the allowable 
positions for each switch, an algorithm has been developed 
that in a successive way finds a feasible solution within a 
given amount of time.  
 Looking at the power system and the tasks of the 
redundancy control from a topological point of view it 
should be noted that the system’s topological array has 
characteristics similar to that of the incidence matrix 
describing electric networks within the field of graph 
theory. The reason the topological array only has similar 
and not identical characteristics to the incidence matrix is 
due to the unidirectional flow of power through the system. 
The classical approach in electric network theory using 
arrays is the bi-directional power flow that uses the 
numbers 0, 1 and  –1 to identify the flow direction. Since 
the power system at hand only allows power to flow in one 
direction the closest match to the incidence matrix is the 
use of unidirectional circuit elements such as 
semiconductor devices within the electric network itself. 
From a mathematical point of view this adds considerably 
to the complexity of the system when performing reliability 
calculations, since the system now includes multiple failure 
modes for each block. Also, it should be noted that the 
analysis assumes a constraint between the blocks ‘PWM 
controller’ and ‘Power-switch’. This constraint ensures a 
correct connection between the driving ‘PWM controller’ 
and the semiconductor device making up the switching 
element of the converter.  
 As mentioned above the starting point in the 
development of the software was a series of system 
constraints that would limit the number of allowable switch 
combinations. The implementation example used in this 
presentation is the constraint saying that a switch set in 
position 1 cannot be set in any other position. Describing 
this constraint with a ‘logic-level’ explanation, one obtains: 
 
If   s1a1 = 1  Then  s1a2 & s1a3 & s1a4 & s1a5 = 0 
 
 This description must now be converted into Nial terms, 
which results in the following lines of source code: 
 
a2:= OUTER <=s1a1 (not s1a2);   
a3:= OUTER <=s1a1 (not s1a3); 
a4:= OUTER <=s1a1 (not s1a4);   
a5:= OUTER <=s1a1 (not s1a5); 
 
 Noting the replication of parameter s1a1, the proposition 
must be united through the operation of colligation: 
 
(0 2 4 6) (1) (3) (5) (7) fuse (OUTER and a2 a3 a4 a5)  
 Following the above procedure, the remaining system 
constraints can be added to the source code. After a few 
transformations a list of allowable switch positions based 
on the system constraints can be obtained. Keeping in mind 
that the initial 25 combinatorial switch positions for any 
  Page 19 
 
given switch now has been reduced to 6 allowable switch 
positions that comply with the system constraints: 
 
+-----+-----+-----+-----+-----+-----+
|ooooo|ooool|ooolo|ooloo|olooo|loooo|
+-----+-----+-----+-----+-----+-----+
 
 The first entry from the left in (1) is the NULL solution 
where the block subsequent to the switch in question is 
disconnected from all blocks within the power system. The 
entry to the right in (1) is the notation used for switch 
position 1. The second entry from the right is the notation 
used for switch position 2 and so forth. As an example the 
entry to the right in (1) indicates that the left-hand side of 
the switch in question is connected to Converter 1 
regardless to which converter board the right-hand side of 
switch is connected. Thus, a ‘0’ in any entry in (1) indicates 
a disconnection whereas a ‘1’ indicates a connection 
between two blocks. 
 Since the software program decides which blocks to 
interconnect at all times the power system would under 
normal circumstances interconnect the individual block in a 
way that enables power throughput within the physical 
boundaries of each converter board as shown in Figure 2. 
In case of multiple failures the algorithm would find a way 
through the system that ensures a maximum number of 
working converter boards. In other words, allowing the 
algorithm to decide which blocks to interconnect, the 
overall power system is no longer comprised of 5 
individual converter boards with interconnecting switches, 
but 5 times 5 blocks that can be interconnected in an 
enormous number ways. This gives rise to the previously 
mentioned increase in reliability. As a consequence, a high 
reliable power system can be built with lower volume and 
mass than conventional power systems, but at the cost of 
increased circuit complexity and considerably higher cost 
price. 
 As an example of the capabilities of the algorithm, let the 
power system suffer from 8 faults located in different 
places throughout the power system. A condition from 
which a power system comprised of standard parallel-
connected converter boards could not recover. As seen 
from Figure 4 each row in the power system has suffered at 
least one failure. 
 
   
 
Figure 4 : 8 faults distributed among all 5 converters 
 
 The array to the right in Figure 4 is the system truth-
values as they are entered into the system array for 
calculation purposes. The software program now performs 
the following tasks in order to establish a path between the 
first two blocks.  
The starting point is the answer to the following question:  
 Does the combination of 1 1 occur in the first two 
columns in the matrix shown to the right in Figure 4? In 
Nial terms this can be expressed in a very compact form: 
 
Q:= ((0 pick (cols AA) EACHLEFT = ll) link o) 
lolloo 
 
 The result is shown as truth-values. It should be noted 
that due to the number of allowable switch positions a 
falsehood has been attached to the end of the result. 
 Assigning the correct switch positions to the entries that 
returned truth is completed through the operation ‘sublist’. 
 
Y:= Q sublist (reverse Res_1) 
 
+-----+-----+-----+
|loooo|ooloo|ooolo|
+-----+-----+-----+
 
 In order to insert the correct switch positions into the 
result array, the positions that returned truth must be 
identified in Index origin 0. 
 
Index:= EACH first (Y EACHLEFT  
 sublist tell (first shape AA)) 
 
0 2 3 
 
 Finally, the assigned switch positions are inserted into 
the result array by using the operation ‘placeall’. 
 
Y (cart Index 0) placeall AA 
 
 Following a similar procedure the rest of the truth-values 
in the array shown to the right in Figure 4 are replaced by 
feasible switch positions. The resulting array for the case of 
8 faults distributed among all 5 converters is shown in 
Figure 5. 
 
                +-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||loooo|||ooooo|||ooooo|||ooloo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+
|+-----+|+-----+|+-----+|+-----+|||ooooo|||loooo|||ooooo|||ooolo||
|+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+|+-----+|+-----+|+-----+|+-----+|||ooloo|||ooooo|||olooo|||ooool||
|+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+
|+-----+|+-----+|+-----+|+-----+|||ooolo|||ooloo|||ooolo|||ooooo|||+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+
|+-----+|+-----+|+-----+|+-----+|||ooooo|||ooolo|||ooool|||ooooo||
|+-----+|+-----+|+-----+|+-----+|
+-------+-------+-------+-------+
Figure 5 : Result array
1 1 0 0 1 
0 0 1 0 1 
1 1 0 1 1 
1 1 1 1 0 
0 1 1 1 1 
(1)
  (2)
  Page 20 
 
 Comparing the array shown to the right in Figure 4 with 
the result array shown in figure 5 it is obvious that the two 
arrays are linked through a transformation array. Looking 
at the axes of the two arrays it can be seen that the 
transformation array is the previously mentioned incidence 
matrix for electric networks.  
 
VI. RELIABILITY ASSESSMENT 
 
 As indicated in section ‘III. REDUNDANCY AS 
RELIABILITY ENHANCEMENT’ the reliability of the power 
system at hand is best found using the Markov Modeling 
approach. However, due to the large number of states in 
which the power system can reside the calculations become 
extremely complicated resulting in loss of any insight into 
the relation between survivability of each block and the 
impact of the overall system. Changing the viewpoint from 
dynamic parts level redundancy to system level 
survivability makes it possible to express the overall 
system performance concerning reliability as a function of 
time. It should be noted that this approach does not provide 
any system information during transition from one state to 
another. However, in most cases the figures of merit 
relevant to most customers is the probability of system 
survival within the expected system lifetime. For this 
reason the proposed system level approach will be utilized. 
 When considering reliability assessment of systems 
several evaluation techniques are applicable. However, due 
to the complicated interconnection of the individual blocks 
within the power system a generalized approach focusing 
on a formal system description by means of block 
reliabilities is desirable. Two approaches comply with the 
latter desire - event trees and connection matrix techniques. 
Since the power system at hand is comprised of a rather 
large number of blocks the event tree approach quickly 
becomes too complex. In contrast the connection matrix 
technique establishes a matrix representing power flow 
between system nodes by means block reliabilities. Thus, 
the obvious approach is the connection matrix technique, 
which will be used throughout the remainder of this 
presentation. 
 Figure 6 shows a cross section of the power system 
found in Figure 2.  
BA
HF
2 3
Input
Input
1
G I
4
 
Figure 6 : Cross section of Figure 2 
 
Representing both power flow and system nodes the green 
arrows in Figure 6 are the basis for the connection matrix 
technique. It should be noted that flow from one block to 
another is unidirectional whereas the flow to and from a 
switch is bidirectional. The blocks interconnecting the 
individual nodes are characterized by their probability of 
providing fault free operation for a specified period of time. 
The establishment of the connection matrix is now 
straightforward as the entries of the matrix are the 
probabilities for each block interconnecting two adjacent 
nodes. Figure 7 shows technique applied to the blocks and 
nodes found in Figure 6. 
 
1 2 3
1
2
3
4
4
1 A 0 0
0 1 B G
0 0 1 0
10 G 0
0
0
0 0 0
0 0
0
1
C
0
 
Figure 7 : Connection matrix for 
 
 In Figure 7 the blue circles define the unidirectional flow 
between node 1 and 2 (block A) while the red circles define 
the bidirectional flow between node 2 and 4 (block G - a 
switch). 
 Having established the entire connection matrix the next 
step is either node removal through sequential reduction or 
matrix multiplication. The latter method being the easiest to 
apply, thus being the method preferred. Application of the 
matrix multiplication is straightforward as the basic 
connection matrix is multiplied by itself a number of times 
until the resulting matrix remains unchanged. The 
transmission from input to output is now derived from the 
matrix as the entry found in row ‘node 1’ and the column 
containing the output node.  
 Based on the solution derived from the connection matrix 
and on the assumption that the block failure rates 
throughout the power system are identical the following 
system probability equation can be established:   
 
 
( ) t   -
based-Array
Switch5e  P ⋅+= λ
λ
 
 
 λSwitch is the failure rate of each switch and λ is the 
overall failure rate for each converter board. By means of 
the exponential distribution the probability of system 
survival of a traditional redundant power system can be 
found as: 
 
 t-lTraditiona e  P
⋅
=
λ  
 Comparing (3) and (4) it can be seen that the difference 
is the exponent. As will become apparent this difference is 
of great importance when considering redundant systems.  
   (3) 
   (4) 
  Page 21 
 
By means of the probabilities found in (3) and (4) the 
binomial coefficients for a N+2 redundant system can be 
established: 
 
 λλλ ⋅⋅⋅⋅⋅⋅ ⋅+⋅⋅= t-5t-4t-3lTraditiona e6e15-e10  R  
 
 κκκ ⋅⋅⋅⋅⋅⋅ ⋅+⋅⋅= t-5t-4t-3based-Array e6e15-e10  R  
 
where κ is equal to: 
                         
 ( ) t    Switch5 ⋅+= λκ λ  
 
 Plotting the two equations reveals the probability of 
system survival for a given period of time as a 
function of overall converter board failure rate.  
 
2.105 4.105 6.105 8.105 10.105 12.105
Failure rate (FIT)
Probability (P)
0.8
0.6
0.4
0.2
1.0
 
Figure 8 : Probability of system survival 
 
 The red line in Figure 8 is the probability of system 
survival for the array-based approach while the green line 
shows the probability of system survival for a traditional 
redundant power system.  
 It should be noted that the reliability of the array-based 
approach is worse at converter board failure rates below the 
switch failure rate plus one fifth the converter board failure 
rate. The boundary between the two reliability scenarios 
can in mathematical terms be expressed as: 
 
 ( ) t- t   - Switch5 ⋅=⋅+ λλλ    ⇒   45 Switch  λλ ⋅=  
 
 Thus, at converter board failure rates below the value 
given in (9) the traditional approach would be preferable. 
However, the converter board failure rate for any power 
system would by far exceed the failure rate of a single 
switch. For this reason it can be concluded that the array-
based approach indeed increases the overall reliability of 
the proposed power system configuration. 
 
 
 
VII.  CONCLUDING REMARKS 
 
 An alternative approach in the design of reliable power 
systems has been presented. Based on statistical 
calculations using among others the exponential 
distribution it has been show that redundancy is the tool to 
implement when considering high reliable power systems.  
 Also, a control scheme for the redundancy control of the 
power system has been presented. Using the array-based 
logic a well functioning system capable of establishing the 
maximum number of working converter boards possible 
has been implemented.  
 Finally, an assessment of the overall gain in power 
system reliability has been performed. This assessment 
showed that a considerable increase in system survivability 
is possible when the proposed array-based control and 
implementation technique is applied. 
 
ACKNOWLEDGMENT 
 
 The author would like to thank Associate Professor, 
Ph.D. Peter Falster, Department of Electric Power 
Engineering, Technical University of Denmark and Senior 
Designer Henrik Møller, Alcatel Space Denmark for their 
support during this work. 
 
REFERENCES 
 
[1] Military Handbook (MIL-HDBK-217): Reliability Prediction of 
Electronic Equipment. 
 
[2] Mike Jenkins and Peter Falster: “Array Theory and Nial”, report 
1999, Department of Electric Power Engineering, Technical 
University of Denmark. 
 
[3] Gert L. Møller: “On the technology of array-based logic”, Ph.D. 
thesis 1995, Department of Electric Power Engineering, Technical 
University of Denmark. 
 
[4] Ole Immanuel Franksen: ”Group Representations of Finite 
Polyvalent Logic – a Case Study Using APL Notation”, IFAC VII 
World Congress, Helsinki, June 1978. 
 
[5] Ole Immanuel Franksen and Peter Falster: “Colligation or, the logic 
inference of interconnection”, Mathematics and Computers in 
Simulation 52 (2000) 1-9. 
 
[6] Bruce K. Walker: “Evaluating Performance and Reliability of 
Automatically Reconfigurable Aerospace Systems Using Markov 
Modeling Techniques”, Department of Aerospace Engineering & 
Engineering Mechanics, University of Cincinnati, OH, USA. 
 
[7] Trenchard More: “Notes on the Diagrams, Logic and Operations of 
Array Theory”, Structures and operations in Engineering and 
Management Systems, The second Lerchendal Book, Tapir 
Publishers. 
 
[8] Mike A. Jenkins: “Q’Nial Reference Manual”, Nial Systems 
Limited, Kingston, Ontario, Canada, 1985. http://www.nial.com 
 
   (5) 
   (7) 
   (6) 
    (8) 
  Page 22 
 
 
Digitally Controlled Converter with Dynamic  
Change of Control Law and Power Throughput 
 
 
Carsten Nesgaard 
Technical University of Denmark 
DK-2800  Kongens Lyngby 
E-mail: cn@oersted.dtu.dk 
 
Michael A. E. Andersen 
Technical University of Denmark 
DK-2800  Kongens Lyngby 
E-mail: ma@oersted.dtu.dk 
 
Nils Nielsen 
Technical University of Denmark 
DK-2800  Kongens Lyngby 
E-mail: nni@oersted.dtu.dk 
 
 
Abstract With the continuous development of faster and 
cheaper microprocessors the field of applications for digitally 
control is constantly expanding. Based on this trend the 
paper at hand describes the analysis and implementation of 
multiple control laws within the same controller. Also, 
implemented within the control algorithm is a thermal 
monitoring scheme used for assessment of safe converter 
power throughput. An added benefit of this thermal 
monitoring is the possibility of software implemented analytic 
redundancy, which improves system fault resilience. Finally, 
reliability issues concerning the substitution of analog 
controllers with their digitally counterparts are considered. 
 The outline of the paper is divided into two segments – 
the first being an experimental analysis of the timing 
behavior by means of code optimization – the second being an 
examination of the dynamics of incorporating two control 
laws using multiple control parameters. 
 
I.  INTRODUCTION 
 
 The continuous evolution of still faster processors at 
ever increasing performance levels provides a basis for 
application areas ranging from monitoring and supervisory 
circuitry to high speed redundancy control of complex 
power systems [1]. Furthermore, these benefits comes at 
ever lower costs, thus making digitally control an 
attractive alternative to the traditional analog control 
circuits found in most power electronics applications 
today. 
 The very fact that power electronics has benefited 
greatly from the continuing development in 
microprocessor technology is seen by the numerous papers 
and articles describing among other things the 
implementation of promising new control schemes.  
 This paper examines the possibilities of implementing 
two different control laws within the same low cost 
controller. The change in control law is dynamic and is 
based on the continuous measurement of key parameters. 
The control laws between which the microcontroller alters 
are a standard pulse width modulation (PWM) control law 
and the simpler pulse skipping (PS) control law. Since 
switching losses are dominant at no and light loads in most 
converter topologies the controller applies the PS control 
law when operated in this region. At heavier and full load 
the total converter losses are a combination of switching 
losses and conduction losses for various converter 
components. Therefore when operated in this region the 
PWM control law is applied to the converter. 
 Also, included in the converter control algorithm is a 
continuous measurement of converter worst-case 
temperature for assessment of converter power throughput 
capability. It is well known that internal operating 
temperature depend on ambient temperature, power losses 
within the converter and the thermal capabilities of the 
heatsinks used. Therefore it is of vital importance for the 
continuous supply of power that the worst-case 
temperature is measured on a regular basis. Based on loss 
analysis the worst-case temperature for the test converter 
is found at the switching transistor for which reason the 
temperature-monitoring device is mounted at the transistor 
case. A description of the thermal aspects as well as the 
dual implementation of the control algorithm is provided is 
section ‘IV. CONTROL ALGORITHM AND 
MONITORING’. 
 When substituting analog controllers with digitally 
equivalents reliability a number of issues must be 
considered. This topic is discussed in section ‘VI.  
RELIABILITY’.  
 Following the reliability discussion section ‘VII. 
EXPERIMENTAL RESULTS’ provides measurements 
verifying successful implementation of the proposed 
control algorithm. 
 Finally, section ‘VIII. FURTHER WORK’ gives a 
description of the research work done since the original 
digest submission as well as ideas for further work within 
the field of digitally controllers in DC/DC converter 
applications.  
 
II.  THE CONTROLLER 
 
 The controller used in the test setup is the 8-bit RISC 
PIC16F877 microcontroller from Microchip [2]. It’s core 
features relevant to the application at hand include: 
 
  8K 14-bit word flash memory    
  256 E2PROM data memory  
  10-bit PWM module 
  8 channel 10-bit A/D converter  
  Single cycle operations  
  20 MHz clock frequency 
Presented at Power Electronics Specialists Conference 2003, Acapulco, Mexico, June 2003
  Page 23 
 
 
  Thus, making the PIC16F877 ideal for controlling a 
DC/DC converter. This fact combined with the low power 
consumption and high-quality development tools provided 
by the device manufacturer made it the microcontroller of 
choice. 
 Based on equations found in the device datasheet on 
calculating the acquisition time for a single A/D 
conversion an average value of 20.8µs for each analog 
parameter conversion can be established. Due to the low 
bandwidth that results from this rather long acquisition 
time converter control and shift in control law based 
primarily on periodic sampled analog parameters are to be 
avoided. For this reason a trade-off between execution 
speed, precision, complexity and cost is therefore 
inevitable. If the acquisition time of the A/D converter 
within the microcontroller cannot be accepted due to lack 
of execution speed an external A/D conversion circuitry 
can be added. Unfortunately this adds to both cost and 
complexity of the overall system. Since the focal point in 
this presentation is an examination of the PIC 
microcontroller capabilities in converter control no 
external circuitry is added. 
 
III.  TEST CIRCUIT 
 
 The test circuit is comprised of a 5W BUCK converter 
designed for continuous conduction mode, a number of 
measurement circuits and a PIC microcontroller clocked at 
20 MHz. A graphical representation of the test circuit is 
seen in Fig. 1. 
 
Power switch Filter
PIC16F877
microcontroller
12V Input 5V Output
Temp
Duty-cycle
Input current
Input voltage
Output current
Output voltage
1AMAX
 
Fig. 1 : Test setup 
 
 The BUCK converter, which is represented by the 
blocks ‘Power switch’ and ‘Filter’, is comprised of a 200 
µH inductor, a 1 mF output filter capacitor, a 1N5811 
schottky free wheeling diode and a IRF9530 power 
MOSFET. Voltage sensing is achieved by utilization of a 
simple voltage divider while current sensing is achieved 
by means of a sensing resistor. Utilization of these simple 
techniques has the disadvantage of resulting in lower 
overall converter efficiency than otherwise obtainable with 
more sophisticated techniques. However, the focal point in 
this presentation remains clear regardless which technique 
is used.  
 Finally, the temperature-sensing device is a 2-wire 
digital thermometer from Dallas Semiconductors/ Maxim. 
It is equipped with an I2C interface suitable for 
communication with the microcontroller. 
 
IV.  CONTROL ALGORITHM AND MONITORING 
 
 The control algorithm and monitoring functions are 
implemented using the development software MPLAB 
from Microchip. Although MPLAB allows for generation 
of source code using standard C the source code for the 
control algorithm implemented in the test converter is 
written in assembler due to the optimized compilation 
hereof. 
 Examination of source code implementation via 
subroutines to which the main program guides the 
instruction pointer revealed that the sample timing 
depended on the number of as well as which instructions 
were executed. Since this is an unacceptable situation in a 
DC/DC converter the faster and more precise code 
execution obtained by applying an interrupt based program 
structure for control parameter sampling is implemented.  
 The determination of which control law to apply at any 
given time is based on a set of measurements of the input 
current, input voltage, output current and output voltage. 
Of these four parameters that all can be identified in Fig.  
the measurement of the output voltage is the most 
important parameter since the generation of a proper Duty-
cycle depends solely hereof. The change in control law 
from PWM to PS occurs when the power level decreases 
below 1.5W while the change in control law from PS to 
PWM occurs when the power level increases above 2W. 
Thus, the control algorithm incorporates hysteresis, which 
prevents a state of oscillation between the two control 
laws. Detailed calculations concerning the optimum 
operating point where the changes in control law should 
occur is found to be at 1.85W (equal to 370mA). However, 
using this single point of the operating curve as a mark for 
control law changes results in oscillatory behavior, when 
slightly variable load currents around 370mA is supplied 
by the converter. This oscillatory behavior results in 
increased high frequency noise and deteriorated dynamic 
converter performance – thus justifying the use of 
hysteresis.  
 Initially the PWM control law was implemented in 
real-time in which calculations are performed continuously 
based on the measured output voltage. However, analysis 
of the program execution speed revealed that in order to 
maintain an acceptable sampling frequency a different 
implementation technique had to be used. Under normal 
operating conditions it is possible to predict the behavior 
of the converter and therefore in an analytic way calculate 
the Duty-cycle needed for proper operation. This analytic 
fact can be used to generate a look-up table containing all 
the information needed for continuous converter operation 
within the specified limits. From the microcontroller 
datasheet it can be seen that accessing the program 
memory can be achieved in 16 cycles, which equates to 4 
µs. Of the available 8K program memory only a small 
  Page 24 
 
 
segment is used for the actual control software, thus 
leaving plenty of memory available for other purposes. 
Implementation of the proposed look-up table is simplified 
by means of a small C program that generate an assembler 
file containing the numerical values. Once the entire set of 
assembler files are compiled the look-up table is placed in 
memory along with the control algorithm. The optimized 
code execution caused by the look-up table results in a 
switching frequency of 77 kHz and a sampling frequency 
of 10 kHz when operated in PWM mode. Due to 
asymmetrical skipping of pulses in PS mode the switching 
frequency is no longer fixed. However, the sample 
frequency remain unchanged. 
 When operated outside the specified conditions the 
generation of the Duty-cycle is performed in real-time 
lowering the sample frequency to 3 kHz. This rather low 
sample frequency would under normal conditions be 
unacceptable, but instead of simply shutting down the 
converter in case of abnormal operating conditions the 
proposed control algorithm provides a continuous supply 
of power although at a degraded level. 
 It is well known that converter power throughput is a 
function of temperature. Manufactures of commercially 
available converters most often specify the power a given 
converter can deliver under certain operating conditions. 
Thus, making the user responsible for proper cooling 
and/or control of ambient temperature. Exceed the 
specified values and permanent damage to the converter is 
likely to occur. As an extra safety precaution many 
modern converters have thermal shutdown capabilities, 
which shuts down the converter as the maximum 
temperature is reached. Since thermal shutdown capability 
provide no or little warning before the protective circuitry 
shuts down the converter the proposed control algorithm 
include a dynamic change of power throughput as a 
function of temperature. This allows for active temperature 
management, since at any given time the temperature can 
be related to the losses within the converter. Keeping a 
converter running at maximum power throughput during a 
temperature rise eventually causes the thermal protection 
to shut down the converter. In a stand-alone configuration 
the dynamic change in power throughput as a function of 
temperature provides little or no advantages over the 
thermal shutdown protection. However, in load priority 
applications or in redundant configurations the loads with 
the lowest priority will simply be disconnected from the 
converter and thereby improving the probability of a 
continuous supply of power to the critical loads. 
 
V.  EFFICIENCY IMPROVEMENT 
 
 Converter efficiency curves most often have a shape 
similar to that shown in Fig. . This is due to the 
distribution of conduction losses and switching losses. 
Mentioned in section ‘I INTRODUCTION’ the switching 
losses are dominant at light loads, thus causing the 
efficiency in this region of the operating range to decrease 
quite dramatically. Fig. 2 shows the test converter 
efficiency when operated in PWM mode. 
0
10
20
30
40
50
60
70
80
90
0 0,3 0,6 0,9 1,2
Output current
E
ffi
ci
en
cy
 
Fig. 2 : Efficiency of PWM control 
 
 Applying the PS control law it is theoretically possible 
to improve the overall efficiency at light loads. Indeed, 
section ‘VII. EXPERIMENTAL RESULTS’ shows that 
this is the case. 
 
VI.  RELIABILITY 
 
 While the continuous development in microprocessor 
technology makes it attractive to replace analog circuitry 
with digitally equivalents the pros and cons in each case 
should be considered very carefully. While analog 
controller have advantages of high bandwidth, no need for 
data conversion, high resolution and fast response to 
changes in fixed control parameters their digitally 
counterpart have the advantage of being able to respond 
not only to changes in the fixed control parameters but 
also to changes in variables within the converter. 
Furthermore digitally controllers have the ability to 
respond intelligently to the loss of fixed control 
parameters. By means of analytic behavior of converter 
operation the lost control parameter can often be deduced 
from the control parameters available. Although, different 
from the technique used in this presentation one such 
analytic approach in determining immeasurable parameters 
is given in [3].  
 Based on the continuous measurement of input power, 
output power and system temperature the proposed control 
algorithm improves system reliability by means of a 
technique known as analytical redundancy. Analytical 
redundancy is the concept of determining a parameter 
based on measurements of different variables. Thus, 
establishing a configuration of variables suitable for 
describing any one of the N variables in case one 
measurement is lost. This technique compensates for some 
of the loss in system reliability when interchanging the 
analog controller with its digitally counterpart. An exact 
number of the loss in reliability due to this interchange can 
be found using the data given in ‘Military Handbook 217’ 
concerning reliability prediction of electronic equipment: 
  Page 25 
 
 
A commercially available analog controller in a 
16-pin DIP package designed for ground-based 
equipment has a Mean Time To Failure (MTTF) of 
1.7 million years. 
 
An 8-bit PIC microcontroller in a 40-pin DIP 
package designed for ground-based equipment has 
a Mean Time To Failure (MTTF) of 0.46 million 
years. 
 
 It is seen that the MTTF decreases by a factor of 3.7 
when replacing the analog controller unit with its digitally 
counterpart. Furthermore, the reliability of the software 
implemented in the digitally controller has to be 
considered. Combining these reliability issues one finds 
that although analytic redundancy and other complex 
techniques can be applied in a digitally controller the 
analog controller still provides the optimum reliability in 
simple converter control. 
 Having mentioned the drawbacks of digitally control, 
one should note that the increase in ‘intelligence’ in 
converter control open the door to new reliability 
improvements not possible with analog controllers. 
 Although the response to a temperature change is much 
slower than to a voltage or current change analytic 
redundancy using the temperature in conjunction with 
other variables can be applied. Suppose the circuit for 
measuring output current fails. Under normal 
circumstances this would imply that the efficiency of the 
converter is zero and thus the converter has failed. By 
applying analytic redundancy the controller is able to 
determine whether the converter has failed or not. The test 
circuit in this presentation applies analytic redundancy by 
relating the converter power throughput to the 
temperature. On the assumption the previously mentioned 
fault occurs the controller would sustain the power 
throughput as long as the current limit is not exceeded.  
The constant monitoring of the system temperature can 
hereon after be used as a means of deciding which control 
to apply. 
 
VII.  EXPERIMENTAL RESULTS 
 
 This section provides measurements for the test 
converter under consideration.  
 As verification of proper Duty-cycle generation in both 
control modes the first set of measurements is the 
MOSFET gate voltage.  
 Fig. 3 shows the MOSFET gate voltage in PWM mode 
and Fig. 4 shows the MOSFET gate voltage in PS mode. 
 
      
Fig. 3 : Converter operating in PWM mode                       
 
 In Fig. 3 it can be seen that the switching frequency of 
the test converter operating in PWM mode is 76.9 kHz, 
which is very close to the expected 77 kHz.            
 
Fig. 4 : Converter operating in PS mode 
 
 Fig. 4 shows the very narrow pulses applied in PS 
mode. Since the pulse skipping is asymmetrical and is 
solely based on the load no fixed switching frequency is 
used. 
 In Fig. 5 the rather low current drawn by the digitally 
controller and the temperature sensing device can be seen. 
Low power consumption for the control circuitry adds to 
the list of advantages of using digitally controllers in 
converter applications. Replacing the digitally controller of 
the test converter with an analog controller causes the total 
control circuitry power consumption to double.  
 
  Page 26 
 
 
 
      
Fig. 5 : Control circuit power consumption           
 
 Fig. 6 shows the increase in converter efficiency at 
lighter loads (purple curve). The point where the blue and 
purple curves intersect is the optimum point of change in 
control law from PWM to PS and vice versa. However, as 
mentioned previously this can lead to oscillatory converter 
behavior with very undesirably consequences as a result. 
Therefore the implemented control algorithm incorporates 
hysteresis. An enhanced view of this hysteresis loop can 
be seen in Fig. 7. 
0
10
20
30
40
50
60
70
80
90
0 0,3 0,6 0,9 1,2
Output current
Ef
fic
ie
nc
y
 
Fig. 6 : Efficiency of PS and PWM control 
 
70
72
74
76
78
80
82
0,25 0,3 0,35 0,4 0,45
Output current
E
ffi
ci
en
cy
 
Fig. 7 : Enhanced view of hysteresis loop 
 
 
VIII.  FURTHER WORK 
 
 The field of digitally control of converters by means of 
low-cost microcontrollers provides many features not 
mentioned in this presentation. Further work should 
address topics such as  fault management implementations 
methods in low-cost microcontrollers as well as fault 
prediction schemes imbedded in the converter control 
algorithms. 
 Currently a formalized analysis of implementation 
schemes of analytical redundancy in standard converter 
topologies is being carried out. The foundation for this 
work is an establishment of converter topology connection 
matrices in combination with fault propagation analysis. 
Redrawing the test converter in Fig. 1 in accordance with 
the graph theoretical principles the following oriented 
connection graph can be established. 
 
Q L
C
VPWM
I
T D
1 2 3
4
5
6
9 7
8
Vin VOUT
 
Fig. 8 : BUCK test converter connection graph 
 
 Analyzing the connection graph in Fig. 8 using the 
graph theoretical approach the following connection 
matrix can be established: 
 
TABLE 1 : BUCK CONNECTION MATRIX 
 
 1 2 3 4 5 6 7 8 9 
1 0 Q 0 0 0 Q 0 0 0 
2 0 0 L 0 0 Q I 0 0 
3 0 0 0 C 0 0 0 V 0 
4 0 D C 0 0 0 0 0 0 
5 0 Q 0 0 0 Q 0 0 0 
6 0 0 0 0 0 0 0 0 T 
7 0 0 0 0 P 0 0 0 0 
8 0 0 0 0 P 0 0 0 0 
9 0 0 0 0 P 0 0 0 0 
 
where P is short for the PWM controller, V is short for the 
voltage sensing circuitry and I is short for the current 
sensing circuitry. The remaining matrix entries Q, L, D, C 
and T are identical to the abbreviations commonly used in 
circuit theory. 
 With reference to TABLE 1 it can be seen that the 
Duty cycle (5) for controlling the switch is generated 
based on 3 different parameters. Establishing the 
  Page 27 
 
 
theoretical relations between these converter parameters as 
well as considering the measurement of these same 
parameters from an operational point of view it is possible 
to continue the converter operation in case of certain faults 
- although at a deteriorated level. This increases overall 
system reliability as described in section ‘VI. 
RELIABILITY’. In other words the current research is 
expected to provide a theoretical framework and model for 
future implementations of analytic redundancy in single 
path converters. In this context single path converters are 
characterized as converters comprised of a single electrical 
connection between input and output. 
 
IX.  CONCLUSION 
 
 This paper has examined the converter control 
capabilities of a PIC microcontroller. The proposed control 
scheme based on efficiency improvement at light loads has 
shown that implementation of multiple control laws is 
indeed within the timing limits of a standard low-cost 
microcontroller. Furthermore, it has been shown that 
temperature measurement allows for implementation of 
analytic redundancy, which improves system fault 
resilience although true hardware fault tolerance can only 
be achieved in redundant converter configurations.  Also, 
it has been shown how a measurement of transistor 
temperature can be used in protecting the converter from 
overheating by constantly adjusting the converter power 
throughput.  
 A short description of further work within the field of 
digitally controllers has also been given and the current 
research concerning a formalization of the mathematical 
analysis of analytical redundancy optimization was 
presented. 
 
ACKNOWLEDGEMENT 
 
 The author would like to thank Alcatel Space Denmark 
for sponsoring this work and Rune Moller Barnkob for 
implementing and testing the control algorithm as well as 
the monitoring functions. 
  
REFERENCES 
 
[1] C. Nesgaard, ‘An array-based study of increased 
system lifetime probability’, IEEE-APEC 2003 
  
[2] Microchip, ‘PIC16F87x Datasheet’ 
  
[3] D. Y. Qiu, S. C. Yip, Henry S. H. Chung, and S. Y. R. 
Hui, ‘On the Use of Current Sensors for Control of 
Power Converters’, IEEE-PESC 2001 
 
[4] Military Handbook 217, ‘Reliability prediction of 
electronic equipment’ 
 
[5] I. Celanovic, I. Milosavljevic, D. Boroyevich, R. 
Cooley, J. Guo, ‘A New Distributed Digital Controller 
for the Next Generation of Power Electronics Building 
Blocks’, IEEE-APEC 2000 
 
 
  Page 28 
 
Efficiency improvement in redundant power 
systems by means of thermal load sharing 
 
Carsten Nesgaard 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: cn@oersted.dtu.dk 
 
Michael A. E. Andersen 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: ma@oersted.dtu.dk 
 
 
Abstract – The demand for higher output currents at ever 
lower voltage levels is often solved by paralleling multiple 
converters. Provided redundancy is implemented this 
technique, besides being relatively easy to implement, has the 
advantage of improving the overall system reliability. Also, 
the parallel-connection concept forms the basis of a very cost-
effective power system design, since the entire system often 
can be realized using off-the-shelf units. This paper verifies 
experimentally that the use of the thermal load sharing 
technique, proposed in [1], not only increases the overall 
system reliability but also has a positive impact on the system 
efficiency. The latter aspect is achieved by redistributing the 
current throughput of each converter, which in turn results in 
equal thermal conditions as opposed to the current sharing 
technique’s intent to establish equal currents. 
 
I. INTRODUCTION 
 
 This paper describes the experimental results of the new 
thermal load sharing technique presented in [1]. From a 
schematic presenting all key components in the test setup to 
measurements of load sharing currents and overall system 
efficiency this paper will verify that thermal load sharing in 
most cases outperforms the widely used current sharing 
technique.  
 The power system under consideration in [1] is 
comprised of 3 parallel-connected converter units. The 
theoretical results indicate that the new load sharing 
technique increases the overall system reliability quite 
significantly based on an unequal distribution of the 
individual converter load currents. From the description in 
[1] it can be deduced that the more converter units making 
up the power system the better results concerning reliability 
is achievable. From a system point of view this result 
seems obvious since the more paths the current can take 
from input to output the easier it is for the system to 
optimize the system temperatures by balancing the currents 
through each converter unit.  
 The system considered in this paper is comprised of 2 
identical parallel-connected converter units, for which 
reason it should be expected that the overall reliability 
improvement is less than that found in [1]. Indeed, as the 
experimental verification will show the overall system 
improvement in terms of reliability and efficiency is less 
than that obtained in [1]. However, the results obtained still 
provide significant improvements in overall system 
performance. An analytic as well as verbal explanation of 
these results is provided in section “IV. THEORETICAL 
SYSTEM EVALUATION”. Finally, based on 
reliability calculations, presented in section ”V. 
RELIABILITY ASSESSMENT”, it will become clear that the 
use of thermal load sharing increases the overall system 
reliability by lowering the average system operating 
temperature. 
 
II. THE TEST SYSTEM 
 
 The test system is comprised of 2 parallel-connected 
buck converters each capable of supplying a load current of 
25A at an output voltage of 5V. A block diagram of the 
two-converter system can be seen in figure 1. 
 
             
Converter 1
Converter 2
OutputInput
IOUT/2
IOUT/2
IOUT
 
Figure 1 : Test setup block diagram 
 
 The annual downtime for the two-converter system, 
shown in Figure 1, utilizing the traditional current sharing 
technique can be calculated to 10 minutes and 14 seconds. 
This downtime is one of the parameters used to compare 
the two load sharing techniques. A description of the steps 
involved in calculating this downtime is presented in 
section ”V. RELIABILITY ASSESSMENT”. 
 Turning the attention towards a more detailed system 
outline, Figure 2 shows a simplified schematic of the test 
setup under consideration. Each converter utilizes 4 IC’s, a 
single MOSFET transistor and two free-wheeling diodes. 
The reason for explicitly mentioning these active 
components is due to the fact that they are major 
contributors to the overall converter failure rate. Besides 
from these active components the converters are comprised 
      10 min. 14 sec. / year 
Presented at Applied Power Electronics Conference and Exposition 2004, Anaheim, USA, February 2004 
  Page 29 
 
of input- and output capacitors, the energy storing inductors 
and a relatively large number of small-signal components 
(not included in Figure ).  
100 µF 100 µF
48 µHIRFP064 10 mΩ
470 µF
RFeedback
+5V
MC3307UC3902UC3843
IR2110
PBYR
3045
RGate
+16V
Input Output
100 µF 100 µF
48 µHIRFP064 10 mΩ
470 µF
RFeedback
MC3307UC3902UC3843
IR2110
PBYR
3045
RGate
 
Figure 2 : Simplified schematic 
 
 The switching frequency, which was chosen to 122kHz, 
results in an inductor ripple of approximately 0.6A. 
 
 
Figure 3 : Differential gate-source voltage for each converter 
 
 Even though every attempt has been made to ensure 
equal converter layout, performance and switching Figure 3 
clearly shows that small differences exists. The timing 
problem is easily fixed by utilizing clock synchronization 
(however, the UC3843 has no sync pin) while the 
difference in duty cycle is intentional, since this determines 
the current supplied by each converter. It is estimated that 
the difference in switching frequency is relativity 
unimportant in relation to testing the two load sharing 
techniques, hence further discussion of this topic is 
omitted.  
 
 
Figure 4 : Test setup 
 Figure 4 shows an image of the real-world test setup. 
The large copper baseplate on which the two converters are 
implemented is used as a heat stabilizing mechanism for all 
small-signal devices. The physical separation between the 
MOSFET transistor heatsinks (24 cm.) prevents the two 
converters from interacting thermally, thus increasing the 
adjustability of the control system.  
 
III. MEASUREMENTS AND SYSTEM DESCRIPTION 
 
Initially the two 25A buck converters were paralleled 
and operated in a ‘semi-droop’ manner where the load 
sharing was based on the ‘natural’ output impedance of the 
converters. This technique is very simple but results 
unfortunately only in very rare situations in an acceptable 
performance, efficiency and reliability. Indeed, as the 
efficiency measurement shows one converter supplies 
almost the entire load current - leaving the other converter 
in an idle state. To optimize the efficiency and system 
reliability some form of load control is needed. The 
implementation of this concept is achieved by utilizing a 
dedicated load share controller. The load share controller 
used is the UC3902 from Texas Instruments. Since this 
controller does not allow for high-side differential current 
sensing an OP-amp is employed to compensate for the lack 
of this feature. It should be noted that Texas Instruments do 
offer load share controllers that allows for high-side 
differential sensing (like the UC3907) but due to a rather 
tight schedule it was chosen to proceed with the load share 
controller available at the time of implementation – the 
UC3902. 
 Following the guidelines provided in the load share 
controller datasheet and associated application notes the 
current sharing technique was successfully implemented. 
The two buck converters were then operated at nominal 
output power (12.5A each) while tuning the current share 
controller. The result of this tuning can be seen in figure 5. 
0
2
4
6
8
10
12
14
0 5 10 15 20 25 30
Output current
In
di
vi
du
al
 c
on
ve
rt
er
 c
ur
re
nt
Converter 1
Converter 2  
Figure 5 : Individual converter currents 
 
 Figure 5 shows that the achievable current sharing is 
very accurate. The only observable deviation from identical 
current levels is in the 1A – 7A range. Since the load share 
controller operates over a fairly large current range a small 
deviation should be expected.  
  Page 30 
 
 The next set of measurements is performed while each 
converter operates individually, thus allowing for very 
accurate temperature data to be obtained. The result is 
shown in Figure . 
5 10 15 20
150
100
50
Temperature
Output current
Converter 1 MOSFET temperature
Converter 2 MOSFET temperature
175
25
25
75
125
 
Figure 6 : MOSFET temperature measurements 
 
 The temperature at which the MOSFET transistors will 
be working is around 70°C, since this temperature 
corresponds to an individual converter output current of 
12.5A. From Figure  it can be seen that converter 1 
generally operates at a higher temperature than converter 2. 
The point where the temperatures of the two MOSFET 
transistors are equal is at an output current of 23.6A. Above 
this very high converter output current the temperature of 
the MOSFET transistor used in converter 2 exceeds the 
temperature of the MOSFET transistor used in converter 1. 
Part of the explanation for the relatively large MOSFET 
transistor temperature difference is shown in Figure 7, 
which shows the free-wheeling diode temperatures. 
 
5 10 15 20
150
100
50
Temperature
Output current
Converter 1 diode temperature
Converter 2 diode temperature
25
25
75
125
 
Figure 7 : Diode temperature measurements 
 
 The temperatures of the free-wheeling diodes are also 
relatively high. Again, it can be seen that converter 1 
operates at a higher temperature than converter 2. Since an 
intense mutual heating between the two active components 
takes place it is difficult to identify the actual self-heating 
of each component. As one component increases the 
ambient temperature (in the immediate vicinity of the 
power components) the other suffers from increases in 
parasitic elements resulting in increased internal heating. 
Some of the diode parameters that are affected thermal 
changes are described in section “IV. THEORETICAL 
SYSTEM EVALUATION”. 
0,4
0,5
0,6
0,7
0,8
0,9
0 5 10 15 20 25 30
Output current
Ef
fic
ie
nc
y
Semi-droop sharing efficiency
Current sharing efficiency  
Figure 8 : System efficiency 
 
 Figure 8 shows the efficiencies of the two techniques 
for parallel converter operation discussed so far. With 
reference to Figure 8 is can be seen that the semi-droop 
configuration exhibits higher efficiency at low output 
currents compared to the current sharing approach. This is 
simply due to chance since the converter supplying the 
majority of current in this test setup apparently has the 
highest output voltage (and efficiency at light loads). The 
control circuitry of the other converter monitors the 
common output voltage, which is higher than its internal 
reference voltage, and adjusts the duty cycle accordingly. 
  In order to make a fair comparison between the two 
load sharing techniques each converter is implemented with 
the exact same components, same length of wiring and 
current sensing resistors in both cases although not 
necessary in the thermal load sharing situation. 
Furthermore, since the thermal load sharing technique does 
not need high-side sensing the added OP-amp and 
associated passive components could also be removed from 
the circuit. However, for comparison purposes these 
components remain active during the thermal load sharing 
implementation. 
0
2
4
6
8
10
12
14
16
0 5 10 15 20 25 30
Output current
In
di
vi
du
al
 c
on
ve
rte
r c
ur
re
nt
Converter 1
Converter 2  
Figure 9 : Individual converter currents 
1 Converter 
2 Converter 
sharingCurrent 
droop-Semi
  Page 31 
 
 As can be seen in Figure 9 the individual converter 
currents are no longer identical – far from it actually. At 
lighter loads the difference between the two currents is 1A, 
but as the load increases the separation between the two 
converter currents become larger. At the nominal output 
current (25A) the difference between the two converter 
contributions is 3.1A. The system efficiency that results 
from this redistribution of converter currents can be seen in 
Figure 10.  
0,4
0,5
0,6
0,7
0,8
0,9
0 5 10 15 20 25 30
Output current
Ef
fic
ie
nc
y
Semi-droop sharing efficiency
Current sharing efficiency
Thermal sharing efficiency  
Figure 10 : System efficiency comparison 
 
 It should be noted that the efficiency of the thermal load 
sharing follows that of the semi-droop, since this causes the 
lowest system heating. At heavier loads the efficiency of 
the thermal load sharing exceeds that of the current sharing 
approach by approximately 2%. 
 
 
Figure 11 : Power system output voltage ripple 
 
 From Figure 11 it can be seen that the output voltage 
ripple deviates slightly from the expected triangular 
waveform. This is due to the small difference in switching 
frequency and the constant altered duty cycle. However, the 
ripple voltage is clearly within the ±5% voltage variation 
limit set as a requirement for the power system under 
consideration. 
 
IV. THEORETICAL SYSTEM EVALUATION 
 
 This section explains why the current distribution of the 
thermal load sharing technique results in higher overall 
efficiency. The calculations will be limited to include only 
high power components – meaning, components that are 
related to the high current path from input to output. 
Identifying these components the following list can be 
established: 
 
 MOSFET transistors 
 Free-wheeling diodes 
 Current measurement resistors 
 Inductors 
 Capacitors 
 
 Based on the subsequent descriptions and computations 
loss estimations are provided at the end of each subsection. 
These loss estimations verify that a shift in individual 
converter currents gives rise to the efficiency gain predicted 
in [1]. It should be noted that the correlation between 
output current, temperature and power losses in some of the 
power components is very complex. Due to this complexity 
the following descriptions only state the initial loss 
equations or make a reference to where the equations can 
be found - otherwise than that the results are shown in 
terms of graphical illustrations.  
 
MOSFET transistors 
 The redistribution of MOSFET transistor losses is the 
dominant factor in the system efficiency improvement. 
However, as will be shown the free-wheeling diodes and 
the filter capacitors also contribute to a shift in system 
losses whereas the contributions from the current 
measurement resistors and the inductors are only minor.  
 In the following section the subscript ‘Current’ is used 
to denote the losses associated with current sharing 
technique while the subscript ‘Thermal’ is used to denote 
the losses associated with thermal load sharing technique. 
Also, since the system is comprised of two converters the 
losses are calculated for each converter and are represented 
by the aforementioned subscript notation followed by two 
numbers. For example the MOSFET transistor conduction 
losses in the current sharing case are denoted ‘PConduction, 
Current = 5.13W and 2.30W’. This indicates that the loss in 
converter 1 is 5.13W and the loss in converter 2 is 2.30W. 
Also, as will be shown in Figure 12 and Figure 13 the 
notation RDS(ON) + 2.9mΩ is adapted to indicate the difference 
in MOSFET transistor ON-resistance for the two transistors 
used in the test system. This value is found by comparing 
the actual measurements to the theoretical loss evaluations 
based on the nominal RDS(ON) value (transistor datasheet). 
 The MOSFET transistor conduction losses are found 
using the equations provided in [1]. Based on this approach 
the following set of loss curves can be established.
droop - Semi
sharingCurret 
sharing Thermal
  Page 32 
 
Conduction losses
5
10
15
20
5 10 15 20 25
Output Current
Nominal RDS(ON)
Nominal RDS(ON) +2.9mΩ
In
cr
ea
sin
g 
te
m
pe
ra
tu
re
 
Figure 12 : Conduction losses vs. output current 
 
 Each curve in Figure 12 represents the conduction 
losses for a fixed temperature while varying the output 
current. This clearly shows that not only do the conduction 
losses increase as a function of output current but also as a 
function of temperature. The latter fact actually has a 
significant impact on the overall MOSFET losses. By 
relating the conduction losses at a given output current to 
the correct temperature based on heat-sink heat dissipation 
to the ambient the following results can be obtained: 
 
 2.30W  and5.13W    P Current   ,Conduction =
 .92W2  and3.39W    P  Thermal ,Conduction =  
 
 The process of determining the above losses and 
temperatures is successive, meaning that a change in one 
variable results in a change in the other variable. For this 
reason the curves shown in Figure 12 are established by 
calculating a number of points that relates output current, 
junction and heat-sink temperatures, MOSFET ON-
resistance and total MOSFET power loss. Using the 
mathematical tool ‘Mathematica’ these points is then fitted 
to make up the curves shown in figure 12.  
 MOSFET transistor switching losses is another heat 
generating factor that must be included in the overall 
MOSFET losses. These losses are found using the 
procedure provided in [2], from which the following 
graphical representation can be established: 
 
Switching losses
2
4
6
8
10
12
14
5 10 15 20 25
Output Current
Nominal RDS(ON)
Nominal RDS(ON) +2.9mΩ
 
Figure 13 : Switching losses as a function of output current 
 
 It should be noted that the temperature dependency of 
the switching losses found in [3] have been interpolated 
and normalized to the switching losses at 25°C at an output 
current of 12.5A. Also, since the increase in current results 
in higher conduction losses the associated MOSFET 
transistor junction temperature increases. In turn, this 
increases the overall switching losses (as well as 
conduction losses) as a function of output current. This 
dependency is included in the switching loss curves shown 
in Figure 13. The overall effect of the redistribution of the 
load current gives the following result: 
 
 .19W4  and.65W  5  P Current   Switching, =
 .30W4  and.47W  4  P  Thermal Switching, =  
 
 The decrease in conduction losses amounts to 1.12W 
while the switching losses contribute 1.07W to the overall 
system loss reduction. 
 
Free-wheeling diodes 
 The diode losses can be found using the simple 
equation shown below: 
 
    IR  IV  P 2 RMSDiode,Dynamicavg Diode,staticDiode ⋅+⋅=  
 
where Vstatic is the forward voltage drop, RDynamic is the 
inverse slope of the forward current vs. voltage drop. The 
parameters IDiode,avg and IDiode,RMS denote the average and 
RMS diode currents respectively and can be found using 
the following two equations: 
 
    )D-(1I  I Outavg Diode, ⋅=     D = Duty-cycle 
 
 






+
∆
⋅=
2
Out
2
L
RMSDiode, I  12
I
D  I  ∆IL  = Inductor ripple 
 
 As will become apparent, the effect of the diode losses 
on the overall decrease in system power losses is much 
lower than that of the MOSFET transistors. One reason for 
this being that the forward voltage drop of a typical diode 
decreases with increasing temperature. However, the 
change in forward current also affects the forward voltage 
drop – with increasing forward voltage drop with 
increasing forward current. Being in close proximity to the 
MOSFET transistor heat-sinks the change in diode 
temperature is a combination of internal heating, heat 
transfer from the MOSFET transistors and a change in 
forward current. By taking these parameters into account 
the following values can be found: 
 
 .44W4  and.46W  3  P Current   Diode, =
 3.86W  and.71W  3  P  Thermal Diode, =  
 
The overall decrease in diode losses amounts to 0.33W
  Page 33 
 
Current measurement resistors 
 The relationship between current and power loss for this 
component is almost linear in the current range of interest 
to this paper. Thus, the gain in efficiency from 
redistributing the output current is negligible, which the 
following calculations will verify: 
 
The total power loss in the current measurement resistors 
can be found using: 
 
 α⋅⋅= Resistor
2
OutResistor RI  P  
 
where α denotes the temperature factor for copper. 
Inserting values for the load sharing scenarios gives the 
following results: 
 
 .218W3  P Current  Resistor, =  .224W3  P Thermal Resistor, =  
 
Inductors 
 The inductors are implemented using high flux powder 
cores from Magnetics. Since the shape of this component 
deviates from that of the current measurement resistors the 
correlation between temperature, wire resistance and DC 
power loss is no longer linear: The following equation for 
temperature estimation is used [5]: 
 
 Temperature Rise (°C)
833.0
2 )(cm Area Surface
(mW) LossPower  Total  





=  
 
Inserting values in order to assess the copper and core 
losses gives the following result: 
 
 .327W3  P Current  Inductor, =  .124W3  P Thermal Inductorl, =  
 
The difference between these two losses accounts for a 
power loss decrease of 203mW. Like the current 
measurement resistors this loss decrease has only minor 
overall impact.  
 
Capacitors 
 The last components that will be considered in this 
paper are the filter capacitors. On the assumption that the 
losses associated with these components are solely caused 
by the ripple current and the internal capacitor ESR the 
following total power loss can be found: 
 
 .511W4  P Current  Capacitor, =  .019W4  P Thermal Capacitor, =  
 
The equation used for determining the abovementioned 
capacitor losses is the same as that used for determining the 
losses in the current measurement resistors. The thermal 
load sharing causes an overall decrease in capacitor losses 
of 492mW.  
 
Summary 
 Combining all the subtotals calculated above results in a 
total loss reduction of 3.21W. This decrease in system 
losses results in an overall efficiency increase by: 
 
 .6%1  
P  P
P
 - 
P  P
P
OutLScurrent  Loss,
Out
OutLS  thermalLoss,
Out
=
++
 
 
 This increase is approximately 0.4% lower than that 
indicated in Figure 10. However, additional losses due to 
changes in diode reverse recovery currents have not been 
included. Also, loss adjustments taking into account the 
difference in switching frequency have not been 
considered. 
 
V. RELIABILITY ASSESSMENT 
 
 This section briefly introduces the reliability 
calculations that form the basis for the previously 
mentioned annual down-time. The point of origin is the 
Military Handbook 217F concerning reliability prediction 
of electronic equipment. Following the guidelines in this 
handbook and the general derivation techniques of finding 
analytical expressions for areas under a curve results in the 
following equation for R(t): 
 
 t-
t
t-
t
t
0
e   e   f(t)     f(t) - 1  R(t) ⋅
∞
⋅
∞
=⋅=⇒= ∫∫∫
λλλ dtdtdt  
 
 This equation provides the probability of system 
survival within a given period of time (t). In terms of 
annual down-time R(t) can be rearranged and expressed as: 
 
 t-
t
0
t-
t
0
e - 1   e   f(t)  Q(t) ⋅⋅ =⋅== ∫∫
λλλ dtdt  
 
 In reliability engineering terms Q(t) is often referred to 
as the system unavailability, since it represents the 
probability of system failure.
 In order to evaluate the overall system reliability a 
combined assessment equation for the N+1 redundant 
system has to be established. This can be accomplished by 
combining the individual converter probabilities as shown 
next: 
 
The total number of combinations is 22 = 4 of which only 3 
are valid for system success: 
 
 212121System qp  pq  pp  R ⋅+⋅+⋅=  
 
In the special case of identical probabilities this equation 
becomes: 
 
 qp2  p  R 2System ⋅⋅+=
  Page 34 
 
 The latter equation can be used in the thermal load 
sharing situation since both converters operate at the same 
temperature, thus having the same failure rate. 
 Having established a theoretical foundation for the 
reliability assessment the next information needed is the 
temperatures of the individual components. This is a rather 
complicated task, for which reason the simplified thermal 
model shown in Figure 14 is used in all reliability 
calculations. This model shows the components in close 
proximity to the MOSFET transistors. Although additional 
components for the load sharing controller, the OP-amp 
etc. are present these are assumed to be operating at 
ambient temperature and are not affected by the change in 
heat-sink temperature. 
TSurface
TSurface - 10°C
TSurface - 30°C
1 resistor
1 MOSFET
5 resistors
2 IC's
1 inductor
2 diodes
4 capac itors
1 resistor
1 diode
2 capacitors
8 resistors
2 IC's
4 capacitors
          
Figure 14 : Simplified temperature distribution 
 
 Based on the temperature distribution in Figure 14 an 
average annual downtime of 10 minutes and 14 seconds is 
established (see Figure 1) for the current sharing technique. 
This number takes into account the redundancy concepts 
build into the power system. In other words the calculations 
indicate the probability of at least one working converter. 
 Reliability calculations for the thermal load sharing 
technique reveal that due to the redistribution of converter 
currents an annual downtime of 6 minutes and 11 seconds 
can be achieved. Compared to the results depicted in Figure 
1 this is a reduction of almost 40%. 
 A final verification of the thermal load sharing 
technique’s advantages over the traditional current sharing 
technique is shown in Figure 15. 
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30
Output current
A
ve
ra
ge
 s
ys
te
m
 te
m
pe
ra
tu
re
Current sharing technique
Thermal load sharing technique  
Figure 15 : Average system temperature 
 Figure 15 shows the average system temperature as a 
function of output current. It can be seen that the system 
operated by the thermal load sharing is at a constant lower 
temperature than its current sharing counterpart. At the 
extreme ends of the operating range the temperature 
difference between the two techniques is only 1°C while 
the difference throughout the normal operating range is as 
high as 3.3°C. This may not seem that impressive, but it 
should be noted that the temperatures depicted in Figure 15 
are average temperatures – meaning that the individual 
converter temperatures in the current sharing 
implementation varies by as much as 15°C. A temperature 
difference of this magnitude lowers the converter reliability 
of the hotter converter considerably. 
 
VI. CONCLUSION 
 
This paper has provided the experimental results of a real-
world realization of the new thermal load sharing technique 
proposed in [1]. The results show that using the thermal 
load sharing technique not only increases the overall 
system reliability as calculated in [1] but also has a positive 
impact on the system efficiency. The increase in efficiency 
is achieved by redistributing the current supplied by each 
converter to obtain equal thermal conditions as opposed to 
the current sharing technique’s intent to establish equal 
currents.  
 Further efficiency improvements are achievable if the 
current measurement resistors, not used by the thermal load 
sharing, are removed. However, for comparison purposes it 
was chosen to leave them in the circuit along with the OP-
amps and the associated small-signal components. 
 
ACKNOWLEDGMENT 
 
 The authors would like to thank Alcatel Space Denmark 
(ASD) for sponsoring this work, Senior Designer Henrik 
Møller from ASD for his contributions to the real-world 
implementation and Senior Design Engineer Arturo Arroyo 
from International Rectifier for helping with the 
temperature measurements.  
 
REFERENCES 
 
[1]  Optimized load sharing control by means of thermal 
reliability management, Submitted for PESC2004 
[2] Fundamentals of Power Electronics – second edition, 
Robert W. Erickson and Dragan Maksimovic 
[3] Reliability challenges due to excess stress under high 
frequency switching of power devices,  Professor 
Johann W. Kolar, ETH, Zürich 
[4] Reliability prediction of electronic equipment, Military 
Handbook 217-F 
[5]  Magnetics core data for High Flux Powder Cores 
 
  Page 35 
 
Thermal droop load sharing automates power systems reliability optimization
 
           Seth Sanders Carsten Nesgaard 
sanders@eecs.berkeley.edu     nesgaard@eecs.berkeley.edu  
 
 The increasing demand for high-current power supplies 
calls for simple implementation techniques that enables 
system designers to meet costumer requirements using off-
the-shelf products, thus optimizing parameters such as cost, 
system complexity and time to market. An approach often 
taken in the synthesis of power systems for non-critical 
applications is the droop-based parallel-connection of 
multiple converters. This particular approach allows for a 
cost-effective implementation with sufficient accuracy for 
most modern applications. In fact most droop-based power 
systems can be implemented with an overall current 
imbalance of less than 5% [1]. In cases where very tight 
regulation and load sharing is needed cost and circuit 
complexity are usually secondary parameters and the use of 
dedicated load share controllers are therefore justified. 
 This article describes an alternative and very simple 
reliability enhancing load sharing technique for parallel-
connected power supplies. The power system under 
consideration is comprised of two identical converters. By 
means of series resistors the two power supplies are 
connected to the same load as shown in Fig. 1.  
Power supply 1 Power supply 2
R1 R2
RLOAD VOUT
+
-
V1
+
-
V2
+
-
I2I1
 
Fig. 1 : Simple droop load sharing 
 
 The correlation between the load voltage (VOUT) and the 
two power supply voltages (V1 and V2) is given by: 
 
 222111OUT RI - V  RI - V  V ⋅=⋅=   (1) 
 
 From (1) it can be seen that the output characteristics of 
the two power supplies are sloped with decreasing output 
voltage for increasing load current. A graphical illustration 
of (1) can be seen in Fig. 2. 
Upper voltage limit
Lower voltage limit
Nominal voltage
VOUT
IOUT
Output characteristics
 
Fig. 2 : Converter output characteristics  
 
 Having described the basic operation and characteristics 
of the simple droop load sharing the focus will now be 
turned towards feedback networks, temperature imbalances 
and system reliability. The common feedback network 
shown in Fig. 3a provides a means for stabilizing the 
system output voltage. The feedback voltage generated by 
the resistor network is a scaled replica of the output 
voltage, which allows the controller to account for load 
variations. 
RF1
RF2
VFB
V1 or 2
RF1
RF2
VFB
VOUT
RT
RS
(a) (b)  
Fig. 3 : Feedback networks 
 
 If this scaled replica of the output voltage could be off-
set using system temperature information the series resistor 
in the simple droop load sharing could be eliminated, thus 
improving system efficiency. Fig. 3b shows a network 
modification that incorporates changes in temperature into 
the feedback signal. The temperature dependent device RT 
is a NTC thermistor while RS is a fixed series resistor. 
Deriving an analytical expression for the feedback voltage 
VFB as a function of temperature T results in the following 
rather complicated equation: 
 
 
   (2) 
 
 
 
where R25 is the nominal thermistor resistance at 25°C and 
β is a material constant provided by the device 
manufacturer. The remaining variables in (2) can be 
identified in Fig. 3b. Inserting values into (1) for a 5V test 
converter results in the output characteristic shown in Fig. 
4. with a 300mV droop from 40°C to 160°C, which equals 
an output current interval from 0A to 7A (see Fig. 5). 
 
16040 60 80 140100 120
Ideal droop output voltage
Thermal droop output voltage
Temperature
VOUT
5.00
4.85
5.15
Nominal output voltage
TAmbient = 40
oC
Fig. 4 : Output characteristic of actual power supply 
 
 As can be seen in Fig. 4 the output characteristic caused 
by the thermal droop load sharing is highly non-linear. This 
is due to the very non-linear resistance vs. temperature 
relation of the thermistor. However, in load sharing 
applications a non-linear output characteristic is relatively 
irrelevant as long as the thermistors used in the converters 
are the same. The proposed implementation depicted in 
Fig. 3b actually imitates one of several recognized ways of 
linearizing non-linear elements, hence linearizing the 
output characteristic to some degree. 
 It should be noted that an increase in output voltage at 
lower temperatures can only be achieved if the resistance 
( )( )
( )( )
 
  
RReR
ReRR  R
RV  V
F1S
  
25
S
  
25F1
F2
F2
OUTFeedback
T  273
1
298
1
T  273
1
298
1
++⋅
+⋅⋅
+
⋅=
⋅+−
⋅+−
+
+
β
β
Submitted for PELS Newsletter, Second Quarter 2004
  Page 36 
 
of RF2 is slightly decreased. This can be done by mounting 
a resistor in parallel with RF2.  
 The mode of operation is straight forward: As the 
temperature increases the resistance of RT decreases thus 
generating a slightly larger feedback voltage. In turn this 
forces the controller to slightly decrease the converter 
output voltage hence producing an output characteristic 
similar to that shown in Fig. 2. The slope of the output 
characteristic can be altered by changing the ratio of the 
added resistors.  
 It is well-known that the reliability of electronic parts is 
very dependent on operating temperature. Due to the 
parasitic elements inherent in any real-world components 
different reliability optimums under given working 
conditions apply to each component. The proposed thermal 
droop load sharing technique accounts for this fact by using 
the individual converter temperatures as local feedback off-
sets. This ensures the lowest overall power system 
temperature, thus optimizing the overall system reliability.  
 
TAmbientt  = 40 oC
AHeatsink  = 50mm. x 50mm.
Converter 1
Converter 2
175
150
125
100
75
50
25
MOSFET temperature
Output current
Converter 2 current
Converter 1 current
Equal current
Converter 2 temperature
Converter 1 temperature
Equal temperature
2 4 6 8  
Fig. 5 : System temperature vs. output current 
 
 Fig. 5 shows the system temperature caused by 
semiconductor power losses as a function of output current. 
In this example it is assumed that the MOSFET ON-
resistance of converter 1 is 10% larger than that of 
converter 2. From Fig. 5 it can be seen that equal converter 
current (7A each) results in a converter temperature 
difference of almost 10°C. From a reliability point of view 
this leads to an unavailability of converter 1 of almost 
twice that of converter 2. This fact can be seen in Fig. 6 
where the normalized failure rate for the system is 
depicted. 
 
λ (FIT)
0.2
0.4
0.6
0.8
1.0
1.2
25 50 75 100 125 150
Temperature
  .0390   Q
.0340  Q
.0480  Q
e - 1  (Q)lity Unavailabi
 techniquedroop Thermal
current equal 2,Converter 
current equal 1,Converter 
t-
=
=
=
=
⋅λ
.00140  QSystem =
.00160  QSystem =
year 1 t =
 
Fig. 6 : System failure rate 
 
 Changing the current vs. output voltage load sharing to 
the thermal droop load sharing equalizes the temperature of 
the two converters resulting in an off-set situation where 
converter 1 supplies 6.69A and converter 2 supplies 7.31A. 
This current adjustment lowers the average MOSFET 
temperature from (150+141)/2 = 145.5°C to 145°C (see 
Fig. 5). This temperature difference - although it might 
seem minor - results in an overall system unavailability 
reduction of 12.5%. Test values based on the normalized 
system failure rates can be seen in Fig. 6 where the failure 
rates for the each temperature scenario also can be 
identified. It should be noted that as power levels and/or 
system complexity increases the overall system impact of 
the proposed thermal load sharing becomes more profound.  
 This simple example have shown that the proposed 
thermal droop load sharing technique lowers the overall 
system unavailability by equalizing the individual 
converter temperatures. Furthermore, the system efficiency 
will increase due to the removal of the droop resistors 
depicted in Fig. 1. 
 
References 
[1] ‘When It Comes To Compact PCI Supplies, Standards 
Are Helping’, Lazar Rozenblat and Paul Kingsepp, 
Todd Products Corp., web-article.  
[2] ‘U-129 UC3907 Load Share IC Simplifies Parallel 
Power Supply Design’, Mark Jordan, TI application 
note 
[3] ‘Efficiency improvement in redundant power systems by 
means of thermal load sharing’, Carsten Nesgaard and 
Michael A. E. Andersen, Technical University of 
Denmark, Accepted for APEC 2004. 
 
Carsten Nesgaard is currently 
completing his Ph.D. from the Technical 
University of Denmark. Since February 
2003 he has been a visiting scholar at the 
University of California, Berkeley where 
he is working with Professor Seth 
Sanders. This work recently resulted in 
the design of a highly reliable power 
system for a precision docking project 
managed by Partners for Advanced 
Transit and Highways.  
 In the past Mr. Nesgaard has worked 
as a design engineer consultant for 
Alcatel Space Denmark where he was part of the power systems 
design team.  
 
 Seth R. Sanders received the S.B. 
degrees in electrical engineering and 
physics and the S.M. and Ph.D. degrees 
in electrical engineering from the 
Massachusetts Institute of Technology, 
Cambridge, in 1981, 1985, and 1989, 
respectively.  
 He was a Design Engineer at the 
Honeywell Test Instruments Division, 
Denver, CO. Since 1989, he has been on 
the faculty of the Department of Electrical 
Engineering and Computer Sciences, 
University of California, Berkeley, where 
he is presently Professor. 
 His research interests are in high frequency power conversion 
circuits and components, in design and control of electric machine 
systems, and in nonlinear circuit and system theory as related to 
the power electronics field. 
  Page 37 
 
   Optimized Load Sharing Control by 
   means of Thermal Reliability Management 
 
Carsten Nesgaard 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: cn@oersted.dtu.dk 
Michael A. E. Andersen 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: ma@oersted.dtu.dk 
 
Abstract – As the demand for reliable power systems 
comprised of parallel-connected converter units continue to 
increase the need for optimized load sharing techniques rises 
accordingly. This fact is emphasized by a power system 
minimization trend that tends to shrink the available PCB 
area set aside for the power system. These contradictive 
trends feed the research in topics such as advanced thermal 
management and dynamic thermal management. 
Implementation of the latter topic usually requires the use of 
complicated controllers that continuously monitors the 
thermal working environment and key components within 
the system.  If the thermal stress exceeds preset limits the 
controller react according to a predetermined sequence to 
minimize the damaging effects of excessive thermal stress.  
 This paper combines the dynamic thermal management 
with the load sharing. It is hereby effectively ensured that the 
parts count is kept to a minimum while providing a dynamic 
optimization of parameters such as average and absolute 
system temperatures as well as overall system reliability. The 
latter aspect is achieved by redistributing the current 
throughput of each converter, which in turn results in equal 
thermal conditions as opposed to well-known and widely used 
current sharing technique’s intent to establish equal currents. 
 
I.  INTRODUCTION 
 
 With new applications for high-current low-output-
voltage power systems emerging nearly every day the need 
for new and cost-efficient power system designs is a 
matter of course. As output voltage levels continue to 
decrease an approach that seems more and more attractive 
is the implementation of distributed power configurations 
with point-of-load power conversion. This technique 
distributes a high voltage to all parts of the system, thus 
minimizing the voltage drops throughout the distribution 
network. However, this configuration only solves the 
problem of power losses in the distribution network while 
the problems of high-current low-output-voltage 
conversion at the point-of-load remain a challenge. A 
common solution to the latter problem is parallel-
connection of multiple converter units. This technique is 
attractive for a number of reasons. The first and most 
obvious is that it provides the designer with a simple 
technique for reliability improvements as redundancy quite 
easily can be implemented. Another advantage of this 
particular technique is that it allows the designer to 
implement large power systems by means of off-the-shelf 
units, thus minimizing parameters such as design time and 
system costs. However, due to non-ideal parts each 
converter unit deviates from the ideal case, which makes a 
power system comprised of parallel-connected converters 
a rather poor performing system. To account for the non-
ideal parts some form of load sharing is needed to ensure 
that each converter in the configuration delivers its share 
of the total output power.  
 In other words parallel-operation of multiple converters 
is employed when specifications require a highly reliable 
system, designable within a very short time frame and at 
low costs. However, to make full use of the system’s 
potential load control is a must. 
 The steps involved in designing a power system are 
many and would by no means fit the page limit of this 
paper nor is it the intention of this paper to describe a 
detailed power system realization. However, in order to 
clarify some of the design choices described in subsequent 
sections a very short introduction to the initial design 
considerations will be given.  
 The first design consideration of importance to the N+1 
redundant power system described in this paper is the 
number of converters to use. In the design of a N+1 
redundant system the most straight forward 
implementation is the design of two identical converters 
each capable of supplying the maximum load current. 
However, this approach results in a 100% power 
‘overshoot’ – meaning that the available system power is 
twice that required by the specifications. Increasing the 
number of converter units reduces this power ‘overshoot’. 
For a N+1 redundant power system Figure 1 shows the 
percent-wise decrease in power ‘overshoot’ as the number 
of converter units increases. The other curve is an index 
that takes into account the decrease in converter unit cost 
price, the increase in circuit complexity and the increase in 
load sharing circuitry costs – all a function of the number 
of converter units. The index is based on component cost 
(pr. 1000 pieces) and standard load sharing 
implementation circuitry. It should be noted that the index 
curve in many situations will change as a function of the 
number of units when large scale manufacturing is 
employed and/or different load sharing techniques are 
used. 
Presented at Power Electronics Specialists Conference 2004, Aachen, Germany, June 2004
  Page 38 
 
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7 8 9 10 11 12
Number of units in N+1 system
P
ow
er
 'o
ve
rs
ho
ot
' r
ed
uc
tio
n 
in
 %
 
Figure 1 : Percent-wise decrease in power ‘overshoot’ 
 From Figure 1 it can be seen that the two curves 
intersect somewhere between 3 and 4 converter units. This 
point is the optimum in the configuration at hand. 
However, as indicated above this optimum point is most 
likely to shift to either side along the axis of abscissa when 
other power system implementations are considered.  
 From a reliability point of view the number of 
converter units should be kept to a minimum. As an 
example using the data for the power system at hand a 
N+1 redundant system comprised of 4 converter units is 
40% more likely to fail at any given time than a N+1 
redundant power system comprised of 3 converter units. 
The same tendency holds when transitioning from a 3 
converter system to a 2 converter system. However, due to 
the percent-wise larger increase in component count in the 
latter case the probability of system failure is 65% higher 
in a N+1 redundant 3 converter system than that of a N+1 
redundant 2 converter system. From these calculations it 
can be seen that as the number of converter units increase 
a smaller and smaller gain in reliability is achieved when 
substituting an X unit system with an X-1 unit system. 
 The next design consideration of importance to this 
paper is the choice of load sharing technique. The most 
commonly used technique is the current sharing technique. 
This paper examines the current sharing technique as well 
as a new thermal load sharing technique. In each case the 
pros and cons will be discussed and a comparison of the 
two techniques will be presented in section “VI.  
RELIABILITY”. 
 
II.  POWER SYSTEM 
 
 Based on the intersection of the two curves shown in 
Figure 1 and the subsequent reliability issues concerning 
parallel-connection of multiple converter units the power 
system in this paper is comprised of N+1=3 parallel-
connected buck converters each capable of supplying 15 
ARMS at an output voltage of 5V. The maximum load 
current IOUT is 30 A.  
 With reference to Figure 2 the individual converter 
parameters in the parallel-configuration can be identified. 
I1, T1 are parameters associated with converter 1; I2, T2 are 
associated with converter 2 and I3, T3 are associated with 
converter 3. These parameters form the basis of the 
thermal calculations as well as the reliability assessments. 
Converter 1     (T1)
Converter 2     (T2)
Converter 3     (T3)
I1
I2
I3
IOUTIin
 
 
Figure 2 : N+1 redundant power system 
 
 All calculations are all based on the assumption that 
each buck converter is implemented with a single 
MOSFET transistor. In converter implementations with 
multiple MOSFET switches and/or synchronous 
rectification the overall impact of improper load sharing 
would be even more profound. A fact that can be deduced 
from the calculations in section “IV.  EFFECTS OF 
PARASITIC ELEMENTS”. 
 Although a strictly theoretical analysis the following 
calculations establishes the foundation for rethinking the 
‘obvious’ load sharing approach – the current sharing 
technique. 
 
III.  CURRENT SHARING 
 
 The most common and widely accepted technique for 
load sharing is the current sharing technique. The idea 
behind the technique is that equal stress and temperature is 
achieved with identical currents through each converter. In 
turn, this should result in optimized performance and 
reliability. In the ideal case with identical converter 
components and identical thermal operating surroundings 
this technique does indeed result in optimized performance 
and reliability. However, the ideal case is very rare and the 
result of implementing the current sharing technique is 
often less advantageous than predicted by the theoretical 
models. 
 Figure 3 shows the general case where a number of 
converters are paralleled and forced to supply an equal 
share of the total output current.  
 
DC/DC converter
Load
control
DC/DC converter
Load
control
DC/DC converter
Load
control
Load
Lo
ad
 s
ha
rin
g 
bu
s
 
 
Figure 3 : Traditional current sharing technique 
 
 A more detailed illustration of the current sharing 
technique is depicted to the right in Figure 4 – where it can 
be seen that high side current sensing is required (in non-
isolated systems) as well as dual supply rails for the 
control circuitry. 
( ) 0.751)-(xindex  Price -x indexcircuitry  LS index  Complexity ⋅⋅+
( )
100
(x)unit  pr. P
 1)(xunit  pr. P - (x)unit  pr. P 
Max
MaxMax
⋅
+
  Page 39 
 
 The illustration of IOUT vs. Temperature in Figure 4 
(lower left) is a representation of the maximum output 
current (IMAX) as a function of system temperature. In most 
converter designs the horizontal line (IMAX) determines the 
maximum safe output current and is often based on output 
current under worst-case temperature conditions. 
 The current sharing technique is thoroughly described 
in numerous papers, articles and application notes [4] - [7]. 
For this reason the description of this technique will be 
limited to that already presented. As a summary the pros 
and cons of the current sharing technique will briefly be 
discussed.  
 The advantages of the current sharing technique 
compared to that of a system without any load sharing are 
many. Being a simple technique to implement the current 
sharing technique ensures that no single converter unit is 
stressed to the maximum. This is ensured by preventing 
any single converter from going into current limitation – 
due to for instance small variations in individual converter 
output voltages. The main drawback of the current sharing 
technique is the need for output current sensing. Sensing 
the output current is typically done by inserting a resistor 
in series with the converter output. This resistor causes 
additional power loss – although it can be kept to a 
minimum compared to for example the semiconductor 
losses of the converter – thus resulting in system heating.
 
                
Power
components
PWM control
Load share
control
Current
meas. OutputInput
Current Limit (ILIM)
ILimit
0
IMAX
IOUT
Temperature
ISENSE RMEAS
+ 9V
- 9V
LS controller
R3
R1 R2
R4
t
OP-amp
High side sensing
TMAX  
                    Figure 4 : Current sharing implementation and controller current waveform 
 
IV.  EFFECTS OF PARASITIC ELEMENTS 
 
 All electronic parts are associated with parasitic 
elements that deviates from the ideal-part models used in 
the initial system analysis. Since a description of all these 
parasitic elements would form the basis for an entire paper 
this section concentrates on addressing the parasitic 
elements associated with the MOSFET transistors. This 
simplification is justified by the fact that the MOSFET 
transistor in the power system at hand generates the most 
heat. Being the primary source for system heating the 
MOSFET transistor is also the primary cause of 
deteriorated system reliability.  
 To simplify matters even further in order to fit the page 
limit the analysis in this paper focuses on the conduction 
losses caused by the temperature dependent MOSFET ON-
resistance. Although switching losses also depend on 
temperature [3] these losses contribute far less to the load 
sharing temperature deviation than the conduction losses. 
 According to transistor manufacturer’s datasheets the 
nominal value of the MOSFET ON-resistance RDS(ON) can 
vary by as much as ±30% from one batch of transistors to 
another. A fact that must be taken into account when 
thermal system issues are considered. However, the 
variation in RDS(ON) between transistors from the same 
batch is usually much smaller. 
 Figure 5 shows the MOSFET ON-resistance as a 
function of temperature for the MOSFET transistors used 
in the power system design shown in Figure 2. 
Rds(ON) (Ω)
0.025
Temperature
125100755025 150-25
0.050
0.075
0.100
0.125
0.150
 
Figure 5 : MOSFET RDS(ON) temperature dependency  
 
 In Figure 5 it can be seen that the MOSFET ON-
resistance increases from 70mΩ to 140 mΩ when the 
junction temperature increases from 25°C to 140°C.   
 By means of a simple example this section will show 
that utilization of the current sharing technique with intend 
to optimize the overall system reliability quite often results 
in imbalanced power loss distribution within the system. 
 MOSFET transistor power generation due to RDS(ON) 
can be expressed by the following equation: 
 
 DS(ON)R2RMSI  RP DS(ON) ⋅=     (1) 
 
 The heat generated by the power loss calculated in (1) 
is transferred from the MOSFET casing and heat-sink to 
the ambient by means of convection and radiation. A 
mathematical description of this heat transfer can be 
established by the following two equations [2]:
  Page 40 
 
 
 ( )4
h
5
AmbientT - SurfaceTA1,34  ConvectionP ⋅⋅=     (2) 
 
 




⋅⋅
−
⋅=
4
AmbientT - 
4
SurfaceTA
8015,7  RadiationP     (3) 
 
 In (2) the variable ‘h’ is the height of the heat-sink 
while the variable ‘A’ in both (2) and (3) denotes the area 
of the heat-sink. A graphical representation of (2) and (3) 
is shown in Figure 6 and Figure 7 respectively. 
 
PConvection (W)
5
10
15
20
25
TSurface (
oC)
1401201008060
TAm bient =  40
oC
AHeats ink =  20 cm. x 20 cm.
 
Figure 6 : Power dissipation caused by convection 
 
PRadiation (W)
0.2
0.4
0.6
0.8
1.0
TSurface (
oC)
1401201008060
TAmbient =  40oC
AHeats ink =  20 cm. x 20 cm.
 
Figure 7 : Power dissipation caused by radiation 
 
 From Figure 6 and Figure 7 it can be seen that the heat 
transfer from MOSFET to ambient is almost solely due to 
convection. 
 In order to calculate the conduction losses the choice of 
MOSFET transistor must be recognized. For illustration 
purposes it is chosen to implement the three converters in 
the configuration with a transistor of nominal RDS(ON), a 
transistor of nominal RDS(ON) + 30% and a transistor of 
nominal RDS(ON) - 30% respectively. In a real-world 
implementation this scenario would be extremely rare 
although a difference in RDS(ON) among the three transistors 
should be expected. 
 Thermal equilibrium is obtained when heat generation 
equals heat dissipation. To assist in the estimation of 
MOSFET transistor temperature the thermal model shown 
in Figure 8 is established.   
 
PRDS(ON) PRadiation + PConvection
Rjc Rcs
Tc TSurfaceTj
TAmbient  
 
Figure 8 : Thermal system equivalent 
 Using (1), (2), (3) and the thermal model shown in 
Figure 8 an exact value for the conduction losses and 
MOSFET temperatures can be found: 
 
 7.7W      -30%nom,RP      C78.4  sT     C90.7  jT DS(ON) =°=°=     (4) 
 13.5W              nom,RP      C99.8  sT     C121.4  jT DS(ON) =°=°=     (5) 
 24.4W  30%nom,RP    C134.9  sT     C173.9  jT DS(ON) =+°=°=     (6) 
 
 As expected the temperature dependency of the 
MOSFET ON-resistance have a negative overall effect that 
contributes to a significant increase in conduction losses. 
 The calculated temperatures indicate that the MOSFET 
transistor with RDS(ON),nom+30% operates very close to the 
recommended maximum temperature and is thus very 
likely to fail. 
 From the results in (4), (5) and (6) the average junction 
temperature for the 3 MOSFET transistors can be found to 
be 128.7°C while the associated average heat-sink surface 
temperature is 104.4°C.  
 
V.  THERMAL LOAD SHARING 
 
 The proposed thermal load sharing technique 
compensates for the imbalanced power losses that result 
from implementing the current sharing technique. By 
monitoring the temperature of the heat generating 
component (or components) the load current supplied by 
each converter in the parallel-configuration can be 
adjusted to take into account parameters such as parasitic 
elements, physical layout and working environment. 
 Using this technique each converter works at the same 
temperature, which in turn results in identical converter 
reliability in the parallel-configuration. 
 Figure 9 shows the thermal load sharing technique in a 
configuration where the MOSFET transistor heat-sink 
temperatures are monitored and fed back to the control 
circuitry. This configuration corresponds to the system 
considered in this paper.  
 
Lo
ad
 s
ha
rin
g 
bu
s
DC/DC converter
DC/DC converter
DC/DC converter
Load
Load
control
Temp
Load
control
Temp
Load
control
Temp  
 
Figure 9 : Thermal load sharing technique 
 
 The real-world implementation of the thermal load 
sharing technique is straight forward, since the existing 
control circuitry employed by for example the current 
sharing technique can be used. The temperature sensing 
device is simply mounted at the most critical location 
  Page 41 
 
within the converter – in this case on the MOSFET 
transistor casing. The temperature signal is then fed back 
to the load share controller where it replaces the current 
signal. To ensure a system startup without running one or 
more converters into current limitation the signal from the 
current measurement can be combined with the 
temperature information to create a load share controller 
that initially uses the current sharing technique. As the 
temperature of the individual converters change the output 
current information is offset by the temperature signal – 
thus maximizing system efficiency and reliability. 
 
Power
components
PWM control
Load share
control
Current
meas. OutputInput
2,7V - 20V
R1
R2
TSense
Part of
 
 
Figure 10 : Temperature sensor mounting 
 
 Using the same equations that were used to calculate 
the MOSFET transistor power losses and temperatures in 
section “IV.  EFFECTS OF PARASITIC ELEMENTS” the load 
distribution for the thermal load share technique can now 
be established. It should be noted that the temperature was 
a variable in the current sharing calculations while the 
individual load currents were fixed parameters. In the case 
of thermal load sharing the temperature is fixed while the 
current distribution among the individual converters is the 
variable. The result is shown below:  
 A 1.71   30%-nom Rds(on),I =     (7) 
 A .69            nom Rds(on),I =     (8) 
 A .78  30%nom Rds(on),I =+     (9) 
 The average MOSFET junction temperature that results 
from distributing the load current as calculated in (7), (8) 
and (9) is 115.5°C. This is 13.2°C lower than the average 
junction temperature using the current sharing technique. 
Due to the fact that heat dissipation from heat-sink to 
ambient depends on source-to-ambient temperature 
difference the average surface temperature associated with 
the 115.5°C junction temperature is 95.7°C. This is 8.7°C 
lower than the average surface temperature in the current 
sharing case. Since the overall system reliability is a 
function of heat-sink surface temperature it can easily be 
seen that the probability of system survival is much better 
in the case of thermal load sharing.  
 Among the advantages of the thermal load sharing 
technique is its ability to optimize the system reliability at 
any given time, its system efficiency enhancing 
capabilities and its easy and cost-effective implementation. 
Another, perhaps less obvious, advantage of the thermal 
load sharing technique is its ability to control converters 
with different power ratings. Suppose a power system was 
comprised of multiple high power converters and a single 
low power converter. If the low power converter is being 
over-loaded its temperature would increase causing the 
thermal load share controller to require the remaining 
converters to supply more current, thus alleviating the 
over-loaded converter. A similar situation in a system 
employing the current sharing technique would quickly 
cause a converter malfunction. 
 Even though very little system affect results, a 
disadvantage of the thermal load sharing technique that 
must be mentioned is the possibility of a slightly increase 
in individual converter failure rate. However, this 
drawback is by far compensated through the much lower 
average system temperature that results from the 
implementation. 
 
VI.  RELIABILITY 
 
 It is well known that system temperature is the single 
most important parameter in system reliability 
assessments. Minimizing the temperature rise increases the 
system reliability and quite often also results in better 
system efficiency. Therefore in order to assess the system 
reliability the component distribution of the printed circuit 
board must be known.  
 When considering the physical layout of the converter 
there is a trade-off assessment between the thermal aspects 
of the converter design and the electrical constraints of for 
instance the physical distance between MOSFET and 
controller IC. From a reliability point of view the IC 
should be positioned as far away from the heat generating 
MOSFET as possible. However, from an electrical point of 
view the IC should be positioned as close to the MOSFET 
gate terminal as possible – in order to minimize the effects 
of PCB trace inductance. As a compromise the layout 
shown in Figure 11 is chosen for the reliability assessment.  
 
Transformer
Heatsink
Transistor
ICIC
Misc. components
Temperature
Distance
TSurfaceTTransformer
TAmbient
TIC
PCB
TEnd of PCB
 
Figure 11 : System temperature distribution 
 
 Based on the above temperature distribution an 
assessment of the overall system reliability can be 
established.  
 Using the component data found in [1] the following 
failure rates (expressed as failures in 109 hours) for the 
three converters in the current sharing configuration can be 
calculated:
  Page 42 
 
 FIT 3834   30%-nom ,RDS(ON) =λ   (10) 
 FIT 0429            nom ,RDS(ON) =λ   (11)  
 FIT 74712  30%nom ,RDS(ON) =+λ   (12) 
The probability of survival for each converter is calculated 
by utilizing the exponential distribution: 
 hours) 8760(10-9e  Prob ⋅⋅−= λ      (13) 
It should be noted that the calculated probability of 
converter survival is for a period of one year. 
 .96230     30%nom ,RProb DS(ON) =−       (14) 
 .92380               nom ,RProb DS(ON) =       (15) 
 .78610      30%nom ,RProb DS(ON) =+       (16) 
 
Combining the binominal coefficients for the probability 
that all converters work with that of one converter fails 
results in the following system reliability:  
 
 ,97400  SystemProb =       (17) 
 
Expressing this probability in terms of system 
unavailability the following probability of annual down-
time can be established: 
 
 2.60%  0.0260  .97400 - 1  SystemProb - 1  P ====       (18) 
 
Performing the same reliability calculations for the thermal 
load sharing technique provides a foundation for a system 
performance comparison. Since the temperatures in this 
case are the same for all three converters they have 
identical failure rates: 
 
 .93380  ThermalProb        FIT 7819  Thermal =⇒=λ       (19) 
 
Based on (19) the overall system reliability can be 
calculated: 
 
 .98740  SystemProb =       (20) 
 
Expressing (20) in terms of unavailability: 
 
 1.26%  0.0126  .98740 - 1  SystemProb - 1  P ====       (21) 
 
Comparing (18) and (21) it can easily be seen that the 
probability of system malfunction for the thermal load 
sharing technique is less than half that of the current 
sharing technique. Calculating the percent-wise decrease 
in system unavailability one finds that the proposed 
technique reduces the annual down-time probability by 
51.6%. This is a significant reduction caused simply by 
considering the parasitic elements of the MOSFET 
transistors. Had the converters been positioned in different 
working surroundings the effect could have been even 
more profound. 
 
VII.  CONCLUSION 
 
This paper has provided the foundation for a new thermal 
load sharing technique that at any given time ensures 
optimum reliability, performance and efficiency. A 
comparison between the thermal load sharing technique 
and the common and widely accepted current sharing 
technique is provided and the pros and cons in each case 
have been discussed. 
 Reliability estimations have been provided as analytic 
evidence of the superior reliability of the thermal load 
sharing technique. Among the advantages of the thermal 
load sharing technique is optimized reliability, 
minimization of MOSFET losses resulting in an increase 
in overall system efficiency and simple implementation. A 
disadvantage of the thermal load sharing technique is the 
possibility of a slight increase in individual converter 
failure rate. However, this fact is by far compensated 
through the much lower average system temperature that 
results from the implementation. 
 
ACKNOWLEDGMENT 
 
 The authors would like to thank Alcatel Space 
Denmark (ASD) for sponsoring this work and Senior 
Designer Henrik Møller from ASD for his comments and 
suggestions throughout this work.  
 
REFERENCES 
 
[1] Reliability prediction of electronic equipment, Military 
Handbook 217-F 
[2] Power Electronics, Second Edition, Mohan, Undeland, 
Robbins 
[3] Reliability challenges due to excess stress under high 
frequency switching of power devices, Johann W. Kolar, 
ETH, Zürich European Power Electronics and Drives 
Conference 2003 
[4] Paralleled DC power supplies sharing loads equally, US 
patent 4,635,178 
[5] System and method of load sharing control for 
automobile, US patent 5,157,610 
[6] Current share circuit for DC to DC converters, 
 US patent 5,521,809 
[7] U-129, UC3907 Load Share IC Simplifies Parallel Power 
Supply Design, Application Note – Texas Instruments 
[8] Efficiency improvement in redundant power systems by 
means of thermal load sharing, Carsten Nesgaard and 
Michael A. E. Andersen, Applied Power Electronics 
Conference and Exposition 2004, Anaheim, USA
  Page 43 
 
Experimental Verification of the  
Thermal Droop Load Sharing 
 
 
Carsten Nesgaard 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: cn@oersted.dtu.dk 
Michael A. E. Andersen 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: ma@oersted.dtu.dk 
 
 
Abstract – As the demand for reliable power systems 
comprised of parallel-connected converter units continue to 
increase the need for optimized, yet simple, load sharing 
techniques increase accordingly. 
 This paper examines the traditional series droop resistor 
technique and its reliability optimized counterpart - the 
thermal droop load sharing. The former technique ensures an 
approximate load sharing among the individual converters by 
distributing an equal current throughout the system while the 
latter technique uses temperature information to adjust the 
individual converter currents, hence intentionally creating an 
unequal load sharing. 
 Following an introduction to the thermal droop load 
sharing technique, a set of measurements of both techniques 
are presented. These verify that the thermal droop load 
sharing technique has a positive impact on the overall system. 
 
I.  INTRODUCTION 
 
 Keeping the design of high-current power supplies at a 
low cost, low circuit complexity, relatively high efficiency 
and with a short time to market is a challenging job for the 
power supply designers. This fact feeds the constant search 
for alternative implementation methods that exceed present 
performance levels. An implementation method often used 
in high-current power systems is the parallel-connection of 
several identical converters each capable of supplying part 
of the load current. This approach allows for relatively easy 
design with little circuit complexity beyond that of the 
individual converters. In addition, a short time to market 
makes this method an attractive solution from a 
management point of view. 
 In order to control the current flow through each 
converter in this type of configuration some form of load 
sharing is usually applied. In mission critical systems or 
power supplies for high availability systems the use of a 
dedicated load share controller IC is often considered the 
only viable solution. The added cost and circuit complexity 
of the dedicated controller IC is justified by the requirement 
of high availability which in many cases cannot be assessed 
in terms of economical aspects. In most other high-current 
applications the use of a simpler load sharing technique is 
often adequate. The technique under consideration in this 
paper is the droop load sharing technique and its reliability 
optimized counterpart the thermal droop load sharing. The 
former technique ensures an approximate load sharing 
among the individual converters by distributing an equal 
current throughout the system while the latter technique 
uses temperature information to adjust the individual 
converter currents, hence intentionally creating an unequal 
load sharing. The implementation of the thermal droop load 
sharing technique follows the guidelines proposed in [2].  
 Lowering the loop gain is another common technique 
for achieving droop load sharing. This increases the overall 
efficiency compared to the series resistor implementation 
but comes at the cost of lower dynamic system capabilities. 
Furthermore, changing the loop gain in prefabricated 
converters is by no means simple if not impossible, thus 
eliminating a key element in the droop load sharing – 
simplicity of implementation. 
 For comparison purposes the traditional droop load 
sharing is initially implemented with intend to achieve a 
regulation as close to the ideal as possible. Having obtained 
a set of measurements of the initial system setup the load 
sharing technique is altered to include temperature 
information and a second set of measurements is obtained. 
In section ‘III. EXPERIMENTAL RESULTS AND SYSTEM 
DESCRIPTION’ an analysis of the differences in the two load 
sharing techniques is provided followed by a verbal 
discussion of the pros and cons.  
 A graphical illustration of the ideal droop load sharing 
using series resistors is depicted in Figure 1.  
 
Upper voltage limit
Lower voltage limit
Nominal voltage
VOUT
IOUT
Output characteristics
 
Figure 1 : Ideal droop output voltage 
 
II.  THE POWER SYSTEM 
 
 The power system under consideration is comprised of 
3 commercially available converters, each capable of 
supplying 15A at a 5V output voltage. Figure 2 shows a 
photo of the top and bottom of the converters used.
Submitted for review at Applied Power Electronics Conference and Exposition 2005, Austin, USA, March 2005
  Page 44 
 
       
Figure 2 : Converter top and bottom view 
 
The specifications set forth for the power system are as 
follows: 
 
 Output voltage : 5 V ± 5% 
 Max load current  : 20 A 
 Voltage droop : 300 mV 
 
 Since the converters are encapsulated the technique 
proposed in [2] cannot be implemented directly. However, 
the converters are fitted with a trim pin that allows for a 
±10% alternation of the output voltage. In terms of 
achieving accurate droop load sharing this is more that 
adequate since the droop load sharing usually limits the 
voltage droop to a few tenths of a volt. Indeed, this power 
system seeks to create a droop voltage of 300mV in the 
range from no load to full load. This would leave room for 
additional voltage variations at the output before the ±5% 
regulation limit is reached. Starting out with the very 
simple feedback network shown in Figure 3 it becomes 
clear that the thermistor resistance ratio over the intended 
operating range results in a voltage droop that would 
exceed the specified regulation limits (see Figure 5). In 
fact, the converter output voltage would be at a constant 
minimum due to an ambient temperature of 40°C, thus 
being outside the regulation boundaries (the shaded areas in 
Figure 5). 
To TRIM pin
VOUT
RT
RS
CT
 
Figure 3 : Basic feedback network 
 
 With reference to Figure 3 the individual components 
are briefly described. RT is a thermistor with a room 
temperature resistance (R25) of 5kΩ and a material constant 
(β) of 3950, RS is a series resistor of 4.9 kΩ used to 
increase the output voltage at low current levels and CT is a 
stabilizing/noise reducing capacitor of 1nF. With the 
implementation shown in Figure 3 the correlation between 
temperature and feedback voltage will in general be very 
non-linear due to the characteristic of the thermistor RT. 
However, as can be seen in Figure 5 the part of the output 
voltage that actually exhibits the droop slope is very close 
to linear. Nonetheless, the feedback network has to be 
modified in order to achieve regulation and the intended 
300mV droop voltage. 
 
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5V   V
3950       
k 5.0     R
k 7.5     R
 OUT
 
 25
  S
=
=
Ω=
Ω=
β
20 40 60 80 100
( ) β⋅−
+
⋅+
⋅
⋅=
298
1
T273
1
eR  R
VRV  V
25S
OUTS
OUTFB
Temperature
VFB
TAmbient = 40
oC
   
Figure 4 : Feedback voltage 
 
1.0
2.0
3.0
4.0
6.0
5.0
20 60 80 100
Temperature
VOUT
Droop slope
TAmbient = 40
oC
40  
Figure 5 : Output voltage 
 
 The approach taken in modifying the feedback network 
comes from a common linearizing technique applicable to 
non-linear elements. The modified feedback network can be 
seen in Figure 6 while the associated output voltage droop 
can be seen in  
Figure 7.  
 
RF1
RF2
To TRIM pin
VOUT
RT
RS
CT
RF1 = 13  k
RF2 = 3.9 k
RT,25C = 5.0 k
RS = 4.3 k
CT = 1.0 nFβ = 3950
 
Figure 6 : Modified feedback network 
 
 An analytic expression relating the feedback voltage 
and thermistor temperature equals: 
 
( )
( )
 
  
RReR
ReRR
  R
RV  V
F1S
  
25
S
  
25F1
F2
F2
OUTFeedback
T  273
1
298
1
T  273
1
298
1






++⋅






+⋅⋅
+
⋅=
⋅+−
⋅+−
+
+
β
β
      (1) 
 
 This equation is used in the theoretical determination of 
the feedback network component values listed in figure 6. 
A graphical representation of (1) can be seen in  
Figure 7. The dotted line illustrates the ideal droop output 
voltage for the system under consideration while (1) exhibit 
non-linear characteristics – as predicted. From a load 
sharing point of view this non-linear characteristic is 
relatively insignificant as long as all three converters 
exhibit the same non-linear droop output voltage. 
  Page 45 
 
16040 60 80 140100 120
Ideal droop output voltage
Thermal droop output voltage
Temperature
VOUT
5.00
4.85
5.15
Nominal output voltage
TAmbient = 40
oC
 
Figure 7 : Modified feedback network characteristics 
 
 The physical mounting of the thermistors onto the base-
plate was achieved by utilizing small metal brackets. In 
order to ensure approximately equal bracket-to-thermistor 
force among the converters each metal bracket was fitted 
using a torque wrench. 
 
III. EXPERIMENTAL RESULTS AND SYSTEM DESCRIPTION 
 
 In order to avoid overstressing the converters in case of 
malfunction the total load current is limited to 20A, which 
is within the capability of the system. Figure 8 shows the 
real-world test setup for the traditional droop load sharing 
utilizing series resistors as droop elements. To avoid power 
system shut-down in the event of a single converter fault all 
outputs are OR’ed using schottky diodes.  
 
Converter 2
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-2
DIsolation-2
Load
Converter 1
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-1
DIsolation-1
Converter 3
-INPUT
ON/OFF
+INPUT
-OUTPUT
-SENSE
TRIM
+OUTPUT
+SENSE RDroop-3
DIsolation-3
VIn
 
Figure 8 : Test setup 
 
 The first set of measurements is completed for the 
traditional droop load sharing implemented with a 60mΩ 
resistor in series with each converter output. 
4
4,2
4,4
4,6
4,8
5
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
V
ol
ta
ge
 d
ro
op
 (V
)
Converter 1
Converter 2
Converter 3
 
Figure 9 : Converter voltage droop vs. output current 
0
20
40
60
80
100
120
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
Te
m
pe
ra
tu
re
 (C
)
Converter 1
Converter 2
Converter 3
 
Figure 10 : Converter temperature vs. output current 
 
 Figure 9 shows the droop voltage of each converter 
operated individually as a function of output current. It can 
be seen that the differences in droop voltage is relatively 
small, which implies that the system will exhibit a 
relatively good load sharing. Figure 10 shows the 
individual converter temperatures as the droop voltages in 
Figure 10 was obtained. It should be noted that the 
converter temperatures shown in Figure 10 include the 
temperature of the series resistor and the OR’ing diode.  
 The load sharing that result from operating all three 
converters simultaneously is shown in Figure 11. 
 
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10 12 14 16 18 20
Load current (A)
In
di
vi
du
al
 c
on
ve
rt
er
 c
ur
re
nt
 (A
)
Converter 1
Converter 2
Converter 3
 
Figure 11 : Individual converter current sharing 
 
 From the current distribution among the individual 
converters, shown in Figure 11, it is easily identified that 
the use of precision resistors as droop elements provides a 
relatively good current regulation. The large differences in 
load sharing are due to the initial variation in converter set 
point voltages as shown in Figure 9. It can be seen that 
converter 3 supplies the majority of the load current. This 
fact combined with its high temperature over the entire 
operating range decreases the overall system reliability and 
increases the combined average system temperature. 
 The very simple steps involved in changing from the 
traditional droop load sharing technique to the thermal 
droop load sharing are now carried out. From this point 
forward, the system contains temperature information used 
to optimize the power system and simultaneously 
eliminates the series droop resistance, which contributes 
significantly to the low overall efficiency of the ‘series 
droop resistor’ system. 
 Figure 12 shows an image of the real-world 
implementation where the initial droop resistors, the 
  Page 46 
 
thermistors and the feedback network are easily 
identifiable. The two sets of wires (red) are used for 
measuring the effects of thermal droop load sharing without 
having to remove the series droop resistors.  
 
 
Figure 12 : Real-world test configuration 
 
4,5
4,6
4,7
4,8
4,9
5
5,1
5,2
5,3
5,4
5,5
0 1 2 3 4 5 6 7 8 9 10
Individual converter current (A)
V
ol
ta
ge
 d
ro
op
 (V
)
Converter 1
Converter 2
Converter 3
 
Figure 13 : Individual converter voltage droop 
 
 From Figure 13 it can be seen that the added thermal 
load sharing circuitry decreased the output voltage of 
converter 3, thus causing it to have the lowest droop 
voltage of the three converters over almost the entire 
operating range. This, being just the opposite of the 
scenario in the series resistor droop configuration, is caused 
by component tolerances although much effort have been 
put into finding accurate resistors. Another observation 
worth mentioning is the voltage slope of converter 3. This 
voltage, as opposed to the voltages of converter 1 and 
converter 2, is almost a straight line.  
 The information provided in Figure 14 clearly shows 
that the added feedback circuit has a positive impact on the 
converter output voltages. With the exception of light loads 
and full load the three droop voltages are almost equal. 
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14 16 18 20
Load current (A)
In
di
vi
du
al
 c
on
ve
rt
er
 c
ur
re
nt
 (A
)
Converter 1
Converter 2
Converter 3
 
Figure 14 : Thermal load sharing converter currents  
 Figure 14 shows the current sharing among the three 
converters while being controlled by the thermal droop load 
sharing technique. The added thermal feedback network 
clearly has a positive impact on the current distribution in 
terms of equal current sharing. It should be noted that this 
scenario is caused by distributing the converter currents 
according to the baseplate temperatures and a different 
power system configuration could actually result in an even 
more unequal current distribution than shown in Figure 11.
 Turning the attention back to the previously mentioned 
tolerance issue, it is necessary to establish the extreme 
limits of the output voltage caused by component tolerances 
in the feedback network in order to examine voltage 
deviations in more detail. The first limit is the absolute 
lowest output voltage possible. This occurs when the 
feedback network constantly provides the converter with 
the highest feedback voltage. In other words the thermistor 
to feedback resistor (RF2) ratio must be at a minimum. At 
the other extreme the feedback voltage would have to be at 
its minimum at all times. This occurs when the thermistor 
to feedback resistor (RF2) ratio is at a maximum. 
Summarizing these observations the following table can be 
established: 
 
Component VOUT, MAX VOUT,MIN 
RF1 -2% +2% 
RF2 +2% -2% 
RT,25C -2% +2% 
RS -2% +2% 
Table 1 : Voltage variation due to component tolerances 
 
 Since the combinations of feedback network component 
tolerances that could cause this deviation from the ideal are 
endless the only way to establish accurate date is to 
measure all components. However, from an operational 
point of view these tolerances are just a fact of life and as 
long as the power system performs satisfactory there is no 
need to measure every component. As an example of 
component tolerances leading to the abovementioned 
feedback voltage deviation, the following set of values has 
been deduced: 
 
 RF1 = + 1.18% 
 RF2 = + 1.18% 
 RT,25C = + 1.18% 
 RS = + 0.0% 
 
 The next measurement shown in Figure 15 illustrates 
the common output voltage bus during a single converter 
failure while supplying a total output current of 5A. It can 
be seen that the resulting voltage drop is significant 
although the duration is only a few hundred nanoseconds. 
Since the power system considered in this chapter is 
comprised of 3 hybrid converters without any additional 
capacitance at the output the only mechanism working to 
prevent the voltage glitch from happening is the control 
circuitry of the converters. Forming a small capacitor bank 
  Page 47 
 
at the output will assist the control circuitry in its attempt 
and will greatly reduce the voltage glitch in the event of a 
converter failure. 
 
 
Figure 15 : Output glitch during converter failure 
 
 Finally, the efficiency of the two techniques are 
measured. This is illustrated in Figure 16 where the lower 
curve is the efficiency of the series droop resistor approach 
and the curve representing the highest efficiency is the 
thermal droop load sharing approach. 
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 10 20 30 40 50 60 70 80 90 100
Output power (W)
Ef
fic
ie
nc
y
 
Figure 16 : Overall system efficiency 
 
 
IV. RELIABILITY 
 
 The reliability evaluation of the power system at hand is 
complicated greatly by the fact that detailed component 
data is unobtainable. The only data available are those 
provided in the converter datasheet. It is well-known that 
the reliability of electronic parts is very dependent on 
operating temperature. Due to the parasitic elements 
inherent in any real-world components different reliability 
optimums under given working conditions apply to each 
component. The proposed thermal droop load sharing 
technique accounts for this fact by using the individual 
converter temperatures as local feedback off-sets. This 
ensures the lowest overall power system temperature, thus 
optimizing the overall system reliability. Since the 
converters are encapsulated in a very small package it is 
assumed that all parts work at the same temperature – the 
measured baseplate temperature times the thermal 
resistance of 1mm. aluminum gives the thermal drop across 
the heatsink.  
 The reliability evaluation presented in this section use 
the reliability data in the converter datasheet by 
normalizing the MTBF to the worst-case temperature. It is 
hereafter possible to calculate the relative changes in 
reliability as a function of temperature. The degree of 
accuracy of these calculations depends of the component 
distribution as well as the ratio of passive components to 
active components. To account for this fact the calculations 
are based on seven active components. Most real-world 
designs incorporate a smaller number of active components 
in simple DC-DC converters, and the results presented 
below will provide the end-user with even higher system 
reliability. In other words, the assumptions on which the 
following calculations are based forms a basis for a worst-
case reliability improvement (minimum reliability 
improvement possible). 
 The point of origin is a measurement of the temperature 
contribution made by the series droop resistors. The result, 
shown in Figure 18, provides the resistor temperature 
increase above ambient temperature as a function of the 
current passing through it. 
0
10
20
30
40
50
60
70
80
90
0 2 4 6 8 10
Current through droop resistor (A)
Te
m
pe
ra
tu
re
 a
bo
ve
 a
m
bi
en
t (
C
) Droop resistor temperature increase above ambient
 
Figure 17 : Resistor temperature increase above 
ambient 
 
 It can be seen that the temperature curve in Figure 18 
has the shape of a second order function, which is exactly 
what should be expected of restive power losses.  
 Having established a correlation between converter 
current and series resistor temperature rise the reliability 
calculations can be performed. The point of origin is the 
determination of the individual converter temperatures at 
full load current. Fortunately, this is easily done by relating 
Figure 10 and Figure 11. These values are then used to 
establish a failure rate for each converter and consequently 
a reliability number as a function of time. Next, the same 
procedure is followed for the thermal droop technique and a 
comparison is possible. The data used is summarized in the 
following table: 
 
 
Series resistor droop technique Thermal droop technique 
Converter Current Temperature FIT Current Temperature FIT 
1 6.23 A 60.1°C 178 7.42 57.4°C 160 
2 5.33 A 57.2°C 158 6.11 53.0°C 134 
3 8.34 A 94.9°C 802 6.49 60.1°C 178 
Table 2 : Reliability data for the two techniques
  Page 48 
 
 From the data presented in Table 2 it can be seen that 
converter 3 operates at a temperature of more than 30°C 
above the other two converters in the configuration when 
the load sharing is achieved by means of the series resistor 
droop technique. This has a very negative impact on the 
associated failure rate (FIT). In fact, the failure rate of 
converter 3 is 4.8 times higher than the average of the other 
two converters. Inserting this much higher value into the 
exponential function for calculating the overall reliability, 
results in a very high probability of failure for converter 3. 
In turn, this decreases the overall system reliability 
considerably. The average system temperature that results 
from the data shown in Table 2 is 70.7°C for the series 
resistor droop technique and 56.8°C for the thermal droop 
technique.  
 Graphically illustrating the unavailability of the two 
techniques using the data in Table 2 is shown in Figure 18. 
  
Unavailability
Years
1 2 3 4 5
0.00005
0.00010
0.00015
0.00020
0.00025
0.00030
0.00035
Series resistor droop technique
Thermal droop technique
 
Figure 18 : System unavailability vs. time in years 
 
 Figure 18 shows the combined system unavailability as 
a function of years in operation. Based on the curves it is 
quite clear that the traditional series resistor droop 
technique is much more likely to fail than its thermal droop 
counterpart. An exact number of the percent-wise decrease 
in system unavailability can be established by dividing the 
unavailability difference between the two load sharing 
techniques with the unavailability of the traditional series 
resistor droop load sharing. In mathematical terms this 
decrease can be expresses as: 
 
1eeee2-
100-  Q
1562500
t4599
78125
t657
1250000
t10731
12500000
t124611--
−+++⋅
⋅
=∆
⋅−⋅−⋅−⋅
ξ   (2) 
 
where ξ is given by: 
 
 6250000
t15987
3125000
t8541
2500000
t10293
78125
t657
1250000
t10731
12500000
t-124611
eee2eee2  
⋅−⋅−⋅−⋅−⋅−⋅
++⋅−−−⋅=ξ  
 
 Inserting numerical values into (2) results in an overall 
unavailability decrease of 75.04%. This is a significant 
reduction, which to a large extend is caused by lowering of 
the overall system temperature by elimination of the droop 
resistors. Also, the significant drop in the operating 
temperature of converter 3 contributes to the large decrease 
in overall system unavailability. In fact, it can be calculated 
that, in the initial series resistor droop technique 
 implementation, converter 3 has a 70% chance of failing 
while the other two converters approximately split the 
remaining 30% (converter 1 = 16.2% and converter 2 
=13.8%).  
 As a concluding remark it is worth noting that the 
temperature distribution that results form the thermal droop 
load sharing implementation still allow for certain 
deviations. As opposed to the technique described in [4] 
this technique has no feedback to a common controller that 
effectively equalizes the temperatures. A system 
incorporating a dedicated controller could also be 
implemented by means of standard off-the-shelf converters. 
However, such a system increases the overall circuit 
complexity and thereby eliminates the entire idea of the 
droop technique – its simplicity of implementation. 
 
V. CONCLUSION 
 
 This paper has provided the experimental verification 
that the new thermal droop load sharing technique enhances 
both system efficiency and overall reliability. Furthermore, 
due to the added thermal feedback network the current 
sharing among the individual converter were improved. 
However, as has been explained this is merely due to 
chance, since the power system adjusts the individual 
converter currents in accordance with the baseplate 
temperature of each converter. 
 In other words, the thermal droop load sharing 
technique combines the efficiency of the lower loop gain 
method with the dynamic capabilities of the series resistor 
method. 
 
ACKNOWLEDGMENT 
 
 The authors would like to thank CALEX for sponsoring 
high-quality converters for the power system, International 
Rectifier for allowing the test measurements to be 
performed at their research lab in Santa Clara. Also, the 
authors would like to thank Professor Seth Sanders from 
UC Berkeley for his contributions to this work.  
 
REFERENCES 
 
[1] ‘When It Comes To Compact PCI Supplies, Standards 
Are Helping’, Lazar Rozenblat and Paul Kingsepp, 
Todd Products Corp., web-article. 
[2] ‘Thermal droop load sharing automates power system 
reliability optimization’, Carsten Nesgaard and Seth R. 
Sanders, submitted for the second quarter PELS 
Newsletter. 
[3] ‘75Watt QH Single Series DC/DC Converters’, 
CALEX data sheets, www.calex.com 
[4] ‘Efficiency improvement in redundant power systems 
by means of thermal load sharing’, Carsten Nesgaard 
and Michael A. E. Andersen, Applied Power 
Electronics Conference and Exposition 2004, 
Anaheim, USA 
  Page 49 
 
Topological reliability analysis of common front-
end DC/DC converters for server applications 
 
Carsten Nesgaard 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: cn@oersted.dtu.dk 
Michael A. E. Andersen 
Oersted-DTU, Automation  
 DK-2800 Kongens Lyngby 
Technical University of Denmark 
Email: ma@oersted.dtu.dk 
 
Introduction 
 
 With the increasing requirements of highly reliable systems the need for power solutions capable of providing the 
same level of reliability increases accordingly. Adopting the technique of redundancy, power system designers has a 
powerful tool for realizing systems with a high degree of reliability. However, in order to maximize the overall reliability 
of such redundant systems dual power sources are often considered a necessity. The search for a suitable power source to 
serve as back-up has led to numerous papers ranging from descriptions of UPS implementations to analysis of the pros 
and cons of the 48V DC telecom bus. One particular solution that everyone seems to agree on is the configuration shown 
in Figure 1 where the 48V DC telecom bus serves as the secondary power source. A thorough description of this telecom 
DC power bus can be found in [1] and [2], for which reason a description hereof is omitted in this presentation. 
 Having found a feasible solution for the secondary power source the search is on for the optimum front-end DC/DC 
topology for conversion of the low bus voltage to the 380VDC input voltage for the DC/DC output stage. During the past 
several years many different topologies and variations hereof have been proposed for this front-end DC/DC converter. 
The emphasis of these topologies has been on improving system efficiency. Although system efficiency is of great 
importance the reason for implementing a redundant configuration in server applications is to minimize down-time. 
Unfortunately, there has been little or no assessment of overall power system reliability as a result of implementing the 
proposed topologies.  
 This paper analyses a boost converter, a cascaded boost converter and a SEPIC converter and finds that many of the 
exotic topologies proposed in numerous papers performs worse than a simple isolated SEPIC converter in terms of 
system survivability. Using this information an isolated SEPIC converter is implemented in a real-world test setup. 
Measurements show that the efficiency of this SEPIC converter is approximately 92% over the entire input voltage range 
(DC-telecom bus : 48V – 75V). 
 
A block diagram of the server power system configuration in question is shown in Figure 1. 
 
AC / DC
Front End
DC / DC
Front End
DC / DC
Output stage
   90 - 264 VAC
  48 - 75 VDC
380 VDC Server
 
Figure 1 : Block diagram of the overall power system 
 
 In Figure 1 it can be seen that the power system is comprised of a regular supply (AC/DC front-end) taken from the 
mains and a second supply (DC/DC front-end) taken from the telecom DC power bus. 
 
Reliability 
 
 Reliability is a topic of ever-growing importance in the evaluation of modern systems, whether these are comprised of 
electronic parts, system level building blocks or concern a supply management chain. Originating in fields such as 
Submitted for review at International Power Electronics Congress 2004, Celaya, Mexico, October 2004 
  Page 50 
 
military equipment, space products and high-cost computer systems, reliability standards have evolved and are nowadays 
necessary in almost all system applications. The most comprehensive of these standards is the military standard (MIL-
HDBK-217) which includes failure rates for almost every component found on the commercial market. Following the 
guidelines provided in MIL-HDBK-217 the calculation of single system reliabilities is straightforward, although many 
individual component failure rates are involved. 
 As indicated in the introduction converter evaluations within a given power system is often conducted based on 
efficiency assessments. From a technical point of view this comparison is interesting and much information about the 
system can be obtained. However, from a reliability point of view this system evaluation only provides part of the system 
performance package. Although system efficiency and overall reliability to some extend is related, the number of 
components making up the system, the thermal aspects of the design and the electrical stresses the components has to 
tolerate all have a profound affect on system reliability while system efficiency might be completely unaffected by these 
parameters.  
 Besides form the reliability affecting parameters mentioned above two additional reliability issues are addressed in 
this paper – system operating temperature and component voltage stress. In reliability engineering it is well know that 
system temperature is the most critical parameter in terms of survivability. However, component voltage stress also has a 
significant affect on system performance. The temperature issues are considered in each of the 3 converter topologies and 
will be described in the following sections. The component deterioration of increasing voltage stress will be shown 
graphically but will otherwise not be described in this digest due to the page limitation. In the final paper a short 
description of high voltage stresses will be provided, since this parameter (as well as temperature changes) forms the 
basis for all the reliability calculations leading to the result shown in figure 7. 
  Two commonly used components that include the working voltage stress in the determination of the failure rates are 
the resistor and the capacitor. A 3 dimensional illustration of the reliability affects of varying temperature and voltage 
stress is shown in Figure 2 and Figure 3.  
 
   
Voltage stress
Temperatu
re
150
100
50
0
1.0000
0.9995
0.9990
0.9985
0.5
0.0
1.0
Pr
ob
ab
ilit
y 
of
 s
ur
vi
va
l
         
Voltage stress
Temper
ature
150
100
50
0
1.0000
0.998
0.996
0.994
0.5
0.0
1.0
Pr
ob
ab
ilit
y 
of
 s
ur
vi
va
l
 
   Figure 2 : Resistor survivability vs. temp. and stress      Figure 3 : Capacitor survivability vs. temp. and stress 
 
 From Figure 2 and Figure 3 the reliability impact of temperature and voltage stress can be seen. It is clear that the 
overall impact of increasing system temperature is more critical than the component voltage stress. However, as the 
voltage stress approaches 1 (100%) the associated temperature needed to maintain a certain level of reliability decreases 
dramatically. This indicates the importance of a proper trade-off between component selection and allowable power 
losses. 
 Although reliability prediction of electronic equipment is a strong tool it must be kept in mind that the foundation of 
the calculations is based on point estimates of component data. The accuracy of the data in MIL-HDBK-217F is therefore 
only valid for the conditions under which they were obtained. Since it is an impossible task to obtain data for all 
conditions under which a component can be used, extrapolation within a limited range of operating conditions is allowed, 
but must be taken into account when applying the accumulated probability predictions to the electronic equipment under 
consideration. 
  Page 51 
 
Boost converter 
 
 Being one of the basic topologies this converter is very simple and many books, papers and articles have described the 
functionality of this converter. Therefore this section will only provide the necessary parameters and descriptions to allow 
for the loss and reliability assessments. 
 
C1Q1
L1
Vin VOut
D1
 
Figure 4 : Boost converter 
 
 The duty-cycle needed to boost the minimum input voltage to the 380VDC is approximately 0.88. This rather large 
duty-cycle increases the stresses on the semiconductor devices, which in turn results in deteriorated system reliability. 
The exact component stresses of this topology will be included in the final paper, where they will form the basis for a set 
of loss curves and the system reliability evaluation.  
 
Cascade boost converter 
 
 As specified in the previous section the boost converter is a poor choice when large step-up ratios are needed.  In the 
application at hand the step-up process requires a duty-cycle very close to 1, which increases the semiconductor stresses 
considerably. A topology that by many have been considered to be the best solutions for the DC/DC front-end converter 
is the cascade boost converter [8].  This topology - depicted in Figure 5 - reduces the duty-cycle, which in turn reduces 
the stress on each individual component although the accumulated component stress remains the same [6]. 
 
C1Q1
L1
Vin
D1
C2Q2
L2
VOut
D2
First stage Second stage
 
Figure 5 : Cascade Boost converter 
 
 The final paper will provide detailed calculations for this topology. Also, data from [8] will be included and compared 
to the theoretical temperature and loss estimations. 
 
SEPIC converter 
 
 The converter proposed for this application is an isolated version of the SEPIC converter. Like the other topologies the 
SEPIC converter is able to draw continuous current at the input, which in this case is a clear advantage since the 
requirements for input filtering is reduced. Furthermore, the magnetic components of the SEPIC converter can be 
integrated onto a single magnetic core, which enables a high degree of leakage inductance control that can be used to 
guide away ripple from the input of the converter. 
 
C1
Q1
L1
Vin VOut
D1
C2L2
1:N
 
Figure 6 : Isolated SEPIC converter
  Page 52 
 
 As can be seen from Figure 6 the galvanic isolation is not maintained since this is not a requirement. The coupled 
inductor design makes it possible to design the converter to operate in the vicinity of 50% duty-cycle for the entire input 
voltage range.  Since the performance of the SEPIC converter is at a maximum when operating with a duty-cycle of 50% 
[5] this is a very attractive design option. 
 The selection of the individual components for the SEPIC topology is based on a component stress of approximately 
70%, which is a compromise between losses/temperature and component cost. 
 
Design specifications: Input voltage: 48VDC – 75VDC 
 Output Voltage: 380 VDC 
 Output Power: 300W 
 Turns ratio: N = 6.33 (This insures 50% duty-cycle at VIN = 60 VDC) 
 
The detailed reliability and loss calculations will be included in the final paper along with a short description of the 
SEPIC pros and cons. 
 
Comparison 
 
 Based on the above description this section compares the 3 topologies by means of reliability calculations. Each 
converter topology has been evaluated and the associated losses and system temperature has been established. To 
summarize - it was found that due to the large step-up requirements in the application at hand the boost converter would 
be a poor choice. The cascaded boost converter compensates for some of the drawbacks in the boost converter by for 
example reducing the duty-cycle, which in turn reduces the stress on each individual component although the 
accumulated component stress remains unchanged [6]. Considering the buck-boost topological family it is well known 
that this class of converters has the ability to perform high step-up ratios without extreme duty-cycles. Furthermore, in 
cases of very high step-up/step-down ratios the buck-boost topology actually imposes less semiconductor stress than the 
buck and boost topologies. A more detailed investigation of the latter topic can be found in [5]. 
 Another way to obtain the large step-up ratio without extreme duty-cycles is to utilize isolated converters. The 
transformer turns ratio can then be matched to the step-up/step-down needs. Isolating the buck or boost topologies 
increases the semiconductor stress significantly whereas the semiconductor stress in isolated buck-boost topologies 
remains the same [5]. Therefore, when comparing a single-switch forward converter (buck type) with a single-switch fly-
back converter (buck-boost type) one will find that the component stress is very similar [7]. In fact, the fly-back converter 
seems to have an advantage over the forward converter with regards to cost. Furthermore, isolated buck and boost 
converters are very sensitive towards voltage variations at the input whereas the isolated buck-boost converters are 
relatively immune. 
 The probability of a well-functioning system (survivability) as a function of time for the 3 topologies considered in this 
paper can be seen in Figure 7. 
2 4 6 8 10
Probability
Years of operation
1.0
0.9
0.8
0.7
0.6
Boost converter
Casceded boost converter
Isolated SEPIC converter
 
Figure 7 : Converter survivability as a function of years of operation
  Page 53 
 
 From Figure 7 it can be seen that the SEPIC converter outperforms both the boost and the cascaded boost converters. 
Considering the individual component failure rates it is quite clear that semiconductors by far are the largest contributor 
to the overall system probability of failure. Therefore it is of great importance to minimize the use of highly stressed 
semiconductors. Comparing the three topologies it is obvious that the cascade boost topology has a fairly high failure rate 
due to doubling of the semiconductors used as compared to the boost and SEPIC topologies. Considering the stress of 
critical components in both the boost and SEPIC topologies reveals that the lowest failure rate is obtained in the SEPIC 
topology, since the stress on the semiconductors is significantly reduced. Also, the temperature difference between the 
boost and SEPIC topologies indicate that the boost converter is operated close to its maximum.  
 Although a certain relationship between efficiency and reliability exists in most systems, being in the form of power 
dissipation and operating temperature, some components actually perform better efficiency-wise at high temperatures 
whereas the corresponding reliability decreases as operating temperatures increase. One such example is the diode used 
in most converters. Operating a diode near the maximum safe recommended operating temperature decreases the forward 
voltage drop and the associated diode losses decrease. However, from a reliability point of view this operating 
temperature decreases the overall system reliability quite dramatically. 
 
Efficiency  
 
 This section shows the preliminary measurements of the converter efficiency. Table 1 shows 3 measurements for 
three different input voltage levels. 
 
Measurement no. Voltage Input power Efficiency (%) 
1 48 292 91,9 
2 60 289 92,5 
3 75 293 92,3 
Table 1: Measurements 
 
As can be seen from Table 1 the efficiency for the converter is approximately 92% over the entire input voltage range. 
 
Conclusion 
 
 It has been shown that reliability considerations can be a useful tool in the selection of converter topologies suitable 
for the application at hand. Describing each converter topology in terms of power losses, temperatures and reliability data 
enables the system designer to evaluate all aspects of each converters performance. 
 
 Due to the page limit - this digest only contains the introductory description of the problem at hand and the most 
essential results. The curves shown in Figure 7 is the theoretical evidence that all calculations have been performed and 
that the isolated SEPIC converter indeed outperforms the boost and cascaded boost converters. The experimental 
measurements of the implemented SEPIC converter show that this particular topology efficiency-wise performs better 
than either of the other two topologies. 
 
 
References 
 
[1] Qun Zhao, Fengfeng Tao and Fred C. Lee: “A Front-end DC/DC Converter for Network Server Applications”. Power Electronics Specialists 
Conference  2001, Vancouverm, Canada. 
[2] John Åkerlund: “-48 V DC Computer Equipment Topology – an Emerging Technology”. INTERLEC 1997. 
[3] Lars Petersen: “Input-Current-Shaper Based on a Modified SEPIC Converter with Low Voltage Stress”. Power Electronics Specialists Conference  
2001, Vancouverm, Canada. 
[4] Military Handbook (MIL-HDBK-217F): Reliability Prediction of Electronic Equipment. 
[5] L. Petersen, M. Andersen: “Two-Stage Power Factor Corrected Power Supplies:  The Low Component-Stress Approach”. Applied Power 
Electronics Conference and Exposition 2002 in Dallas, USA. 
[6] Bruce Carsten: “Converter component load factors; A performance limitation of various topologies”, PCI 1988, Munich, Germany. 
[7] Bruce Carsten: “On the fundamental performance similarities of flyback and forward converters at high frequencies”, PCI 1987, Long Beach, CA, 
USA. 
[8] Laszlo Hubber, Milan M. Jovanovic: “A Design Approach for Server Power Supplies for Networking Applications”, Applied Power Electronics 
Conference and Exposition 2000, New Orleans, USA 
   Page 54 
 
 
 
 
 
 
Report on Power System Reliability 
 
 
 
 
Author 
 
Carsten Nesgaard 
 
October 2003 
 
 
Department of Electrical Engineering and Computer Sciences 
 
UC Berkeley 
 
 
 
Advisors 
 
Program Manager Wei-Bin Zhang 
Professor Seth Sanders 
 
 
   Page 55 
 
Introduction: 
 
The power system is one of the most critical parts of almost any system. Due to the sever 
consequences of a guidance system failure in the application at hand it is of vital importance to 
ensure a stable and continuous output voltage to the subsequent systems. This report contributes to 
an identification of critical parts of the power system as well as to a classification of techniques for 
increasing the overall power system reliability. 
 
To comply with the general guidelines of the UpTime Institute the input power has to come from at 
least two different and independent sources. This is a rather stringent requirement that limits its 
implementation to systems comprised of multiple power sources. In systems with only a single 
power source these guidelines are not applicable for which reason such systems cannot be 
classified as fault tolerant. However, even if the system cannot be classified as fault tolerant from a 
system point of view does not prevent it from being fault tolerant at a subsystem level. 
Furthermore, even though true fault tolerance might not be obtainable, improvements to the overall 
system reliability is still possible by means of different techniques.  
In this particular application the failure rate of the battery is very low – meaning that the challenge 
in terms of reliability is the voltage conversions to the interconnected subsystems. 
 
This report will provide an examination of the power system with intend to optimize reliability as 
well as overall system performance. The foundation of the analysis is a Functional Failure Modes 
Effects and Criticality Analysis (FFMECA) followed by a description of hardware redundancy 
implementations. The FFMECA is performed at a block level and indicates a prioritized criticality 
evaluation of each block. 
 
Power system configuration: 
 
In order to establish the criticality of the power system the following configuration has been 
established: 
 
12 V     6 W
DVI
24 V    12 W
DVI monitior
12 V
100 WAVG
500 WPeak
Service
Break
Controller
24 V    25 W
Control
computer
12 V    20 W
Radar
12 V    20 W
Radar
12 V    20 W
Lidar
12 V     2 W
V-V com.
9-30 V
3 W
VDS
9-36 V
2.5 W
GPS
12 V     5 W
VR com.
9-36 V
20 W
Mag. meters
Significant
Very Critical
Critical
Minor None
Battery 1
Battery 2
S/C protection
Front Switch
S/C protection
Front Switch
SupplyLoad
  
Figure 1 : Power system criticality analysis
   Page 56 
 
The individual blocks are rated according to the following criticality list: 
 
1  Very critical (This block is essential for human safety) 
2  Critical (Loss of this block causes system malfunction) 
3  Significant (Loss of this block causes important system degradation) 
4  Minor (Loss of this block causes only minor system degradation) 
5  None (Loss of this block has no effect on overall system performance might cause a 
surveillance circuit to loose power) 
 
Based on above list the following Functional Failure Mode Effects and Criticality Analysis is 
established: 
 
Block Functional effect Voltage level Criticality Power rating
Control computer System malfunction 24V 2 25W 
Differential GPS system Loss of exact location 9-36V 4 2.5W 
Driver vehicle interface Loss of control 12V 1 6W 
Lidar Deteriorated avoidance system 12V 3 20W 
Magnetometers Loss of guidance 9-36V 2 20W 
Radar Deteriorated avoidance system 12V 3 20W 
Safety monitor computer Loss of control 24V 1 12W 
Service brake controller System malfunction - 2 - 
Steering actuator Loss of steering 12V 2 100W (500W) 
Vehicle dynamics sensor Loss of motion detection 9-30V 4 3W 
V-R communication Loss of roadside communication 12V 5 5W 
V-V communication Loss of avoidance communication 12V 3 2W 
 
 
Preliminary topology evaluation: 
 
Due to the severity of certain system malfunctions a high degree of overall system reliability is 
required. To further increase the overall reliability fault resilience is built into the electrical design 
resulting in a single point failure free system. The proposed converter design that complies with the 
latter requirement is shown Figure 2. 
 
Front Switch
or Fuse Input Filter
BUCK
Converter
Resonant
Converter Output Filter
PWM 1 PWM 2
OVP
Latch OVP
OVP
Latch
 
Figure 2 : Individual converter realization
   Page 57 
 
The system is comprised of a buck converter followed by a resonant converter operated at a 50% - 
50% duty-cycle - thus serving as a DC-DC transformer. To ensure that no single failure can short 
out the input power bus a front switch is inserted in series with each individual converter. This 
front switch also serves as a current limitation during system startup and/or converter replacement. 
Controlled converter shut-down in case of fault occurrence is ensured by the built-in latch, which 
also prevents the system from operating in a state where one or more converters are trying to restart 
after being shut-down (hick-up mode). 
 
 
Reliability considerations: 
 
Considering the individual component failure rates it becomes apparent that the reliability of the 
entire system is determined by the component/components most likely to fail. For this reason this 
section provides the reliability data for the three most unreliable components – the controller IC, 
the MOSFET transistor and the filter capacitors. 
 
                                 
 Controller IC   MOSFET transistor  Filter capacitor 
 
The probability of component survival for a specified period of time (survivability) of the three 
components is calculated based on the following assumptions: 
 
 The control IC considered is a standard controller for DC/DC converter applications and is 
comprised of approx. 3,500 transistors. 
 The switching device under consideration is a standard low voltage MOSFET transistor 
(64V) for use in switching applications with power ratings up to 100W. 
 Due to the requirements of fairly low output voltage ripple a large amount of capacitance is 
needed at the system output. For this reason the capacitors considered are electrolytic 
capacitors with a voltage rating complying with the standard derating requirements of 75%. 
 
Using the point failure rate estimates found in MIL-HDBK-217 the following three equations can 
be deduced: 
 
 ( )2981273T 1  - -1925MOSFET e120  +⋅⋅=λ  
 
( )





 +⋅⋅= +⋅
⋅
0013.0e0.021000  298
1
273T
1
5-108.617
.650-  - 
IC Controlλ  
 ( )( ) ( )5378273T09.535.0Capacitor e10.0254  +⋅⋅+⋅= sλ  
 
A graphical illustration of the temperature dependency of component reliability is shown below:
   Page 58 
 
Failure rate (FIT 's)
Temperature (ºC)
20000
15000
10000
5000
40 80 120 1401006020  
Figure 3 : Control IC failure rate as a function of junction temperature 
 
 
Failure rate (FIT's)
Temperature (ºC)
800
600
400
200
40 80 120 1401006020  
Figure 4 : MOSFET transistor failure rate as a function of junction temperature 
 
 
Failure rate (FIT's)
Temperature (ºC)
300
200
100
40 80 120 1401006020  
Figure 5 : Filter capacitor failure rate as a function of internal temperature
   Page 59 
 
It should be noted that the failure rate of the capacitors are calculated based on 3 sets of equations 
applicable in 3 different temperature intervals. The detailed derivation of these equations are 
omitted in this report but can be provided in the form of a ‘Mathematica’ document. 
 
From Figure 3, figure 4 and Figure 5 it can be seen that the control IC is by far the most unreliable 
component. Therefore, proper circuit layout is of vital importance for the overall system reliability. 
As a trade-off between thermal considerations and electrical requirements the printed circuit board 
layout shown in Figure 6 (top) is chosen for each converter. Based on circuit simulations the 
resulting operating temperature distribution, shown in Figure 6 (bottom), can be established.  
 
Inductor
Heatsink
Transistor
ICIC
Misc. components
Temperature
Distance
TSurfaceTInductor
TAmbient
TIC
PCB
TEnd of PCB
 
Figure 6 : Converter temperature distribution 
 
The obtainable mean time between failures for a system comprised of commercially available parts 
should be expected within the MTBF range: 2·105 to 5·105. These numbers apply to a single 
converter unit, thus implying that better system performance is achievable by means of reliability 
enhancement techniques.  
An example of a rather expensive reliability improvement technique is the use of screened 
components for the power system implementation. A graphical illustration of the accumulated 
failure rates of a single DC/DC converter can be seen on the next page. The resulting MTBF at the 
desired operating temperature is within 3.3·106 to 4.3·106, which is considerably better than the 
MTBF values mentioned in the previous paragraph. 
 
Further reliability enhancement techniques will be described in a subsequent section. 
 
 
 
 
 
 
 
   Page 60 
 
   Page 61 
 
Redundancy concept: 
 
In order to improve the reliability of the system, one or more spare converters can be added to form 
a redundant configuration. On the assumption that the proposed converter configurations can be 
implemented within acceptable reliability limits (MTBF of approx. 1·106) several techniques in 
terms of redundancy are applicable. If the spare converters are kept on stand-by until a functioning 
converter fails, the system is said to be cold redundant whereas if all converters are operated at the 
same time the system is said to be hot redundant. 
In case one converter fails, the hot redundant system automatically resumes operation without 
noticeable system affect while the cold redundant system requires a series of system procedures to 
be performed prior to returning to normal system operation. The latter case causes system down-
time, but has the advantage of adding an unused converter to the system – meaning that aging and 
reliability issues in most cases can be disregarded for the cold spare. 
If two or more ordinary power converters are connected to the same load, small difference in the 
output voltage will cause unequal current sharing among the converters. Typically, one or more of 
the converters will operate in current limitation and some may not supply any current at all. This 
system does not perform dynamically as expected, since the current loop gain decreases. The result 
is deteriorated load step response. Furthermore, the converters operating in current limitation do 
not conform to the component derating requirements unless each converter is over-designed. 
 
Converter 1     (T1)
Converter 2     (T2)
Converter 3     (T3)
I1
I2
I3
IOUTIin
 
Figure 7 : Redundant power system 
 
The lack of system performance in parallel-configurations calls for active load sharing. The load 
sharing among the proposed converters is based on current mode control. The output current is 
measured and compared to an internal reference in the current controller. The output from the 
current controller alters the duty-cycle generated by the PWM controller so that each converter’s 
current contribution can be modified. This current modifying capability of each converter is 
combined with a current sharing bus through which the converters communicate. In turn, this 
results in equal current sharing among the parallel-connected converters.  
Traditionally the current sharing bus is implemented by means of a single wire to which each 
converter is connected. This configuration is simple and very effective as long as the wire stays 
intact. If for some reason the current sharing bus is damaged, some converters might still share part 
of the output current while others might not supply any current at all. From a dynamic point of 
view this causes the system to become slow reacting and possibly unstable due to the lack of load 
sharing feedback. To eliminate the failure mode of single wire open failure each converter is diode 
connected to a ring configuration of the current sharing bus. 
   Page 62 
 
To protect against output voltage loss in case of a single point failure (shorted output) each 
converter has to be connected to the common output voltage bus via an OR’ing device or through a 
fuse. The latter case is a single-fault protection that impedes future reconnecting attempts. A more 
detailed description of the pros and cons of fuse-connecting the individual converters to the output 
bus are provided in a subsequent section. 
 
When using parallel-connected converters for power system reliability enhancement the choice of 
redundancy configuration must be established. The most common approach is a N+1 redundant 
configuration where one extra converter is added to the system. This approach enables the system 
to tolerate one fault while still providing the required output power. The most straight forward 
implementation of such a N+1 redundant system is the design of two identical converters each 
capable of supplying the maximum load current. However, this approach results in a 100% power 
‘overshoot’ – meaning that the available system power is twice that required by the specifications. 
Increasing the number of converter units reduces this power ‘overshoot’. For a N+1 redundant 
power system Figure 8 shows the percent-wise decrease in power ‘overshoot’ as the number of 
converter units increases. The other curve is an index that takes into account the decrease in 
converter unit cost price, the increase in circuit complexity and the increase in load sharing 
circuitry costs – all a function of the number of converter units. The index is based on component 
cost (pr. 1000 pieces) and standard load sharing implementation circuitry. It should be noted that 
the index curve in many situations will change as a function of the number of units when large 
scale manufacturing is employed and/or different load sharing techniques are used. 
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7 8 9 10 11 12
Number of units in N+1 system
P
ow
er
 'o
ve
rs
ho
ot
' r
ed
uc
tio
n 
in
 %
 
Figure 8 : Percent-wise decrease in power ‘overshoot’ 
 
From Figure 8 it can be seen that the two curves intersect somewhere between 3 and 4 converter 
units. This point is the optimum in the configuration at hand. However, as indicated above this 
optimum point is most likely to shift to either side along the axis of abscissas when other power 
system implementations are considered.  
From a reliability point of view the number of converter units should be kept to a minimum. As an 
example, a N+1 redundant system comprised of 4 converter units is 40% more likely to fail at any 
( ) 0.751)-(xindex  Price -x indexcircuitry  LS index  Complexity ⋅⋅+
( )
100
(x)unit  pr. P
 1)(xunit  pr. P - (x)unit  pr. P 
Max
MaxMax
⋅
+
   Page 63 
 
given time than a N+1 redundant power system comprised of 3 converter units. The same tendency 
holds when transitioning from a 3 converter system to a 2 converter system. However, due to the 
percent-wise larger increase in component count in the latter case the probability of system failure 
is 65% higher in a N+1 redundant 3 converter system than that of a N+1 redundant 2 converter 
system. From these calculations it can be seen that as the number of converter units increase a 
smaller and smaller gain in reliability is achieved when substituting an X unit system with an X-1 
unit system. 
 
In order to calculate the reliability improvement obtainable with different number of converters in a 
given parallel-configuration the following set of equations are established: 
 
 ( ) ( ) T  -TT1-2 2121 e1 - e  e  P ⋅+⋅⋅ ⋅+= λλλλ  
 ( ) ( ) T    -TTT2-3 321321 e2 - e  e  e  P ⋅++⋅⋅⋅ ⋅++= λλλλλλ  
 ( ) ( ) T      -TTTT3-4 43214321 e3 - e  e  e  e  P ⋅+++⋅⋅⋅⋅ ⋅+++= λλλλλλλλ  
 
The equations are derived using the exponential distribution with a constant hazard rate for all 
components combined with the binominal coefficients for successful system operation. It should be 
noted that the equations allow for reliability calculations of converters with different accumulated 
failure rates. In the special case of equal failure rates the 3 equations can be further simplified: 
 
 ( ) T-2T1-2 e1 - e2  P ⋅⋅⋅ ⋅⋅= λλ  
 ( ) T-3T2-3 e2 - e3  P ⋅⋅⋅ ⋅⋅= λλ  
 ( ) T-4T3-4 e3 - e4  P ⋅⋅⋅ ⋅⋅= λλ  
 
Plotting the equations as a function of time provides a visual assessment of the different 
configurations: 
 
Time (hours)
Probability
0.2
0.4
0.6
0.8
1.0
50000 100000 150000
2 converters - 1 working
3 converters - 2 working
4 converters - 3 working
 
Figure 9 : Probability of system survival as a function of time 
   Page 64 
 
As expected Figure 9 shows that the system comprised of 2 converters provides the best overall 
reliability whereas the system comprised of 4 converters performs worst reliability-wise. In Figure 
9 a red circle indicates the normal range of system life. A more detailed view of this section can be 
seen in Figure 10. 
 
Time (hours)
Probability
0.999
2000
2 converters - 1 working
3 converters - 2 working
4 converters - 3 working
0.998
0.997
0.996
0.995
0.994
4000 6000 8000
 
Figure 10 : Enhanced view of circled time interval shown in Figure 9 
 
The curves shown above the probability functions for the 3 different configurations are the system 
probability of survival for the special case where all converters have the same failure rate. It can be 
seen that equalizing the failure rates results in improved system reliability. Ensuring equal failure 
rate can be accomplished by means of thermally distributing each converter’s current contribution. 
Due to IEEE regulations a detailed description of this latter technique cannot be provided in this 
report since papers have been submitted to both the Applied Power Electronics Conference and 
Exposition 2004 and the Power Electronics Specialists Conference 2004. However, the real-world 
implementation of the proposed power system will incorporate this new load sharing technique. 
 
Fuse protection: 
 
If a redundant system supplies a common load, it is important to ensure that none of the converters 
fail in a manner that shorts the output power bus, since this will disable the entire power system. In 
other words it is important that each converter is single failure tolerant towards short circuiting the 
power supply outputs. One way of ensuring this is by inserting fuses in series with each converter 
output. Unfortunately, a fuse can sustain several times its nominal current rating for prolonged 
periods of time. Therefore a large current is needed to blow the fuse in a timely manner. The 
current rating needed to blow a traditional fuse within 1 ms. is on the order of 4 times the nominal 
current. This sets a lower limit on the number of parallel-connected converters since the remaining 
converters have to supply the large current needed to blow the fuse of the faulty converter. At the 
same time the remaining converters must maintain the proper current level at the load. Further 
details of pre-arcing time vs. multiple integers of nominal current can be found in fuse 
manufactures datasheets (for instance SCHURTER). 
However, from a reliability point of view the use of fuses has an overall system impact that results 
in lower converter failure rate. Whether to use fuse protection or some means of actively limiting 
the current-flow to one direction should be based on system assessments for each particular 
application. In this application a mix of fuses and active semiconductors will be used. The fuses 
   Page 65 
 
will be used as buffers at each converter’s input while each converter’s output is actively OR’ed to 
the common output voltage bus. 
 
Parts level redundancy: 
 
Having described the redundancy concept at a system level it is worth noting that similar 
approaches apply at the parts level. The common approach in assessing system reliability is based 
on a probability of component failure. This approach assumes two failure modes – a working part 
and a failed part. However, most components have several failure modes each with their own 
probability of occurrence.  
 
As an example the free-wheeling diode of the buck converter is considered. By means of traditional 
reliability evaluation of electronic parts the following two states can be determined: 
 
 Part working 
 Part failed 
 
The probability of the buck converter free-wheeling diode failing within a year is 0.08%. 
Considering the multiple failure modes of the free-wheeling diode the following states can be 
determined: 
 
 Part working 
 Part failed - short circuit 
 Part failed - open circuit 
 
In order to take corrective actions towards the part failures it is necessary to know the distribution 
of short circuit failures and open circuit failures. These data can be found in numerous component 
standards. The data used in this report is as follows: 
 
 Part failed - short circuit 35% 
 Part failed - open circuit 65% 
 
The percentages indicate that 35% of the probability of diode failure – the diode fails short circuit 
while the remaining 65% of the probability of diode failure – the diode fails open circuit.  
Continuous buck converter operation requires a path for the inductor current during the off-period, 
meaning that an open circuit is not a valid state. At the same time this current path should be 
blocked during the on-period, meaning that a short circuit is not a valid state either. Optimization of 
this current path using the above information can be accomplished by parallel-connecting two 
diodes as shown in Figure 11. 
BA
 
Figure 11 : Parallel-connection of two diodes increases the reliability
   Page 66 
 
For the probability assessment it should noted that the two failure modes are mutually exclusive, 
meaning that once a diode has failed open circuit it cannot fail short circuit. This information is 
important when deducing the minimal cut set used to assess the probability of diode survival in the 
parallel-configuration. The following equation forms the minimal cut set: 
 
 circuitOpen operation Normal
2
operation NormalParallel PP2  P  R ⋅⋅+=  
 
Inserting values for the two cases (single diode and parallel-connected diodes) results in the 
following probabilities of component survival within 1 year: 
 
 PSingle = 0.999219 
 PParallel = 0.999453 
 
The value for the parallel-connected diodes might not seem that different from the result of the 
single diode, but in terms of unavailability (1-Probability) it amounts to almost a 30% reduction in 
overall probability of component failure. Similar approaches can be taken in parallel-connecting 
other parts throughout the system. However, it should be noted that this technique only applies to 
multiple failure mode components having different probabilities of failing in one state or the other. 
 
 
Actual system: 
 
An image of the real-world implementation of the power system for the control computers, 
magnetometers, Doppler radar and lidar can be seen in Figure 12. 
 
 
Figure 12 : Real-world power system 
 
The realization of the power system and the design of components are described in the document 
‘Precision Docking Project Power System’ presented at the last project meeting. Therefore this 
section only provides the results and the simulations for the overvoltage protection circuitry that 
was designed to prevent unacceptable failure modes from propagating through the system. A 
schematic of the thermal droop load sharing circuitry added to each converter is illustrated in 
Figure 13. 
   Page 67 
 
Overvoltage protection 
 
To prevent overvoltage faults in the power system the following overvoltage protection circuit has 
been designed: 
 
R3
R1
R22.2kΩ
2.2kΩ
5.6kΩ
2N3904
Q1
2N3906
Q2
C11nF
output
VOutput
RT
RS
RF1
RF2
13kΩ
3.9kΩ 4.3kΩ
2.2kΩ
R4 C2 100pF
74F125 time = t1
Test circuit
on_off
Thermal droop load sharing
Over voltage protection
RT,25Ccc    = 5.0kcΩβ             = 3950
Test circuit shorts out
individual resistors
buffer
switch
feedback
trig C3 10pF
 
Figure 13 : Droop load sharing with overvoltage protection circuit 
 
A basic FMECA for the thermal droop load sharing network components is performed. The result 
can be seen in the table below: 
 
Part Failure mode Failure effect Criticality 
RT Short circuit Over voltage situation 2 
 Open Circuit None 4 
RS Short circuit None 4 
 Open Circuit None 4 
RF1 Short circuit Over voltage situation 2 
 Open Circuit None 4 
RF2 Short circuit None 4 
 Open Circuit Over voltage situation 2 
 
The 3 failure modes causing an over voltage situation is examined by means of the test circuit 
shown in Figure 13. At a predetermined moment the test circuit closes the switch, thus causing a 
short circuit of its terminals. To make sure the over voltage circuit does not trigger prematurely the 
test circuit incorporates a time delay of 20µs. 
   Page 68 
 
Simulation results 
 
Figure 14 shows the normal operating mode of the converter. The curves show the node voltages at 
50% of full load, which is equivalent to 7.5A. The buffer (buffer) and trigger (trig) voltages are 
zero, although the curves show some noise in the nano and pico volt range. The output voltage 
(output) is 5V with a sinusoidal ripple voltage of ±100mV that represents both the natural converter 
voltage ripple as well as random noise. The fact that the feedback voltage is a scaled replica of the 
output voltage is used to trigger the over voltage protection in case the feedback voltage exceeds 
2.7V. It should be noted that the non-linear characteristics of the thermistor must be taken into 
account in order for the feedback voltage to be a true scaled replica of the output voltage. 
 
Feedback voltage at 50% load
Output voltage with noise
ON/OFF voltage for converter shut-down
Trigger voltage
 
Figure 14 : Waveforms during normal operation  
 
Close examination of the 3 failure modes leading to over voltages reveals that the exact same 
timing behavior occurs in all situations. For this reason, only one set of waveforms are provided. 
The 3 failure modes examined are: 
 
• RT short circuit 
• RF1 short circuit 
• RF2 open circuit 
 
Figure 15 shows the node voltages from which the protection circuit response can be observed. 
 
Feedback returns to normal
Retriggering of overvoltage
Output voltage with noise
Switch activation voltage
Buffer voltage
Feedback over voltage
 
Figure 15 : Waveforms during abnormal operation 
   Page 69 
 
At the instant the test circuit closes the switch and causes an over voltage situation the buffer 
voltage (buffer) generates the trigger signal (trig) that activates the ON/OFF latch (on_off). Figure 
16 shows a close up view of the on_off voltage during abnormal system operation. 
 
Triggering of over voltage protection latch
Voltage spike
Immune to retriggering attempts
 
Figure 16 : Enhanced view of the on_off voltage during abnormal operation 
 
The reaction time from over-voltage detection to converter shut-down is 663ns. This reaction time 
can be minimized at the cost of a larger voltage spike. According to the manufacturer’s datasheet 
the voltage at the ON/OFF pin should be limited to 3V. Currently a 533mV voltage spike results 
from the circuit configuration but can be minimized if more capacitance is added to C1 and C2. 
Larger capacitors results in longer charge times, which in turn prolongs the reaction time of the 
overvoltage protection. Very fast-reacting protection circuits and low voltage spikes at the 
converter TRIM input during circuit triggering are contradictive requirements and a trade-off must 
be made. In this case a relatively fast-reacting protection circuit is essential for system survival, 
therefore the voltage spike that results have to be accepted. 
 
Once the over voltage protection has detected an over-voltage from the converter it would be 
desirable if the converter never attempted to restart and possibly cause another over-voltage 
situation. The overvoltage protection latch ensures that retriggering attempts are ignored and the 
converter stays off-line. From the switch voltage (switch) shown in Figure 15 it can be seen that a 
retriggering is attempted 300µs after the first over voltage situation. Furthermore, although the 
feedback voltage (feedback) returns to normal 50µs after triggering the overvoltage protection, the 
ON/OFF latch remains in the low state, thus keeping the converter in the off state. 
 
   Page 70 
 
Summary: 
 
The power system can be implemented using a wide verity of techniques – each resulting in 
optimization of reliability, mass, cost or circuit complexity. Since there are no requirements towards 
mass and circuit complexity these parameters are secondary concerns, for which reason it is 
recommended that the approach taken should be based on a trade-off between the needed system 
reliability and power system cost.  
With the severe consequences of for instance control computer malfunction it is recommended that 
the power system is implemented using the redundancy technique and associated thermal load 
sharing. Since large scale implementation is desirable, power system realization by means of off-the-
shelf converters have been examined. The conclusion was that the best approach for the application 
at hand was a power system comprised of 3 parallel-connected sharing the common load by utilizing 
the new thermal droop load sharing technique proposed in the document ‘Precision Docking Project 
Power System’. 

   
 
 
 
ISBN 87-91184-29-0 
 
 
