# IMAGE COMPRESSION AND ENERGY HARVESTING FOR ENERGY CONSTRAINED SENSORS A Dissertation IN Electrical and Computer Engineering and Telecommunications and Computer Networking Presented to the Faculty of the University of Missouri–Kansas City in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY #### by HSUAN-TSUNG WANG B.S., Vanung University, Taiwan, 2002 M.S., University of Missouri-Kansas City, USA, 2007 > Kansas City, Missouri 2013 © 2013 HSUAN-TSUNG WANG ALL RIGHTS RESERVED # IMAGE COMPRESSION AND ENERGY HARVESTING FOR ENERGY CONSTRAINED SENSORS Hsuan-Tsung Wang, Candidate for the Doctor of Philosophy Degree University of Missouri–Kansas City, 2013 #### **ABSTRACT** The advances in complementary metal-oxide-semiconductor (CMOS) technology have led to the integration of all components of electronic system into a single integrated circuit. Ultra-low power circuit techniques have reduced the power consumption of circuits. Moreover, solar cells with improved efficiency can be integrated on chip to harvest energy from sunlight. As a result of all the above, a new class of miniaturized electronic systems known as self-powered system on a chip has emerged. There is an increasing research interest in the area of self-powered devices which provide cost-effective solutions especially when these devices are used in the areas that changing or replacing batteries is too costly. Therefore, image compression and energy harvesting are studied in this dissertation. The integration of energy harvesting, image compression, and an image sensor on the same chip provides the energy source to charge a battery, reduces the data rate, and improves the performance of wireless image sensors. Integrated circuits of image compression, solar energy harvesting, and image sensors are studied, designed, and analyzed in this work. In this dissertation, a hybrid image sensor that can perform the tasks of sensing and energy harvesting is presented. Photodiodes of hybrid image sensor can be programmed as image sensors or energy harvesting cells. The hybrid image sensor can harvest energy in between frames, in sleep mode, and even when it is taking images. When sensing images and harvesting energy are both needed at the same time, some pixels have to work as sensing pixels, and the others have to work as solar cells. Since some pixels are devoted to harvest energy, the resolution of the image will be reduced. To preserve the resolution or to keep the fair resolution when a lot of energy collection is needed, image reconstruction algorithms and compressive sensing theory provide solutions to achieve a good image quality. On the other hand, when the battery has enough charge, image compression comes into the picture. Multiresolution decomposition image compression provides a way to compress image data in order to reduce the energy need from data transmission. The solution provided in this dissertation not only harvests energy but also saves energy resulting long lasting wireless sensors. The problem was first studied at the system level to identify the best system-level configuration which was then implemented on silicon. As a proof of concept, a $32 \times 32$ array of hybrid image sensor, a $32 \times 32$ array of image sensor with multiresolution decomposition compression, and a compressive sensing converter have been designed and fabricated in a standard $0.5~\mu m$ CMOS process. Printed circuit broads also have been designed to test and verify the proposed and fabricated chips. VHDL and Matlab codes were written to generate the proper signals to control, and read out data from chips. Image processing and recovery were carried out in Matlab. DC-DC converters were designed to boost the inherently low voltage output of the photodiodes. The DC-DC converter has also been improved to increase the efficiency of power transformation. #### APPROVAL PAGE The faculty listed below, appointed by the Dean of the School of Graduate Studies, have examined a dissertation titled "Image Compression and Energy Harvesting for Energy Constrained Sensors," presented by Hsuan-Tsung Wang, candidate for the Doctor of Philosophy degree, and hereby certify that in their opinion it is worthy of acceptance. #### **Supervisory Committee** Walter D. Leon-Salas, Ph.D., Committee Chair Department of Computer Science & Electrical Engineering Ghulam Chaudhry, Ph.D. Department of Computer Science & Electrical Engineering Deep Medhi, Ph.D. Department of Computer Science & Electrical Engineering Cory Beard, Ph.D. Department of Computer Science & Electrical Engineering Reza Derakhshani, Ph.D. Department of Computer Science & Electrical Engineering ## CONTENTS | A | BSTR | ACT | iii | |----|--------|----------------------------------------------------------------------|-----| | IL | LUST | TRATIONS | X | | TA | ABLES | S | xix | | Α | CKNC | OWLEDGEMENTS | xxi | | Cl | hapter | | | | 1 | INT | RODUCTION | 1 | | | 1.1 | A Hybrid CMOS Imager with Sensing and Energy Harvesting Capabilities | 2 | | | 1.2 | Image Compression | 5 | | | 1.3 | Summary of Contributions | 10 | | 2 | HYB | RID IMAGER SYSTEM AND CIRCUIT DESIGN | 12 | | | 2.1 | Hybrid Pixel | 13 | | | 2.2 | Hybrid Pixel Design | 13 | | | 2.3 | DC-DC Conversion | 18 | | | 2.4 | Image Reconstruction with Sub-sampling and Compressive Sensing | 24 | | 3 | HYB | RID IMAGER ENERGY HARVESTING AND IMAGE ACQUISITION | | | | RES | ULTS AND DISCUSSION | 34 | | | 3.1 | Energy Harvesting | 35 | | | 3.2 | Test results for charge pump | 38 | | | 3.3 | Test results for DC-DC boost converter | 39 | | | 3.4 | Image Acquisition of Hybrid Imager | 42 | |---|------|------------------------------------------------------------------------------|-----| | 4 | DET | AILED DESCRIPTION AND A MATHEMATICAL ANALYSIS FOR A | | | | CIRC | CUIT OF ENERGY HARVESTING USING ON-CHIP SOLAR CELLS | 45 | | | 4.1 | On-Chip Solar Cells | 48 | | | 4.2 | Energy Harvesting Circuits | 52 | | | 4.3 | Circuit Analysis | 55 | | | 4.4 | Model-Based Design Optimization | 68 | | | 4.5 | Hardware Measurements | 74 | | 5 | MUL | TIRESOLUTION DECOMPOSITION FOR LOSSLESS AND NEAR - | | | | LOS | SLESS COMPRESSION | 80 | | | 5.1 | One Level of Decomposition | 82 | | | 5.2 | Mixed-Signal Implementation Analysis | 83 | | | 5.3 | Three Levels of Decomposition | 87 | | | 5.4 | Multiresolution Decomposition Imager Architecture | 89 | | | 5.5 | Multiresolution Decomposition Numerical Simulation Results | 94 | | | 5.6 | Chip Measurement Results | 98 | | 6 | AN I | NCREMENTAL $\Sigma\Delta$ CONVERTER FOR COMPRESSIVE SENSING 1 | 102 | | | 6.1 | Incremental $\Sigma\Delta$ Modulation Architecture for Compressive sensing 1 | 103 | | | 6.2 | Quantization Noise and Waveform Reconstruction | 107 | | | 6.3 | The Incremental $\Sigma\Delta$ Converter Circuit Implementation | 110 | | | 6.4 | Results and Measurements | 112 | | | 6.5 | Comparison with Nyquist Pata Converters | 110 | | | 6.6 | Related Work on AICs | 22 | |----|-------|-----------------------------------------------------|----| | 7 | DET | AILED DESCRIPTION OF A SIGMA-DELTA RANDOM DEMODULA- | | | | TOR | CONVERTER ARCHITECTURE FOR COMPRESSIVE SENSING AP- | | | | PLIC | CATIONS | 24 | | | 7.1 | Theory | 27 | | | 7.2 | Random Demodulator Converter Architecture | 29 | | | 7.3 | Circuit Design | 34 | | | 7.4 | Measurements Results | 44 | | 8 | CON | CLUSION | 48 | | Αţ | pendi | x | | | A | Chip | Pin-out | 51 | | | A.1 | The Hybrid Imager Chip | 51 | | | A.2 | The Multiresolution Decomposition Imager Chip | 58 | | | A.3 | The $\Sigma\Delta$ AIC Chip | 64 | | В | Sche | matics | 69 | | C | Pictu | res of Custom PCB | 73 | | RI | EFERI | ENCE LIST | 76 | | VI | ТА | 1 | 87 | ### **ILLUSTRATIONS** | Figure | | Page | |--------|---------------------------------------------------------------------------------------------------|------| | 1 | Architecture of the hybrid imager with sensing and energy harvesting ca- | | | | pabilities | . 12 | | 2 | Reconfigurable pixel concept: (a) photodiode working as a photo sensor | | | | (anode tied to ground); (b) photodiode working as a solar cell (anode tied | | | | to the load) | . 14 | | 3 | Cross-section diagram of photodiodes available in standard CMOS. Diode | | | | $\mathcal{D}_1$ can be employed to harvest energy. Diode $\mathcal{D}_{par}$ is a parasitic diode | . 15 | | 4 | I-V curve of photodiode $\mathcal{D}_1$ when its N-well is left floating and when it is | | | | tied to ground | . 16 | | 5 | Schematic diagram of a single hybrid pixel | . 17 | | 6 | Energy extraction circuit with flying capacitor to maximize the amount of | | | | energy harvested from the in-pixel photodiodes | . 19 | | 7 | Charge pump with flying capacitor to double the output voltage of the | | | | photodiodes | . 20 | | 8 | DC-DC boost converter with flying inductor to boost up the output voltage | | | | of the photodiodes | . 22 | | 9 | Improved inductor-based energy harvesting circuit | . 23 | | 10 | Pixel configuration patterns for image reconstruction as proposed in [45]. | 25 | | 11 | $32 \times 32$ array of masks for different percentages of 1s.: (a) 10% (b) 50% | | |----|---------------------------------------------------------------------------------|----| | | (b) 90%. Black areas represent energy-harvesting pixels, and white areas | | | | represent photo sensing pixels | 26 | | 12 | The neighborhoods of missing pixels:(a) $R = 1$ (b) $R = 2$ | 27 | | 13 | Illustrations of reconstruction of two missing pixels a and b. Gray squares | | | | represent available pixels which are photo sensing pixels in the imager, | | | | and white squares represent missing pixels which are energy harvesting | | | | pixels in the imager | 29 | | 14 | Layout of the hybrid pixel imager | 34 | | 15 | Diagram of the test setup. The light source intensity is controlled with a | | | | power supply. The I-V curve of the photodiode is measured with a source | | | | meter. The output voltage is measured with an oscilloscope | 35 | | 16 | The I-V curve of the hybrid imager when all of the hybrid pixels were | | | | programmed for energy harvesting | 36 | | 17 | I-V curves of the hybrid imager with direct incident light using different | | | | percentage of 1s of the random patterns at the condition of illuminance | | | | $E_v = 40kLux$ | 37 | | 18 | Measurement results of the charge pump with flying input capacitor: (a) | | | | charge pump start up; (b) ripple voltage of the charge pump | 38 | | 19 | A circuit setup to charge a lithium-ion battery by a DC-DC boost con- | | | | verter with hybrid pixels | 40 | | 20 | Battery voltage as the battery is charged using the DC-DC boost converter | | |----|---------------------------------------------------------------------------|----| | | and the hybrid pixels | 41 | | 21 | Acquired image from hybrid-pixel imager: (a) image of letter U; (b) | | | | image of letter M; (c) image of letter K; (d) image of letter C; | 42 | | 22 | Hybrid imager image reconstructions with sub-sampling of fixed radius | | | | approach. $R = 4$ . Percentages of 1s of random masks are, from left, 10% | | | | to 90% in increments of 10% | 44 | | 23 | Hybrid imager image reconstructions with sub-sampling of weighted av- | | | | erage and adaptive R. Percentages of 1s of random masks are, from left, | | | | 10% to 90% in increments of 10% | 44 | | 24 | Hybrid imager image reconstructions with compressive sensing. Percent- | | | | ages of 1s of random masks are, from left, 10% to 90% in increments of | | | | 10% | 44 | | 25 | Cross-section diagram of photodiodes available in standard CMOS and | | | | their equivalent circuit diagram. | 48 | | 26 | Typical connection between an input voltage source, a DC-DC converter | | | | and the load. The source and the load have a common ground (a) ideal | | | | voltage source; (b) photodiode $D_1$ as the voltage source | 50 | | 27 | Cross-section diagram of on-chip photodiodes and their connection to a | | | | DC-DC converter | 50 | | 28 | Measured I-V curves of a P-diff/N-well photodiode ( $D_1$ ) under direct sun | | |----|------------------------------------------------------------------------------|----| | | illumination conditions when the N-well is left floating and when it is tied | | | | to ground | 51 | | 29 | Proposed capacitor-based energy harvesting circuit | 52 | | 30 | Proposed inductor-based energy harvesting circuit | 53 | | 31 | Improved two-transistor inductor-based energy harvesting circuit using | | | | on-chip photodiodes | 55 | | 32 | Photodiode equivalent circuit employed to model the photodiode's elec- | | | | trical behavior. | 56 | | 33 | Modeled and measured photodiode I-V curves | 58 | | 34 | Photodiode output power as a function of the photodiode voltage $V_{D1}$ for | | | | different values of the photo-generated current. The estimated location of | | | | the MPP is marked with dots | 60 | | 35 | Equivalent energy harvesting circuit when $\phi_1 = 1$ | 61 | | 36 | Equivalent energy harvesting circuit when $\phi_1 = 0$ | 61 | | 37 | Inductor current waveform in CCM | 62 | | 38 | Inductor current waveform in DCM | 64 | | 39 | Inductor current waveform | 65 | | 40 | Inductor current $I_i(t)$ waveforms calculated using the derived expression | | | | in (4.27) and using the Spectre circuit simulator for three different values | | | | of $R$ | 69 | | 41 | Conversion efficiency of the proposed energy-harvesting circuit operating | | |----|-------------------------------------------------------------------------------------------|----| | | in DCM as a function of duty cycle ( $d$ ) and clock period ( $T$ ) | 72 | | 42 | Conversion efficiency of the energy-harvesting circuit operating in DCM | | | | as a function of duty cycle $(d)$ for different photogenerated current levels | | | | $I_{ph}$ | 72 | | 43 | Output voltage of the energy-harvesting circuit operating in DCM as a | | | | function of duty cycle (d) for different photogenerated current levels ${\cal I}_{ph}.$ . | 73 | | 44 | Average photodiode voltage $(V_{dm})$ as a function of duty cycle $(d)$ for dif- | | | | ferent photogenerated current levels $I_{ph}$ | 73 | | 45 | Output voltage of the energy-harvesting circuit operating in DCM as a | | | | function of duty cycle (d) for different load resistances $R_L$ | 74 | | 46 | Photograph of fabricated CMOS integrated circuit. The integrated circuit | | | | contains a P-diff/N-well photodiode, a capacitor bank and NMOS transis- | | | | tor switches. The switches are covered by a metal layer to protect them | | | | from light | 75 | | 47 | Diagram of the test setup. The light source intensity is controlled with a | | | | power supply. The I-V curve of the photodiode is measured with a source | | | | meter. The output voltage is measured with an oscilloscope | 76 | | 48 | Output voltage of the energy-harvesting circuit as predicted by theoretical | | | | model (dotted lines) and from hardware measurements (circles) | 77 | | 49 | Voltage output of energy-harvesting circuit captured by an oscilloscope | | | | $(R_r = 410 \text{ k}\Omega)$ | 78 | | 50 | Image compression approach based on multiresolution decomposition | 81 | |----|------------------------------------------------------------------------------------------|-----| | 51 | Simplified model of a single multiresolution decomposition stage | 84 | | 52 | Model incorporating noise sources for (a) the mixed-signal implementa- | | | | tion; (b) fully-digital implementation | 85 | | 53 | Image compression approach based on multiresolution decomposition | 87 | | 54 | Architecture diagram of the imager | 89 | | 55 | Schematic diagram of the CDS unit | 90 | | 56 | Schematic diagram of the switched-capacitor circuit for multiresolution | | | | decomposition | 91 | | 57 | Timing diagram of the multiresolution decomposition circuit | 92 | | 58 | Lena image and its three levels of decomposition | 95 | | 59 | Histograms of the: (a) original image; (b) first level residual; (c) second | | | | level residual; (d) third level residual | 96 | | 60 | Layout of the multiresolution decomposition imager | 98 | | 61 | A custom PCB for testing of multiresolution decomposition imager | 99 | | 62 | Acquired images: (a) full-resolution $eye$ image $(H_0)$ ; (b) reduced-resolution | | | | $eye$ image $(H_1)$ ; (c) residual $eye$ image $(R_1)$ . Histograms: (d) full-resolution | | | | eye image $(H_0)$ ; (e) residual eye image $(R_1)$ | 100 | | 63 | Images acquired from the fabricated imager chip: (a) circle after CDS; | | | | (b) circle after integrator; (c) first-level residual for A; (d) histogram of | | | | first-level residual of circle; (e) A after CDS; (f) A after integrator; (g) | | | | first-level residual for $A$ ; (h) histogram of first-level residual of $A$ | 101 | | 64 | Schematic diagram of the proposed voltage-mode ADC | |----|----------------------------------------------------------------------------------------| | 65 | Normalized ideal CS measurements and the output of the compressive | | | sensing incremental $\Sigma\Delta$ converter | | 66 | Quantization error of the compressive sensing incremental $\Sigma\Delta$ converter | | | for Ns=1, 2 and 8 | | 67 | Switched-capacitor circuit implementation of the compressive sensing ADC111 | | 68 | Diagram of the test setup for compressive sensing on PCB | | 69 | Test results of AM signal: (a) Reconstructed from Compressive Sensing | | | Measurements; (b) Sampled using Normal ADC Mode | | 70 | Test results of FM signal: (a) Reconstructed from Compressive Sensing | | | Measurements; (b) Sampled using Normal ADC Mode | | 71 | Diagram of the custom test setup | | 72 | Normalized spectrum of recovered signal. The input signal is a single | | | sinusoidal of frequency 20 kHz | | 73 | Normalized spectrum of recovered signal. The input signal consists of | | | two AM signals with carrier at frequencies at 100 kHz and 250 kHz and | | | a single sinusoidal at 400 kHz | | 74 | Graphical depiction of signal acquisition under compressive sensing 121 | | 75 | Conventional and compressive sensing signal acquisition approaches. A | | | random demodulator circuit is employed to project the input signal $\boldsymbol{s}(t)$ | | | onto the random vectors $\varphi_k(t)$ | | 76 | Encoding and decoding process under the compressive sensing frame- | |----|-------------------------------------------------------------------------------------------------------| | | work. The input $\boldsymbol{s}(t)$ is encoded using the measurement matrix $\boldsymbol{\Phi}$ and a | | | quantizer. The quantized measurements $\hat{\mathbf{y}}$ are employed to recover the | | | input signal. The discrete-time recovered signal is denoted by $\tilde{\mathbf{s}}$ 129 | | 77 | Block diagram of the proposed $\Sigma\Delta$ random demodulator converter 131 | | 78 | Average and standard deviation of the signal-to-noise ratio of recovered | | | input signals from quantized measurements: (a) $N=256$ (b) $N=512$ | | | (c) $N = 1024$ | | 79 | Schematic diagram of the switched-capacitor implementation | | 80 | Average and standard deviation of signal-to-noise ratio of recovered sig- | | | nals from noisy and quantized measurements for $N=1024.\dots 140$ | | 81 | Average and standard deviation of signal-to-noise ratio of signals recov- | | | ered from offset-corrected measurements for $N=1024.\dots 141$ | | 82 | Typical waveform of the integrator output $V_x$ and metastability region 142 | | 83 | Diagram of the custom test setup | | 84 | Measured SNDR of recovered signals as the ratio $M/N$ is varied 145 | | 85 | Diagram of setup employed to acquire a radio signal | | 86 | Spectrum of acquired signals: (a) using a data acquisition unit, (b) recov- | | | ered from compressive sensing measurements | | 87 | Block diagram of the hybrid imager chip | | 88 | Pin names of the hybrid imager chip | | 89 | Bonding diagram of the hybrid imager chip. Chip number: V19k-AJ 153 | | 90 | Top view of the hybrid imager chip package | |-----|--------------------------------------------------------------------------| | 91 | Block diagram of the multiresolution decomposition imager chip 158 | | 92 | Pin names of the multiresolution decomposition imager chip 159 | | 93 | Bonding diagram of the multiresolution decomposition imager chip. Chip | | | number: V0BL-AE | | 94 | Top view of the multiresolution decomposition imager chip package 161 | | 95 | Block diagram of the $\Sigma\Delta$ AIC chip | | 96 | Pin names of the $\Sigma\Delta$ AIC chip | | 97 | Bonding diagram of the $\Sigma\Delta$ AIC chip. Chip number: V19K-AK 166 | | 98 | Top view of the $\Sigma\Delta$ AIC chip package | | 99 | Schematic of the hybrid pixel with the sizes of the transistors | | 100 | Schematic of the op-amp with the sizes of the transistors | | 101 | Schematic of the $\Sigma\Delta$ AIC, version 1 | | 102 | Schematic of the $\Sigma\Delta$ AIC, version 2 | | 103 | Picture of custom PCB for the multiresolution decomposition imager with | | | description | | 104 | Picture of custom PCB for the hybrid imager with description 174 | | 105 | Picture of custom PCB for the $\Sigma\Delta$ AIC with description 175 | ## **TABLES** | T | ables | | Page | |---|-------|----------------------------------------------------------------------------------------------|-------| | | 1 | Average PSNR (dB) of Image Reconstruction Results by Running 100 | | | | | Random Patterns and 12 Images with Different Radii for Fixed Radius | | | | | Approach and Different Percentages of 1s | . 32 | | | 2 | Average SSIM of Image Reconstruction Results by Running 100 Random | | | | | Patterns and 12 Images with Different Radius for Fixed Radii Approach | | | | | and Different Percentages of 1s | . 33 | | | 3 | Average PSNR (dB) and SSIM of Image Reconstruction Results by Run- | | | | | ning 100 Random Patterns and 12 Images with Different Reconstruction | | | | | Schemes and Different Balances | . 33 | | | 4 | Features of the Test Chip | . 35 | | | 5 | Power Efficiency of the Charge Pump with Different $R_L$ | . 39 | | | 6 | Power Efficiency of the DC-DC Boost Converter with Different $\mathcal{R}_{\mathcal{L}}$ and | | | | | External Inductor L at the Condition of $C_D = 100nF$ , $C_L = 100nF$ , | | | | | $Iph = 20uA$ , and Illuminance $E_v = 40kLux$ | . 42 | | | 7 | Compression Performance of a Three-Level Decomposition | . 96 | | | 8 | Comparison with JPEG in Lossy Scenario | . 97 | | | 9 | Compression Performance | . 98 | | | 10 | Features of the Test Chip | . 100 | | 11 | MSE OF Reconstructed Signal from Measurements Computed by the ADC 108 | |----|------------------------------------------------------------------------| | 12 | MSE OF Reconstructed Signal for an Unbalanced Sensing Matrix 109 | | 13 | Average Power Consumption of $\Sigma\Delta$ AIC | | 14 | Performance Summary of $\Sigma\Delta$ AIC | | 15 | Performance Comparison of AIC with Nyquist-Rate Converters 119 | | 16 | Pin Configuration of the Hybrid Imager Chip | | 17 | Pin Configuration of the Multiresolution Decomposition Imager Chip 162 | | 18 | Pin Configuration of the $\Sigma\Delta$ AIC Chip | #### **ACKNOWLEDGEMENTS** I would never have been able to finish my dissertation without the guidance of my committee members, help from friends, and support from my family. I would like to express my deepest gratitude to my adviser Professor Walter D. Leon-Salas for his guidance and support, providing me with an excellent atmosphere for doing research. He inspired and motivated me greatly to work on analog and mixed signal IC design. He patiently corrected my writing and financially supported my research at UMKC. I would also like to thank Professor Ghulam Chaudhry, Professor Deep Medhi, Professor Cory Beard, and Professor Reza Derakhshani, who served on my dissertation committee, for their continued support and valuable suggestions. I would like to thank Mohammad Benyhesan, who as a good friend, was always willing to share everything with me. It would have been a lonely lab without him. Many thanks to Fahad Moiz, Yu-Fang Chen, Mei-Chun Chen, Yu-Syuan Lin, Irene Chu, Bobbie Lee, and Johnny Liu for accompanying me throughout my hard time. I would also like to thank my parents, two elder brothers, elder sister, my nieces, and my nephews. They were always supporting me and encouraging me with their best wishes. Special thanks to my brother-in-law, Peter Gillespie, who reviewed my interesting dissertation and made my dissertation more readable. #### CHAPTER 1 #### **INTRODUCTION** Advanced CMOS integrated circuits technology creates the opportunity for a new class of miniaturized electronic systems known as self-powered systems on a chip. Wireless sensors have special needs that are different from consumer electronics. First, consumer electronic devices can be plugged into a power source all the time, or else the batteries can be charged whenever needed. On the other hand, wireless sensing devices are usually deployed in areas that are hard to reach such as inaccessible areas, deserts, remote outposts, and so on. Therefore, they are far away from electrical outlets and can not be attached to a power source all the time, or even when needed. Although bigger batteries can be used to operate these devices for extended periods of time, even a very large battery is going to run out of charge sometime. At that point, the battery should be charged or replaced. Since wireless sensors can be deployed in difficult to reach locations, it is very costly just to charge or replace batteries by human labor. How can this problem be solved? There are various energy sources available for harvesting, such as ambient light, vibration, radio energy, thermal energy, etc. Solar energy harvesting is especially suitable for sensors operating outdoors as the energy densities can reach $100 \text{ } mW/cm^2$ [69]. The photodiodes available in the standard CMOS fabrication process could be employed to work as solar cells to harvest solar energy. The integration of solar cells and a radio module on a CMOS image sensor has the benefit of reducing the sensor size and fabrication costs, and provides an energy source to the system. Second, transmitting signals wire-lessly consumes a lot of power out of energy-constrained devices. A way to reduce power consumption is to reduce the data rate that an image sensor has to generate. Multiresolution decomposition image compression provides a way to compress or reduce the data rate of images by combining the benefits of multiresolution decomposition with the simplicity of predictive coding. The circuitry of the multiresolution decomposition only needs to perform the operations of addition and subtraction which can very easily be integrated with a wireless image sensor on a chip. #### 1.1 A Hybrid CMOS Imager with Sensing and Energy Harvesting Capabilities The integration of solar cells on an integrated circuit along with the sensor front end and radio module has the benefit of reducing the sensor size and the fabrication costs. On-chip solar energy harvesting can be accomplished using the photodiodes available in CMOS fabrication processes. Several groups have reported the usage of photodiodes for on-chip solar energy harvesting [50] [45] [33] [30] [67]. In [50] solar cells fabricated in a silicon-on-sapphire process are reported. However, silicon-on-sapphire fabrication technology is still more costly than standard CMOS. In [33] photodiodes fabricated in standard CMOS are integrated with vertical interconnect based capacitors to store harvested energy. Measurement results show that up to 8.2 $\mu$ W of power could be harvested with this approach. The two major drawbacks of this approach are that the vertical capacitors block light if it does not shine at the correct angle and the capacitance density for energy storage is very low. A different approach is taken in [30] and [67] where an active pixel sensor (APS) imager is equipped with additional photodiodes at the pixel level to harvest energy. The addition of an extra photodiode in each pixel results in bigger pixels and reduced fill factors. An improvement on this idea is presented in [45] where the in-pixel photodiode is used either to sense light or to harvest energy. Additional circuitry is needed to configure the photodiode as a sensor or as a harvester. This approach reuses the in-pixel photodiode for sensing and harvesting operations. However, it has reduced efficiency because the N-well gets tied to ground during energy harvesting and the output voltage generated is too low to power conventional electronic circuits. Moreover, pixels can only be configured in groups and not individually. To address the problems, a CMOS imager incorporated with hybrid pixels which can be individually configured to work as photo-sensors or as tiny solar cells to harvest solar energy is presented. The imager can harvest energy between frames, sleep mode, and even when it is acquiring images. However, the voltage produced by the photodiodes is rather low, around 0.3 V to 0.5 V depending on the illumination conditions. Higher voltages are needed to charge a battery or to power up the circuitry of the sensor. Voltage boosters with flying energy storage components to avoid connecting the N-well of the photodiodes to ground and avoid activating the parasitic photodiodes are also proposed. The hybrid pixels can be fabricated in the standard CMOS process. Since individual pixels of the hybrid imager can be configured as sensors or energy harvesters, the ratio of photo-sensing to harvesting pixels can be varied according to environmental conditions. Recently, a reconfigurable image sensor was presented in [45] where the pixel's photodiodes can be configured to work as sensors or as tiny solar cells to harvest energy. Different configuration patterns where half of the pixels could be switched to work as sensors and half as energy harvesters is also explored in [45]. The authors concluded that a checker-board pattern has the least distortion on average in the reconstructed images. This scheme offers a trade-off between image resolution and harvested energy. In some situations the harvested power is not enough to power up the circuit even when half of the pixels are configured to harvest energy. Hence, the proposed hybrid pixels are capable of being configured individually so that a fixed ratio of photo-sensing to harvesting-pixels can be accomplished. Conversely, if there is enough stored energy, the half resolution configuration decreases the quality of acquired images. In order to have a flexible ratio of photo-sensing to harvesting pixels, a random pattern is programmed on the pixel array and the balance of 0s and 1s can be controlled with a chaos-based pseudorandom bit generator. In this arrangement, a 0 means that a pixel is programmed as solar cell and a 1 means that a pixel is configured as a photo-sensor. The resulting images will lack imaging data for pixels which are configured as solar cells. The missing pixels are estimated using averaging and compressive sensing technologies. The reconstructed images have good or acceptable quality with the added benefit on having a flexible way to control the ratio of photo-sensing to harvesting pixels. A methodology based on compressive sensing for programmable hybrid pixels is also proposed. The nature of compressive sensing is that if a signal is sparse in some domain, then the signal can be acquired at a much lower rate than what the Nyquist/Shannon theorem would otherwise suggest. In compressive sensing, a signal is acquired by projecting it into a random basis. For the simplicity of circuit design, instead of random numbers in random basis, a binary random basis with only $\{0,1\}$ is used. When a 0 in the random basis applies to a corresponding position of a signal, the value of this projection does not contribute to the final compressive sensing measurement. Applying this idea to to the concept of hybrid pixels, the pixels that correspond to the 0s in the random basis can be configured as solar cells while the pixels that correspond to the 1s are programmed as photo-sensors. The balance of 0s and 1s can be changed: when there are more 0s, more pixels are dedicated to energy harvesting; when there are less 0s, less pixels are dedicated to energy harvesting. In this way, the balance can be changed in response to changes in environmental conditions. Simulation results show that up to 90% of 0s in a random basis can be deployed, and an image is still reconstructable with acceptable image quality. That means up to 90% of hybrid pixels of an imager can be programmed to harvest energy. The images are recovered by compressive sensing reconstruction algorithms. #### 1.2 Image Compression The other main factor of a wireless sensor is power consumption. Radio communication is a power-hungry operation. One way to reduce power consumption dramatically is to compress data before transmission. Different image compression algorithms have been integrated with CMOS imagers with relative success. There are a number of challenges that focal-plane image compression faces, such as the limited silicon area available at the focal plane which limits the complexity of the processing circuitry. Moreover, popular image compression algorithms require extensive matrix multiplications and storage of the resulting coefficients. The resulting coefficients have to be quantized and encoded to achieve actual compression, adding to the complexity of the compression circuitry. To address these challenges, circuit designers have come up with different solutions such as implementing the matrix multiplications with simple analog circuits [38], [8]. However, due to limitations of analog devices, the reconstructed images exhibit relatively high noise. Other approaches to focal-plane image compression have departed from the traditional transform coding approach. For instance, in [2] a conditional replenishment algorithm is implemented using very simple circuits integrated at the pixel level. The disadvantage of conditional replenishment is that it does not work well with still images and has poor compression performance compared to other compression approaches. Another approach to image compression relies on predictive coding. In predictive coding the value of a pixel is predicted using models that take into account the pixel's neighbors and/or its past values. The prediction error, which is the difference between the pixel value and its prediction, is encoded using an entropy encoder. If a good predictor is used, on average the prediction error is zero or close to zero, and high compression ratios can be achieved. The drawback of prediction-based compression is that small errors can be accumulated, especially if the prediction operations are carried out in the analog domain [46]. Modern image compression algorithms are based on wavelet decomposition in which an image is progressively decomposed into sub-bands of lower resolution. The coefficients in the sub-bands are encoded using hierarchical group testing techniques. The implementation of wavelet-based compression techniques on the focal plane is not straightforward due to the complex calculations and storage space required. The success of wavelet decomposition in image compression has motivated circuit designers to integrate simplified versions of wavelet-based compression on the focal plane. For instance, a 1-D Haar wavelet transform is implemented on the focal plane using a $\Sigma\Delta$ ADC that performs weighed spatial averaging [54]. Successive ADC outputs are accumulated off-chip accounting for a 2-D Haar transform. In [48] a focal-plane circuit based on capacitors and switches is employed to perform the average, sum and difference operations needed in the Haar wavelet transform. A CMOS imager architecture that combines the benefits of multiresolution decomposition with the simplicity of predictive coding is proposed. In this approach, an image is progressively decomposed in images of lower resolution; up to three levels of decomposition are employed. The low resolution images are used as predictors for the higher resolution images. The resulting prediction residuals are quantized, entropy encoded and compressed. This compression approach can provide lossless or lossy compression and the resulting bitstream is a fully embedded code. The required operations are additions and subtractions which can be implemented with simple analog circuits. Unlike standard predictive coding, errors due to non-Ideal behavior of circuit components do not propagate throughout the image. Another way to compress data is using compressive sensing. Compressive sensing is a recent theoretical development applicable not only for images but also for any type of signal that is sparse in any domain. One of the central theorems in digital signal processing is the Nyquist theorem which states that in order to acquire a band-limited signal without distortion the signal has to be uniformly sampled with a rate of at least twice of its highest frequency component. There are some applications, such as radar, ultra-wideband communication and video, in which the direct application of Nyquist theorem results in sampling rates that are pushing the physical limits of analog-to-digital converters (ADC) [42]. Often times more information about the signal than just its highest frequency component is available. This additional information can be exploited to relax the sampling rates requirements. In recent years, a new theoretical framework, known as compressive sensing, has been developed that allows sparse or compressible signals to be sampled at rates below the Nyquist rate. Compressive sensing states that signals that are sparse in some transform domain can be sampled at a lot lower rate than the Nyquist rate and can still be recovered without introducing distortion. Hence, compressive sensing has the potential to revolutionize the way analog signals are acquired. To fulfill this potential, new circuit architectures need to be developed to produce ADCs that follow the compressive sensing principles and are able to work at lower sampling rates. Converters that directly acquire compressive sensing measurements are sometimes called analog-to-information converters (AICs). Some circuit architectures addressing the implementation of AICs have been proposed in the literature. In [51], a compressive sensing ADC dubbed X-ADC was built using off-the-shelf electronic components that were operated beyond their stated specifications to convert signals that are sparse in the frequency domain and have a 2 GHz Nyquist rate using a sampling rate of only 280 MHz. In [81], a parallel mixed-signal converter architecture for wideband cognitive radios is presented. A prototype converter was built using low-speed off-the-shelf components. It was shown that using parallel overlapping windowed integrators reduced spurious frequencies. The transistor-level hardware architecture of a compressive analog-to-digital converter (CADC) is presented in [42] [59]. The converter uses a double-balanced Gilbert cell multiplier and an RC-active differential integrator to compute the compressive sensing measurements in the analog domain which are later quantized by a back-end ADC. The ADC was tested with AM signals sampled at 1/8 of the Nyquist rate. Another approach to designing an AIC employs a sigma delta modulator [11]. This approach dynamically changes the feedback coefficients of a sigma delta modulator in accordance with a random dictionary. The disadvantage is that it requires very high-order modulators that can easily become unstable. To avoid unstable behavior the feedback coefficients must meet additional constraints. A compressive sensing ADC that is based on the incremental sigma-delta architecture is proposed. The proposed converter requires a first-order sigma-delta modulator making it unconditionally stable. Moreover, it uses a switched-capacitor hardware implementation that requires only 0.047 mm<sup>2</sup> of silicon area in a 0.5 $\mu$ m CMOS fabrication process. Thus, several of the proposed converters can be employed in parallel to achieve even higher conversion rates. #### 1.3 Summary of Contributions This dissertation addresses the problem of energy constrained wireless image sensors and provides vital solutions, including hybrid pixels for dual function of energy harvesting and photo sensing, DC-DC converter, image compression, and application of compressive sensing. The main contributions of this work are: - the design and testing of a hybrid pixels imager that can perform the tasks of photo sensing and energy harvesting. The hybrid pixels allow the imager to harvest energy between frames or while it is in sleep mode. - the design and testing of a DC-DC converter to boost the inherently low voltage of the photodiodes when they work as solar cells. The DC-DC converter employs flying energy storage components to maximize the amount of energy that can be harvested. - the development of a random reconfiguration method for hybrid pixel imager and develop a missing pixel reconstruction algorithm to recover missing values of pixels when they are configured as solar cells. Recovered images must have good to acceptable image quality depending on the balance of 0s and 1s. - the implementation and verification of compressive sensing with hybrid pixel image. - the design and testing of an analog-to-digital converter suitable for compressive sensing applications. The converter is based on an incremental sigma-delta converter architecture and is able to directly acquire and convert analog compressive sensing measurements to digital format. A switched-capacitor circuit is proposed for the converter's hardware implementation. • the design and testing an image compression algorithm suitable for focal plane integration and its hardware implementation. In this approach an image is progressively decomposed into images of lower resolution. The low resolution images are then used as the predictors of the higher resolution images. The prediction residuals are entropy encoded and compressed. This compression approach can provide lossless or lossy compression and the resulting bitstream is a fully embedded code. A switched-capacitor circuit is proposed to implement the required operations. This dissertation is organized as follows: Chapter 2 provides the background of the hybrid pixel concept and DC-DC conversion, describes the architecture and the chip design, and contains the analysis of the proposed circuits. Chapter 3 presents the test set up and test results for the hybrid imager. Chapter 4 gives a detailed description and a mathematical analysis of the energy harvesting with on-chip solar cells. The background, design, and test results for multiresolution decomposition image compression are presented in Chapter 5. Chapter 6 shows the background, design, and results for an incremental sigma-delta converter for compressive sensing. Chapter 7 presents an in-depth look at a compressive sensing sigma-delta converter. Finally, Chapter 8 provides the conclusions of this dissertation. # CHAPTER 2 HYBRID IMAGER SYSTEM AND CIRCUIT DESIGN Figure 1: Architecture of the hybrid imager with sensing and energy harvesting capabilities. To demonstrate the hybrid pixel concept, an imager with hybrid pixels was designed and fabricated. Figure 1 shows the architecture of the hybrid imager with sensing and energy harvesting capabilities. The imager consists of an $32 \times 32$ array of the hybrid pixels which will be discussed in detail in section 2.1. Column and row decoders employed for addressing individual pixels are placed at the periphery of the array. The energy harvesting buses from each column are combined into a global bus that feeds a DC-DC converter. The sensing buses from each column are multiplexed and applied to an off-chip analog-to-digital converter (ADC). #### 2.1 Hybrid Pixel Photodiodes occupy a significant area of a typical CMOS imager. When an imager is not acquiring images, e.g. during sleep mode or between frames, the photodiodes are not used but they still take up area resources inside the chip. A better use of the in-pixel photodiodes would be to harvest solar energy when images are not being acquired. To enable this dual functionality a pixel is modified so that it can be configured as a photosensor or as a solar cell. A pixel that can be configured in this manner is referred to as a hybrid pixel. Moreover, if individual pixels in an array could be configured as sensors or energy harvesters, then the ratio of photo-sensing to harvesting pixels can be varied according to environmental conditions. This scheme offers a trade-off between image resolution and harvestable energy. #### 2.2 Hybrid Pixel Design To work as a photo-sensor, the anode of the photodiode is connected to ground and the reverse-bias photo-generated current $I_{ph}$ is measured typically by integrating it over a period of time [10]. To work as a solar cell, the photodiode's anode needs to be connected to the load (or to a DC-DC converter) so that power is generated by the photodiode. Fig. 2 shows the photodiode's configuration for the two modes: (a) photo sensing and (b) power generation. Figure 2: Reconfigurable pixel concept: (a) photodiode working as a photo sensor (anode tied to ground); (b) photodiode working as a solar cell (anode tied to the load). The photodiodes that are available in a standard P-type substrate CMOS fabrication process are: N-well/P-sub, P-diff/N-well and N-diff/P-sub. Of these diodes only the P-diff/N-well could be used in a hybrid pixel because the P-substrate has to be tied to ground to prevent pn junctions across the chip from getting forward-biased. A cross-sectional view of a P-diff/N-well diode $(D_1)$ is shown in Fig. 3. Notably a second diode $(D_{par})$ is formed by the N-well and the P-substrate. Diode $D_{par}$ is undesired from the energy harvesting point of view. If diode $D_{par}$ is allowed to conduct current, a portion of the holes generated inside the N-well will flow through $D_{par}$ decreasing the current through $D_1$ and the amount of power that $D_1$ can provide to the load. Figure 3: Cross-section diagram of photodiodes available in standard CMOS. Diode $D_1$ can be employed to harvest energy. Diode $D_{par}$ is a parasitic diode. Figure 4 illustrates the effect of parasitic diode $D_{par}$ on the output current of $D_1$ . The figure shows the current vs. voltage (I-V) curve for diode $D_1$ when the N-well is tied to ground and when it is left floating. Clearly, tying the N-well to ground significantly reduces the current through $D_1$ as charges are allowed to flow through $D_{par}$ . The presence of $D_{par}$ creates a problem as practical loads are referred to ground requiring $D_1$ 's cathode (N-well) to be connected to ground (see Fig. 2 (b)). This problem is actually easily solved by employing a flying energy storage element (capacitor or inductor) to extract energy from $D_1$ and deliver it to the load. A charge pump based on this concept is presented in section 2.3. Based on the observations stated above, the hybrid pixel shown in Fig. 5 is proposed. The hybrid pixel consists of a P-diff/N-well photodiode $D_1$ , switches $S_1$ to $S_4$ , a Figure 4: I-V curve of photodiode $D_1$ when its N-well is left floating and when it is tied to ground. 1-bit memory, row and column addressing switches and a PMOS transistor $P_1$ . The 1-bit memory can be written via the column-wide "programming bus". Writing a '1' in the 1-bit memory configures the pixel in the sensing mode. In the sensing mode switches $S_1$ and $S_3$ are closed and the pixel works as a regular active pixel sensor (APS). The APS works as follows: the RST switch resets the photodiode's capacitance to $V_{DD}$ . Then the switch RST is open and the photo-generated current discharges the capacitance producing Figure 5: Schematic diagram of a single hybrid pixel. a voltage proportional to the photo-generated current. This voltage is read through $P_1$ which forms part of a source-follower amplifier. The pixel is configured in the harvesting mode by writing a '0' into the 1-bit inpixel memory. In the harvesting mode, switches $S_1$ and $S_3$ are closed while switches $S_2$ and $S_4$ are open. $S_1$ and $S_3$ connect the photodiode to the energy harvesting (EH) bus. The EH bus is a column-wide shared bus that adds the photo-currents of all other pixels configured in the harvesting mode in that particular column. The EH buses from all columns are merged at the bottom of the array to produce a current that is the sum of the photo-currents of all pixels configured in the harvesting mode. The total harvested photo-current is applied to a DC-DC converter that boosts the inherently low voltage output of the photodiodes. The pixel layout occupies an area of 43 $\mu$ m $\times$ 43 $\mu$ m in a 0.5 $\mu$ m CMOS fabrication process. The RST switch is implemented with a PMOS transistor while switch $S_4$ is implemented with a transmission gate. NMOS transistors are sufficient to implement switches $S_1$ , $S_2$ and $S_3$ . The 1-bit memory is implemented with two inverters. The transition voltage of the inverters is designed at $V_{DD}/2$ for easy transformation of the 1-bit memory, where $V_{DD}=3.3$ V, but $V_{DD}$ has to be increased to 3.9 V for the 1-bit memory to work in practice. ### 2.3 DC-DC Conversion As explained in section 2.1 connecting the cathode (N-well) of the in-pixel photodiodes to ground decreases the output current in harvesting mode due to the presence of the parasitic N-well/P-sub photodiode. This problem is solved by employing energy storage components (capacitor or inductor depends on what type of DC-DC converter is used) to extract energy from the in-pixel photodiodes and transfer it to the load. The flying capacitor or inductor is employed to store and transfer energy from the photodiode to the load while providing isolation. Thus, connecting the N-well of the parasitic diode to ground is avoided. Two types of DC-DC converters, charge pump and DC-DC boost converter, are discussed in this section. Charge pumps use capacitors as energy storage elements, while DC-DC boost converters use inductors. Although the charge pump stores and transfers energy on capacitors which can be easily integrated within a single chip, a charge pump requires more switches to transfer energy than a DC-DC boost converter and it has lower transferred output voltage than DC-DC boost converter. Although more stages of charge pumps could be employed to pump up to a higher voltage, the efficiency of entire voltage boost system would decrease dramatically, and the additional stages of charge pumps would require more switches and occupy more area on the silicon. DC-DC boost converter are not quite ideal either. Inductors are not as easy as capacitors to be fabricated on chip. A solution to this is to use external inductors for DC-DC boost converters. Both types of converters have pros and cons and they will be discussed in detail. # 2.3.1 Charge pump Figure 6: Energy extraction circuit with flying capacitor to maximize the amount of energy harvested from the in-pixel photodiodes. Figure 6 depicts this concept. The energy-extraction circuit requires a two-phase non-overlapping clock with phases $\phi_1$ and $\phi_2$ . The parallel-connected diodes $D_1$ to $D_N$ represent the photodiodes of the pixels that have been configured to work in harvesting mode. The flying capacitor $C_1$ is charged by the photodiodes during phase $\phi_1$ by closing switches $S_1$ and $S_2$ . During phase $\phi_2$ switches $S_1$ and $S_2$ are open and the switches $S_3$ and $S_4$ are closed, transferring the charge to the load. The typical open-circuit voltage of on-chip photodiodes is between 0.4 V to 0.5 V (depending on illumination conditions). This voltage is rather low to power up most loads of interest in a sensor node. Hence, a boosting DC-DC converter must be connected between the diodes and the load. Figure 7: Charge pump with flying capacitor to double the output voltage of the photodiodes. In this work we have employed a charge pump to boost the inherently low voltage of the photodiodes. The charge pump is based on the series-parallel charge pump topology. As a proof of concept, a charge pump with a 2× voltage conversion gain was designed and implemented. Fig. 7 shows the schematic diagram of the implemented pump. The concept of using a flying capacitor and a series-parallel pump can be readily extended to obtain higher voltage conversion gains by adding more stages. The charge pump works with a two-phase non-overlapping clock with phases $\phi_1$ and $\phi_2$ . During phase $\phi_1$ switches $S_1$ , $S_2$ , $S_3$ and $S_4$ are closed and all the other switches are open connecting $C_1$ and $C_2$ in parallel with each other and with the photodiodes. Thus, in $\phi_1$ $C_1$ and $C_2$ get charged by the photodiodes to a voltage between 0.4 V to 0.5 V (depending on the illumination conditions). During $\phi_2$ switches $S_7$ , $S_5$ and $S_6$ are closed connecting $C_1$ and $C_2$ in series and to the load. In the figure the load is modeled by the current sink $I_L$ . Assuming $C_1 = C_2 = C_L = C$ , an analysis of the circuit in Fig. 7 shows that, at the beginning of $\phi_2$ , $V_{DC} = 2V_{ph} - 2I_L/C(2T_2/3 + T_1)$ , where $T_1$ and $T_2$ are the on-time of phases $\phi_1$ and $\phi_2$ , respectively. The output voltage decreases with the load current. The decrease can be compensated by increasing the capacitances or increasing the switching frequency. The efficiency of the charge pump is affected by parasitic bottom-plate capacitances and the non-zero on-resistance of the switches. ### 2.3.2 DC-DC Boost Convertor The DC-DC boost converter with flying inductor is shown in Figure 8. It uses a inductor for energy storage and transfer to the load. The diode $D_s$ works as a switch to avoid energy transfer from the load side to the photodiode side. The energy-extraction circuit requires a two-phase non-overlapping clock with phases $\phi_1$ and $\phi_2$ . The flying inductor $L_1$ is charged by the photodiodes during phase $\phi_1$ by closing switches $S_1$ and $S_2$ . During phase $\phi_2$ switches $S_1$ and $S_2$ are open and switch $S_3$ is closed, transferring the charge to the load. Figure 8: DC-DC boost converter with flying inductor to boost up the output voltage of the photodiodes. An advantage of the DC-DC boost converter over the charge pump is that the inductor based voltage booster has the ability to boost the photodiode's inherently low voltage and directly generate the voltage level required by the load. The drawback of charge pumps is that they require several capacitors and the switches, increasing area requirements and power losses due to the on-resistance of the switches. The voltage at node B, the cathode of the photodiodes $D_1$ to $D_N$ , in Figure 8 is typically between -0.4V to -0.5V which is also the open-circuit voltage photodiodes $Dpar_1$ to $Dpar_N$ . This open-circuit voltage develops in response to photons with longer wavelength that are able to penetrate deeper into the silicon crystal and reach parasitic diodes. As a consequence, the voltage at node A is very close to 0V. Moreover, a loop of the inductor $L_1$ and the load side is open when the diode $D_s$ is in reverse bias. Thus, switch $S_3$ can be removed as shown in Figure 9. During phase $\phi_1$ by closing switches $S_1$ and $S_2$ , node A gets connected to the load ground. Since the voltage at node A is already very close to ground, the circuit operates as intended. Removing switch $S_3$ has the advantage of eliminating the losses due to the non-zero on-resistance of switch $S_3$ . All the switches are implemented by N-MOS transistors. Moreover, instead of two-phase non-overlapping clocks only one phase of $\phi_1$ is required. After removing $S_3$ , the output voltage, $V_{out}$ , increases between 0.2V and 0.3V. Figure 9: Improved inductor-based energy harvesting circuit. ### 2.4 Image Reconstruction with Sub-sampling and Compressive Sensing The hybrid pixel imager has the dual function of energy harvesting and photo sensing. There would be a condition that both actions of photo sensing and energy harvesting are needed. The hybrid pixel imager sub-samples an image, and missing pixels which are programmed to harvest energy can be recovered from their available neighbor pixels. As discussed in Chapter 1, some patterns of pixels configuration were proposed in [45] and are shown in Figure 10. The gray boxes represent pixels configured as photo sensors and white boxes mean the values of the pixels are missing. Half of the pixels are programmed as solar cells and the others are configured as photo sensors by deploying these four patterns. The authors concluded that a chessboard pattern has the best quality of reconstructed images. However, all four of these patterns lack the flexibility that individually programmable hybrid pixels can provide. Two image reconstruction algorithms which are sub-sampling and compressive sensing were developed for the hybrid imager. ## 2.4.1 Image Reconstruction with Sub-Sampling A random pattern mask is proposed to configure the modality of pixels. The random pattern extends the flexibility of the hybrid imager. The array size of the mask is the array size of an imager. Figure 11 shows some masks of size 32 × 32 with different percentages of 1s: 10%, 50%, and 90%. One of the advantages of implementing random pattern masks in the hybrid imager is that the masks can be easily generated on hardware and software without requiring memory to store different masks. The ratio of 0s and 1s is controlled by setting the threshold of the pseudo random generator. The pseudo random Figure 10: Pixel configuration patterns for image reconstruction as proposed in [45]. generator generates a sequence of 0s and 1s randomly, and a mask matrix is formed by the long sequence. In order to match the masks in the hardware and the image reconstructing software, the same seed and threshold are used. In this way, no memory is required to store the information of masks. As the ratio of 0s and 1s can be controlled in the random pattern masks, the ratio of photo sensing and energy harvesting pixels can be controlled. In the image reconstruction process, a matching mask is generated at the reconstruction side using the same seed. The imager only converts the analog values of photosensing pixels to digital values and sends these values to a computer for image reconstruction. There is no information sent from the imager to identify the type of pixels. Thus, energy can be saved by not transmitting this information. The missing pixels are recovered from their available neighbor pixels configured as photo sensing pixels in the Figure 11: $32 \times 32$ array of masks for different percentages of 1s.: (a) 10% (b) 50% (b) 90%. Black areas represent energy-harvesting pixels, and white areas represent photo sensing pixels. imager. Since the random masks are used, the available neighbor pixels' location are also random. Two approaches were developed. The process of the first approach is as follows. First, a random mask is applied to the hybrid imager, then the available pixels' values are converted and transmitted for image reconstruction processing and placed in their corresponding locations according to the matching mask. An image with missing pixels is formed. The position of the missing pixels is known at the reconstruction side from the 0s in the matching mask. At this point it is ready for the missing pixels to be estimated. Since an image frame with missing pixels was generated, the position of the missing pixels are known. A neighborhood of a missing pixel can be formed. The neighborhood of radius R for a missing pixel is define as follow: $$Neighborhood of a missing pixel = (i \pm h, j \pm h)$$ (2.1) where (i,j) is the coordinate of the missing pixel, and $h \le R$ . Figure 12 shows the neighborhood of missing pixels for R=1 and R=2. In Figure 12(a), the pixels of a neighborhood of radius R = 1 for a missing pixel (i,j) are (i-1,j-1), (i-1,j), (i-1,j+1),(i,j-1),(i,j+1),(i+1,j-1), (i+1,j), and (i+1,j+1). Figure 12: The neighborhoods of missing pixels:(a) R = 1 (b) R = 2. To estimate a missing pixel, at least one available neighbor pixel in a given radius is required. If there are not any available pixels, the radius has to be increased manually until there is at least one available pixel. The radius for the first approach is fixed throughout entire image reconstruction process. The reconstructed pixels will not become available pixels for the next missing pixel because the reconstructed pixels already have reconstruction error. If the reconstructed pixels became available pixels, the errors would propagate throughout the entire image, and the reconstructed image would have very poor quality. A disadvantage of the first approach is that the radius is fixed and has to be known before the reconstruction process. The problem of the radius being fixed is that there are some situations in which there are many available pixels. In these cases a smaller radius would be preferable. In order to solve this problem, a reconstruction algorithm with an adaptive radius and weighted pixel is developed to improve the reconstruction process performance. For the most part, the second approach is the same as the first approach. Two exceptions are the weight of pixels and checking the radius. The weights are proportional to the inverse of the distance to the center of the neighborhood. The center corresponds to the missing pixel that is being estimated. The weight d of a available pixel is $$d = \sqrt{r^2 + c^2} \tag{2.2}$$ where {r,c} is the relative coordinate of an available pixel with respect to the missing pixel that is being estimated. The radius is checked for every missing pixel until at least one available pixel is found. For example, as shown in Figure 13, pixel "a" is ready for reconstruction and the initial radius is 1. Within an eight-pixel neighborhood of pixel "a", there are 3 available pixels which are shown as gray squares in the figure. The value of missing pixel "a" is the weighted average of these three available pixels. If there are not any available pixels within a radius of 1, like pixel "b" in Figure 13, the radius must increase by 1. After increasing the radius, there are 4 available pixels, and the value of missing pixel "b" is the weighted average of these available pixels. The same process is applied until all missing pixels are estimated. The simulation results will be shown later to compare with compressive sensing. Figure 13: Illustrations of reconstruction of two missing pixels a and b. Gray squares represent available pixels which are photo sensing pixels in the imager, and white squares represent missing pixels which are energy harvesting pixels in the imager. # 2.4.2 Image Reconstruction with Compressive Sensing Compressive sensing was explored for image reconstruction when not all pixels are available. In compressive sensing measurements are acquired by projecting an images into a random basis. The random basis vectors can be arranged in matrix form by stacking the random vectors as rows of the matrix. The resulting matrix is called the sensing matrix. If an image is the size of $N \times N$ , the size of the sensing matrix is then M by $N^2$ , where M is the numbers of measurements. A measurement is acquired by calculating the inner product between input image vector and a row of the sensing matrix. M measurements are needed to reconstruct an image. A drawback of this approach is that energy is needed to program M different masks and acquire M measurements. To solve this problem, a fixed random pattern is programmed into the in-pixel sensor memory. The sensing matrix is generated based on this fixed random pattern. A sensing matrix based on this method is shown as follows: $$\Phi = \begin{bmatrix} 1 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 1 & \dots & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & \dots & 0 \\ 1 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & \dots & 0 \\ & & & & \vdots & & & & & \\ 1 & 0 & 0 & 1 & 1 & 0 & 0 & 0 & 1 & \dots & 0 \end{bmatrix}$$ (2.3) If a value in the first row is 0, all elements of this column are 0s. For example, columns 2,3,6,8, and the last column of the first row are 0s. Thus, the values of each cell in these five columns are 0s. When a value of the first row is a 1, the elements of this column randomly change to 0s or 1s in the other rows. A random sensing matrix is formed by applying the above process. Instead of acquiring measurements on the focal plane, the analog values of available pixels are converted to digital and transmitted for the compressive sensing process. Now, a sensing matrix and an image with missing pixels are available for compressive sensing and image reconstruction using compressive sensing reconstruction algorithms. ## 2.4.3 Hybrid Image Reconstruction Simulation Results The balance of 0s and 1s in the sensing matrix also can vary. In the simulation, up to 90% of 0s are programmed in the sensing matrix, and images with acceptable quality are reconstructed. The radius R is fixed for the first image reconstruction approach of sub-sampling. Table 1 and Table 2 shows average PSNR and SSIM of the reconstructed images reconstructed from 12 images and 100 random patterns with varying radii R and percentage of 1s. PSNR is the abbreviation for peak signal-to-noise ratio, and SSIM is the abbreviation for structural similarity, which is a method for measuring the similarity between two images. SSIM has better consistent with human eye perception, and PSNR is a traditional comparison for two images. SSIM and PSNR are both shown for better comparison. The average PSNR and SSIM are low when the radius is small and the percentage of 1s is low because there are not enough available pixels for reconstruction. On the other hand, when the radius R is 7 and percentage of 1s of the random mask is 90%, the average PSNR and SSIM are lower than R=1. That is because there are many available pixels to blur the images. This is the reason that the adaptive radius and weighted average reconstruction algorithm is developed. Table 1: Average PSNR (dB) of Image Reconstruction Results by Running 100 Random Patterns and 12 Images with Different Radii for Fixed Radius Approach and Different Percentages of 1s | Percentage of 1s | R=1 | R=2 | R=3 | R=4 | R=5 | R=6 | R=7 | |------------------|-------|-------|-------|-------|-------|-------|-------| | 10 | 10.37 | 15.58 | 19.81 | 20.89 | 20.77 | 20.53 | 20.27 | | 20 | 14.29 | 21.69 | 22.53 | 22.03 | 21.60 | 21.25 | 20.93 | | 30 | 18.37 | 24.11 | 23.42 | 22.76 | 22.27 | 21.89 | 21.56 | | 40 | 22.36 | 25.21 | 24.20 | 23.48 | 22.99 | 22.59 | 22.26 | | 50 | 25.83 | 26.18 | 25.07 | 24.32 | 23.80 | 23.41 | 23.06 | | 60 | 28.42 | 27.27 | 26.10 | 25.32 | 24.78 | 24.39 | 24.05 | | 70 | 30.41 | 28.59 | 27.39 | 26.60 | 26.05 | 25.65 | 25.29 | | 80 | 32.53 | 30.43 | 29.15 | 28.36 | 27.83 | 27.44 | 27.07 | | 90 | 35.76 | 33.52 | 32.22 | 31.41 | 30.88 | 30.48 | 30.09 | Table 3 summarizes and compares the performance of image reconstruction of sub-sampling and compressive sensing for different percentages of 1s. Fixed radius and adaptive radius are both shown for comparison. It is clear that the adaptive algorithm and compressive sensing have better PSNR and SSIM. The process of adaptive radius and weighted average is better than fixed radius because adaptive radius does not need to manually adjust the radius, the radius is adapted according to the available pixels in the reconstruction process. The adaptive radius approach and compressive sensing approach are implemented in the hardware testing since they have very similar performance from the simulation results. Table 2: Average SSIM of Image Reconstruction Results by Running 100 Random Patterns and 12 Images with Different Radius for Fixed Radii Approach and Different Percentages of 1s | Percentage of 1s | R=1 | R=2 | R=3 | R=4 | R=5 | R=6 | R=7 | |------------------|--------|--------|--------|--------|--------|--------|--------| | 10 | 0.1287 | 0.3610 | 0.4944 | 0.4761 | 0.4456 | 0.4256 | 0.4087 | | 20 | 0.3015 | 0.6413 | 0.6098 | 0.5617 | 0.5279 | 0.5046 | 0.4864 | | 30 | 0.5383 | 0.7337 | 0.6761 | 0.6298 | 0.5965 | 0.5737 | 0.5550 | | 40 | 0.7418 | 0.7868 | 0.7325 | 0.6899 | 0.6594 | 0.6367 | 0.6182 | | 50 | 0.8558 | 0.8308 | 0.7842 | 0.7465 | 0.7181 | 0.6969 | 0.6790 | | 60 | 0.9084 | 0.8704 | 0.8322 | 0.8000 | 0.7743 | 0.7549 | 0.7380 | | 70 | 0.9388 | 0.9064 | 0.8768 | 0.8509 | 0.8293 | 0.8124 | 0.7970 | | 80 | 0.9620 | 0.9399 | 0.9192 | 0.9011 | 0.8850 | 0.8712 | 0.8586 | | 90 | 0.9820 | 0.9709 | 0.9604 | 0.9507 | 0.9415 | 0.9330 | 0.9249 | Table 3: Average PSNR (dB) and SSIM of Image Reconstruction Results by Running 100 Random Patterns and 12 Images with Different Reconstruction Schemes and Different Balances | Percentage of 1s | Sub-sampling with fixed Radius R=4 PSNR SSIM | | Sub-sampling with weighted average and adaptive R | | Compressive sensing PSNR SSIM | | |------------------|----------------------------------------------|--------|---------------------------------------------------|----------------|----------------------------------|--------| | 10 | 20.89 | 0.4761 | PSNR 21.58 | SSIM<br>0.5765 | 21.52 | 0.5368 | | - | | | | | | | | 20 | 22.03 | 0.5617 | 23.37 | 0.6981 | 23.55 | 0.6860 | | 30 | 22.76 | 0.6298 | 24.86 | 0.7804 | 25.18 | 0.7751 | | 40 | 23.48 | 0.6899 | 26.29 | 0.8390 | 26.64 | 0.8347 | | 50 | 24.32 | 0.7465 | 27.71 | 0.8829 | 28.10 | 0.8805 | | 60 | 25.32 | 0.8000 | 29.18 | 0.9159 | 29.68 | 0.9143 | | 70 | 26.6 | 0.8509 | 30.77 | 0.9419 | 31.24 | 0.9377 | | 80 | 28.36 | 0.9011 | 32.77 | 0.9638 | 33.14 | 0.9576 | | 90 | 31.41 | 0.9507 | 36.04 | 0.9831 | 35.71 | 0.9736 | ### CHAPTER 3 # HYBRID IMAGER ENERGY HARVESTING AND IMAGE ACQUISITION RESULTS AND DISCUSSION Figure 14: Layout of the hybrid pixel imager. As a proof of concept, a test chip containing a $32 \times 32$ array of hybrid pixels, a charge pump, address decoder and a multiplexer was designed and fabricated in a $0.5~\mu m$ CMOS process. A printed circuit board (PCB) was designed and fabricated to host the test chip and the supporting circuitry. The chip was outfitted with a lens and mounted on the PCB. The PCB includes an ADC to convert the pixel outputs to digital format, a CPLD to generate the control and timing signals and to transmit acquired images to a PC, and a digital-to-analog converter (DAC) to generate bias voltages. Table 4 summarizes the main characteristics of the fabricated hybrid imager. Table 4: Features of the Test Chip | ruble 1. I editates of the Test Chip | | | | | |--------------------------------------|----------------------------------------------|--|--|--| | Technology | 3-metal, 2-poly 0.5-μm CMOS | | | | | Number of pixels | 32 × 32 | | | | | Pixel size | $54\mu\mathrm{m} \times 48.75\mu\mathrm{m}$ | | | | | Photodiode size | $31 \ \mu \text{m} \times 31 \ \mu \text{m}$ | | | | | Fill factor | 36.5% | | | | | Power Consumption | $140~\mu\mathrm{W}$ | | | | | Power supply | 3.3 V | | | | # 3.1 Energy Harvesting Figure 15: Diagram of the test setup. The light source intensity is controlled with a power supply. The I-V curve of the photodiode is measured with a source meter. The output voltage is measured with an oscilloscope. Figure 15 shows a diagram of the test setup for the energy harvesting test. All of the hybrid pixels of the imager were programmed for energy harvesting during testing unless otherwise stated. A xenon light bulb was employed as the light source since it produces light with a color similar to daylight. The light source is connected to a power supply whose current output can be regulated to control the intensity of the light beam. The PCB is mounted in front of the light beam and the output voltage of the energy-harvesting circuit is monitored with an oscilloscope. To measure $I_{ph}$ , the photodiode is disconnected from the energy-harvesting circuit and connected to a source meter (Keithley 2400). The photodiodes of the hybrid imager are connected in parallel internally. The I-V curve of the photodiodes is then measured with the source meter. $I_{ph}$ is taken as the photodiode current at 0 V bias. Figure 16: The I-V curve of the hybrid imager when all of the hybrid pixels were programmed for energy harvesting. An I-V curve of the photodiodes of the hybrid pixels is shown in Figure 16 with an illuminance of 40 kLux. Output voltage of the photodiodes of the hybrid imager is between 0.4 V to 0.5 V depending on the illumination. A charge pump or a boost converter can be implemented separately to boost up the low voltage of the photodiodes. A single stage of charge pump can pump the output voltage of the photodiodes to 0.8 V, and a DC-DC boost converter can boost the voltage up to 3.8 V with the hybrid imager. More measurement results of DC-DC converters are shown in the following sections. Since an external capacitor and inductor are required, part of the energy harvesting circuit is built on a breadboard, but the switches which are integrated with the hybrid imager are used. Figure 17: I-V curves of the hybrid imager with direct incident light using different percentage of 1s of the random patterns at the condition of illuminance $E_v = 40kLux$ . When sub-sampling or compressive sensing for hybrid pixel imager is implemented, pixels are configured as photo-sensors when they are 1s in the random mask. If a 50% of random mask is used, the other half of pixels of an imager are programmed for energy harvesting. As the images are being acquired, the imager is also harvesting energy. Figure 17 shows the I-V curves of the hybrid imager when different percentage of 1s of the random masks are implemented to configure the hybrid pixels. The illuminance is 40 kLux for this test. # 3.2 Test results for charge pump Figure 18: Measurement results of the charge pump with flying input capacitor: (a) charge pump start up; (b) ripple voltage of the charge pump The charge pump with flying capacitors was also tested. All of the hybrid pixels were programmed for energy harvesting for this test. Figure 18(a) and 18(b) shows the charge pump test results for a resistive load of 1 M $\Omega$ and a load capacitance of 1000 pF. Capacitors $C_1$ and $C_2$ have a capacitance of 1000 pF. Capacitors are off chip. Figure 18(a) shows the transient behavior of the charge pump output voltage during starts up. Figure Table 5: Power Efficiency of the Charge Pump with Different $R_L$ | $R_L(\Omega)$ | $V_{DC}(V)$ | $\eta$ (%) | |---------------|-------------|------------| | 1 M | 0.812 | 11.57 | | 500 K | 0.776 | 21.13 | | 330 K | 0.744 | 29.42 | | 220 K | 0.690 | 37.97 | | 100 K | 0.550 | 53.07 | | 56 K | 0.374 | 43.82 | 18(b) is a close-up view of the output waveform to show the ripple voltage. The ripple voltage increases as the load resistance decreases. The switching frequency of the pump is 31 kHz. Table 5 lists the charge pump output voltage and efficiency ( $\eta$ ) for different load values. The highest efficiency is achieved at a load $R_L$ of 100 k $\Omega$ and the power delivered to the load is 3 $\mu$ W. Higher voltages can be achieved by adding more stages to the charge pump, or alternatively by employing a switched inductor based DC-DC boost converter. Test results of boost converter are shown in the following section. #### 3.3 Test results for DC-DC boost converter The test results for the DC-DC boost converter configuration are shown in this section. The DC-DC boost converter uses a flying inductor instead of a capacitor to store energy. An inductor-based voltage booster has the ability to boost the photodiode's inherently low voltage and directly generate the voltage level required by the load. All of the hybrid pixels were programmed for energy harvesting for this test. The DC-DC boost converter can boost the output voltage of photodiode, which is between 0.4 V to 0.5 V up Figure 19: A circuit setup to charge a lithium-ion battery by a DC-DC boost converter with hybrid pixels. to as high as 3.8 V with the flying inductor L = 1 mH. The switch frequency is 15 kHz with $C_D$ = 100 nF, $C_L$ = 1 $\mu$ F, and no load resistor. This is a big achievement because with this high of a converted voltage output, a 3.7 V lithium-ion battery can be charged directly by this DC-DC boost converter. Figure 19 show a circuit for charging a 3.7 V lithium-ion battery with 8 mAh. Figure 20: Battery voltage as the battery is charged using the DC-DC boost converter and the hybrid pixels. Figure 20 shows the battery voltage as the battery is charged using the DC-DC boost converter and the hybrid pixels. For an illuminance of 60 kLux, the battery voltage increases 200 mV after the battery has been charged for 6 hours. The test was conducted with indoor illumination. A Xenon light bulb was employed as a light source. The values for all the components are L = 100 mH, $C_L$ = 100 nF, and $C_D$ = 100nF. Clock frequency for $S_1$ and $S_2$ is 12 kHz. Note that the boost converter only requires one phase of the clock, while the charge pump needs two phases of clock. Table 6 lists the boost converter output voltage and efficiency $(\eta)$ for different load values and different inductance. As it can be seen in the table, higher inductance can convert and boost up higher voltage because higher inductance can store more energy. This is especially true when the hybrid imager has relatively low photodiode output current, Table 6: Power Efficiency of the DC-DC Boost Converter with Different $R_L$ and External Inductor L at the Condition of $C_D=100nF$ , $C_L=100nF$ , Iph=20uA, and Illuminance $E_v=40kLux$ | | L = 1mH | | L = 10mH | | L = 100 mH | | |---------------|-------------|------------|-------------|------------|-------------|-------| | $R_L(\Omega)$ | $V_{DC}(V)$ | $\eta$ (%) | $V_{DC}(V)$ | $\eta$ (%) | $V_{DC}(V)$ | η (%) | | 1 M | 0.887 | 11.21 | 1.54 | 33.30 | 2.23 | 70.29 | | 500 K | 0.835 | 21.13 | 1.25 | 46.98 | 1.68 | 84.45 | | 330 K | 0.80 | 27.68 | 1.115 | 53.07 | 1.44 | 88.89 | | 220 K | 0.75 | 36.80 | 0.972 | 60.68 | 1.19 | 90.64 | | 100 K | 0.62 | 55.63 | 0.68 | 66.53 | 0.799 | 90.78 | | 56 K | 0.51 | 66.38 | 0.50 | 64.06 | 0.582 | 85.42 | | 10 K | 0.21 | 62.72 | 0.19 | 49.58 | 0.191 | 51.72 | requiring the boost converter to use a higher value of inductance. When inductance increases, the switching frequency decreases. The switching frequency is 533 kHz when an inductor L of 1 mH is used. The switching frequency is 12 kHz when L = 100 mH is used. The highest efficiency is achieved at a load $R_L$ of 100 k $\Omega$ and an inductance of 100 mH, with an illuminance of 40 kLux. The power delivered to the load is 6.5 $\mu$ W. ## 3.4 Image Acquisition of Hybrid Imager Figure 21: Acquired image from hybrid-pixel imager: (a) image of letter U; (b) image of letter M; (c) image of letter K; (d) image of letter C; . There are a few different ways to acquire images. The simplest way is to use all of the hybrid pixels as photo sensors, then it is like a normal camera. Figure 21 shows images acquired with the hybrid imager when all the pixels are configured to work in the sensing mode. Notably, images of acceptable quality can be acquired with the imager even if it only has a $32 \times 32$ pixel array. Another way is sub-sampling in which a random mask is applied to programmed the modality of hybrid pixels. There are two approaches to recover missing pixels of sub-sampling. The first approach is fixed radius. The second approach is using weighted average and adaptive radius. The last method to acquire images is compressive sensing. It uses a compressive sensing reconstruction algorithm to recover images. Figures 22 to 24 show the reconstructed images of three different approaches of reconstruction algorithms. In the figures, each approach used different percentages of 1s from 10% to 90% in increments of 10%. Visually, the approaches with compressive sensing and weighted average have better reconstructed images no matter what percentages of 1s of random masks. For 10% and 20% of 1s of random masks, sub-sampling with weighted average has better reconstruction results than compressive sensing, but for other percentages compressive sensing has notably better reconstruction results than others. Although, compressive sensing is more complicated in reconstruction algorithm, but it has the best performance. Moreover, compressive sensing for the hybrid pixel imager does not require any extra hardware. The hardware architecture of the compressive sensing is the same as sub-sampling. The complication of reconstruction takes place in data process center, so it does not affect the work load of the chip. Therefore, compressive sensing is the best solution for the hybrid pixel imager image reconstruction. Figure 22: Hybrid imager image reconstructions with sub-sampling of fixed radius approach. R = 4. Percentages of 1s of random masks are, from left, 10% to 90% in increments of 10%. Figure 23: Hybrid imager image reconstructions with sub-sampling of weighted average and adaptive R. Percentages of 1s of random masks are, from left, 10% to 90% in increments of 10%. Figure 24: Hybrid imager image reconstructions with compressive sensing. Percentages of 1s of random masks are, from left, 10% to 90% in increments of 10%. #### CHAPTER 4 # DETAILED DESCRIPTION AND A MATHEMATICAL ANALYSIS FOR A CIRCUIT OF ENERGY HARVESTING USING ON-CHIP SOLAR CELLS This chapter gives a detailed description and a mathematical analysis of the onchip solar energy harvesting and the proposed circuit that can be employed to generate high voltages from integrated photodiodes. The proposed circuit uses a switched-inductor approach to avoid stacking photodiodes to generate high voltages. The effect of parasitic photodiodes present in integrated circuits is addressed and a solution to minimize their impact is presented. The proposed circuit employs two switch transistors and two off-chip components: an inductor and a capacitor. A theoretical analysis of a switched-inductor DC-DC converter is carried out and a mathematical model of the energy harvester is developed. Measurements taken from a fabricated integrated circuit are presented and shown to be in good agreement with hardware measurements. Measurement results show that voltages of up to 2.81 V (depending on illumination and loading conditions) can be generated from a single integrated photodiode. The energy harvester circuit achieves a maximum conversion efficiency of 59%. Solar energy harvesting is a promising solution in remote sensing applications. In this type of application, battery replacement is costly and sometimes prohibitive. In scenarios such as environment monitoring and infrastructure surveillance, sensors are deployed in large quantities and in remote locations, making it costly to replace their batteries. In such cases, solar energy can be harvested to recharge the sensor's battery or directly power-up the sensor to extend its operational life and, in some cases, provide a near-perpetual sensor operation. The integration of solar cells on the same chip with the sensor circuitry reduces system integration costs, reduce parasitic effects and overall size. Moreover, on-chip solar cell integration enables entirely new solutions such as self-powered retinal prosthesis [7] and femto-satellites [9]. The fabrication of solar cells on a standard CMOS process is appealing because it reduces fabrication costs as no post-processing or additional fabrication steps are required. The photodiodes available in standard CMOS could, in principle, be employed to work as solar cells to harvest solar energy. However, these photodiodes have a relatively low open-circuit voltage, around 0.40 V to 0.55 V depending on the illumination conditions. Higher voltages are needed to be able to charge a battery or to power up the sensor's circuitry. The conventional approach to generating higher voltages is to serially stack several photodiodes. In standard CMOS, it is difficult to stack photodiodes because photodiodes share the same substrate. Another challenge found in standard CMOS is the presence of current losses due to parasitic photodiodes. If the parasitic current losses are not properly compensated, the amount of harvested power can be significantly decreased. If a silicon-on-insulator (SOI) fabrication process is employed, photodiodes can be isolated and they can be easily stacked. Moreover, parasitic photodiodes are eliminated in a SOI process. However, SOI is a much more costly technology than standard CMOS. Several efforts have been undertaken to address the challenges of on-chip solar cell integration in standard CMOS processes. In [6] a voltage between 0.6 V to 0.83 V is generated by connecting a P-diff/N-well and a P-sub/N-diff photodiodes in series. However, this approach is limited to only two serially-connected photodiodes. Moreover, the substrate has to be left floating to a potential that depends on the light intensity. Hence, the voltage threshold of transistors integrated on the same substrate with the photodiodes is not well defined. In [44] current losses due to parasitic diodes are compensated by employing additional photodiodes. Although this approach enables the stacking of photodiodes and generates voltages between 0.84 V to 1.3 V, its efficiency is quite low (0.3% to 0.06%). Moreover, additional silicon area is needed for parasitic current loss compensation. In [33], photodiodes fabricated in a standard CMOS process are employed as solar cells to harvest energy. An open circuit output voltage of 1.09 V was obtained by stacking photodiodes. However, the stacked photodiodes have to reside in different integrated circuits to allow the microchip substrate to be tied to a non-ground potential. In [65] photodiodes fabricated in a triple-well fabrication process are employed for energy harvesting. A triple-well process enables the stacking of photodiodes fabricated on the same microchip. However, they still suffer from losses due to parasitic photodiodes. This paper addresses the problem of on-chip solar energy harvesting and presents a circuit that can be employed to generate high voltages from integrated photodiodes. Photodiode stacking is avoided using DC-DC converters. Switched-capacitor and switched-inductor DC-DC converters are evaluated. The effect of parasitic photodiodes is addressed and a solution to minimize their impact is presented. A theoretical analysis of a switched-inductor DC-DC converter is carried out and a mathematical model of the energy harvester is developed. Measurements taken from a fabricated integrated circuit are presented. Measurement results show that voltages of up to 2.81 V (depending on illumination and loading conditions) can be generated from a single photodiode. The energy harvester achieves a maximum efficiency of 59%. The rest of this paper is organized as follows: Section 4.1 presents a brief description of photodiodes available in standard CMOS and the challenges of employing these photodiodes as solar cells to harvest energy. Section 4.2 introduces the proposed energy-harvesting circuit. Section 4.3 presents an analysis of the energy-harvesting circuit yielding a mathematical model of the solar energy harvester. Section 4.4 uses this mathematical model to find the circuit parameters that optimize the energy harvester's performance. Section 4.5 presents measurement from a fabricated microchip. ## 4.1 On-Chip Solar Cells Figure 25: Cross-section diagram of photodiodes available in standard CMOS and their equivalent circuit diagram. Figure 25 shows a cross-sectional view of some of the photodiodes that are available in a standard CMOS process. The p-n junction P-diff/N-well constitutes one of the photodiodes $(D_1)$ while the p-n junction P-sub/N-well constitutes another photodiode $(D_2)$ . The substrate (P-sub) must be connected to the lowest potential in the circuit (ground in this case) to avoid forward biasing any p-n junction across the chip. Figure 25 also shows the equivalent circuit diagram of diodes $D_1$ and $D_2$ . A third photodiode is available in standard CMOS: N-diff/P-sub (not shown in the figure). Given that the substrate has to be tied to ground, the N-diff/P-sub and the P-sub/N-well $(D_2)$ cannot be employed as solar cells since their anodes are connected to ground. The N-diff/P-sub and the P-sub/N-well photodiodes, however, can be used as light sensors in image sensing applications [10]. The anode of photodiode $D_1$ (P-diff) does not have to be tied to ground. Thus, $D_1$ can work as a solar cell to harvest solar energy. For energy harvesting purposes, $D_2$ is a parasitic diode that introduces current losses. The incident light generates electron-hole pairs in the P-diff, N-well and P-sub regions of the chip. The holes generated in the N-well diffuse to the depletion regions of $D_1$ and $D_2$ where the respective p-n junction built-in potentials collect them into the P-diff and P-sub regions. Similarly, the electrons generated in the P-diff and P-sub regions diffuse to the depletion regions and get collected into the N-well by the junctions' built-in potentials. As more charges get separated by the p-n junctions, the built-in potentials increase until equilibrium is achieved. Under open circuit conditions, equilibrium is reached when the net current through the diodes is zero. The open-circuit voltage between terminals A and B is around 0.4 V to 0.55 V depending on the illumination and doping levels. Figure 26: Typical connection between an input voltage source, a DC-DC converter and the load. The source and the load have a common ground (a) ideal voltage source; (b) photodiode $D_1$ as the voltage source. The voltage produced by a photodiode is too low to power up the electronic circuitry in a typical sensor node. Hence, a step-up DC-DC converter that generates a sufficiently large voltage is needed. A common feature of most DC-DC converter architectures is that the input voltage source of the converter and the load need to have a common ground as illustrated in Fig. 26. As a result, when the voltage source is replaced with on-chip photodiode $D_1$ , the photodiode's cathode, or equivalently the N-well, has to be tied to ground. Figure 27: Cross-section diagram of on-chip photodiodes and their connection to a DC-DC converter. The connection between $D_1$ and the DC-DC converter is illustrated in more detail in Fig. 27. The figure also shows the presence of the parasitic diode $D_2$ . As before, the cathode of $D_1$ (N-well) is tied to ground to establish a common reference between $D_1$ , the DC-DC converter and the load. Connecting the N-well to ground lowers the N-well/P-sub ( $D_2$ ) junction potential to zero volts allowing holes generated in the N-well to easily diffuse into the substrate; as a result current $I_{D2}$ increases. As holes are diverted away from $D_1$ and into $D_2$ , the magnitude of $I_{D1}$ decreases. A low $I_{D1}$ translates in less energy that can be harvested. Figure 28: Measured I-V curves of a P-diff/N-well photodiode $(D_1)$ under direct sun illumination conditions when the N-well is left floating and when it is tied to ground. The effect of tying the N-well to ground on the magnitude of $I_{D1}$ can be appreciated experimentally by measuring the current vs. voltage (I-V) curve of $D_1$ when the N-well is tied to ground and when it is left floating. Fig. 28 shows the measured I-V curve of a P-diff/N-well photodiode of size 982 $\mu$ m $\times$ 1047 $\mu$ m under direct sunlight illumination conditions. The I-V curves were measured with a source meter (Keithley 2400). As observed, the current through $D_1$ is significantly higher when the N-well is left floating than when it is tied to ground. Hence, more energy can be harvested if the N-well is left floating rather than being tied to ground. To take advantage of this observation, a DC-DC converter will have to be modified accordingly to maximize the energy that can be harvested from the photodiode. # 4.2 Energy Harvesting Circuits Figure 29: Proposed capacitor-based energy harvesting circuit. The losses due to the parasitic diode $D_2$ can be avoided by using a circuit that isolates the load circuit from the photodiode. The load isolation can be accomplished by employing a flying capacitor or by using a flying inductor. The flying capacitor or inductor is employed to store and transfer energy from the photodiode to the load while providing isolation. In this manner, connecting the N-well of the parasistic diode $D_2$ to ground is avoided. The capacitor-based energy harvesting circuit is shown in Fig. 29 while the inductor-based energy harvesting circuit is shown in Fig. 30. In both cases a two-phase non-overlapping clock with phases $\phi_1$ and $\phi_2$ and extra switches $(M_1$ to $M_4)$ are needed. Switch $M_4$ is replaced by diode $D_s$ in the switched-inductor circuit to avoid energy transfer from the load side to the photodiode side. During phase $\phi_1$ the energy storage element (capacitor or inductor) is charged by connecting it in parallel with the photodiode $D_1$ . During phase $\phi_2$ the energy stored in the capacitor or in the inductor is transfered to the load. Figure 30: Proposed inductor-based energy harvesting circuit. An advantage of the inductor-based over the capacitor-based harvesting circuit is that the former has the ability to boost the photodiode's inherently low voltage and directly generate the voltage level required by the load. On the other hand, the capacitor-based circuit has be modified to generate high voltages. A series-parallel charge pump, for instance, can be employed to boost the photodiode's voltage. The drawback of charge pumps is that they require several capacitors and switches increasing area requirements and power losses due the switches on-resistance. At first glance, the all-capacitor nature of charge pumps might seem advantageous for fully-integrated energy harvesting solutions. However, on-chip capacitors suffer from bottom-plate parasitic capacitances which dissipate additional energy as the parasitic capacitors need to be charged and discharged [58]. Moreover, accurate charge pumps require substantially large capacitors on the order of micro-Farads which are difficult to integrate on chip [62]. As state-of-the-art inductors can be as small as $1 \times 2 \times 1$ mm<sup>3</sup>, co-packaging an off-chip inductor with the sensor IC is an viable solution for producing miniaturized energy-harvesting sensor nodes [40]. In this work we focus on inductor-based solar energy harvesting circuits for integrated photodiodes. Improved Energy Harvesting Circuit: The voltage at the cathode of photodiode $D_1$ (node B in Fig. 30) is typically between -0.4 V to -0.5 V, which is the open-circuit voltage of photodiode $D_2$ . This open-circuit voltage develops in response to photons with longer wavelength that are able to penetrate deeper into the silicon crystal and reach $D_2$ . As a result, the voltage at node A is very close to 0 V due to the positive voltage ( $\sim 0.5$ V) generated across $D_1$ . This particular situation, only present in integrated photodiodes, allows us to remove switch $M_3$ . (see Fig. 31). With $M_3$ removed, during phase $\phi_1$ (when $M_1$ is closed), node A gets connected to the load ground. However, since the potential at node A is already very close to ground, the circuit operates as originally intended. The advantage of removing $M_3$ is that the losses due to the non-zero ON resistance of $M_3$ are eliminated. Less silicon area is needed to implement the energy-harvesting circuit if $M_3$ is removed. Moreover, there is no need to generate a two-phase clock as only phase $\phi_1$ is required. Circuit measurements show that removing $M_3$ results in an output voltage, $V_{out}$ , increase of 0.2 V to 0.3 V. Due to these reasons, in the rest of this paper, we will focus on the analysis and test of the energy-harvesting circuit shown in Fig. 31. Figure 31: Improved two-transistor inductor-based energy harvesting circuit using onchip photodiodes. # 4.3 Circuit Analysis An analysis of the energy harvesting circuit is necessary to optimize it and to provide an insight into its operation. The inductor-based harvesting circuit belongs to a class of DC-DC converters known as boost converters. However, the unique non-linear characteristic of the photodiode introduces effects that are not present in the design of standard boost converter circuits where an ideal voltage source is often assumed as the energy source. To simplify the analysis of a boost converter with a photodiode, the photodiode is sometimes modeled as a constant current source [78], a piece-wise linear circuit [77] or it is assumed to operate in close proximity of the maximum power point (MPP) [26]. In this work we carry out an analysis that does not make these assumptions, resulting in a more generic model that allows us to estimate more precisely the performance of the proposed energy harvesting circuit. Our analysis is then employed to find the conditions that optimize the operation of the energy harvester. ## 4.3.1 Photodiode Model Figure 32: Photodiode equivalent circuit employed to model the photodiode's electrical behavior. Our analysis requires an accurate model of the photodiode. A photodiode working in the photo-voltaic mode is typically modeled using the equivalent circuit shown in Fig. 32 [79]. We will use the model in Fig. 32 in our analysis and simulations to assess the performance of the proposed energy harvesting circuit. The photodiode's model includes a current source, $I_{ph}$ , that represents the photogenerated current. $I_{ph}$ is a function of the illumination intensity and when $R_s$ is close to zero, it can be approximated by the photodiode's short-circuit current. The equivalent circuit also includes a diode to model the typical knee in the I-V curve. The current through this diode is given by $I_d = I_s(e^{Vd/nV_T} - 1)$ , where $I_s$ is the reverse saturation current, n is the diode ideality factor and $V_T$ is the thermal voltage. The thermal voltage is defined as $V_T = k_b T_C/q$ where $k_b$ is Boltzmann's constant, $T_C$ is the photodiode's temperature and q is the electron charge. $V_d$ is the voltage across the diode. The capacitor represents the photodiode's junction capacitance $C_j$ . The resistance $R_{sh}$ is a shunt resistance that models the load presented to the current harvested near the edges of the device and $R_s$ is the diode's series resistance due to the device contacts and connections [62]. The model parameters can be extracted from measured I-V curves using curve fitting [31] or iterative techniques [37]. To extract the photodiode model parameters, the I-V curve shown in Fig. 28 (with the N-well floating) is used. Given that an I-V curve is a DC measurement, the capacitor $C_j$ can be assumed to be an open circuit when performing the I-V curve based parameter estimation. Considering $C_j$ as an open circuit and using the iterative technique presented in [37] yields the following parameter values: $I_{ph}$ =52.3 $\mu$ A, $R_{sh}$ =1 M $\Omega$ , n=1.15, $I_s$ =1 pA and $R_s$ =0 $\Omega$ . The junction capacitance $C_j$ is a function of the diode voltage $V_d$ . To simplify our analysis, the value of $C_j$ is approximated with its zero-bias value $C_{j0} \times A$ , where $C_{j0}$ is the zero-bias capacitance per unit area of the P-diff/N-well junction and A is the area of the photodiode. Circuit simulations show that the error introduced by the zero-bias $C_j$ approximation is very small. The fabricated photodiode has an area of $982~\mu\mathrm{m}\times1047~\mu\mathrm{m}$ , which yields an estimated junction capacitance value of $C_j=735~\mathrm{pF}$ . Fig. 33 shows the measured and modeled I-V curves. Figure 33: Modeled and measured photodiode I-V curves. # 4.3.2 Maximum Power Point (MPP) Calculation The efficiency of the energy harvester circuit will be evaluated using the following expression [68]: $$\eta = \frac{P_{out}}{P_{in,max}} \times 100\% = \frac{V_{out} \times I_L}{V_{mpp} \times I_{mpp}} \times 100\%$$ (4.1) where, $V_{mpp}$ and $I_{mpp}$ are the photodiode's voltage and current at the MPP, $P_{in,max} = V_{mpp} \times I_{mpp}$ is the maximum power that the photodiode can provide, and $P_{out} = V_{out} \times I_L$ is the power delivered to the load. An analytical expression that calculates an approximate value for the MPP has been derived in [63]. The approximate MPP value is within a neighborhood $\varepsilon$ of the true MPP, where $\varepsilon=nV_T$ . In this section, a simple analytical expression that predicts the MPP with a slightly better accuracy than the expression provided in [63] is derived. The derived MPP expression allows for more accurate estimations of $P_{in,max}$ . From Fig. 32 and taking into account that $R_s$ =0, the following expression for the photodiode power can be written: $$P_{in} = V_{D1} \times I_{D1} = V_{D1} \left[ I_{ph} - I_s (e^{V_{D1}/nV_T} - 1) - \frac{V_{D1}}{R_{sh}} \right]$$ (4.2) Making $\partial P_{in}/\partial V_{D1}=0$ and considering $R_{sh}{\approx}\infty$ results in: $$I_s e^{V_{D1}/nV_T} \left( 1 + \frac{V_{D1}}{nV_T} \right) = I_{ph} + I_s$$ (4.3) Equation (4.3) can be solved using the Lambert-W function [22] to yield: $$V_{mpp} = nV_T \times W(a) - nV_T \tag{4.4}$$ and $$I_{mpp} = I_{ph} + I_s - \frac{Is \times a}{W(a)}$$ $$\tag{4.5}$$ where, $W(\cdot)$ is the Lambert-W function and $a=e^1\times(\frac{I_{ph}+I_s}{I_s})$ . Fig. 34 shows the photodiode output power as given in (4.2) for different photocurrent, $I_{ph}$ , levels. The MPP estimated by equation (4.4) are marked in the figure with dots. A comparison with the MPP value estimated by the analytical expression provided in [63] is also shown. Notably, the MPP expression derived above provides a slightly better estimation of the MPP. Figure 34: Photodiode output power as a function of the photodiode voltage $V_{D1}$ for different values of the photo-generated current. The estimated location of the MPP is marked with dots. Assuming $R_{sh}=\infty$ for simplicity, the open-circuit voltage of the photodiode can be found to be: $$V_{oc} = nV_T \log \left( \frac{I_{ph} + I_s}{I_s} \right) \tag{4.6}$$ # 4.3.3 Energy Harvester Equivalent Circuit Figures 35 and 36 show the equivalent circuit of the energy harvester when $\phi_1=1$ and when $\phi_1=0$ , respectively. The equivalent circuits are employed to model the operation of the inductor-based energy harvester and to guide its design. Figure 35: Equivalent energy harvesting circuit when $\phi_1=1$ . Figure 36: Equivalent energy harvesting circuit when $\phi_1 = 0$ . In the equivalent circuits, the photodiode has been replaced by the model presented in Fig. 32. The transistors are modeled by the transistors' ON resistance $R_{on}$ which is assumed to be equal for both transistors since the transistors' source voltages are either ground or close to ground. The diode $D_s$ is assumed to have zero ON resistance and a drop voltage of $V_{don}$ volts when conducting. Capacitance $C_d$ is the parallel equivalent capacitance of $C_D$ and $C_j$ . Two distinct modes of operation can be distinguished for the energy harvesting circuit, namely continuous conduction mode (CCM) and discontinuous conduction mode (DCM). The circuit operates in CCM if the current through the inductor does not fall to zero throughout a switching period. On the other hand, if the inductor current falls to zero before the end of a switching phase, the harvesting circuit is said to operate in DCM. DCM typically occurs when the harvesting circuit operates at light load conditions generating large inductor current variations [29]. The difference between these two modes is shown graphically in Fig. 37 and 38 where the current through the inductor, $I_i(t)$ , is depicted. ## 4.3.4 CCM Analysis Fig. 37 shows a typical steady state inductor current waveform in CCM. $T_1$ denotes the time for which $\phi_1 = 1$ while $T_2$ denotes the time for which $\phi_1 = 0$ . Figure 37: Inductor current waveform in CCM. From the circuit in Fig. 36 we can write the following expression for $I_i$ when $\phi_1=0$ : $$I_i(t) = I_i(T_1) - \frac{1}{L} \int_{T_1}^t V_i \, dt \tag{4.7}$$ Replacing $V_i = V_{out} + V_{don}$ in (4.7) and assuming a capacitor $C_L$ large enough to keep $V_{out}$ constant (small-ripple approximation), equation (4.7) can be solved to yield: $$I_i(t) = I_i(T_1) - \frac{(t - T_1)}{I_i} \left( V_{out} + V_{don} \right)$$ (4.8) The condition for CCM operation is $I_{i0} > 0$ . Given that $I_i(T_1 + T_2) = I_{i0}$ , the following expression for $I_{i0}$ can be found: $$I_{i0} = I_i(T_1) - \frac{T_2}{L} \left( V_{out} + V_{don} \right) \tag{4.9}$$ Equation (4.8) can also be used in DCM analysis provided that $I_{i0} = 0$ before the end of a switching period. In steady state, the load current $I_L$ is equal to the average current through the diode $D_s$ . Therefore, we can write: $$I_L = \frac{1}{T} \int_{T_1}^{T_1 + T_2} I_i(t) dt$$ (4.10) where $T = T_1 + T_2$ . Replacing (4.8) in (4.10) and solving for $V_{out}$ results in: $$V_{out} = \frac{2L}{(1-d)T}I_i(T_1) - \frac{2LI_L}{(1-d)^2T} - V_{don}$$ (4.11) where, $d = T_1/T$ is the clock's duty cycle. According to (4.11), the output voltage can be increased by increasing the value of the inductance L. However, due to the small-package constraints of the target application, L is kept below 47 $\mu$ H. In principle, $V_{out}$ could also be increased by increasing the duty cycle or the switching frequency. However, changing d and T also affects the value of $I_i(T_1)$ . In the next subsection, an expression for $I_i(T_1)$ will be derived that will allow us to consider the overall effects of d and T on $I_i(T_1)$ . # 4.3.5 DCM Analysis From Fig. 38 we note that $I_i(T_1 + \delta T) = 0$ . Replacing this result in (4.8) and solving for $\delta$ yields: $$\delta = \frac{LI_i(T_1)}{\left(V_{out} + V_{don}\right)T} \tag{4.12}$$ Figure 38: Inductor current waveform in DCM. The condition for DCM operation is $\delta < (1-d)$ . As before, in steady state the load current $I_L$ is equal to the average current through the diode $D_s$ . Therefore, we can write: $$I_L = \frac{1}{T} \int_{T_1}^{T_1 + \delta T} I_i(t) dt$$ (4.13) Replacing (4.8) and (4.12) in (4.13) and solving for $V_{out}$ yields: $$V_{out} = \frac{LI_i^2(T_1)}{2TI_L} - V_{don} (4.14)$$ Notably, in DCM the output voltage has a quadratic dependence on $I_i(T_1)$ and is inversely proportional to the load current $I_L$ . An expression for $I_i(T_1)$ will be derived next to facilitate the calculation of $V_{out}$ . Once $V_{out}$ is known, both the output power $P_{out} = V_{out} \times I_L$ and the efficiency can be calculated. # 4.3.6 Inductor Current From Fig. 35 we can write the following relationships: $$I_{i} = I_{ph} - I_{d} - C_{d} \frac{dV_{d}}{dt} - \frac{V_{d}}{R_{sh}}$$ (4.15) and $$\frac{dI_i}{dt} = \frac{V_d}{L} - \frac{2R_{on}}{L}I_i \tag{4.16}$$ The non-linear dependence of $I_d$ on $V_d$ precludes an explicit solution of the equation system above. To circumvent this problem we employ Taylor series to linearize the current $I_d$ around the steady state average value of the photodiode voltage $V_d$ . Fig. 39 shows the approximate behavior of $V_d$ in steady state. In the figure, $V_{dm} = (V_{d0} + V_{d1})/2$ is the average value of $V_d$ . Figure 39: Inductor current waveform. Applying Taylor series, the current $I_d$ can be approximated around $V_{dm}$ as follows: $$I_d = c_1 + c_2(V_d - V_{dm}) (4.17)$$ where, $c_1=I_s(e^{V_{dm}/nV_T}-1)$ and $c_2=(I_s/nV_T)e^{V_{dm}/nV_T}$ . From (4.17) we ca write: $$\frac{dI_d}{dt} = \frac{dI_d}{dV_d} \frac{dV_d}{dt} = c_2 \frac{dV_d}{dt} \tag{4.18}$$ Taking the derivative of (4.15) with respect to time yields: $$\frac{dI_i}{dt} = -\frac{dI_d}{dt} - C_d \frac{d^2 V_d}{dt^2} - \frac{1}{R_{sh}}$$ (4.19) Replacing (4.18) in (4.19) and equating the result with (4.16) results in a secondorder differential equation which can be employed to solve for $V_d$ when $\phi_1 = 1$ : $$V_d(t) = -\frac{K_3}{K_2} + c_4 e^{-t/\tau_1} + c_5 e^{-t/\tau_2}$$ (4.20) where, $$K_{1} = -\frac{c_{2}}{C_{d}} - \frac{1}{C_{d}R_{sh}} - \frac{2R_{on}}{L}$$ $$K_{2} = -\frac{1}{LC_{d}} \left( 1 + c_{2}2R_{on} + \frac{2R_{on}}{R_{sh}} \right)$$ $$\tau_{1} = \frac{-1}{\frac{K_{1}}{2} - \frac{1}{2}\sqrt{K_{1}^{2} + 4K_{2}}}$$ $$\tau_{2} = \frac{-1}{\frac{K_{1}}{2} + \frac{1}{2}\sqrt{K_{1}^{2} + 4K_{2}}}$$ $$K_{3} = -\frac{-2R_{on}}{LC_{d}} \left( c_{1} - I_{ph} - c_{2}V_{dm} \right)$$ $$c_{4} = \left( \frac{\tau_{1}}{\tau_{1} - \tau_{2}} \right) \left( V_{d0} + \frac{K_{3}}{K_{2}} \right)$$ $$c_{5} = \left( \frac{\tau_{2}}{\tau_{2} - \tau_{1}} \right) \left( V_{d0} + \frac{K_{3}}{K_{2}} \right)$$ $$(4.21)$$ Given that $V_{d1} = V_d(T_1)$ and using (4.20) we can write: $$V_{d1} = -\frac{K_3}{K_2} + c_4 e^{-T_1/\tau_1} + c_5 e^{-T_1/\tau_2}$$ (4.22) To solve for $V_d$ when $\phi_1=0$ , we notice that from Fig. 36 we can write: $$C_d \frac{dV_d}{dt} = I_{ph} - I_d - \frac{V_d}{R_{ch}} \tag{4.23}$$ Replacing (4.17) in (4.23) results in a first-order differential equation that can be solved to yield: $$V_d(t) = -\frac{K_5}{K_4} \left( 1 - e^{-t/\tau_3} \right) + V_{d1} e^{-t/\tau_3}$$ (4.24) where, $$K_{4} = -\frac{c_{2}}{C_{d}} - \frac{1}{C_{d}R_{sh}}$$ $$K_{5} = \frac{I_{ph} - c_{1} + c_{2}V_{dm}}{C_{d}}$$ $$\tau_{3} = -\frac{1}{K_{4}}$$ (4.25) Given that $V_{d0} = V_d(T_2)$ and using (4.24) we can write: $$V_{d0} = -\frac{K_5}{K_4} \left( 1 - e^{-T_2/\tau_3} \right) + V_{d1} e^{-T_2/\tau_3}$$ (4.26) The steady-state values of $V_{d0}$ and $V_{d1}$ can be found using the iterative procedure outlined in Listing 4.1 below. In the iterative procedure the variable k represents the iteration number. - 1) k=02) Set $V_{d0}[k]=V_{d1}[k]=V_{oc}$ 3) Use (22) to find $V_{d1}[k+1]$ 4) Use (26) and $V_{d1}[k+1]$ to find $V_{d0}[k+1]$ - 5) Update $V_{dm} = (V_{d0}[k+1] + V_{d1}[k+1])/2$ - 6) k = k + 1 - 7) If k>200 end - 8) Go to 3) Listing 4.1: Iterative procedure to find steady-state values of $V_{dm}$ , $V_{d0}$ and $V_{d1}$ Once the steady-state value of $V_{dm}$ , $V_{d0}$ and $V_{d1}$ , along with the coefficient values in (4.21), are found, an expression for $I_i$ when $\phi_1 = 1$ can be obtained by replacing (4.17), (4.19) and (4.20) in the (4.15)-(4.16) equation system and solving the resulting differential equation. The obtained expression for the inductor current is: $$I_{i}(t) = I_{i0}e^{-t/\tau_{4}} + \frac{c_{6}}{2R_{on}}\left(1 - e^{-t/\tau_{4}}\right) + \frac{c_{4}\tau_{1}}{L - 2R_{on}\tau_{1}}\left(e^{-t/\tau_{4}} - e^{-t/\tau_{1}}\right) + \frac{c_{5}\tau_{2}}{L - 2R_{on}\tau_{2}}\left(e^{-t/\tau_{4}} - e^{-t/\tau_{2}}\right)$$ $$(4.27)$$ where, $$c_6 = -\frac{K_3}{K_2}$$ $$\tau_4 = \frac{L}{2R_{cm}} \tag{4.28}$$ To demonstrate the validity of the model derived above, the inductor current waveform, $I_i(t)$ , was calculated using (4.27) and using the Spectre circuit simulator. Fig. 40 shows the inductor current waveforms generated by the model and the circuit simulator for the following parameter values: $I_{ph}=52~\mu\text{A}, L=47~\mu\text{H}, I_{i0}=0~\text{A}, T=8~\mu\text{s}$ and d=0.5. The figure shows three cases: $R_{on}=250~\Omega$ , $R_{on}=100~\Omega$ and $R_{on}=50~\Omega$ . Notably, the derived expression for $I_i(t)$ agrees very well with the output of the Spectre circuit simulator. More importantly, the expression for $I_i(t)$ allows us to explore the design space of the DC-DC converter without having to rely on a high-end circuit simulator. According to equations (4.11) and (4.14), $V_{out}$ is maximized when $I_i(T_1)$ is maximum. In the next section we will the mathematical model derived to find the circuit parameter values that maximize the output voltage and consequently the efficiency of the energy harvesting circuit. ## 4.4 Model-Based Design Optimization From Fig. 40 it is clear that low $R_{on}$ values result in higher peak inductor currents, and correspondingly higher output voltages, as the power losses due to the on-resistance of the switch transistors are reduced. Hence, low $R_{on}$ values are preferred to achieve high conversion efficiencies. The value of $R_{on}$ for a NMOS transistor is given by: Figure 40: Inductor current $I_i(t)$ waveforms calculated using the derived expression in (4.27) and using the Spectre circuit simulator for three different values of $R_{on}$ . $$R_{on} = \frac{L}{\mu_n C_{ox} W(V_{GS} - V_{TH})}$$ (4.29) where, $\mu_n$ is the electron mobility, $C_{ox}$ is the capacitance per unit area of the gate oxide, W is the width of the transistor's gate, L is the length of the transistor's gate, $V_{GS}$ is the steady-state transistor's gate-to-source voltage and $V_{TH}$ is the threshold voltage of the transistor. From (4.29), $R_{on}$ decreases as W increases. However, as W is increased, the gate capacitance, $C_g = WLC_{ox}$ , also increases resulting in higher dynamic power losses due to the need to charge and discharge $C_g$ as the transistor is turned on and off. Thus, the transistor width W cannot be made arbitrarily large. The analysis presented in [72] shows that the optimal W, at which the combined effect of conduction losses due to $R_{on}$ and dynamic losses due to $C_g$ is minimized, results in extremely wide transistors. Hence, they conclude that in the interest of saving silicon die area, smaller than optimal transistor widths can be employed. In this work we employ transistors with a gate width of 90 $\mu$ m which result in an on-resistance of approximately 50 $\Omega$ . From equations (4.11) and (4.14), it is clear that a large inductance value maximizes the output voltage of the energy harvesting circuit. However, a large inductor will have a large physical size and might not be suitable for integration with a small sensor unit. Hence, the inductance value is limited to 47 $\mu$ H as this is an inductor that is commercially available in packages as small as $1\times2\times1$ mm<sup>3</sup>. # 4.4.1 CCM and DCM Operation Replacing the CCM condition $I_{i0} > 0$ in equation (4.9) results in the following output voltage upper bound: $$\frac{L}{T_2}I_i(T_1) - V_{don} > V_{out} (4.30)$$ Similarly, replacing the DCM condition $\delta > (1-d)$ in equation (4.12) results in the following output voltage lower bound: $$V_{out} > \frac{L}{T_2} I_i(T_1) - V_{don}$$ (4.31) Comparing (4.30) and (4.31), it is clear that the energy harvesting circuit provides a higher output voltage if it operates in DCM. Hence, we focus on the DCM for the rest of this paper. For convenience, actual measurements of the harvester performance are taken using a resistive load instead of a constant current load. Thus, we can write: $I_L = V_{out}/R_L$ , where $R_L$ is the load resistance and $I_L$ is the load current. Replacing $I_L = V_{out}/R_L$ in (4.14) results in the following quadratic equation: $$V_{out}^2 + V_{don}V_{out} - \frac{LR_L I_i^2(T_1)}{2T} = 0 (4.32)$$ which can be solved to obtain: $$V_{out} = \sqrt{\frac{V_{don}^2}{4} + \frac{LR_L I_i^2(T_1)}{2T}} - \frac{V_{don}}{2}$$ (4.33) The expression in (4.33) will be used in the performance evaluation of the proposed energy-harvesting circuit. The power delivered to the load then becomes: $$P_{out} = \frac{V_{out}^2}{R_L} \tag{4.34}$$ Fig. 41 shows the conversion efficiency of the energy-harvesting circuit as the switching clock period (T) and the clock's duty cycle (d) are varied. The efficiency was calculated using (4.1) and (4.34). The following parameters were used to generate the plot: $I_{ph} = 50 \ \mu\text{A}$ (outdoor light conditions), $R_L = 500 \ \text{k}\Omega$ , $V_{don} = 0.3 \ \text{V}$ , $L = 47 \ \mu\text{H}$ , $R_{on} = 50 \ \Omega$ and $C_d = 33 \ \text{nF}$ . From the figure, efficiencies greater than 60% can be achieved for switching frequencies in the range of 200 kHz to 4 MHz $(0.25 \ \mu\text{s} \le T \le 5.0 \ \mu\text{s})$ and duty cycles above 4%. Figures 42 and 43 show the conversion efficiency and output voltages of the energy-harvesting circuit as the duty cycle of the clock signal is varied, and for different values of the photogenerated current $I_{ph}$ . The resistance load is kept fixed at $R_L=500~{\rm k}\Omega$ Figure 41: Conversion efficiency of the proposed energy-harvesting circuit operating in DCM as a function of duty cycle (d) and clock period (T). and the switching frequency is set to 200 kHz. Notably, as the illumination level changes, the duty cycle needs to be adjusted correspondingly to achieve maximum conversion efficiency. Moreover, from Fig. 43, as the illumination level decreases, the maximum output voltage that can be obtained also decreases. Figure 42: Conversion efficiency of the energy-harvesting circuit operating in DCM as a function of duty cycle (d) for different photogenerated current levels $I_{ph}$ . Fig. 44 shows a plot of $V_{dm}$ , the average photodiode voltage $(V_d)$ in steady state, as the duty cycle is varied and for the same conditions as Figs. 42 and 43. Fig. 44 also shows Figure 43: Output voltage of the energy-harvesting circuit operating in DCM as a function of duty cycle (d) for different photogenerated current levels $I_{ph}$ . (with color dots) the location of the maximum power point voltage $(V_{mpp})$ . Comparing Fig. 44 with Figs. 42 and 43, it can be seen (as expected), that the maximum conversion efficiency is achieved for the same duty cycle that results in the average voltage across the photodiode being equal to $V_{mpp}$ . Figure 44: Average photodiode voltage $(V_{dm})$ as a function of duty cycle (d) for different photogenerated current levels $I_{ph}$ . Fig. 45 shows the output voltage of the energy-harvesting circuit for different Figure 45: Output voltage of the energy-harvesting circuit operating in DCM as a function of duty cycle (d) for different load resistances $R_L$ . values of the load resistance $R_L$ as the duty cycle is varied and for a fixed value of $I_{ph}$ . As the load resistance decreases, the output voltage decreases. However, the maximum output voltage is achieved for the same duty cycle value regardless of the load resistance value. Hence, to achieve maximum conversion efficiency, the duty cycle needs to be adjusted only in response to changes in the illumination levels. #### 4.5 Hardware Measurements To test the concept of energy harvesting using on-chip solar cells, an integrated circuit (IC) was designed and fabricated using a 0.5 $\mu$ m standard CMOS fabrication process. The IC includes a rectangular P-diff/N-well photodiode, a capacitor bank and NMOS transistor switches. A photograph of the integrated circuit is shown in Fig. 46. The area of the photodiode is 982 $\mu$ m $\times$ 1047 $\mu$ m. The fabricated IC is an experimental microchip and includes photodiodes of different geometries for experimental purposes. The switches was covered with a metal layer to prevent light from affecting their behavior. The integrated capacitors are combined in parallel to create a 580 pF capacitor that was employed as the load capacitor $C_L$ . The bottom plate parasitic capacitance, that is normally a source of losses in integrated capacitors, is not of concern here because the bottom plate of the load capacitor must be tied to ground, thus, shorting the bottom-plate parasitic capacitance. Figure 46: Photograph of fabricated CMOS integrated circuit. The integrated circuit contains a P-diff/N-well photodiode, a capacitor bank and NMOS transistor switches. The switches are covered by a metal layer to protect them from light. ## 4.5.1 Test Setup A printed circuit board (PCB) was designed and fabricated to house the fabricated microchip. The PCB includes a CPLD that is employed to generate a clock signal with variable duty cycle. A test setup was built to collect measurements. Fig. 47 shows a diagram of the test setup. A white LED was employed as the light source. The light source is connected to a power supply whose current output can be regulated to control the intensity of the light beam. The PCB is mounted in front of the light beam and the output voltage of the energy-harvesting circuit is monitored with an oscilloscope. To measure $I_{ph}$ , the photodiode is disconnected from the energy-harvesting circuit and connected to a source meter (Keithley 2400). The I-V curve of the photodiode is then measured with the source meter. $I_{ph}$ is taken as the photodiode current at 0 V bias. Figure 47: Diagram of the test setup. The light source intensity is controlled with a power supply. The I-V curve of the photodiode is measured with a source meter. The output voltage is measured with an oscilloscope. #### 4.5.2 Measurements The power supply current of the light source was adjusted until $I_{ph}$ reached a value of $50~\mu\mathrm{A}$ to simulate outdoor illumination levels (approx. $60~\mathrm{klux}$ ). The average output voltage of the energy-harvesting circuit was then measured with the oscilloscope. Figure 48: Output voltage of the energy-harvesting circuit as predicted by theoretical model (dotted lines) and from hardware measurements (circles). Fig. 48 shows the output voltage as predicted by (4.33) (shown with dotted lines) and actual voltage measurements (shown with circles) as the duty cycle is varied. An inductor value of $L=47~\mu\mathrm{H}$ and a capacitance $C_D$ of 33 nF were employed. Three different values of resistor were used for the load: 198 k $\Omega$ , 330 k $\Omega$ and 990 k $\Omega$ . Given that the oscilloscope probe has a finite impedance (approximately 700 k $\Omega$ ) the effective load seen by the energy-harvesting circuit is lower than the resistors employed. Taking into account the probe impedance, the effective load resistance values ( $R_L$ ) are 154 k $\Omega$ , 224 k $\Omega$ and 410 k $\Omega$ . The switching frequency was set to 200 kHz ( $T=5~\mu\mathrm{s}$ ). Notably, the output voltage predicted by the theoretical model agrees very well with the measurements. The maximum output voltage obtained for $R_L=410~\mathrm{k}\Omega$ is 2.24 V for a duty cycle of 4.5% which corresponds to a power delivered to the load of 12.24 $\mu$ W and a conversion efficiency of 59%. The maximum output voltage obtained with only the oscilloscope probe loading the circuit is 2.81 V. Fig. 49 shows an oscilloscope capture of the corresponding output voltage. The output voltage ripple is due to the switching activity and can be improved by employing a larger load capacitance $C_L$ . Figure 49: Voltage output of energy-harvesting circuit captured by an oscilloscope ( $R_L = 410 \text{ k}\Omega$ ). Higher efficiencies can be achieved if the switching frequency is increased as predicted by Fig. 41. However, in our experiments, for switching frequencies beyond 250 kHz, the conversion efficiency started to decrease. The lower efficiencies at higher switching frequencies are due to the combined effects of parasitic capacitive and inductive elements of the PCB and the IC package and the limited self-resonance frequency (SRF) of the inductor employed. The efficiency, however, can be improved by co-packaging the IC and the inductor to minimize parasitic effects. The proposed energy-harvesting circuit requires two off-chip components ( $C_D$ and L). Due to their relatively large values (by integrated circuits standards), these components cannot be integrated on chip. However, they can be co-packaged with the final IC as state-of-the-art off-chip capacitors and inductors of the requires values are on the order of $1 \times 2 \times 1$ mm<sup>3</sup>. An external Schottky diode was employed in our experiments because the IC fabrication process that was used did not have a Schottky diode option. However, many CMOS fabrication process support the Schottky diodes. Alternatively, at the expense of more silicon area, a self-synchronized MOS switch [40] can be employed instead of the Schottky diode. #### CHAPTER 5 # MULTIRESOLUTION DECOMPOSITION FOR LOSSLESS AND NEAR - LOSSLESS COMPRESSION Radio communications are power-hungry operation. Data compression can be implemented to reduce the power consumption. Two approaches are presented in this dissertation, and they are multiresolution decomposition image compression and compressive sensing conversion. The multiresolution decomposition lossless image compression approach is suitable for focal plane integration, while a near-lossless compression can achieve higher compression rates. Compressive sensing is discussed in Chater 6. A feature that precludes modern lossless image compression algorithms from their direct focal plane implementation is that they require significant memory storage. For instance, CALIC [76] requires storing two previous rows of the image as the encoding proceeds. Moreover, adaptive context modeling is used to improve the prediction value but this operation requires a significant hardware overhead. In order to keep the implementation simple yet still be able to provide compression, we implement two basic image compression components: decorrelation via prediction and entropy encoding of prediction residuals. Furthermore, we employ multiresolution decomposition to complement the prediction process. By employing multiresolution decomposition we avoid the need to store previous rows of the image. Multiresolution decomposition is an image processing technique in which the resolution of an image is progressively increased or decreased. In [64] multiresolution representation using the S+P transform is employed to form a hierarchical pyramid for lossless compression. However, building the hierarchical pyramid requires significant memory storage. We employ multiresolution decomposition to generate an image of lower resolution that serves as as the prediction of pixel values in the higher resolution image. The basic concept of this approach is depicted in Fig. 50. The algorithm can work with one or more levels of decomposition. One level and three levels of decomposition is shown later. Figure 50: Image compression approach based on multiresolution decomposition. ## 5.1 One Level of Decomposition Fig. 50 is one level of decomposition. It takes the original image $H^0$ of size $N \times N$ pixels (with N even) and generates image $H^1$ of size $\frac{N}{2} \times \frac{N}{2}$ . The pixel value at location (r,c) in $H^1$ is computed as follows: $$H_{r,c}^{1} = \frac{1}{4} \left( H_{2r,2c}^{0} + H_{2r+1,2c}^{0} + H_{2r,2c+1}^{0} + H_{2r+1,2c+1}^{0} \right)$$ (5.1) where $r=0,1,2\ldots N/2-1, c=0,1,2\ldots N/2-1$ . In other words, each pixel in $H^1$ is the average of the four pixels in the spatially equivalent $2\times 2$ block in $H^0$ . The image $H^1$ has a resolution four times smaller than $H^0$ . If $H^0$ is read out using a Morton-Z scan, the $2\times 2$ block average can be computed as the image is read out and there is no need to store the previous image row. The low resolution image $H^1$ is employed as a predictor of the higher-resolution image $H^0$ . A prediction residual image $R^1$ of size $N\times N$ is computed in the following manner: $$R_{r,c}^1 = H_{r,c}^0 - H_{u,v}^1 (5.2)$$ where $u=\lfloor r/2\rfloor$ , $v=\lfloor c/2\rfloor$ and $\lfloor \cdot \rfloor$ is the greatest integer function. For smooth areas in an image, a pixel value will be very close to the average of itself and its closest neighbors. Thus, the prediction residual on those regions will be zero or very close to zero. For regions in the image with edges or texture, the absolute value of the prediction residual will increase. For most natural images, the distribution of the prediction residuals peaks at zero and falls exponentially as the residual absolute value increases. Prediction residuals of natural images often follow a Laplacian distribution. We take advantage of this type of distribution by using an entropy encoder to encode the residual images and achieve compression. An entropy encoder assigns short codewords to frequent residual values and longer codewords to less-frequent residual values. The image $H^1$ is compressed using a simple differential encoding scheme where the difference between a pixel and its neighbor to the left is encoded. Following this simple differential scheme, there is no need to store an entire image row to encode $H^1$ . The compressed representation of the original image $H^0$ is given by the differentially encoded $H_1$ , followed by the entropy-encoded residual image $R^1$ . Note that we only need to transmit 3 out of every 4 pixels in the residual images $R^1$ because the 4-pixel averages are already available to the decoder from the previous level of decomposition. # **5.2** Mixed-Signal Implementation Analysis The compression approach described above is implemented in hardware employing a mixed-signal circuit to save silicon area resources. The multiresolution decomposition step is performed in the analog domain to take advantage of simple analog solutions and the residuals are then converted to digital by an ADC for further encoding. The encoding step is performed in the digital domain due to the inherently digital nature of entropy encoders. Analog circuits will introduce noise, which might limit the compression performance. In this section we will analyze the noise performance of the mixed-signal approach and compare it with a fully-digital approach. In the fully-digital approach, the analog pixel values in $H^0$ are converted to digital by an ADC, and all the subsequent operations are then performed in the digital domain. The fully-digital approach is the approach taken by lossless compression standards. Figure 51 shows a model of the decomposition and prediction operations. The model assumes a one-dimensional input, which is the result of reading the image using a Morton-Z scan. The transfer function $P(z)=1/4\sum_{k=0}^3 z^{-k}$ computes the average of four consecutive pixels $(2\times 2 \text{ block})$ . The $z^{-4}$ block delays the original pixel sequence by four clock periods so that the correct prediction residuals can be computed. Hence, we can write $$H^{1}[l] = 1/4 \sum_{k=0}^{3} H^{0}[4l+k]$$ (5.3) for the lower-resolution image and $$R^{1}[n] = H^{1}[l] - H^{0}[n]$$ (5.4) for the ideal or noiseless prediction residuals, where $l=\lfloor n/4 \rfloor$ and $n=0,1\dots N^2-1$ . Figure 51: Simplified model of a single multiresolution decomposition stage. Figure 52 expands the previous model by explicitly adding the main noise sources for both the mixed-signal and the fully-digital implementations. Our analog implementation employs switched-capacitor circuits. Hence, the dominant source of analog noise is due to charge injection and clock feedthrough. The analog noise due to the prediction Figure 52: Model incorporating noise sources for (a) the mixed-signal implementation; (b) fully-digital implementation. block P is modeled with the random variable $n_p$ , while the random variable $n_d$ models the analog noise introduced by the delay circuit. The quantization noise due to the ADCs $Q_1, Q_2$ and $Q_3$ is modeled with the random variables $n_{qh}$ , $n_{qr}$ and $n_q$ . In the fully-digital implementation, the prediction values are rounded to the closest integer to keep the bit resolution of $H^1$ equal to the bit resolution of $H^0$ . This rounding operation is modeled with a quantizer with quantization noise represented by random variable $n_{qp}$ . The original image is recovered in the following manner: $$\widetilde{H}^0 = \widetilde{H}^1 + \widetilde{R}^1$$ (for mixed-signal) (5.5) $$\overline{H}^0 = \overline{H}^1 + \overline{R}^1$$ (for fully-digital) (5.6) Using Fig. 52 we can write the following expressions for the mixed-signal implementation: $$\tilde{H}^1 = P(H^0) + n_p + n_{qh}$$ (5.7) $$\tilde{R}^1 = z^{-4}H^0 - P(H^0) + n_d - n_p + n_{qr} - n_{qh}$$ (5.8) where, $P(H^0) = H^1[l] = 1/4 \sum_{k=0}^3 H^0[4l+k]$ is the ideal or noiseless prediction value. For the fully-digital implementation we have: $$\overline{H}^1 = P(H^0) + P(n_q) + n_{qp} \tag{5.9}$$ $$\overline{R}^1 = z^{-4}H^0 - P(H^0) + z^{-4}n_q - P(n_q) - n_{qp}$$ (5.10) Replacing (5.7) and (5.8) in (5.5) yields: $$\tilde{H}^0 = z^{-4}H^0 + n_d + n_{qr} \tag{5.11}$$ Thus, the error introduced by the mixed-signal approach is $$\widetilde{e} = n_d + n_{gr} \tag{5.12}$$ Similarly, replacing (5.9) and (5.10) in (5.6) yields: $$\overline{H}^0 = z^{-4}H^0 + z^{-4}n_a \tag{5.13}$$ Assuming stationarity for the quantization noise, the error introduced by the fully-digital approach is $\overline{e} = n_q$ . For lossless compression we must have $|\widetilde{e}| \leq |\overline{e}|$ or equivalently $|n_+ n_{qr}| \leq |n_q|$ . Therefore, the noise of the delay circuit and the quantization noise of $Q_2$ have to be minimized. Fortunately, for most images, the variance of the prediction residuals is much smaller than the variance of the original image. Thus, the input range of $Q_2$ can be matched to the smaller range of the residuals decreasing the quantization noise $n_{qr}$ . The noise $n_d$ can be reduced with proper circuit design techniques. If $|\widetilde{e}| > |\overline{e}|$ and $|\widetilde{e} - \overline{e}| < \delta$ then the system would be equivalent to near-lossless compression with maximum reconstruction error $\pm \delta$ . If the application allows larger values of $\delta$ , then higher compression ratios can be obtained by reducing the bit resolution of quantizer $Q_2$ . Note that $Q_1$ and $Q_2$ can be a single ADC with multiplexed inputs. Figure 53: Image compression approach based on multiresolution decomposition. ## 5.3 Three Levels of Decomposition Fig. 53 is three levels of decomposition. The first level starts with the original image $H_0$ of size $N \times N$ and generates image $H_1$ of size $\frac{N}{2} \times \frac{N}{2}$ . The second level of decomposition takes $H_1$ as its input and generates image $H_2$ of size $\frac{N}{4} \times \frac{N}{4}$ . Finally, level three takes image $H_2$ and produces image $H_3$ of size $\frac{N}{8} \times \frac{N}{8}$ . The image $H_i$ at the $i^{th}$ level of decomposition is generated in the following manner: $$H_i(r_i, c_i) = \frac{1}{4} \left( H_{i-1}(2r_i, 2c_i) + H_{i-1}(2r_i + 1, 2c_i) + H_{i-1}(2r_i, 2c_i + 1) + H_{i-1}(2r_i + 1, 2c_i + 1) \right)$$ (5.14) where $i=1,2,3,0 \le r_i \le N/2^i$ and $0 \le c_i \le N/2^i$ . In other words, each pixel in $H_i$ is the average of the pixels in the spatially equivalent $2 \times 2$ block in $H_{i-1}$ . The low resolution images will be used as predictions of higher resolution images. At each level, a prediction residual image $R_i$ of size $N/2^{i-1} \times N/2^{i-1}$ is computed as following: $$R_i(r_i', c_i') = H_{i-1}(r_i', c_i') - H_i\left(\left\lfloor \frac{r_i'}{2} \right\rfloor, \left\lfloor \frac{c_i'}{2} \right\rfloor\right)$$ (5.15) where $0 \le r_i' \le N/2^{i-1}$ , $0 \le c_i' \le N/2^{i-1}$ and $\lfloor \cdot \rfloor$ is the greatest integer function. For smooth areas in an image, a pixel value will be very close to the average of itself and its three immediate neighbors. Thus, the prediction residual on those regions will be zero or close to zero. For most natural images the distribution of the prediction residuals peaks at zero and falls exponentially as the residual absolute value increases. We take advantage of such distributions by using an entropy encoder to encode the residuals images to achieve compression. The image at the third level of decomposition $H_3$ is not compressed. Instead, it is encoded using a straight binary code with 8 bits per pixel. Since the size of $H_3$ is 1/64 the size of $H_0$ , each pixel in $H_3$ is encoded with an effective rate of 1/8 bpp. The compressed representation of the original image $H_0$ is given by $H_3$ followed by entropy-encoded residual images $R_3$ , $R_2$ and $R_1$ (in that order). Note that we only need to transmit 3 out of every 4 pixels in the residual images because the $2\times2$ block averages are already available to the decoder from the previous level of decomposition. # 5.4 Multiresolution Decomposition Imager Architecture Figure 54: Architecture diagram of the imager. Figure 54 shows the architecture of an imager with the proposed focal-plane compression approach. The imager consists of an array of active pixels (APS), addressing circuitry, a multiplexer, a correlated double sampling (CDS) unit to minimize fixed patter noise (FPN) and a multiresolution decomposition circuit. The pixels are read out following a Morton-Z scan that allows pixels in the same $2 \times 2$ block to be read one after the other simplifying the design of the averaging circuit. An imager with one level of decomposition has been designed and fabricated in a 0.5 $\mu$ m CMOS process. The array size of the imager is $32 \times 32$ pixels. # 5.4.1 CDS Unit Figure 55: Schematic diagram of the CDS unit. Figure 55 shows the schematic diagram of the CDS unit. It is based on a switched-capacitor architecture. The inverting amplifier $A_1$ is a high-gain cascode common-source amplifier. The CDS unit works with a non-overlapping clock with phases $\phi_1$ and $\phi_2$ . During phase $\phi_1$ the voltage of the corresponding pixel bus is sampled on capacitor $C_1$ and capacitor $C_2$ is reset. At the end of phase $\phi_1$ capacitor $C_1$ is charged to $V_{ph} - V_{off}$ and the voltage across $C_2$ becomes $V_{ref} - V_{off}$ , where $V_{ph}$ is the output voltage of the pixel after the integration period and $V_{off}$ if the offset voltage of amplifier $A_1$ . During phase $\phi_2$ the pixel is reset and its reset voltage, $V_{rst}$ , appears at the corresponding column bus. Thus, at the end of $\phi_2$ the voltage across capacitor $C_1$ becomes $C_1(V_{rst} - V_{off})$ . After charge recombination, the output of the CDS unit becomes: $$V_{CDS} = V_{off} + Q_{C_2}/C_2$$ $$= V_{ref} - V_{pix}$$ (5.16) where $V_{ref}$ is a reference voltage, and $V_{pix} = V_{rst} - V_{ph}$ is the pixel value without FPN. Note that the amplifier's offset voltage is also canceled out during this process. # 5.4.2 Multiresolution Decomposition Circuit Figure 56: Schematic diagram of the switched-capacitor circuit for multiresolution decomposition. A single level of multiresolution decomposition is performed by the switched capacitor circuit shown in Fig. 56. The decomposition circuit is composed of an integrator and a memory circuits. The integrator is formed by amplifier $A_1$ , capacitors $C_S$ and $C_C$ and the corresponding switches and the memory circuit is formed by amplifier $A_2$ , capacitors $C_1$ to $C_4$ and their corresponding switches. The sizes of the capacitors are $C_S = C_C/4 = C = 0.5 \, \mathrm{pF}$ and $C_1 = C_2 = C_3 = C_4 = C_f = C = 0.5 \, \mathrm{pF}$ . The integrator computes the average of four consecutive pixels. In the mean time, the memory circuit stores the same four pixel values until their average has been computed. The residual values are obtained by reading the voltage difference $V_{oi} - V_{om}$ . The decomposition circuit works in three phases: reset phase, integration phase and read-out phase. Figure 57 shows the timing diagram during these three phases. Figure 57: Timing diagram of the multiresolution decomposition circuit. In the reset phase, the integrator and memory circuits are reset by setting high the clock signals $\phi_R$ , $\phi_R'$ , $\phi_1$ and $\phi_1'$ . During reset, the voltage across capacitor $C_S$ becomes $$V_{C_S} = V_{ref} + V_{off1} - V_{CDS}[k] (5.17)$$ where $V_{off1}$ is the offset of amplifier $A_1$ and $V_{CDS}[k] = V_{ref} - V_{pix}[k]$ is the output voltage of the CDS unit corresponding to the $k^{th}$ pixel in a given $2\times 2$ block (k=1,2,3,4). In the reset phase the voltage across capacitor $C_f$ of the memory circuit becomes $$V_{C_f} = V_{off2} \tag{5.18}$$ where $V_{off2}$ is the offset voltage of amplifier $A_2$ . In the integration phase and during clock phase $\phi_2$ , the voltage across capacitor $C_S$ is forced to $V_{off1}$ charging capacitor $C_C$ to $$Q_{C_C} = 4C \cdot V_{off1} + C(V_{ref} - V_{CDS}[k])$$ $$= 4C \cdot V_{off1} + C \cdot V_{pix}[k]$$ (5.19) This process is repeated for three more clock cycles and pixels. At the end of the integration phase the accumulated charge in capacitor $C_C$ is $$Q_{C_C} = 4C \cdot V_{off1} + C \cdot \sum_{k=1}^{4} V_{pix}[k]$$ (5.20) which yields a voltage at the integrator's output given by $$V_{oi} = V_{ref} - \frac{1}{4} \sum_{k=1}^{4} V_{pix}[k]$$ (5.21) Thus, the integrator circuit computes the average while eliminating the offset voltage of $A_1$ . Simultaneously, the memory circuits store the pixel voltage values in capacitors $C_1$ to $C_4$ . The storage is accomplished using the two non-overlapping clock phases $\phi_1'$ and $\phi_2'$ and switches $S_1$ to $S_4$ . During clock phase $\phi_1'$ , capacitor $C_f$ is reset to voltage $-V_{off2}$ and capacitors $C_1$ to $C_4$ are successively charged to voltage $V_{CDS}[k] - V_{off2}$ . In the read-out phase, the integrator keeps its output voltage steady while the memory circuit output cycles through the pixel values stored in capacitors $C_1$ to $C_4$ . To accomplish this switches $S_1'$ to $S_4'$ and the two non-overlapping clock phases $\phi_1'$ and $\phi_2'$ are employed. During successive $\phi_1'$ phases, capacitor $C_f$ is reset to voltage $-V_{off2}$ and capacitors $C_1$ to $C_4$ are forced to discharge one at a time injecting a charge of $C \cdot V_{CDS}[k]$ into $C_f$ . The charge in $C_f$ at the end of each $\phi_2'$ phase becomes $$Q_{C_f} = -C \cdot V_{off2} + C \cdot V_{CDS}[k] \tag{5.22}$$ Thus, during phase $\phi'_2$ the output voltage of the memory circuit, $V_{om}$ , is given by $$V_{om} = V_{off2} + \frac{Q_{C_f}}{C_f} = V_{CDS}[k] = V_{ref} - V_{pix}[k]$$ (5.23) Note that the offset voltage $V_{off2}$ is canceled out in the process. The voltage difference $V_{oi} - V_{om}$ is equal to the residual voltage value $$V_R[k] = V_{pix}[k] - \frac{1}{4} \sum_{i=1}^{4} V_{pix}[i]$$ (5.24) Hence, the proposed circuit can be used to compute the residual image for a given level of decomposition. ### **5.5** Multiresolution Decomposition Numerical Simulation Results This section presents numerical simulation results of the proposed compression algorithm. Figure 58 shows the intermediate images of a three-level decomposition of Figure 58: Lena image and its three levels of decomposition. image lena. From the figure, it can be seen that the residual images, $R_i$ , capture the edges or high-frequency components of the image while the $H_i$ images capture the low-pass components. Figure 59 shows the histograms of the original lena image and the three prediction residual images. The histograms show a very narrow distribution for the residual images which can be exploited by an entropy encoder to achieve compression. The histograms become a bit broader as the level of decomposition increases suggesting that higher levels of compression can be achieved in the first levels. Table 7 shows the compression performance of the proposed approach in a lossless scenario for different test images. In the lossless scenario, the original image $H_0$ is recovered exactly from $H_3$ , $R_3$ , $R_2$ and $R_1$ . We have used the first-order entropy as a Figure 59: Histograms of the: (a) original image; (b) first level residual; (c) second level residual; (d) third level residual measure of the average number of bits that an entropy encoder will need to encode each pixel in the residual images. The compression ratios (CR) are within the expected range for lossless compression standards [46]. Higher compression ratios can be achieved in a lossy scenario. For lossy compression the residuals are quantized and entropy encoded. Table 7: Compression Performance of a Three-Level Decomposition | Test | Entrop | y of Residua | Total | CR | | |----------|-----------|------------------------------|--------|-----------|-------| | image | 1st level | st level 2nd level 3rd level | | # of bits | | | mri | 4.7696 | 5.1478 | 5.4049 | 322487 | 1.625 | | man | 5.3471 | 5.6523 | 5.9541 | 358759 | 1.461 | | camera | 4.9044 | 4.9542 | 5.0433 | 325623 | 1.610 | | einstein | 4.8275 | 5.0408 | 5.2650 | 323586 | 1.620 | | lena | 4.5958 | 5.0339 | 5.6209 | 313209 | 1.674 | Table 8 shows the compression performance in a lossy scenario. The lossy compression of the proposed approach is realized by quantizing the residuals $R_3$ , $R_2$ and $R_1$ in the following manner: $$Q_i = (2\delta_i + 1) \left\lfloor \frac{R_i + \delta_i}{2\delta_i + 1} \right\rfloor \tag{5.25}$$ where i=1,2,3, and $2\delta_i+1$ is the quantization step. Competitive results are achieved by using a higher values for $\delta_1$ and a lower values for $\delta_3$ . Although, the results on JPEG are slightly better than proposed approach, the circuit architecture of proposed approach is much simpler. Table 8: Comparison with JPEG in Lossy Scenario | | $\delta$ | 1=2 | $\delta$ | 1=8 | δ1=4 | | $\delta 1$ | =16 | <i>δ</i> 1=8 | | | | |-----------|------------|-------|------------|------------|------------|------------|------------|-------|--------------|-------|------|-------| | Test | $\delta i$ | 2=4 | $\delta z$ | 2=4 | $\delta z$ | 2=8 | $\delta z$ | 2=8 | $\delta$ | 2=8 | JF | PEG | | image | $\delta$ . | 3=8 | $\delta$ . | δ3=2 δ3=16 | | $\delta$ . | δ3=4 | | 3=8 | | | | | | bpp | PSNR | bpp | PSNR | bpp | PSNR | bpp | PSNR | bpp | PSNR | bpp | PSNR | | mri | 2.70 | 34.55 | 1.84 | 35.00 | 1.98 | 29.51 | 1.46 | 31.02 | 1.62 | 31.60 | 1.27 | 36.87 | | man | 3.30 | 33.52 | 2.13 | 33.97 | 2.43 | 28.43 | 1.58 | 29.44 | 1.87 | 30.38 | 1.83 | 32.91 | | cameraman | 2.90 | 35.18 | 2.08 | 35.30 | 2.28 | 30.83 | 1.58 | 31.04 | 1.87 | 32.17 | 1.47 | 33.69 | | einstein | 2.72 | 33.85 | 1.76 | 34.41 | 1.96 | 29.44 | 1.39 | 30.77 | 1.55 | 30.97 | 1.30 | 35.65 | | lena | 2.58 | 34.06 | 1.77 | 35.14 | 1.89 | 29.20 | 1.42 | 31.15 | 1.55 | 31.55 | 1.26 | 36.38 | | mandril | 3.27 | 33.14 | 2.16 | 33.47 | 2.47 | 28.45 | 1.58 | 28.52 | 1.92 | 29.99 | 1.92 | 32.25 | | house | 2.66 | 34.12 | 1.75 | 34.60 | 1.97 | 29.78 | 1.38 | 30.56 | 1.55 | 31.19 | 1.38 | 35.75 | Table 9 shows the compression performance in bits per pixel (bpp) for lossless and near-lossless cases for several grayscale 8-bit test images. The near-lossless case is achieved by coarsely quantizing the residual values such that the maximum reconstruction error that is introduced is $|\delta|$ . As expected near-lossless compression allows us to achieve higher compression rates. The coarse quantization can be performed by throwing away one (for $|\delta| = 1$ ) or two (for $|\delta| = 2$ ) least significant bits from the output of quantizer $Q_2$ before they go into the entropy encoder; a task that can be easily implemented in practice. Table 9: Compression Performance | Test image | Lossless (bpp) | Near Lossless (bpp) | | | | |-------------|----------------|---------------------|----------------|----------------|--| | (grayscale) | $\delta = 0$ | $ \delta = 1$ | $ \delta = 2$ | $ \delta = 3$ | | | mri | 5.129 | 3.614 | 2.939 | 2.522 | | | man | 5.925 | 4.372 | 3.625 | 3.120 | | | cameraman | 5.183 | 3.709 | 3.076 | 2.707 | | | einstein | 5.264 | 3.731 | 3.004 | 2.505 | | | lena | 5.178 | 3.615 | 2.907 | 2.468 | | | mandril | 5.841 | 4.275 | 3.545 | 3.057 | | | house | 5.105 | 3.579 | 2.894 | 2.459 | | # **5.6** Chip Measurement Results Figure 60: Layout of the multiresolution decomposition imager. This section presents measurement results from a fabricated chip implementing one level of decomposition. A test chip was designed and fabricated on a 0.5 $\mu$ m CMOS process as shown in figure 60. The test chip included a $32 \times 32$ active pixel array with Morton-Z read-out, a CDS circuit and a switched-capacitor implementation of the multiresolution and prediction unit. The average power consumption of the multiresolution decomposition and prediction unit is $327~\mu W$ . The integrator circuit occupies an area of $260~\mu m \times 65~\mu m$ and the delay circuit occupies an area of $350~\mu m \times 100~\mu m$ . Capacitance C was set to 0.5~p F to reduce the noise due to charge injection $(n_d)$ which was measured to be less than 12~m V. Figure 61: A custom PCB for testing of multiresolution decomposition imager. As shown in figure 61, The test chip was outfitted with a lens to acquire actual images. A printed circuit board (PCB) was designed to house the fabricated chip and supporting circuitry such as voltage regulators and biasing sources. Power sources for analog and digital parts were separated to minimize noise. A CPLD was employed to generate the clock signals, the implement an adaptive Golomb-Rice entropy encoder, and to transmit the compressed data via a serial port to a computer for later image processing. Table 10 summarizes the main features of the test chip. Table 10: Features of the Test Chip | Technology | 3-metal, 2-poly 0.5-μm CMOS | |----------------------------|------------------------------------------| | Pixel size | $35\mu\mathrm{m} \times 38\mu\mathrm{m}$ | | Fill factor | 44.5% | | Power (computational unit) | $327 \mu W$ | | Power supply | 5 V | | Converter resolution | 8 bits | | Compression type | Lossless/Near-lossless | | Entropy coder | Golomb-Rice | Figure 62 shows the acquired images. It can be seen that the reduced-resolution image $H_1$ retains the low-pass features of the original image and thus is a good predictor especially for smooth areas of the image. At the edges, the prediction error increases as can be seen in Fig. 62(c). Figure 62: Acquired images: (a) full-resolution *eye* image $(H_0)$ ; (b) reduced-resolution *eye* image $(H_1)$ ; (c) residual *eye* image $(R_1)$ . Histograms: (d) full-resolution *eye* image $(H_0)$ ; (e) residual *eye* image $(R_1)$ . Figure 62(d) and 62(e) shows the histograms of the acquired $H_0$ and the $R_1$ images, respectively. Notably, the prediction operation results in a peaky residual distribution with lower variance making it suitable for entropy encoding. A total of 6773 bits were Figure 63: Images acquired from the fabricated imager chip: (a) *circle* after CDS; (b) *circle* after integrator; (c) first-level residual for A; (d) histogram of first-level residual of *circle*; (e) A after CDS; (f) A after integrator; (g) first-level residual for A; (h) histogram of first-level residual of A. needed to fully encode the acquired image resulting in 6.61 bpp. This result is higher than the one expected from Table 9 and is likely due to the low resolution of our array and the artifacts created during readout in the first and last rows of the image. Figure 63 shows more images acquired from the imager. Fig. 63 also shows the histograms of the corresponding residual images. The edges of these two pictures can be clearly seen, since they are simple shapes. The entropy of the residual images is 4.33 bits and 4.67 bits for the *circle* and the *A* images resulting in a CR of 1.52 and 1.45 respectively in the lossless case. In the lossy case, CR is 3.19 with PSNR = 31.43 dB and 2.75 with PSNR = 29.72 dB for the *circle* and the *A* images respectively. #### CHAPTER 6 ### AN INCREMENTAL $\Sigma\Delta$ CONVERTER FOR COMPRESSIVE SENSING Compressive sensing is a relatively new development that enables a sparse or compressible signal to be acquired with far fewer measurements than dictated by the Nyquist/Shannon sampling theory, and images are sparse in DCT and wavelet domains. Using compressive sensing, it is not necessary to acquire all the samples in the original signal, but only a few measurements. These measurements are obtained by projecting the input signal onto a basis of non-adaptive measurement vectors. This is accomplished by employing non-adaptive linear projections that preserve the structure of the signal. The signal is reconstructed from the projections using an optimization process [27]. When compressive sensing is applied to the imager, the data rate can be reduced, so power consumption can be reduced, too. Mathematically, let $X = [x_1, x_2, ..., x_n]$ be a discrete-time signal, where $x_i$ are individual samples taken from an underlying analog sensor and N is the number of samples (two-dimensional signals such as images are vectorized into a long one-dimensional vector to fit this notation). The compressive sensing measurements $y_k$ are of the form of inner products between X and a measurement vector $\varphi_k$ that is incoherent or uncorrelated with an sparsifying transform matrix $\Psi$ . Formally, $$y_k = \langle X, \varphi_k \rangle \qquad k = 1, 2, ..., M \tag{6.1}$$ where M is the number of measurements. For K-sparse signals, M = O(Klog(N/K)). The original signal can be recovered by solving the following convex optimization program: $$\min_{\tilde{\mathbf{X}} \in \mathbb{R}^N} ||\tilde{\mathbf{X}}||_{\ell_1} \qquad \text{subject to } \mathbf{y} = \phi X \tag{6.2}$$ where $\|\tilde{\mathbf{X}}\|_{\ell_1} = \sum_i |X_i|$ and the rows of the matrix $\phi$ (also called sensing matrix) are the vectors $\varphi_k$ . The size of matrix $\phi$ is $M \times N$ . For compressive sensing to work, signals must be sparse. Fortunately, most natural signals are sparse or nearly sparse under the discrete cosine transform (DCT) or wavelet bases. The minimization in equation 6.2 can be conveniently reduced to a linear program known as basis pursuit for which there are efficient solutions [27]. A sufficient condition for a stable solution is that the sensing matrix meets the restricted isometry property (RIP) sometimes also called the uniform uncertainty principle (UUP). A related condition is the incoherence condition which requires the sensing and sparsifying matrices to be uncoherent or uncorrelated with each other. It turns out that random matrices satisfy both the RIP and the incoherent conditions with high probability. Ways to construct these matrices include sampling i.i.d entries from a normal or a Bernoulli distribution or by randomly selecting and permuting the rows of a 0/1 Walsh or a noiselet matrix. Pseudo-random binary matrices generated by a linear feedback shift registers (LFSR) have also been employed [42]. ### 6.1 Incremental $\Sigma\Delta$ Modulation Architecture for Compressive sensing The architecture of the proposed compressive sensing converter is shown in Figure 64. This ADC is based on the incremental $\Sigma\Delta$ converter architecture [7] and it is adapted here to simultaneously perform the tasks of computation of compressive sensing measurements and quantization. The combination of these two operations in a single circuit will ultimately save area and power resources. Figure 64: Schematic diagram of the proposed voltage-mode ADC. The voltage input $V_i n$ is multiplied by a pseudo-random binary sequence $\varphi_j(t)$ generated by an LFSR. The integrator, which is normally present in a $\Sigma\Delta$ converter, is used to perform the addition operation required in the inner product calculation of the compressive sensing measurements. The 1/2 factor is needed to keep the output of the integrator below the power supply voltage level and avoid clipping of the integrator's output. The counter counts up whenever the output of the comparator is high. If we let w(t) be the output voltage of the integrator, where t is a discrete-time index, then we can write: $$w(t) = w(t-1) + \frac{V_{in}(t)\varphi_{j}(t) - b(t)V_{ref}}{2}$$ (6.3) where $b(t) \in 0, 1$ is the output of the comparator and $b(t)V_{ref}$ is the output of the 1-bit DAC. Expanding this recursive equation and using the initial conditions w(t) = 0 and b(t) = 0, at time t = N one obtains the following expression: $$\sum_{i=1}^{N} b(i) = \frac{\sum_{i=1}^{N} V_{in}(i)\varphi_{j}(i)}{V_{ref}} + \frac{w(N)}{V_{ref}/2}$$ (6.4) The left hand side of equation 6.4 is an integer number and it is equal to the counter's output. The first term of the right hand side of 6.4 is the inner product between $V_{in}(t)$ and the random sequence $\varphi_j(t)$ scaled by $V_{ref}$ while the second term of the right hand side of 6.4 is the quantization error. This equation shows that this ADC architecture can compute the compressive sensing measurements in the analog domain while converting the result to a digital representation. The bit resolution of this converter is $log_2N$ where N is equal to the length of the pseudo-random sequence. Thus, to increase the resolution, the length of the random sequence has to be increased accordingly. In the traditional incremental $\Sigma\Delta$ converter, the input $V_{in}$ , gets amplified N times since it circulates around the integrator loop N times. This amplification improves the SNR of the converter. In the proposed converter we do not have this SNR improvement. However, it is possible to improve the SNR in the proposed compressive sensing ADC by iterating the $\Sigma\Delta$ loop $N_s$ times for each bit in $\varphi_j$ . If we do that equation 6.4 becomes: $$\sum_{i=1}^{N \times N_s} b(i) = \frac{N_s \sum_{i=1}^{N} V_{in}(i) \varphi_j(i)}{V_{ref}} + \frac{w(N \times N_s)}{V_{ref}/2}$$ (6.5) The number of bits of resolution would then be $log_2(N \times N_s)$ . Figure 65 presents Matlab simulations results of the converter in Figure 64. Figure 65 shows the ideal compressive sensing (CS) measurements computed using equation 6.1 and the output of the ADC when the input is a sinusoidal waveform and the binary random sequence of length N=32 is drawn from a uniformly distributed distribution ( $N_s$ has been set equal to 1). Both the ideal measurements and the ADC's outputs have been normalized by dividing them by N. Figure 65: Normalized ideal CS measurements and the output of the compressive sensing incremental $\Sigma\Delta$ converter. As can be seen from Figure 65, the quantization error is significant. The quantization error can be reduced by increasing N. However, in situations where N cannot be further increased, the quantization error can still be lowered by increasing $N_s$ . Figure 3 shows a quantization error reduction for larger values of $N_s$ . The price that we pay for this reduction in quantization noise is an extended conversion time. Figure 66: Quantization error of the compressive sensing incremental $\Sigma\Delta$ converter for Ns=1, 2 and 8. # 6.2 Quantization Noise and Waveform Reconstruction We carried out a number of tests to assess the impact of the quantization noise introduced by the converter on the acquired waveforms after compressive sensing reconstruction. We employed a two-tone sinusoidal waveform as the input signal of the converter. The input signal had a length of N=128 samples and a total of M=60 compressive measurements were collected. The test signal was reconstructed from the CS measurements using the optimization technique in equation 6.2. The mean square error (MSE) between the original and the reconstructed signals was measured. Two scenarios Table 11: MSE OF Reconstructed Signal from Measurements Computed by the ADC | $N_s$ | MSE | Dynamic | $log_2DR$ | $Mlog_2DR$ | |-------|---------|------------|-----------|------------| | | | range (DR) | | | | 1 | 0.0276 | 12 | 3.585 | 215.10 | | 2 | 0.0183 | 24 | 4.585 | 275.10 | | 3 | 0.0133 | 36 | 5.169 | 310.14 | | 4 | 0.0111 | 48 | 5.585 | 335.10 | | 5 | 0.00512 | 60 | 5.907 | 354.42 | | 6 | 0.00966 | 72 | 6.169 | 370.14 | | 7 | 0.00553 | 86 | 6.426 | 385.6 | | 8 | 0.00419 | 98 | 6.615 | 396.9 | were explored: a) when the CS measurements are computed using the ideal equation 6.1 (no quantization noise) and b) when the CS measurements are computed by the proposed ADC. In the first scenario, a MSE of 0.00526 was obtained. Table 11 presents the MSE of the reconstructed signal when the CS measurements were computed by the proposed ADC for different values of $N_s$ . Table 11 also reports the dynamic range (DR) of the measurements which serves as a metric of the number of bits that would needed to encode each measurement. In particular we employed the quantity $log_2DR$ as an estimation of the bit per measurement. The last column shows the total number of bits that would be needed to encode the M measurements. As a reference, we quantized the original two-tone input signal. For a four-bit quantization we obtained a MSE of 0.00452. Using four bits/sample, a total of $4 \times 128 = 512$ bits would be needed to encode the input signal. Hence, the proposed converter achieves a comparable distortion (at $N_s = 8$ ) as a simple 4-bit quantizer would Table 12: MSE OF Reconstructed Signal for an Unbalanced Sensing Matrix | $N_s$ | MSE | Dynamic | $log_2DR$ | $Mlog_2DR$ | |-------|---------|------------|-----------|------------| | | | range (DR) | | | | 1 | 0.203 | 4 | 2 | 120 | | 2 | 0.0466 | 10 | 3.32 | 199.2 | | 3 | 0.0291 | 14 | 3.81 | 228.6 | | 4 | 0.0154 | 20 | 4.32 | 259.2 | | 5 | 0.0124 | 24 | 4.59 | 275.4 | | 6 | 0.0101 | 30 | 4.91 | 294.6 | | 7 | 0.0067 | 34 | 5.09 | 305.4 | | 8 | 0.00494 | 40 | 5.32 | 319.2 | but a lower bit rate. To match the bit rate achieved by ADC we would have to use a 3-bit quantizer but that would result in a larger MSE. The reconstruction performance can be improved if we modify the balance or density of the random sensing matrices. Balanced binary random sensing matrices have been employed that have a percentage of 1s approximately equal to the percentage of 0s. If we allow the percentage of 0s to be larger we can obtain better MSE values. For instance, for a sensing matrix with a 95% percentage of 0s and when the CS measurements are computed ideally using equation 6.1, we obtain a MSE of 0.00190. Table 12 presents the MSE results when the CS measurements are computed by the ADC using an unbalanced sensing matrix with 95% of zeros. At $N_s$ =8, we obtain a MSE comparable to a 4-bit quantizer but again at a lower bit rate. An advantage of using unbalanced sensing matrices is that the dynamic range of the measurements decreases resulting in fewer bits needed to encode the measurements. An initial explanation of this could be that a sensing matrix with a large number of zeros will have low mutual coherence with the sparsity basis. A smaller coherence means that fewer measurements are needed (see Theorem 1.1 in [15]). Since the number of measurements in our experiment stays the same, the MSE decreases. # **6.3** The Incremental $\Sigma\Delta$ Converter Circuit Implementation The proposed ADC was implemented using the switched-capacitor circuit shown in Figure 67. The integrator is implemented with amplifier $A_1$ and capacitor $C_2$ . The 1-bit DAC is implemented with the flip-flop $FF_1$ , the two AND gates and capacitor $C_3$ . The multiplication of $V_{in}$ with the pseudo-random sequence is performed by the two-input multiplexor $MUX_1$ which is controlled by the output of the LFSR generator. When the LFSR output is 1, $V_{in}$ is fed into the integrator otherwise 0 V is fed in. The LFSR generator employs the feedback polynomial $1 + x^2 + x^3 + x^4 + x^8$ to produce a maximal length sequence. The counter is an 8-bit ripple counter. The capacitor sizes are: 2C1 = C3 = C2 = 1pF. The operation of the circuit is as follows: the integrator is initially reset at the beginning of the conversion by pulling high the signals $\Phi_1$ and $\Phi_2$ causing $C_2$ to discharge. The counter, the LFSR and the flip-flop are also reset at this stage. The conversion then proceeds using a non-overlapping two-phase clock with phases $\Phi_1$ and $\Phi_2$ . During the $\Phi_1$ phase, $C_1$ is charged to: $V_{in}(t)\varphi(t)-V_{off}$ where $V_{off}$ is the offset of the amplifier $A_1$ . During phase $\Phi_1$ capacitor $C_3$ is pre-charged to $-V_{off}$ if the output of the previous comparison, b(t-1), is 1. If b(t-1) is 0, $C_3$ maintains its previous charge. During phase Figure 67: Switched-capacitor circuit implementation of the compressive sensing ADC $\Phi_2$ capacitor $C_1$ is charged to $-V_{off}$ and the charge difference, $CV_{in}(t)\varphi(t)$ , is forced to move into $C_2$ . During $\Phi_2$ , $C_3$ is charged to $1/2V_{ref}-V_{off}$ if the previous output of the comparator was 1. Thus, the charge difference $-CV_{ref}$ is forced into $C_2$ . If the output of the previous comparison was 0, the charge in $C_3$ does not change and $C_3$ is not forced to inject any charge into $C_2$ . Hence, at the end of phase $\Phi_2$ , the charge in $C_2$ is given by: $C[V_{in}(t)\varphi(t)-V_{ref}b(t-1)]$ . Since the capacitance of $C_2$ is 2C, the voltage across $C_2$ is $1/2[V_{in}(t)\varphi(t)-V_{ref}b(t-1)]$ . Thus, the proposed circuit implements equation 6.3. Moreover, the offset voltage of amplifier $A_1$ , $V_{off}$ , is canceled during the integration process. The $\Phi_1$ and $\Phi_2$ phases are repeated until the conversion finishes. At the end of the conversion cycle, the output of the counter is the digital representation of a compressive sensing measurement. A drawback of this design is that the output voltage, w(t), of the integrator is switching between 0 V during $\Phi_1$ and $1/2[V_{in}(t)\varphi(t) - V_{ref}b(t-1)]$ during $\Phi_2$ . This could significantly reduce the speed of the converter. An improved design and analysis of the improved design are presented in Chapter 7. ### **6.4** Results and Measurements The compressive sensing incremental $\Sigma\Delta$ converter has been evaluated through transistor level simulations using the Cadence Spectre circuit simulator. The target fabrication process for the converter is a 0.5 $\mu m$ CMOS technology. For the circuit simulations, a two-stage n-type input opamp was employed to implement the amplifier and the comparator. The prototype of the compressive sensing incremental $\Sigma\Delta$ converter was first built and validated on a printed circuit board (PCB) with low-speed off-the-shelf components. Then an integrated circuit was designed and fabricated on a 0.5 $\mu$ m CMOS fabrication process to validate the proposed AIC on hardware. # 6.4.1 $\Sigma\Delta$ Converter on PCB Figure 68 shows the test setup to validate the incremental $\Sigma\Delta$ Converter system to work in the compressive sensing mode or in standard Nyquist conversion mode. The system was built by off-the-shelf components on a PCB, and it was able to run at a sampling frequency of 125 kHz. Supporting circuitry, such as voltage regulators and biasing sources, was also included on the PCB. Separate power supplies for analog and digital parts and decoupling capacitors were used to minimize noise. A CPLD was employed to generate the clock signals and to transmit the measurements via a serial port to a computer for later reconstruction. The input signals were provided by a function generator which Figure 68: Diagram of the test setup for compressive sensing on PCB. can generate Sine waves, AM, and FM signals. Figures 69 and 70 show the results of reconstructed signals and signals that were acquired in normal ADC mode of the $\Sigma\Delta$ Converter. Both the compressive sensing mode and the standard Nyquist conversion mode were running at the same clock frequency and using the same input signals. The $\Sigma\Delta$ Converter running in compressive sensing mode has better output signals when reconstructed from compressive sensing measurements for both AM and FM signal than running as normal ADC. At this point, the compressive sensing $\Sigma\Delta$ AIC converter has been proved and validated that it can be implemented on hardware. The integrated circuit was designed and the results of the compressive sensing AIC system on a chip are presented in the following section. Figure 69: Test results of AM signal: (a) Reconstructed from Compressive Sensing Measurements; (b) Sampled using Normal ADC Mode. # 6.4.2 $\Sigma\Delta$ Converter on Chip An integrated circuit was designed and fabricated on a 0.5 $\mu$ m CMOS fabrication process to validate the proposed AIC on hardware. For the hardware implementation we chose a bit resolution of 10 bits for the converter which allows N to take up to 1024 values, providing finer spectrum resolution. The fabricated integrated circuit was tested using a custom test setup. The test setup is shown in Fig. 71. It consists of a printed circuit board (PCB) designed to house the fabricated chip and supporting circuitry such as voltage regulators and biasing sources. Separate power supplies for analog and digital parts were (a) (b) Figure 70: Test results of FM signal: (a) Reconstructed from Compressive Sensing Measurements; (b) Sampled using Normal ADC Mode. used as well as decoupling capacitors to minimize noise. A CPLD was employed to generate the clock signals and to transmit the measurements via a serial port to a computer for later reconstruction. A function generator was employed to generate input signals with sparse spectrum. The converter's clock frequency was set to 1 MHz which in turn sets the input bandwidth of the converter to $BW=500\,\mathrm{kHz}$ . This bandwidth is sufficient to cover the long-wave AM spectrum. The offset was estimated employing the method outlined in the previous section to be 17.4 mV. Figure 71: Diagram of the custom test setup. As an initial test, the function generator was set to generate a single sinusoidal signal at 20 kHz which was acquired by the AIC. The acquired measurements were corrected for offset before signal recovery. Figure 72 shows the normalized power spectrum of the recovered signal for N=1024 and M=512. The measured signal-to-noise-plus-distortion ratio (SNDR) is 52.6 dB, which is equivalent to an effective number of bits (ENOB) of 8.4 bits. A second test was conducted with a frequency-sparse signal consisting of two amplitude modulated (AM) signals with carriers at 100 kHz and 250 kHz and a sinusoidal at 400 kHz. Fig. 73 shows the normalized power spectrum of the recovered signal. The two AM signals and the single sinusoidal are clearly visible in the spectrum. Due to noise Figure 72: Normalized spectrum of recovered signal. The input signal is a single sinusoidal of frequency 20 kHz. from the test setup, circuit non-idealities and the inherent quantization noise from the converter, the spectrum of the recovered signal contains spurious components. However, the spurious components are well below the signal power level. An AIC achieves its maximum reconstruction quality when the input signal is the sparsest. I.e. for a single tone and decreases if the sparsity increase but the number of measurements is kept constant, which explains the lower noise floor observed in Fig. 72. The average power consumption of the AIC is summarized in Table 13 for a supply voltage of 3.0 V and a clock frequency of 1 MHz. The accumulator and the comparator are implemented using two-stages opamps. The counter is a 10-bit ripple counter and the pseudo-noise (PN) sequence generator is a 32-bit linear feedback register (LFSR). The total power consumption is 160.4 $\mu$ W. The power consumption can be improved by employing a regenerative comparator as the one presented in [57]. Circuit simulations Figure 73: Normalized spectrum of recovered signal. The input signal consists of two AM signals with carrier at frequencies at 100 kHz and 250 kHz and a single sinusoidal at 400 kHz. show that a regenerative comparator can consume less than 15 $\mu$ W while working at 1 MHz. The AIC occupies an area of 0.127 mm<sup>2</sup>. Table 14 summarizes the performance of the proposed $\Sigma\Delta$ AIC. Table 13: Average Power Consumption of $\Sigma\Delta$ AIC | Accumulator | $81.8 \mu\mathrm{W}$ | |-------------------|----------------------| | Comparator | $40.4~\mu\mathrm{W}$ | | Counter | $5.3 \mu W$ | | 1-bit DAC | $0.2~\mu\mathrm{W}$ | | PN seq. generator | $32.7 \mu W$ | | Total | 160.4 $\mu$ W | Table 14: Performance Summary of $\Sigma\Delta$ AIC Supply voltage 3.0 V Sampling frequency 1 MHz Resolution 10 bits **ENOB** 8.4 bits $160.4 \mu W$ Power Technology $0.5 \,\mu\mathrm{m}$ $0.127 \text{ mm}^2$ Area Architecture $\Sigma\Delta$ with compressive sensing # 6.5 Comparison with Nyquist-Rate Converters If the converter is operated as a standard incremental $\Sigma\Delta$ converter, the input bandwidth is limited to $f_{clk}/2N$ . However, using the AIC concept and with virtually the same circuit, input signals with bandwidth of up to $f_{clk}/2$ can be acquired provided that they have a sparse spectrum. Table 15 presents a comparison with other Nyquist-rate converter architectures fabricated on similar CMOS processes and on more advanced submicron processes. Table 15: Performance Comparison of AIC with Nyquist-Rate Converters | Design | [71] | [24] | [56] | [1] | [66] | [47] | [34] | [61] | [25] | This work | |-------------------------|---------------------|---------------------|---------------------|----------------|----------------------|----------------------|----------------|----------|--------|----------------| | Conversion rate (MHz) | 200 | 1 | 50 | 20.48 | 0.1 | 150 | 1000 | 200 | 3500 | 1 | | ENOB (bits) | 5.0 | 10.5 | 10.3 | 9.0 | 7.0 | 7.3 | 8.9 | 10.4 | 4.9 | 8.4 | | Power (mW) | 110 | 6 | 850 | 19.5 | 0.0031 | 71 | 250 | 348 | 98 | 0.16 | | Process | $0.5 \mu\mathrm{m}$ | $0.6~\mu\mathrm{m}$ | $0.6~\mu\mathrm{m}$ | $0.35 \ \mu m$ | $0.25~\mu\mathrm{m}$ | $0.18~\mu\mathrm{m}$ | $0.13 \ \mu m$ | 90 nm | 90 nm | 0.5 μm | | Architecture | Flash | SAR | Pipeline | Pipeline | SAR | Interleaved | Interleaved | Pipeline | Flash | $\Sigma\Delta$ | | Area (mm <sup>2</sup> ) | 1.6 | 0.3 | 16 | 1.3 | 0.053 | 1.8 | 3.5 | 1.36 | 0.1485 | 0.127 | | FOM (pJ/conv. step) | 16.04 | 4.14 | 12.81 | 1.85 | 0.242 | 3.0 | 0.52 | 1.41 | 0.94 | 0.474 | A figure of merit (FOM) is calculated for each converter as follows: $$FOM = \frac{Power}{2^{ENOB} f_s} \tag{6.6}$$ The AIC performs fairly well when compared with Nyquist converters fabricated in a similar CMOS process. It also performs favorably (in terms of FOM and area) to some converters fabricated in more advanced CMOS process. Fabrication of the AIC in a deep-submicron process is expected to improve its performance. The power consumption of the digital part is expected to reduce significantly in deep submicron technologies [23]. The analog part of the converter, which computes the projections, is expected to improve its performance in the $0.25~\mu m$ and $0.35~\mu m$ processes [4, 13]. At ultra-deep-submicron processes critical analog circuits will have to operate at higher voltages through a combination of thin- and thick-oxide transistors to benefit from technology scaling [5]. The circuit analysis presented in [21] shows that in a target 90 nm process analog signal projections can be efficiently computed using Gm-C circuits while running at 1 GHz. An AIC acquires a signal in a fundamentally different manner than a Nyquist-rate converter does. An AIC acquires projections of the input signal rather than direct samples. The projections are acquired as shown in Fig. 74. The input signal s(t) and the random-like signal $\varphi(t)$ are divided into blocks of N samples. Each block is then used to compute one projection or measurement. The number of projections, M, that need to be acquired depends on the sparsity level of the signal and the target reconstruction SNR. To acquire an $M \times N$ long signal, the AIC needs to acquire M projections instead of the $M \times N$ samples required by a Nyquist-rate converter. The value of N is set by the underlying signal model expressed in equations (7.1) and (7.10). Although the projections Figure 74: Graphical depiction of signal acquisition under compressive sensing. have a larger dynamic range (10 bits in our case), the AIC stills provides effective data compression. Assuming that a Nyquist-rate ADC encodes each sample using 8 bits, the compression ratio that is achieved by the AIC can be estimated to be (8 bits/sample $\times$ $M \times N$ )/(10 bits/projection $\times$ M) = 0.8N. Data compression is inherent to an AIC. However, data compression comes at the price of a longer acquisition time. It takes $M \times N \times T_s$ seconds for the AIC presented here to acquire a signal. An Nyquist-rate convereter needs the same amount of time to acquire an $M \times N$ long signal. The acquisition time can be decreased by employing multiple random demodulators working in parallel but at the cost of increased power consumption. Parallel architectures have been explored in [21], [60] and [39]. ### 6.6 Related Work on AICs There have been other efforts to demonstrate in practice the potential of AICs. In [60] and [43] a prototype hardware implementation of a compressive sensing converter was presented. The prototype employs an analog multiplier, a Gm-C integrator and a low-rate standard backend ADC to implement the AIC. A commercially-available DSP board was employed to process the output of the converter and to recover the original input signal. The converter was tested with sinusoidal and amplitude-modulated (AM) signals. Successful signal recovery was demonstrated at a conversion rate equivalent to 1/8th of the Nyquist rate. In [80] the architecture of a parallel-path AIC for spectrum sensing was presented. A hardware prototype of the analog front-end of the converter was implemented using off-the-shelf components. Each path consists of an OTA that converts the input voltage into current and a network of switches and capacitors that implements a switched-capacitor integrator. An oscilloscope was employed to convert the output of the integrator to digital. The circuit was tested with a frequency-hopping signal spanning 200 kHz. Successful signal reconstruction was demonstrated at a conversion rate equivalent to 32% of the Nyquist rate. In [12] the authors outline a $\Sigma\Delta$ approach for computing compressive sensing measurements. It employs a classical $\Sigma\Delta$ loop with feedback coefficients that change dynamically according to a random dictionary. This approach requires a feedback network with a very large order increasing the potential for unstable behavior. The feedback network coefficients can be either computed at run-time or they can be generated off-line and stored for their later use. The authors provide algorithms to generate coefficients that result in stable behavior. No hardware implementation was presented. The authors of [39] present initial simulation results of an AIC based on a time encoding machine (TEM). A standard TEM was modified to introduce a level of randomness in the switching time points of the output pulses. The length of the output pulses are converted to voltage levels which are then measured by a backend ADC. The authors show that the resulting measurement process is similar to a randomly sampled Fourier ensemble. The drawback of this approach is that high frequencies are attenuated. No hardware implementation was presented. In [21] the design of a receiver exploiting compressive sensing is presented. The receiver has a parallel architecture dubbed parallel segmented compressive sensing. Based on a theoretical analysis and circuit simulations it is predicted that a hardware implementation on a 90-nm CMOS process will be able to acquire signals with bandwidths of 1.5 GHz while consuming 120.8 mW. The analysis include the effects of device noise and clock jitter in the quality of signal reconstruction. An integrated circuit was designed but no hardware measurements were reported. In [52] the authors present a hardware implementation of the modulated wideband converter (MWC). The MWC is based on a circuit similar to the random demodulator but, unlike the AIC, it does not require the projection basis to be random. The only requirement is for the projection basis to be periodic. An implementation of the modulated wideband converter with off-the-shelf components was employed to demonstrate signal acquisition at 2 GHz. The MWC consumes 20 W of power. #### CHAPTER 7 # DETAILED DESCRIPTION OF A SIGMA-DELTA RANDOM DEMODULATOR CONVERTER ARCHITECTURE FOR COMPRESSIVE SENSING APPLICATIONS This chapter presents an indepth look at a compressive sensing sigma-delta converter. A Sigma-Delta architecture for a random demodulator converter is presented. The random demodulator converter exploits the results of compressive sensing theory to acquire sparse signals by projecting the signal of interest onto a random basis. The signal can be recovered from a relatively small number of random projections. Hence, the random demodulator converter is able to provide a compressed representation of the input signal at its output. The proposed architecture is based on a first-order incremental Sigma-Delta converter that can be easily configured to work in the compressive sensing mode or in standard Nyquist conversion mode. The integrator and the counter that are normally present in an incremental Sigma-Delta converter are re-used to simultaneously compute compressive sensing measurements and convert them to digital. A circuit analysis that includes circuit non-idealities is presented. The analysis reveals that offset has a significant impact. Thus, a method for offset compensation is provided. A prototype of the random demodulator converter was fabricated and tested with frequency sparse signals, demonstrating the validity of the design. Compressive sensing is an emerging signal acquisition paradigm in which sparse or near-sparse signals can be efficiently acquired from a relatively small set of measurements. Compressive sensing theory applies to signals that are sparse in the time or frequency domains. Hence, compressive sensing can be applied to wireless communications in scenarios where the radio spectrum is not fully occupied leading to signals with sparse frequency content. Compressive sensing also states that signals that are sparse under a suitable sparsifying transform, such as the wavelet transform or the discrete cosine transform, can be efficiently acquired. Thus, compressive sensing can be employed to efficiently acquire images and biomedical signals [28,49]. Figure 75: Conventional and compressive sensing signal acquisition approaches. A random demodulator circuit is employed to project the input signal s(t) onto the random vectors $\varphi_k(t)$ Rather than directly sampling a signal, compressive sensing acquires measurements or projections of the signal onto a random basis. A practical approach to compute the random projections is to employ the random demodulator circuit shown in Figure 75 [70]. This figure also shows the conventional analog-to-digital (A/D) signal acquisition approach. The random demodulator employs a multiplier and an integrator to project the input signal s(t) onto a random-like signal $\varphi(t)$ . The output of the integrator is sampled and converted to digital by the backend analog-to-digital converter (ADC) every $NT_s$ seconds, where N is a large positive integer. If the projection basis is incoherent with the basis under which the signal is sparse, then only a small number of projections contain most of the information needed to recover the original signal. Therefore, compressive sensing provides a compressed representation of the input signal. Signal recovery is typically formulated as an optimization problem that finds the sparsest signal that matches the acquired projections [16]. In this work, we present an architecture for the random demodulator that is based on a first-order incremental $\Sigma\Delta$ converter. The integrator that is already present in the $\Sigma\Delta$ converter is merged with the integrator required in the random demodulator circuit. Additionally, the digital counter that is part of the incremental converter is used in the compressive sensing converter to generate a digital output. Thus, the proposed architecture merges the random demodulator with the backend ADC. The resulting circuit, which we call a random demodulator converter, provides a digital representation of the analog input signal. Hardware merging enables the proposed converter to *simultaneously* compute the compressive sensing projections and to convert them to digital format, resulting in savings in circuit complexity. A switched-capacitor hardware implementation of the compressive sensing converter is presented and an analysis of circuit non-idealities is carried out. As a proof of concept, a prototype was implemented and tested. Measurement results are presented that demonstrate the validity of the approach. ## 7.1 Theory Let us consider a continuous-time model for the input signal as follows: $$s(t) = \sum_{n=1}^{N} a_n \psi_n(t)$$ (7.1) where, $a_n \in \mathbb{R}$ and $\{\psi_n\}$ is a basis of continuous-time orthonormal waveforms with bandwidth less than BW Hz [70]. We are interested in signals that are sparse, that is, only K of the N coefficients $a_n$ are non-zero at any time with $K \ll N$ . The proposed random demodulator architecture uses discrete-time switched capacitor circuits (see Section 7.3). Hence, in our discussion we will use the following discrete-time notation: $s[i] = s(iT_s)$ , where i is the sample index and $T_s$ is the sampling period. Similarly, $\varphi[i] = \varphi(iT_s) = \epsilon_i$ , where $\epsilon_i \in \{0,1\}$ . The signal s(t) can be recovered from a small number of measurements or projections, y[k], $k = 1, 2 \dots M$ . Each measurement is the inner product between s and the random-like signal $\varphi$ [17]. Mathematically, the measurements are computed as follows: $$y[k] = \sum_{j=1}^{N} s_k[j] \cdot \varphi_k[j] \qquad k = 1, 2, \dots M$$ (7.2) where, $s_k[j] = s[(k-1)N + j]$ and $\varphi_k[j] = \varphi[(k-1)N + j]$ . The random demodulator generates M measurements every $M \times N \times T_s$ seconds. Thus, it has a conversion rate of $1/(NTs) = f_s/N$ . The value of N is set by the underlying signal model expressed in equation (7.1). From (7.1) we find that the spectrum resolution of the random demodulator converter is BW/N Hz. The operation of the compressive sensing converter can be summarized in matrix notation as follows: $$y = \Phi s \tag{7.3}$$ where, $\mathbf{y} = \big[y[1], \cdots, y[M]\big]$ , $\mathbf{s} = \big[s[1], \cdots, s[NM]\big]$ and $\mathbf{\Phi}$ is an $M \times NM$ matrix with the following structure [70]: $$\mathbf{\Phi} = \begin{bmatrix} \vec{\varphi}_1 & & & \\ & \vec{\varphi}_2 & & \\ & & \ddots & \\ & & \vec{\varphi}_M \end{bmatrix}$$ (7.4) where, $\vec{\varphi}_k = [\varphi_k[1], \cdots, \varphi_k[M]]$ . The matrix $\Phi$ is called the measurement matrix. Letting $\psi_{i,n}=\psi_n(iT_s),\ i=1\dots NM$ we can write $\mathbf{s}=\Psi\mathbf{a}$ , where the components of the $NM\times N$ matrix $\Psi$ are the values $\psi_{i,n}$ and the components of vector $\mathbf{a}$ are the coefficients $a_n$ . Thus, $\mathbf{y}=\Phi\Psi\mathbf{a}=\Theta\mathbf{a}$ , where $\Theta=\Phi\Psi$ . A sufficient condition for a stable recovery is that $\Phi$ meets the *restricted isometry* property (RIP) sometimes also called the *uniform uncertainty principle* (UUP) [18,35]. It turns out that random matrices satisfy the RIP condition with high probability. In the random demodulator converter the measurements are quantized introducing quantization distortion or noise. Hence, the input signal has to be recovered from quantized or noisy measurements $\hat{y}[k]$ . When the measurements are corrupted by quantization noise, the optimization problem can be formulated as follows: $$\min_{\tilde{\mathbf{a}}} \|\tilde{\mathbf{a}}\|_{\ell_1} \qquad \text{subject to } \|\Theta\tilde{\mathbf{a}} - \hat{\mathbf{y}}\|_2 \le \varepsilon \tag{7.5}$$ where, $\varepsilon$ is an upper bound on the noise magnitude. This optimization problem is called *Basis Pursuit DeNoise* (BPDN) and can be solved by second order programming techniques [17, 20]. Once the optimization in (7.5) is solved, the input signal is recovered from $\tilde{\mathbf{s}} = \Psi \tilde{\mathbf{a}}$ . Figure 76 summarizes the encoding and decoding process employed in the compressive sensing framework. Figure 76: Encoding and decoding process under the compressive sensing framework. The input s(t) is encoded using the measurement matrix $\Phi$ and a quantizer. The quantized measurements $\hat{\mathbf{y}}$ are employed to recover the input signal. The discrete-time recovered signal is denoted by $\tilde{\mathbf{s}}$ . #### 7.2 Random Demodulator Converter Architecture The circuit architecture of the proposed random demodulator is based on a discrete time first-order incremental $\Sigma\Delta$ converter and is shown in Figure 77. The integration operation required by the random demodulator is performed by the integrator that is already present in the $\Sigma\Delta$ converter. In a discrete-time implementation, the integrator is replaced by the accumulator shown in Figure 77. After N clock cycles, the counter's output contains the digital representation of one compressive sensing measurement. The proposed converter architecture is able to *simultaneously* compute compressive sensing measurements, quantize them and convert them to digital format, saving silicon area resources. The proposed architecture was first introduced by the authors in a conference paper [73]. Here, an expanded analysis that includes the effect of quantization and circuit non-idealities is presented. A prototype hardware implementation and measurement results are also reported. Incremental $\Sigma\Delta$ converters have been used in the past to perform weighted average quantization for focal-plane image compression and FIR filtering [55]. More recently, incremental $\Sigma\Delta$ converters have been applied to compressive sensing imaging [53]. The proposed random demodulator converter is composed of a multiplier, a first-order $\Sigma\Delta$ modulator and a digital counter. In a standard incremental $\Sigma\Delta$ ADC, the input s(t) is held constant by a sample-and-hold (S/H) circuit during conversion. However, in the random demodulator converter the S/H circuit is removed allowing the input signal to change during conversion. Binary random matrices are employed in this work. Thus, the multiplier is implemented with a simple switch that is closed if $\varphi_k[n]=1$ and open if $\varphi_k[n]=0$ . It should be noted that in the discrete-time implementation, aliasing is introduced due to sampling. Thus, an anti-alias filter is needed in front of the converter to minimize aliasing effects. A conversion cycle starts by resetting the accumulator and the digital counter. After reset the $\Sigma\Delta$ loop is iterated N times. The digital counter counts the number of clock cycles for which the comparator's output remains high. From Figure 77, the output of the accumulator can be written as follows: $$V_x[n] = V_x[n-1] + \alpha \left( s_k[n] \cdot \varphi_k[n] - b[n-1]V_{ref} \right)$$ $$(7.6)$$ Figure 77: Block diagram of the proposed $\Sigma\Delta$ random demodulator converter. where, k = 1, 2, ... M and n = 1, 2, ... N. Assuming zero initial conditions for $V_x$ and b, the output of the accumulator after N iterations is given by: $$V_x[N] = \alpha \sum_{n=1}^{N} s_k[n] \cdot \varphi_k[n] - \alpha \cdot V_{ref} \sum_{n=1}^{N-1} b[n]$$ (7.7) Equation (7.7) can be rearranged to yield: $$V_{ref} \sum_{n=1}^{N-1} b[n] = \sum_{n=1}^{N} s_k[n] \cdot \varphi_k[n] - \frac{1}{\alpha} \cdot V_x[N]$$ (7.8) Note that $\sum_{n=1}^{N-1} b[n]$ is an integer number and it is equal to the output of the digital counter. The right hand side of equation (7.8) has two parts: the compressive sensing measurement value $y[k] = \sum_{n=1}^{N} s_k[n] \cdot \varphi_k[n]$ and the term $-\frac{V_x[N]}{\alpha}$ which corresponds to the quantization noise. Thus, the output of the counter is the digital representation of the corresponding compressive sensing measurement. To provide an insight into the effects of quantization and circuit non-idealities on the performance of the random demodulator converter, several system-level simulations were conducted. In the simulations, we varied the sparsity of the signal (K), the number of measurements (M), and N. For each parameter combination, 500 randomly-generated $\Phi$ matrices were employed to encode the input signal and the average SNR of the reconstructed signal was calculated. A total of M measurements were employed in each signal recovery attempt. The value of $\alpha$ was set to 1. The SNR in each reconstruction trial is computed using: $$SNR = 10 \log_{10} \frac{\|\mathbf{s}\|_{2}^{2}}{\|\mathbf{s} - \tilde{\mathbf{s}}\|_{2}^{2}}$$ (7.9) where, \$\tilde{s}\$ denotes the recovered signal. The signal model in (7.1) was used to generate sparse input signals. The location of the K non-zero $a_n$ coefficients in the sparse vector a were randomly selected using a uniform distribution in the range [0, N). The orthonormal basis $\{\psi_n\}$ was based on the discrete cosine transform (DCT) as follows: $$\psi_n(t) = \begin{cases} \sqrt{\frac{1}{MN}} & \text{if } n = 1\\ \sqrt{\frac{2}{MN}} \cos\left(2\pi(n-1)t\right) & \text{if } n = 2\dots N \end{cases}$$ (7.10) Once the sparse vector $\mathbf{a}$ and the matrix $\mathbf{\Psi}$ were built, the sparse input signal was generated using $\mathbf{s} = \mathbf{\Psi} \mathbf{a}$ . The random-like measurement vectors $\vec{\varphi}_k$ were derived from a Bernoulli distribution in which the probabilities of a 1 and a 0 are equal. The compressive sensing measurements were computed as $\mathbf{y} = \mathbf{\Phi} \mathbf{s}$ . The BPDN program in (7.5) was employed to recover the input signal. The BPDN program was solved using the $\ell_1$ -MAGIC suite of convex programming tools [14]. The noise upper bound $\varepsilon$ in (7.5) was derived in [36]. For the reader's convenience, the noise upper bound is reproduced in (7.11): $$\varepsilon = \frac{\Delta}{2(p+1)^{1/p}} \left( M + \kappa(p+1)\sqrt{M} \right)^{1/p} \tag{7.11}$$ where, p=2 because the $\ell_2$ norm is employed in the constraint in (7.5) and $\kappa=2$ . The average value for the quantization step width, $\Delta=V_{ref}/2$ , was employed in the simulations. Figure 78 shows the average and the standard deviation of the SNR of recovered signals from the quantized measurements generated by the $\Sigma\Delta$ compressive sensing converter as the number of measurements is increased for $N=256,\,N=512$ and for N=1024. The average SNR was computed over 500 reconstruction trials. In each trial a new random matrix $\Phi$ was generated. The standard deviation is shown using error bars. Figure 78: Average and standard deviation of the signal-to-noise ratio of recovered input signals from quantized measurements: (a) N=256 (b) N=512 (c) N=1024. The distortion introduced during signal reconstruction is a function of the quantization noise and other parameters such signal sparsity, number of measurements and the length N of the sparse vector $\mathbf{a}$ . The following reconstruction distortion upper bound was derived and reported in [19, 32]: $$\frac{1}{K} \|\mathbf{s} - \tilde{\mathbf{s}}\|_2^2 \le c_1 \frac{K}{M} (\log N) \sigma^2 \tag{7.12}$$ where, $c_1$ is a constant and $\sigma^2$ is the variance of the measurement noise including quantization noise. Letting r = M/N, the distortion upper bound can be written as: $$\|\mathbf{s} - \tilde{\mathbf{s}}\|_{2}^{2} \le c_{1} K^{2} (\log N) \frac{\sigma^{2}}{rN}$$ (7.13) The upper bound shows the dependence of the recovery performance on the different parameters of the converter. From (7.13) we see that recovery performance can be improved by increasing N or by increasing the number of measurements (increasing r). The simulation results presented in Figure 78 agree qualitatively with the theoretical distortion bound in (7.13). Figure 78 also shows that the SNR saturates as M is increased. The saturation is due to the usage of the $\ell_2$ norm in the BPDN program in (7.5) [36]. For an in-depth treatment on the impact of quantization noise on compressive sensing recovery performance, the reader is referred to previous work on the subject [36, 41, 70]. Notice that if we make $\varphi_k[n] = 1$ for all n and add a sample-and-hold circuit at the input of the compressive sensing converter, the resulting circuit is a regular incremental $\Sigma\Delta$ ADC. Thus, the proposed converter can be easily configured to work in Nyquist mode (incremental $\Sigma\Delta$ ) or in compressive sensing mode. The compressive sensing mode can be used when compression is needed, and the Nyquist mode can be used when low distortion and not compression is preferable. ## 7.3 Circuit Design A prototype was designed, fabricated and tested as a proof of concept. The switched-capacitor circuit shown in Figure 79 was employed to implement the first-order $\Sigma\Delta$ modulator. The random demodulator was fabricated in a 0.5 $\mu$ m CMOS fabrication process. The switched-capacitor circuit uses a two-phase non-overlapping clock with phases $\phi_1$ and $\phi_2$ . The discrete-time integrator is implemented with capacitors $C_1$ and $C_2$ , amplifier $A_1$ and switches $S_1$ to $S_5$ . The 1-bit DAC is implemented using capacitor $C_3$ and switches $S_6$ and $S_7$ . The switches are implemented with CMOS transmission gates. Figure 79: Schematic diagram of the switched-capacitor implementation. Assuming an ideal amplifier and ideal switches are used, capacitor $C_1$ is charged to $C_1(s[n]\varphi_k[n])$ during phase $\phi_1$ and fully discharged during $\phi_2$ . Thus, a charge of $C_1(s[n]\varphi_k[n])$ is transferred into capacitor $C_2$ every clock cycle. During phase $\phi_1$ , capacitor $C_3$ is pre-charged to $C_3V_{ref}$ if the previous comparator output, b[n-1], is high. Otherwise, if b[n-1] is low, $C_3$ is not pre-charged and its charge remains zero. During phase $\phi_2$ the charge in $C_3$ is subtracted from $C_2$ . Considering that capacitor $C_2$ is discharged at the beginning of the conversion cycle (by closing switch $S_5$ ), the integrator's output voltage, at the end of the $\phi_2$ phase of the $N^{th}$ clock cycle, is given by: $$V_x[N] = \frac{1}{C_2} \left[ C_1 \sum_{n=1}^{N} s_k[n] \varphi_k[n] - C_3 V_{ref} \sum_{n=1}^{N-1} b[n] \right]$$ (7.14) where, $$b[n] = \begin{cases} \mathbf{1} & \text{if } V_x[n] \ge V_{ref} \\ \mathbf{0} & \text{if } V_x[n] < V_{ref} \end{cases}$$ (7.15) Reordering equation (7.14) yields: $$V_{ref} \sum_{n=1}^{N-1} b[n] = \frac{C_1}{C_3} \sum_{n=1}^{N} s_k[n] \varphi_k[n] - \frac{C_2}{C_3} V_x[N]$$ (7.16) If the capacitors' sizes are chosen such that $C_1 = C_3 = \alpha C_2$ , then the switched-capacitor circuit in Figure 79 implements equation (7.8). In practice, circuit non-idealities such as offset voltage, charge injection and capacitor mismatch introduce noise into the compressive sensing measurements affecting the performance of the converter. ## 7.3.1 Circuit Non-Idealities In this section we study the impact of circuit non-idealities on the performance of the compressive sensing converter. The circuit non-idealities that are considered are the amplifier and comparator's offsets, charge injection, capacitor mismatch and comparator metastability. Analytical expressions of the errors introduced by the circuit non-idealities are derived and their impact on the signal recovery performance is assessed through numerical simulations. #### 7.3.1.1 Offset Voltage We consider an input-referred offset voltage of $V_{off1}$ Volts for amplifier $A_1$ and of $V_{off2}$ for the comparator $A_2$ . Due to $A_1$ 's offset voltage, the charge transferred from $C_1$ to $C_2$ during $\phi_2$ becomes $C_1(s_k[n]\phi_k[n]) + C_1V_{off1}$ yielding at the end of conversion: $$V_{ref} \sum_{n=1}^{N-1} b[n] = \frac{C_1}{C_3} \sum_{n=1}^{N} s_k[n] \varphi_k[n] - \frac{C_2}{C_3} V_x[N] + \underbrace{\frac{C_2}{C_3} V_{off1} + N \frac{C_1}{C_3} V_{off1} + V_{off1} \sum_{n=1}^{N-1} b[n]}_{E_{off1}}.$$ $$(7.17)$$ Thus, the error in the converter's output due to $V_{off1}$ is $E_{off1}$ . If $V_{off1}$ is known and constant, the term $V_{off1} \sum_{n=1}^{N-1} b[n]$ can be readily compensated for since the counter's output $\sum_{n=1}^{N-1} b[n]$ is known. The other terms in $E_{off1}$ are affected by capacitor mismatch whose impact is considered below. The comparator's output, b[n], including the comparator's offset voltage $V_{off2}$ is given by: $$b[n] = \begin{cases} 1 & \text{if } V_x[n] \ge V_{ref} + V_{off2} \\ 0 & \text{if } V_x[n] < V_{ref} + V_{off2} \end{cases}$$ (7.18) The effect of the comparator's offset voltage is a change in the maximum level of $V_x[N]$ from $V_{ref}$ to $V_{ref} + V_{off2}$ . This change increases the quantization noise by an amount of $\frac{C_2}{C_3}V_{off2}$ . Typically, $V_{off2} \ll V_{ref}$ . Hence, the impact of the comparator's offset in the converter's output is minimal. ## 7.3.1.2 Charge Injection Charge injection from switches modifies the charge accumulated in $C_2$ . We consider the channel charge released when switches are turned off as the main source of charge injection-induced error. The charge injected by $S_1$ is signal dependent making it difficult to compensate. However, if $S_3$ is turned off slightly before $S_1$ , the injection of signal-dependent charge can be avoided. At the end of phase $\phi_1$ , $S_3$ injects charge into $C_1$ while $S_7$ injects charge into $C_3$ . At the end of $\phi_2$ , $S_4$ and $S_6$ inject charge into $C_2$ . The output of the converter, including the error introduced by charge injection and amplifier's offset, is given by: $$V_{ref} \sum_{n=1}^{N-1} b[n] = \frac{C_1}{C_3} \sum_{n=1}^{N} s_k[n] \varphi_k[n] - \frac{C_2}{C_3} V_x[N] + \frac{C_2}{C_3} V_{off1} + N \frac{C_1}{C_3} V_{off1} + V_{off1} \sum_{n=1}^{N-1} b[n] + \frac{N}{2C_3} (Q_{CH3} + Q_{CH4}) + \frac{Q_{CH6} + Q_{CH7}}{2C_3} \sum_{n=1}^{N-1} b[n]$$ $$\underbrace{E_{inj}}$$ $$(7.19)$$ where, $Q_{CH3}$ , $Q_{CH4}$ , $Q_{CH6}$ and $Q_{CH7}$ are the channel charge from switches $S_3$ , $S_4$ , $S_6$ and $S_7$ respectively. $E_{inj}$ is the error introduced due to charge injection. Since the charge injection error is signal independent, it can be corrected after the conversion has finished. Circuit design techniques such as switch bootstrapping can also be employed to make charge injection signal independent. ## 7.3.1.3 Capacitor Mismatch The capacitor mismatch is modeled in the following manner: $$\frac{C_1}{C_3} = 1 + e_1 \frac{C_2}{C_3} = \frac{1}{\alpha} + e_2$$ (7.20) where, $e_1$ and $e_2$ represent the capacitor mismatch errors. Replacing (7.20) in (7.19), reordering terms and considering $\alpha = 1$ yields: $$V_{ref} \sum_{n=1}^{N-1} b[n] = (1+e_1) \sum_{n=1}^{N} s_k[n] \varphi_k[n] - (1+e_2) V_x[N]$$ $$+ (e_2 + Ne_1) V_{off1} + (N+1) V_{off1} + V_{off1} \sum_{n=1}^{N-1} b[n]$$ $$+ \frac{N}{2C_3} (Q_{CH3} + Q_{CH4}) + \frac{Q_{CH6} + Q_{CH7}}{2C_3} \sum_{n=1}^{N-1} b[n].$$ (7.21) Notably, capacitor mismatch results in a scaling of the compressive sensing measurements by a factor of $(1 + e_1)$ . The term $(N + 1)V_{off1} + V_{off1} \sum_{n=1}^{N-1} b[n]$ is constant and can be readily compensated if $V_{off1}$ is known. The term $(e_2 + Ne_1)V_{off1}$ , although constant, is not easy to correct due to the uncertainties in $e_1$ and $e_2$ . For large-area integrated capacitors designed with proper layout techniques, the matching accuracy can be as low as 0.1%. The model in (7.21) was employed in numerical simulations to assess the impact of circuit non-idealities on the performance of signal recovery. In the simulations, $e_1$ and $e_2$ are considered to be random variables with a mean value of 0.5% and 0.25% standard deviation. The offset voltage is modeled with a normal random variable with mean value of 20 mV and standard deviation of 10 mV. Charge injection is modeled as: $$Q_{CH} = WLC_{ox}(V_{GS} - V_{TH}) \tag{7.22}$$ where, $W=1.5~\mu \mathrm{m}$ and $L=0.6~\mu \mathrm{m}$ are the width and length of the switch transistors, $C_{ox}=2.510^{-3}~\mathrm{F/m^2},~V_{GS}$ is the gate-to-source voltage, $V_{TH}=0.7~\mathrm{V}$ is the transistor threshold and $C_2=C_1=C_3=2~\mathrm{pF}$ . Figure 80 shows the average SNR of the recovered signal from quantized measurements that are also affected by circuit non-idealities. Notably, circuit non-idealities have a significant impact in the quality of the recovered signal. Among the circuit non-idealities considered, offset voltage has the most impact on SNR. Figure 81 shows the average SNR of the recovered signal from measurements that have been corrected for $V_{off1}$ but that still suffer from charge injection and capacitor mismatch. As observed in Figures 80 and 81, offset correction can significantly improve the performance of the random demodulator converter. Figure 80: Average and standard deviation of signal-to-noise ratio of recovered signals from noisy and quantized measurements for N=1024. In a regular $\Sigma\Delta$ converter, offset error does not cause non-linear distortion and can Figure 81: Average and standard deviation of signal-to-noise ratio of signals recovered from offset-corrected measurements for N=1024. be readily compensated. However, in the random demodulator converter, offset error has a very different effect. Consider, for instance, the case in which a fixed but unknown offset, $y_{off}$ , is added to each compressive sensing measurement. The resulting measurement vector can be expressed as $\mathbf{z} = \mathbf{y} + y_{off} \mathbf{1}$ , where $\mathbf{1}$ is a column vector with all its elements equal to 1. Recalling that $\mathbf{y} = \Phi \mathbf{s}$ , we can write $\mathbf{z} = \Phi \mathbf{s} + y_{off} \mathbf{1}$ . Letting $\mathbf{n}$ be equal to the diagonal components of $y_{off}\Phi^{-1}$ , where $\Phi^{-1}$ is the pseudo-inverse of $\Phi$ , results in $\mathbf{z} = \Phi(\mathbf{s} + \mathbf{n})$ . Given that the matrix $\Phi^{-1}$ has a random-like structure, the effect of adding an offset to compressive sensing measurements is equivalent to adding the noise component, $\mathbf{n}$ , to the input signal resulting in additional distortion in the recovery process. Measuring the offset voltage $V_{off1}$ directly might not be feasible for most applications. Fortunately, $V_{off1}$ can be estimated using the random demodulator converter itself by connecting the input of the converter to 0 V. Ignoring charge injection and capacitor mismatch, making $s_k[n] = 0$ and solving for $V_{off1}$ in (7.21) yields: $$V_{off1} = \frac{V_{ref} \sum_{n=1}^{N-1} b[n]}{1 + N + \sum_{n=1}^{N-1} b[n]}$$ (7.23) where, $\sum_{n=1}^{N-1} b[n]$ is the output of the counter after N clock cycles and we have assumed that $V_x[N] = 0$ . Offset correction is achieved by subtracting the value $(1+N)V_{off1} + V_{off1} \sum_{n=1}^{N-1} b[n]$ from each measurement. ## 7.3.1.4 Comparator Metastability Metastability occurs when the comparator's differential input is not large enough for the comparator to produce a valid logic output. In the proposed converter, metastability occurs when $|V_x - V_{ref}| < \Delta V$ , where $\Delta V$ is the minimum differential input needed to generate a valid logic output within one clock cycle. Fig. 82 depicts the integrator output waveform, $V_x$ , and the metastability region. Figure 82: Typical waveform of the integrator output $V_x$ and metastability region. Assuming a simple single-pole model for the comparator, it can be shown that the transient behavior of comparator's output voltage is given by [3]: $$V_{out}(t) = A_0 V_{in} (1 - e^{-2\pi f_c t})$$ (7.24) where, $A_0$ is the DC gain, $V_{in}$ is the differential input voltage and $f_c$ is the -3 dB frequency. From (7.24) $\Delta V$ can be calculated to be: $$\Delta V = \frac{V_{logic}}{A_0 (1 - e^{-2\pi f_c T_{clk}/2})}$$ (7.25) where, $V_{logic}$ is the valid logic level voltage and $T_{clk}$ is the clock period. Assuming a uniform distribution for $V_x$ the probability of $V_x$ being in the metastability region can be approximated by: $$P_E = \frac{2\Delta V}{V_{ref}} \tag{7.26}$$ Combining (7.25) and (7.26) yields the following expression for metastability probability error: $$P_E = \frac{2V_{logic}}{V_{ref}A_0(1 - e^{-2\pi f_c T_{clk}/2})}$$ (7.27) Replacing circuit parameters: $V_{ref}=2.5$ V, $A_0=70$ dB, $f_c=415.8$ kHz, $V_{logic}=V_{DD}/2=1.5$ V and $T_{clk}=1$ $\mu s$ in (7.27) we find that the probability of a metastability-induced comparator error is $2.6\times 10^{-4}$ . When N=1024, the probability of having a single comparator error in one conversion cycle is $N\times P_E=0.266$ . Due to the integrating nature of the converter, a single comparator error produces an error of 1 LSB in the digital output. Thus, the error due to metastability is less than 1 LSB and this error does not generally translate into significant distortion in the recovered signal. #### **7.4** Measurements Results A custom test setup was built to test the fabricated random demodulator converter prototype. The test setup is shown in Figure 83. It consists of a printed circuit board (PCB) designed to house the fabricated chip and supporting circuitry such as voltage regulators and biasing sources. Separate power supplies for analog and digital parts were used as well as decoupling capacitors to minimize noise. A CPLD was employed to implement the digital portion of the random demodulator converter, generate timing signals, program the bias sources and to transmit the compressive sensing measurements via a serial port to a computer for later reconstruction. The converter's clock frequency was set to 1 MHz which in turn sets the input bandwidth of the converter to $BW = 500 \, \mathrm{kHz}$ . Figure 83: Diagram of the custom test setup. An initial test of the fabricated random demodulator converter was performed using sparse signals of different levels of sparsity. The sparse signals were generated using the same procedure employed in the simulations and described in Section 7.3. A function generator was programmed to play back the generated sparse signals so they can be applied to the fabricated converter. The measurements were collected by a computer using a serial port and custom data acquisition software. The amplifier's offset was estimated employing the method outlined in the previous section to be 17.4 mV. The collected measurements were corrected for offset before signal recovery. The measured power consumption of the $\Sigma\Delta$ modulator is 124 $\mu$ W. The total power consumption of the CPLD is 7.5 mW. Figure 84: Measured SNDR of recovered signals as the ratio M/N is varied. A 10-bit digital counter was employed allowing N to take values of up to 1024. The random sequences were generated using a 32-bit linear feedback shift register, LFSR. Figure 84 shows the measured signal-to-noise-plus-distortion (SNDR) of the recovered signals for N=1024 as the ratio M/N is varied. A maximum SNDR value of 50 dB is achieved for a single tone (K=1) at M/N=0.7. Although the measurements were corrected for offset, the converter's performance is lower than the performance predicted by the plot in Figure 81 suggesting that the converter is affected by other circuit non-idealities such as charge injection, capacitor mismatch and noise from the test setup. The bandwidth of the random demodulator converter is enough to directly acquire radio signals in the Low Frequency (LF) and Medium Frequency (MF) bands. Hence, a second test was conducted to acquire signals in the LF and MF bands. Figure 85 shows the setup that was employed. The setup consists of an long-wave active loop antenna with built-in pre-amp (Kaito KA35), a low-noise variable gain amplifier (VGA) with programmable filtering stage (Analog Devices ADRF6510), a low-pass anti-alias filter and the fabricated random demodulator converter. The ADRF6510 was set to provide a total gain of 57 dB. The radio signal was also acquired with a data acquisition unit to provide a reference. Figure 85: Diagram of setup employed to acquire a radio signal. Figure 86 shows the spectrum of the signals acquired with the data acquisition unit and with the random demodulator converter. The measurements generated by the random demodulator converter were corrected for offset and processed by the optimization program in (7.5) to recover the input signal. A total of M=1000 measurements were used in the recovery process. Notably, the random demodulator converter is able to identify the main frequency components of the signal. The SNR of the recovered signal is 12.5 dB. The SNR is affected by the additional noise introduced by the radio front end and by the input signal not being exactly sparse. The corresponding compression ratio is 4. Figure 86: Spectrum of acquired signals: (a) using a data acquisition unit, (b) recovered from compressive sensing measurements. #### CHAPTER 8 #### **CONCLUSION** This dissertation addressed the integration of energy harvesting, image compression, and an image sensor on the same chip which provides the energy source to charge a battery, reduces the data rate, and improves the performance of wireless image sensors. Integrated circuits of image compression, solar energy harvesting, and image sensors are studied, designed, and analyzed in this dissertation. Each individual topic is summarized below: - The design and test of a CMOS hybrid imager capable of sensing and energy harvesting has been presented. The hybrid imager contains pixels whose photodiodes can be configured to work as photodiodes or tiny solar cells. As a proof of concept a $32 \times 32$ array of hybrid pixels has been designed and fabricated in a standard 0.5 $\mu$ m CMOS process. - DC-DC converters with either a flying capacitor or an inductor employed to maximize the amount of harvestable energy were also presented. Test measurements show that the imager can perform the functions of image acquisition and energy harvesting. Measurement results show that up to 6.5 $\mu$ W of power can be harvested from the array under indoor illumination conditions. - Sub-sampling and compressive sensing are implemented with the hybrid pixel imager. Different balances of random patterns have been implemented to maximize the flexibility of hybrid pixel imager. The algorithms to recover missing pixels have been presented. Up to 90% of 0s can be implemented in random patterns and reconstructed images still have acceptable image quality. - A multiresolution decomposition image compression algorithm suitable for focal plane integration and its hardware implementation has been presented. This compression approach can provide lossless or near-lossless compression and reduce the data rate. A prototype has been implemented on a 0.5 $\mu$ m CMOS process [75]. - An analog-to-digital converter suitable for compressive sensing applications has been presented. The converter is based on an incremental sigma-delta converter architecture and is able to directly acquire and convert compressive sensing measurements to digital format compressive sensing measurements. The converter occupies a small area of 0.047 mm<sup>2</sup> on a target 0.5 $\mu$ m CMOS process. Thus, several of them can be implemented in parallel to achieve high conversion rates [74]. Future development of this CMOS hybrid imager has a number of possibilities. First, radio transmission is one of the most fundamental functions of wireless sensors. An energy-efficient radio transmitter would be a good addition to the hybrid imager. Second, the 1-bit memory of the hybrid pixel can be improved. The problem of the 1-bit memory of the hybrid pixel was discussed in Chapter 2. In practice, 3.9 V $V_{DD}$ is used in order for the 1-bit memory to work. The 1-bit memory is implemented with two inverters in this dissertation. A 6-transistor CMOS SRAM cell should be a good option to replace the 1-bit memory in the hybrid pixel to avoid using higher $V_{DD}$ . Finally, a SOC wireless image sensor with a low power radio transmitter, hybrid imager, and power management circuit can be designed to complete the solution for energy-constrained sensors. ## APPENDIX A ## **CHIP PIN-OUT** ## A.1 The Hybrid Imager Chip Figure 87: Block diagram of the hybrid imager chip. Schematics for the $\Sigma\Delta$ AIC v.2 and the hybrid pixel design I are shown in Appendix B. Figure 88: Pin names of the hybrid imager chip. Figure 89: Bonding diagram of the hybrid imager chip. Chip number: V19k-AJ Figure 90: Top view of the hybrid imager chip package. Table 16: Pin Configuration of the Hybrid Imager Chip | Pin name | Analog/Digital | Input/Output | Description | Pin number | |----------|----------------|--------------|-------------|------------| | Swp2 S | Analog | | | L11 | | Swp2 D | Analog | Inout PMOS 2 | | L12 | | Swp2 G | Digital | In | PMOS 2 | K11 | | Swp3 S | Analog | Inout | PMOS 3 | K12 | | Swp3 D | Analog | Inout | PMOS 3 | J10 | | Swp3 G | Digital | In | PMOS 3 | J11 | | Swn1 S | Analog | Inout | NMOS 1 | J12 | | Swn1 D | Analog | Inout | NMOS 1 | H10 | | Swn1 G | Digital | In | NMOS 1 | H11 | | Swn2 S | Analog | Inout | NMOS 2 | H12 | | Swn2 D | Analog | Inout | NMOS 2 | G10 | | Swn2 G | Digital | In | NMOS 2 | G11 | | Swn3 S | Analog | Inout | NMOS 3 | G12 | | Swn3 D | Analog | Inout | NMOS 3 | F10 | | Swn3 G | Digital | In | NMOS 3 | F11 | | Rst0 | Digital | In | Reset bit 0 | F12 | | Rst1 | Digital | In | Reset bit 1 | E10 | | Rst2 | Digital | In | Reset bit 2 | E11 | | Rst3 | Digital | In | Reset bit 3 | E12 | | Rst4 | Digital | In | Reset bit 4 | D10 | | Col4 | Digital | In | Col bit 4 | D11 | | Col3 | Digital | In | Col bit 3 | D12 | | Col2 | Digital | In | Col bit 2 | C10 | | Col1 | Digital | In | Col bit 1 | C11 | | Col0 | Digital | In | Col bit 0 | C12 | | Vdd | Analog | In | 3.3V | B11 | | Vbsf | Analog | In | 2.5V | A11 | | Mode0 | Digital | In | Mode bit 0 | B10 | | Mode1 | Digital | In | Mode bit 1 | A10 | | Mode2 | Digital | In | Mode bit 2 | C9 | | Mode3 | Digital | In | Mode bit 3 | B9 | | Mode4 | Digital | In | Mode bit 4 | A9 | | Row4 | Digital | In | Row bit 4 | C8 | | Row3 | Digital | In | Row bit 3 | B8 | | Row2 | Digital | In | Row bit 2 | A8 | |--------------|---------|-----|-----------------------------|------------| | Row1 | Digital | In | Row bit 1 | <b>C</b> 7 | | Row0 | Digital | In | Row bit 0 | B7 | | EH bus I1 - | Analog | Out | EH bus for Design I | A7 | | EH bus I1 + | Analog | Out | EH bus for Design I | C6 | | EH bus I2 + | Analog | Out | EH bus for Design I | B6 | | EH bus I2 - | Analog | Out | EH bus for Design I | A6 | | EH bus I3 - | Analog | Out | EH bus for Design I | C5 | | EH bus I3 + | Analog | Out | EH bus for Design I | B5 | | pixel out I | Analog | Out | Sensing bus for Design I | A5 | | Vbias | Analog | In | 0.9 V | C4 | | pixel out II | Analog | Out | Sensing bus for Design II | B4 | | I so | Digital | In | Mux sel bit 0 for Design I | A4 | | I s1 | Digital | In | Mux sel bit 1 for Design I | C3 | | I s2 | Digital | In | Mux sel bit 2 for Design I | В3 | | I s3 | Digital | In | Mux sel bit 3 for Design I | A3 | | II so | Digital | In | Mux sel bit 0 for Design II | B2 | | II s1 | Digital | In | Mux sel bit 1 for Design II | B1 | | II s2 | Digital | In | Mux sel bit 2 for Design II | C2 | | II s3 | Digital | In | Mux sel bit 3 for Design II | C1 | | Sade in | Analog | In | $\Sigma\Delta$ AIC V.2 | D3 | | sw1 | Digital | In | $\Sigma\Delta$ AIC V.2 | D2 | | sw2 | Digital | In | $\Sigma\Delta$ AIC V.2 | D1 | | sw6 | Digital | In | $\Sigma\Delta$ AIC V.2 | E3 | | sw5 | Digital | In | $\Sigma\Delta$ AIC V.2 | E2 | | sw4 | Digital | In | $\Sigma\Delta$ AIC V.2 | E1 | | sw3 | Digital | In | $\Sigma\Delta$ AIC V.2 | F3 | | sw7 | Digital | In | $\Sigma\Delta$ AIC V.2 | F2 | | Vb | Analog | In | 1V | F1 | | sw8 | Digital | In | $\Sigma\Delta$ AIC V.2 | G3 | | sw9 | Digital | In | $\Sigma\Delta$ AIC V.2 | G2 | | sw10 | Digital | In | $\Sigma\Delta$ AIC V.2 | G1 | | sw11 | Digital | In | $\Sigma\Delta$ AIC V.2 | Н3 | | Vs2 | Analog | In | 1V | H2 | | sw12 | Digital | In | $\Sigma\Delta$ AIC V.2 | H1 | | Vs1 | Analog | In | 2V | J3 | | sw13 | Digital | In | $\Sigma\Delta$ AIC V.2 | J2 | |--------------|---------|-------|------------------------|-----| | Sdadc Vbias | Analog | In | 0.9V | J1 | | Sdadc i | Analog | Out | Sigma Delta Modulator | K3 | | Vth1 | Digital | In | 2V | K2 | | Vdd | Analog | In | $\Sigma\Delta$ AIC V.2 | K1 | | GND | Analog | In | $\Sigma\Delta$ AIC V.2 | L2 | | Sdadc b+ | Digital | Out | $\Sigma\Delta$ AIC V.2 | M2 | | Vth2 | Analog | In | 0.5V | L3 | | Sdadc b- | Digital | Out | $\Sigma\Delta$ AIC V.2 | M3 | | EH bus II3 + | Analog | Out | EH bus for Design II | K4 | | EH bus II3 - | Analog | Out | EH bus for Design II | L4 | | EH bus II2 - | Analog | Out | EH bus for Design II | M4 | | EH bus II2 + | Analog | Out | EH bus for Design II | K5 | | EH bus II1 + | Analog | Out | EH bus for Design II | L5 | | EH bus II1 - | Analog | Out | EH bus for Design II | M5 | | SwTG1 S | Analog | Inout | TG1 | K6 | | SwTG1 D | Analog | Inout | TG1 | L6 | | SwTG1 G | Digital | In | TG1 | M6 | | SwTG2 S | Analog | Inout | TG2 | K7 | | SwTG2 D | Analog | Inout | TG2 | L7 | | SwTG2 G | Digital | In | TG2 | M7 | | SwTG3 S | Analog | Inout | TG3 | K8 | | SwTG3 D | Analog | Inout | TG3 | L8 | | SwTG3 G | Digital | In | TG3 | M8 | | SwTG4 S | Analog | Inout | TG4 | K9 | | SwTG4 D | Analog | Inout | TG4 | L9 | | SwTG4 G | Digital | In | TG4 | M9 | | Swp1 S | Analog | Inout | PMOS 1 | K10 | | Swp1 D | Analog | Inout | PMOS 1 | L10 | | Swp1 G | Digital | In | PMOS 1 | M10 | # A.2 The Multiresolution Decomposition Imager Chip Figure 91: Block diagram of the multiresolution decomposition imager chip. Schematic for the operational amplifier is shown in Appendix B. Figure 92: Pin names of the multiresolution decomposition imager chip. Figure 93: Bonding diagram of the multiresolution decomposition imager chip. Chip number: V0BL-AE Figure 94: Top view of the multiresolution decomposition imager chip package. Table 17: Pin Configuration of the Multiresolution Decomposition Imager Chip | Pin name | Analog/Digital | Input/Output | Description | Pin number | |------------|----------------|--------------|-----------------------|------------| | Col0 | Digital | In | Col bit 0 | C2 | | Col1 | Digital | In | Col bit 1 | B1 | | Col2 | Digital | In | Col bit 2 | C1 | | Col3 | Digital | In | Col bit 3 | D2 | | Col4 | Digital | In | Col bit 4 | D1 | | Recol4 | Digital | In | Reset Col bit4 | F2 | | Recol3 | Digital | In | Reset Col bit3 | G1 | | Recol2 | Digital | In | Reset Col bit2 | G2 | | Recol1 | Digital | In | Reset Col bit1 | H1 | | Recol0 | Digital | In | Reset Col bit0 | H2 | | Row0 | Digital | In | Row bit 0 | J1 | | Row1 | Digital | In | Row bit 1 | J2 | | Row2 | Digital | In | Row bit 2 | K1 | | Row3 | Digital | In | Row bit 3 | J3 | | Row4 | Digital | In | Row bit 4 | K2 | | Rerow4 | Digital | In | Reset row bit4 | K3 | | Rerow3 | Digital | In | Reset row bit3 | J4 | | Rerow2 | Digital | In | Reset row bit2 | K4 | | Rerow1 | Digital | In | Reset row bit1 | K5 | | Vdd | Analog | In | 5V | J5 | | Rerow0 | Digital | In | Reset row bit0 | K6 | | Vbsf | Analog | In | 0.9V | J6 | | CDS cdsout | Analog | Out | CDS | K7 | | CDS Vref | Analog | In | 4V | J7 | | CDS Vbn | Analog | In | 1.32V | K8 | | CDS Vbp1 | Analog | In | 3.49V | Ј8 | | CDS Vbp2 | Analog | In | 3.95V | K9 | | CDS f3 | Digital | In | CDS Phase1 + Phase2 | J9 | | CDS f1 | Digital | In | CDS Phase1 | K10 | | CDS f2 | Digital | In | CDS Phase2 | Н9 | | pixbus | Analog | Out | CDS input from pixbus | J10 | | f2 | Digital | In | CDS Phase2 | G10 | | f1 | Digital | In | CDS Phase1 | F10 | | f3 | Digital | In | CDS Phase1 + Phase2 | F9 | | Vbp2 | Analog | In | 3.95V | E10 | |--------------------|---------|-------|-----------------------|-----| | Vbp1 | Analog | In | 3.49V | E9 | | Vbn | Analog | IN | 1.32V | D10 | | Vref | Analog | In 4V | | D9 | | CDSout | Analog | Out | CDS | C10 | | Vopb | Analog | In | 0.9V | C9 | | Vrefi | Analog | In | 4V | B10 | | In | Analog | In | Integrator | В9 | | fi1 | Digital | In | Integrator Phase 1 | A10 | | fi2 | Digital | In | Integrator Phase 2 | B8 | | fr | Digital | In | Integrator Reset | A9 | | Vout | Analog | Out | Integrator | A8 | | Vref1 | Analog | In | 2.5V | B7 | | CDS in | Analog | In | Memory | A7 | | memp | Digital | In | Memory program Enable | A6 | | s0 | Digital | In | Memory bit sel | В6 | | GND | Analog | In | GND | A5 | | s1 | Digital | In | Memory bit sel | B5 | | fm2 | Digital | In | Memory phase2 | A4 | | fm1 | Digital | In | Memory phase1 | B4 | | Voutm | Analog | Out | Memory | A3 | | Opamp Vbias Analog | | In | 0.9V | В3 | | Opamp in- Analog | | In | Opamp | A2 | | Opamp in+ Analog | | In | Opamp | B2 | | Opamp out | Analog | Out | Opamp | A1 | | | | | | | # **A.3** The $\Sigma\Delta$ AIC Chip Figure 95: Block diagram of the $\Sigma\Delta$ AIC chip. Schematic for the the $\Sigma\Delta$ AIC v.1 is shown in Appendix B. Figure 96: Pin names of the $\Sigma\Delta$ AIC chip. Figure 97: Bonding diagram of the $\Sigma\Delta$ AIC chip. Chip number: V19K-AK Figure 98: Top view of the $\Sigma\Delta$ AIC chip package. Table 18: Pin Configuration of the $\Sigma\Delta$ AIC Chip | Pin name | Analog/Digital | Input/Output | Description | Pin number | |----------|----------------|--------------|-------------|------------| | Vdd | Analog | In | 3.3 V | A10 | | GND | Analog | In | 0V | M3 | | II i | Analog | Out | Module 2 | H2 | | II phi | Digital | In | Module 2 | Н3 | | II in | Analog | In | Module 2 | G1 | | II fs2 | Digital | In | Module 2 | G2 | | II fs1 | Digital | In | Module 2 | G3 | | II b | Digital | Out | Module 2 | F1 | | Ιb | Digital | Out | Module 1 | F2 | | Vbias | Analog | In | 0.9V | F3 | | frst | Digital | In | Reset | E1 | | Vc | Analog | In | 2V | E2 | | I fs1 | Digital | In | Module 1 | E3 | | I fs2 | Digital | In | Module 1 | D1 | | Vb | Analog | In | 1V | D2 | | f2 | Digital | In | Phase 1 | D3 | | f1 | Digital | In | Phase 2 | C1 | | I in | Analog | In | Module 1 | C2 | | I phi | Digital | In | Module 1 | B1 | | Ιi | Digital | Out | Module 1 | B2 | ## APPENDIX B ## **SCHEMATICS** Figure 99: Schematic of the hybrid pixel with the sizes of the transistors. Figure 100: Schematic of the op-amp with the sizes of the transistors. Figure 101: Schematic of the $\Sigma\Delta$ AIC, version 1. Figure 102: Schematic of the $\Sigma\Delta$ AIC, version 2. ### APPENDIX C ### PICTURES OF CUSTOM PCB Figure 103: Picture of custom PCB for the multiresolution decomposition imager with description. Figure 104: Picture of custom PCB for the hybrid imager with description. Figure 105: Picture of custom PCB for the $\Sigma\Delta$ AIC with description. #### REFERENCE LIST - [1] Adeniran, O., and Demosthenous, A. An ultra-energy-efficient wide-bandwidth video pipeline ADC using optimized architectural partitioning. *Circuits and Systems I: Regular Papers, IEEE Transactions on 53*, 12 (2006), 2485–2497. - [2] Aizawa, K., Egi, Y., Hamamoto, T., Hatori, M., Abe, M., Maruyama, H., and Otake, H. Computational image sensor for on sensor compression. *Electron Devices, IEEE Transactions on 44*, 10 (1997), 1724–1730. - [3] Allen, P. E., and Holberg, D. R. CMOS Analog Circuit Design; 2nd ed. Oxford Univ. Press, Oxford, 2002. - [4] Annema, A. Analog circuit performance and process scaling. Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on 46, 6 (1999), 711– 725. - [5] Annema, A., Nauta, B., van Langevelde, R., and Tuinhout, H. Analog circuits in ultra-deep-submicron CMOS. Solid-State Circuits, IEEE Journal of 40, 1 (2005), 132–143. - [6] ARIMA, Y., and EHARA, M. On-chip solar battery structure for CMOS LSI. IEICE Electronics Express 3, 13 (2006), 287–291. - [7] Ay, S. A 1.32pW/frame pixel 1.2V CMOS energy-harvesting and imaging (EHI) APS imager. In Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International (Feb. 2011), pp. 116–118. - [8] Bandyopadhyay, A., Lee, J., Robucci, R., and Hasler, P. MATIA: a programmable 80 mu; W/frame CMOS block matrix transform imager architecture. Solid-State Circuits, IEEE Journal of 41, 3 (March 2006), 663 672. - [9] Barnhart, D. J., Vladimirova, T., and Sweeting, M. N. Satellite-on-a-Chip development for future distributed space missions. ASME. - [10] Bigas, M., Cabruja, E., Forest, J., and Salvi, J. Review of CMOS image sensors. Microelectronics journal 37, 5 (2006), 433–451. - [11] Boufounos, P., and Baraniuk, R. Sigma delta quantization for compressive sensing. Proc. SPIE Waveletts XII. Proc. of SPIE 6701 (2007), 670104. - [12] Boufounos, P., and Baraniuk, R. Sigma delta quantization for compressive sensing. Proc. SPIE Waveletts XII. Proc. of SPIE 6701 (2007), 670104. - [13] Bult, K. The effect of technology scaling on power dissipation in analog circuits. Analog Circuit Design (2006), 251–294. - [14] Candes, E., and Romberg, J. l1-magic: Recovery of sparse signals via convex programming. URL: www. acm. caltech. edu/l1magic/downloads/l1magic. pdf 4 (2005). - [15] Candes, E., and Romberg, J. Sparsity and incoherence in compressive sampling. Inverse problems 23, 3 (2007), 969. - [16] Candes, E., Romberg, J., and Tao, T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. Information Theory, IEEE Transactions on 52, 2 (Feb. 2006), 489 509. - [17] Candes, E., Romberg, J., and Tao, T. Stable signal recovery from incomplete and inaccurate measurements. Communications on pure and applied mathematics 59, 8 (2006), 1207–1223. - [18] Candes, E., and Tao, T. Decoding by linear programming. Information Theory, IEEE Transactions on 51, 12 (Dec. 2005), 4203 4215. - [19] Candes, E., and Tao, T. The Dantzig selector: statistical estimation when p is much larger than n. The Annals of Statistics (2007), 2313–2351. - [20] Chen, S., Donoho, D., and Saunders, M. Atomic decomposition by basis pursuit. SIAM journal on scientific computing 20, 1 (1998), 33–61. - [21] Chen, X., Yu, Z., Hoyos, S., Sadler, B., and Silva-Martinez, J. A Sub-Nyquist Rate Sampling Receiver Exploiting Compressive Sensing. Circuits and Systems I: Regular Papers, IEEE Transactions on 58, 3 (March 2011), 507 –520. - [22] Corless, R. M., Gonnet, G. H., Hare, D. E., Jeffrey, D. J., and Knuth, D. E. On the LambertW function. Advances in Computational mathematics 5, 1 (1996), 329–359. - [23] Davari, B., Dennard, R., and Shahidi, G. CMOS scaling for high performance and low power-the next ten years. Proceedings of the IEEE 83, 4 (1995), 595–606. - [24] Davidovic, M., Zach, G., and Zimmermann, H. An 11-bit successive approximation analog-to-digital converter based on a combined capacitor-resistor network. e & i Elektrotechnik und Informationstechnik 127, 4 (2010), 98–102. - [25] Deguchi, K., Suwa, N., Ito, M., Kumamoto, T., and Miki, T. A 6-bit 3.5-GS/s 0.9-V 98-mW flash ADC in 90-nm CMOS. Solid-State Circuits, IEEE Journal of 43, 10 (2008), 2303–2310. - [26] Dondi, D., Bertacchini, A., Brunelli, D., Larcher, L., and Benini, L. Modeling and optimization of a solar energy harvester system for self-powered wireless sensor networks. Industrial Electronics, IEEE Transactions on 55, 7 (2008), 2759–2766. - [27] Donoho, D. Compressed sensing. Information Theory, IEEE Transactions on 52, 4 (April 2006), 1289 –1306. - [28] Duarte, M., Davenport, M., Takhar, D., Laska, J., Sun, T., Kelly, K., and Baraniuk, R. Single-pixel imaging via compressive sampling. Signal Processing Magazine, IEEE 25, 2 (2008), 83–91. - [29] Erickson, R. W., and Maksimovic, D. Fundamentals of power electronics. Kluwer Academic Pub, 2001. - [30] Fish, A., Hamami, S., and Yadid-Pecht, O. CMOS Image Sensors With Self-Powered Generation Capability. Circuits and Systems II: Express Briefs, IEEE Transactions on 53, 11 (Nov. 2006), 1210 –1214. - [31] Gow, J., and Manning, C. Development of a photovoltaic array model for use in power-electronics simulation studies. In Electric Power Applications, IEE Proceedings- (1999), vol. 146, IET, pp. 193–200. - [32] Goyal, V., Fletcher, A., and Rangan, S. Compressive sampling and lossy compression. Signal Processing Magazine, IEEE 25, 2 (2008), 48–56. - [33] Guilar, N., Kleeburg, T., Chen, A., Yankelevich, D., and Amirtharajah, R. Integrated solar energy harvesting and storage. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 17, 5 (2009), 627–637. - [34] Gupta, S., Inerfield, M., and Wang, J. A 1-GS/s 11-bit ADC with 55-dB SNDR, 250-mW power realized by a high bandwidth scalable time-interleaved architecture. Solid-State Circuits, IEEE Journal of 41, 12 (2006), 2650–2657. - [35] Haupt, J., Bajwa, W., Rabbat, M., and Nowak, R. Compressed sensing for networked data. Signal Processing Magazine, IEEE 25, 2 (2008), 92–101. - [36] Jacques, L., Hammond, D., and Fadili, J. Dequantizing compressed sensing: When oversampling and non-gaussian constraints combine. Information Theory, IEEE Transactions on 57, 1 (2011), 559–571. - [37] Kadri, R., Andrei, H., Gaubert, J.-P., Ivanovici, T., Champenois, G., and Andrei, P. Modeling of the photovoltaic cell circuit parameters for optimum connection model and real-time emulator with partial shadow conditions. Energy 42, 1 (2012), 57–67. - [38] Kawahito, S., Yoshida, M., Sasaki, M., Umehara, K., Miyazaki, D., Tadokoro, Y., Murata, K., Doushou, S., and Matsuzawa, A. A CMOS image sensor with analog two-dimensional DCT-based compression circuits for one-chip cameras. Solid-State Circuits, IEEE Journal of 32, 12 (Dec. 1997), 2030 –2041. - [39] Kong, X., Petre, P., Matic, R., Gilbert, A., and Strauss, M. An analog-to-information converter for wideband signals using a time encoding machine. In Digital Signal Processing Workshop and IEEE Signal Processing Education Workshop (DSP/SPE), 2011 IEEE (2011), IEEE, pp. 414–419. - [40] Kwon, D., Rincón-Mora, G. A., and Torres, E. O. Harvesting ambient kinetic energy with switched-inductor converters. Circuits and Systems I: Regular Papers, IEEE Transactions on 58, 7 (2011), 1551–1560. - [41] Laska, J., Boufounos, P., Davenport, M., and Baraniuk, R. Democracy in action: Quantization, saturation, and compressive sensing. Applied and Computational Harmonic Analysis 31, 3 (2011), 429–443. - [42] Laska, J., Kirolos, S., Duarte, M., Ragheb, T., Baraniuk, R., and Massoud, Y. Theory and Implementation of an Analog-to-Information Converter using Random Demodulation. In Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on (May 2007), pp. 1959–1962. - [43] Laska, J., Kirolos, S., Duarte, M., Ragheb, T., Baraniuk, R., and Massoud, Y. Theory and implementation of an analog-to-information converter using random demodulation. In Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on (2007), IEEE, pp. 1959–1962. - [44] Law, M., and Bermak, A. High-Voltage Generation With Stacked Photodiodes in Standard CMOS Process. Electron Device Letters, IEEE 31, 12 (Dec. 2010), 1425 –1427. - [45] Law, M., Bermak, A., and Shi, C. A Low-Power Energy-Harvesting Logarithmic CMOS Image Sensor With Reconfigurable Resolution Using Two-Level Quantization Scheme. Circuits and Systems II: Express Briefs, IEEE Transactions on 58, 2 (Feb. 2011), 80–84. - [46] Leon-Salas, W., Balkir, S., Sayood, K., Schemm, N., and Hoffman, M. A CMOS Imager With Focal Plane Compression Using Predictive Coding. Solid-State Circuits, IEEE Journal of 42, 11 (Nov. 2007), 2555 –2572. - [47] Limotyrakis, S., Kulchycki, S., Su, D., and Wooley, B. A 150-MS/s 8-b 71-mW CMOS time-interleaved ADC. Solid-State Circuits, IEEE Journal of 40, 5 (2005), 1057–1067. - [48] Luo, Q., and Harris, J. A novel integration of on-sensor wavelet compression for a CMOS imager. In Circuits and Systems, 2002. ISCAS 2002. IEEE International Symposium on (2002), vol. 3, pp. III–325 III–328 vol.3. - [49] Mamaghanian, H., Khaled, N., Atienza, D., and Vandergheynst, P. Compressed Sensing for Real-Time Energy-Efficient ECG Compression on Wireless Body Sensor Nodes. Biomedical Engineering, IEEE Transactions on 58, 9 (2011), 2456–2466. - [50] Marwick, M., and Andreou, A. Photo-battery fabricated in silicon on sapphire CMOS. Electronics Letters 44, 12 (2008), 766–767. - [51] Mishali, M., and Eldar, Y. Xampling: Analog Data Compression. In Data Compression Conference (DCC), 2010 (March 2010), pp. 366 –375. - [52] Mishali, M., Eldar, Y., Dounaevsky, O., and Shoshan, E. Xampling: Analog to digital at sub-Nyquist rates. Circuits, Devices & Systems, IET 5, 1 (2011), 8–20. - [53] Oike, Y., and El Gamal, A. CMOS Image Sensor With Per-Column ΣΔ ADC and Programmable Compressed Sensing. Solid-State Circuits, IEEE Journal of 48, 1 (2013), 318–328. - [54] Olyaei, A., and Genov, R. Focal-Plane Spatially Oversampling CMOS Image Compression Sensor. Circuits and Systems I: Regular Papers, IEEE Transactions on 54, 1 (Jan. 2007), 26–34. - [55] Olyaei, A., and Genov, R. Focal-plane spatially oversampling CMOS image compression sensor. Circuits and Systems I: Regular Papers, IEEE Transactions on 54, 1 (2007), 26–34. - [56] Pan, H., Segami, M., Choi, M., Cao, L., and Abidi, A. A 3.3-V 12-b 50-MS/s A/D converter in 0.6-/spl mu/m CMOS with over 80-dB SFDR. Solid-State Circuits, IEEE Journal of 35, 12 (2000), 1769–1780. - [57] Peluso, V., Vancorenland, P., Marques, A., Steyaert, M., and Sansen, W. A 900-mV low-power ΔΣ A/D converter with 77-dB dynamic range. Solid-State Circuits, IEEE Journal of 33, 12 (1998), 1887–1897. - [58] Prabha, R. D., Rincón-Mora, G. A., and Kim, S. Harvesting circuits for miniaturized photovoltaic cells. In Circuits and Systems (ISCAS), 2011 IEEE International Symposium on (2011), IEEE, pp. 309–312. - [59] Ragheb, T., Laska, J., Nejati, H., Kirolos, S., Baraniuk, R., and Massoud, Y. A prototype hardware for random demodulation based compressive analog-to-digital conversion. In Circuits and Systems, 2008. MWSCAS 2008. 51st Midwest Symposium on (Aug. 2008), pp. 37–40. - [60] Ragheb, T., Laska, J., Nejati, H., Kirolos, S., Baraniuk, R., and Massoud, Y. A prototype hardware for random demodulation based compressive analog-to-digital conversion. In Circuits and Systems, 2008. MWSCAS 2008. 51st Midwest Symposium on (2008), IEEE, pp. 37–40. - [61] Razavi, B., and Sahoo, B. A 12-Bit 200-MHz CMOS ADC. Solid-State Circuits, IEEE Journal of 44, 9 (2009), 2366–2380. - [62] Rincón-Mora, G. A. Harvesting microelectronic circuits. In Energy Harvesting Technologies. Springer, 2009, pp. 287–321. - [63] Rodriguez, C., and Amaratunga, G. A. Analytic solution to the photovoltaic maximum power point problem. Circuits and Systems I: Regular Papers, IEEE Transactions on 54, 9 (2007), 2054–2060. - [64] Said, A., and Pearlman, W. An image multiresolution representation for lossless and lossy compression. Image Processing, IEEE Transactions on 5, 9 (1996), 1303–1310. - [65] Sarioglu, B., Aktan, O., Oncu, A., Mutlu, S., Dundar, G., and Yalcinkaya, A. An Optically Powered CMOS Receiver System for Intravascular Magnetic Resonance Applications. Emerging and Selected Topics in Circuits and Systems, IEEE Journal on 2, 4 (2012), 683–691. - [66] Scott, M., Boser, B., and Pister, K. An ultralow-energy ADC for smart dust. Solid-State Circuits, IEEE Journal of 38, 7 (2003), 1123–1129. - [67] Shi, C., Law, M. K., and Bermak, A. A Novel Asynchronous Pixel for an Energy Harvesting CMOS Image Sensor. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 19, 1 (Jan. 2011), 118–129. - [68] Shur, M. Physics of Semiconductor Devices: Manual. Ed. Techniques Ingénieur, 1990. - [69] Sudevalayam, S., and Kulkarni, P. Energy Harvesting Sensor Nodes: Survey and Implications. Communications Surveys Tutorials, IEEE 13, 3 (2011), 443 –461. - [70] Tropp, J., Laska, J., Duarte, M., Romberg, J., and Baraniuk, R. Beyond Nyquist: Efficient Sampling of Sparse Bandlimited Signals. Information Theory, IEEE Transactions on 56, 1 (Jan. 2010), 520 –544. - [71] Tsukamoto, S., Dedic, I., Endo, T., Kikuta, K., Goto, K., and Kobayashi, O. A CMOS 6-b, 200 MSample/s, 3 V-supply A/D converter for a PRML read channel LSI. Solid-State Circuits, IEEE Journal of 31, 11 (1996), 1831–1836. - [72] Wang, C.-C., and Wu, J.-c. Efficiency improvement in charge pump circuits. Solid-State Circuits, IEEE Journal of 32, 6 (1997), 852–860. - [73] Wang, H., and Leon-Salas, W. An incremental sigma delta converter for compressive sensing applications. In Circuits and Systems (ISCAS), 2011 IEEE International Symposium on (2011), IEEE, pp. 522–525. - [74] Wang, H., and Leon-Salas, W. An incremental sigma delta converter for compressive sensing applications. In Circuits and Systems (ISCAS), 2011 IEEE International Symposium on (May 2011), pp. 522 –525. - [75] Wang, H.-T., and Leon-Salas, W. A multiresolution algorithm for focal-plane compression. In Circuits and Systems (ISCAS), 2012 IEEE International Symposium on (May 2012), pp. 926–929. - [76] Wu, X., and Memon, N. Context-based, adaptive, lossless image coding. Communications, IEEE Transactions on 45, 4 (1997), 437–444. - [77] Xiao, W., Dunford, W. G., Palmer, P. R., and Capel, A. Regulation of photovoltaic voltage. Industrial Electronics, IEEE Transactions on 54, 3 (2007), 1365–1374. - [78] Xiao, W., Ozog, N., and Dunford, W. G. Topology study of photovoltaic interface for maximum power point tracking. Industrial Electronics, IEEE Transactions on 54, 3 (2007), 1696–1704. - [79] Yadid-Pecht, O., and Etienne-Cummings, R. CMOS imagers: from phototransduction to image processing. Springer, 2004. - [80] Yu, Z., Chen, X., Hoyos, S., Sadler, B., Gong, J., and Qian, C. Mixed-signal parallel compressive spectrum sensing for cognitive radios. International Journal of Digital Multimedia Broadcasting 2010 (2010). - [81] Yu, Z., Hoyos, S., Sadler, B., Gong, J., and Qian, C. Mixed-signal parallel compressive spectrum sensing for cognitive radios. International Journal of Digital Multimedia Broadcasting (2010). #### **VITA** Hsuan-Tsung Wang was born in Taiwan. He attended Ming-Dao High School, Taichung City, Taiwan. In 1996 he entered National United University in Taiwan. Two years later, he served 2 years of military service in Taichung City. In 2000, he was employed as a engineer in the research and development department at Aryxx Optoelectronics Company. While he was working at Aryxx, he attended Vanung University in Jhongli City, Taiwan. He received the degree of Bachelor of Science from Vanung University in June, 2002. At the same year, he moved back to his hometown, Taichung City, Taiwan, and worked in Asia Optical Company as a R&D engineer. In January, 2005, he entered the School of Graduate Studies at the University of Missouri at Kansas City. He earned a master's degree in electrical engineering from the University of Missouri at Kansas City in December, 2007. He continued studying and received a Ph.D. degree in electrical and computer engineering with a co-discipline of telecommunications and computer networking from the University of Missouri at Kansas City in May, 2013.