We present a 2x4x4mm3 imaging system complete with optics, wireless communication, battery, power management, solar harvesting, processor and memory. The system features a 160x 160 resolution CMOS image sensor with 304nW continuous in-pixel motion detection mode. System components are fabricated in five different IC layers and die-stacked for minimal form factor. Photovoltaic (PV) cells face the opposite direction of the imager for optimal illumination and generate 456nW at 10klux to enable energy autonomous system operation.
Introduction
Visual imaging is a highly desired feature for wireless sensing applications such as biological monitoring and surveillance. Recent advances in low power circuit techniques and integration have resulted in self-sustaining wireless sensor nodes as small as Imm) [1] [2] [3] . However, achieving a complete visual sensing system in a similar volume is challenging due to physical difficulties in integrating optics and the high power consumption of key components such as the image sensor and RF transmitter. This work introduces a millimeter-scale wireless sensor node with visual imaging and ultra-low power motion detection. The overall system sleep power is 304nW with motion detection enabled, allowing energy-autonomous operation with I Oklux of background lighting, which is a typical daytime condition on a window or light source surface. The system also features on-demand duty-cycling, in which the system wakes up when triggered by motion to capture an image and can restrict the read out to the portion of the image where motion was detected.
Proposed System Fig. 1 shows pictures and a cross-sectional diagram of the proposed system. The glass package with vias provides a connection between the top-facing chips and bottom-facing PV cells. This configuration creates optimal lighting conditions for both imaging and harvesting. Transparent protective epoxy covers the PV cells and bond wires below the glass package. A gradient-index (GRIN) rod lens is used for optics, measuring I mm in diameter and 2.4mm in height. The lens is mounted on the surface of the image sensor with UV -curable optical epoxy. Since the ultra-low power circuits in the system exhibit high leakage when exposed to strong light, the cavity inside the glass package is filled with dark epoxy to shield non-imaging chips from light and complete the encapsulation. Fig. 2 outlines the major system components. A new unidirectional, fully synthesizable inter-layer communication bus (Mbus) is introduced, designed specifically for ultra-low power and robust operation. Mbus master and slave layers are arranged in a daisy chain topology. The processor (PRC) layer generates Mbus clock as the master layer and also contains an ARM MO core and 3kB SRAM. The Mbus clock wires toggle only when there is bus activity, minimizing the active power. An Mbus layer is either in forwarding (default) or issuing mode. In the default mode, CIN/DIN are directly forwarded to COUT/DoUT, which I) allows accurate synchronization of layers since delay through the daisy chain is small and deterministic, and 2) removes the need for a local clock on each layer, minimizing the power overhead of adding layers to the loop.
One key feature of Mbus is its interrupt generation during system sleep mode. The power management unit (PMU) in the PRC layer provides 1.2V and 0.6V power domains, down-converted from the 3.8V battery voltage. The PMU can sustain lO-lOOnA load on both supplies during sleep mode and up to 50JlA during active mode. The system can wake up from sleep mode either via a built-in wake-up timer in the PRC layer or by interrupts generated by slave layers that operate in sleep mode, e.g., motion detection. To issue this interrupt, the slave must initiate an Mbus message to the PRC while staying within the extremely low power budget of standby mode. To this end, Mbus data wires use a high default state and a message request is performed by pulling down on this wire, which is forwarded to the master, and consumes near-zero energy. The PMU then switches to active mode and all Mbus layers wake up, allowing the remainder of the message to be sent. An added benefit of this approach is that only the sleep and wire controllers remain powered on during sleep mode (Fig. 3) , resulting in Mbus sleep power of 6p W per layer.
To allow flexible message length while using only two wires, end-of-message (EOM) is encoded by holding clock high while the data 978-1-4799-3328-0114/$31.00 ©2014 IEEE toggles three times (detection circuit shown in Fig. 4) . The EOM detector is also used to allow message interrupt. When a layer issues an interrupt while the Mbus is already active, it blocks the incoming clock (COUT held high), which is then recognized by the Mbus master. The master then sends the EOM sequence, allowing the interrupting layer to gain control.
The image sensor in the IMG layer produces 8/9-bit 160x 160 resolution images with cropping and row/column skipping features to reduce data bandwidth and storage. Motion detection pixels of lOx 10 resolution are built-in to the array (Fig. 6) , with temporal averaging to improve detection sensitivity to slow motions [4] . For motion detection, pixel values of the previous frame are stored on CHOLD, which is distributed in the motion detection cluster of 16x 16 pixels, achieving pixel size of 5.4x5.4Jlm2 and 37% fill factor. At the end of an integration cycle, both the current and previous pixel values are read out and compared (Fig. 7) . A 14-level configurable detection threshold is used to determine if the inter-frame pixel difference is large enough to be flagged as motion. A new bootstrapping circuit is designed to eliminate reset hysteresis for boosted row control signals (Fig. 8) . In the conventional cross-coupled design, node nl does not fully charge up to VDD in steady state, resulting in a different initial boosted voltage in pulsed signal sequence. The proposed single-ended design uses NMOS with its body tied to VDD to lower Vth, raising nl to full VDD when input A is O.
In the ultra-low power motion detection mode, pixels and ADC for full-frame imaging are hierarchically power-gated (Fig. 9) , and motion detection is performed at 5x5 resolution. Since image readout consumes up to lmA instantaneous current, 100pF decoupling capacitance is added for each power-gated domain. To prevent power rail collapse when initially enabling the large capacitors, a smaller assistive power gating device is implemented that turns on before the main power gating device. To reduce leakage power of O.6V digital components, N-wells are biased at 1.2V, reducing PMOS leakage by 60x in simulation.
Frame-wise thresholding of motion trigger count is implemented to prevent against false triggers or dead pixels. The location of the motion detecting pixel is also reported, which can be used to determine which portions of the image to store in SRAM. Fig. 5 shows state changes of each layer and Mbus activity for one of the tested usage scenarios of motion-triggered system wakeup and RF transmission.
Two-way wireless communication is achieved with optical wakeup receiver and RF transmitter. The optical receiver [5] consumes 695pW standby power and is used to initially load the program and to request data. The RF transmitter uses an OOK-modulated scheme, with an LC oscillator tuned to the 915MHz ISM band. The oscillator is turned on for lOOns to transmit I bit and is otherwise power-gated.
Measurement Results
Five of the proposed imaging system were assembled with 80% yield and demonstrated full functionality. The glass package with bottom facing PV cells is first assembled and placed on a glass slide. The battery and IC layers are then stacked and wirebonded, and finally the lens is mounted. With the 5.7JlAh thin-film battery, the system can survive up to 60 days in a 4nA sleep mode or perform continuous motion detection for 3.4 days consuming 80nA from the 3.8V battery. With lOklux illumination, battery charging by the PMU at 120nA was observed, which is sufficient to sustain continuous motion detection operation. Fig. 10 shows measured battery current as the system wakes up upon motion, transmits data, and returns to sleep. Image sensor SNR in its 9-bit configuration is 47dB (Fig.   II) . The RF transmitter in the stack sends data at 4nJibit to an inductive receiver with -70dBm of sensitivity positioned 15mm away. Fig. 12 shows die photos of the three key IC layers. 2014 Symposium on VLSI Circuits Digest of Technical Papers
