30 research outputs found
Scalable Energy-Recovery Architectures.
Energy efficiency is a critical challenge for today's integrated circuits, especially for high-end digital signal processing and communications that require both high throughput and low energy dissipation for extended battery life. Charge-recovery logic recovers and reuses charge using inductive elements and has the potential to achieve order-of-magnitude improvement in energy efficiency while maintaining high performance. However, the lack of large-scale high-speed silicon demonstrations and inductor area overheads are two major concerns.
This dissertation focuses on scalable charge-recovery designs. We present a semi-automated design flow to enable the design of large-scale charge-recovery chips. We also present a new architecture that uses in-package inductors, eliminating the area overheads caused by the use of integrated inductors in high-performance charge-recovery chips.
To demonstrate our semi-automated flow, which uses custom-designed standard-cell-like dynamic cells, we have designed a 576-bit charge-recovery low-density parity-check (LDPC) decoder chip. Functioning correctly at clock speeds above 1 GHz, this prototype is the first-ever demonstration of a GHz-speed charge-recovery chip of significant complexity. In terms of energy consumption, this chip improves over recent state-of-the-art LDPCs by at least 1.3 times with comparable or better area efficiency.
To demonstrate our architecture for eliminating inductor overheads, we have designed a charge-recovery LDPC decoder chip with in-package inductors. This test-chip has been fabricated in a 65nm CMOS flip-chip process. A custom 6-layer FC-BGA package substrate has been designed with 16 inductors embedded in the fifth layer of the package substrate, yielding higher Q and significantly improving area efficiency and energy efficiency compared to their on-chip counterparts. From measurements, this chip achieves at least 2.3 times lower energy consumption with better area efficiency over state-of-the-art published designs.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116653/1/terryou_1.pd
ΠΠ½Π°Π»ΠΈΠ· Π²Π»ΠΈΡΠ½ΠΈΡ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠΈ n-ΠΠΠ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° Π½Π° ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΡ 4x1 ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ°
ΠΠΎΠ»Π½ΡΠΉ ΡΠ΅ΠΊΡΡ Π΄ΠΎΡΡΡΠΏΠ΅Π½ Π½Π° ΡΠ°ΠΉΡΠ΅ ΠΈΠ·Π΄Π°Π½ΠΈΡ ΠΏΠΎ ΠΏΠΎΠ΄ΠΏΠΈΡΠΊΠ΅: http://radio.kpi.ua/article/view/S0021347018030044Π ΡΡΠ°ΡΡΠ΅ ΠΏΡΠΈΠ²Π΅Π΄Π΅Π½ Π°Π½Π°Π»ΠΈΠ· ΠΏΠΎΡΡΠ΅Π±Π»ΡΠ΅ΠΌΠΎΠΉ ΠΌΠΎΡΠ½ΠΎΡΡΠΈ ΠΈ Π²Π΅Π»ΠΈΡΠΈΠ½Ρ Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ 4x1 ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ° Π½Π° Π±Π°Π·Π΅ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠΈ n-ΠΠΠ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° AT-NMOS (Augmented Transistor NMOS). Π Π°ΡΡΠΌΠΎΡΡΠ΅Π½ΠΎ Π²Π»ΠΈΡΠ½ΠΈΠ΅ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΡΡΠΎΠ²Π½Π΅ΠΉ ΠΎΠ±ΡΠ΅ΠΉ ΡΠΈΡΠΈΠ½Ρ ΠΊΠ°Π½Π°Π»Π° ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° Π½Π° Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠΈ ΠΌΠΎΡΠ½ΠΎΡΡΠΈ ΡΡΠ΅ΡΠΊΠΈ ΠΈ Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ Π² ΡΠ»ΡΡΠ°Π΅ 45 Π½ΠΌ ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ. Π£ΡΡΠ°Π½ΠΎΠ²Π»Π΅Π½ΠΎ, ΡΡΠΎ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ ΡΠ»ΡΡΡΠ°Π΅ΡΡΡ Π² ΠΏΡΠ΅Π΄Π»Π°Π³Π°Π΅ΠΌΠΎΠΉ ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΈΠΈ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠΈ p-ΠΠΠ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° Ρ Π·Π°ΠΊΠΎΡΠΎΡΠ΅Π½Π½ΡΠΌ ΡΡΠ°ΡΡΠΊΠΎΠΌ Π·Π°ΡΠ²ΠΎΡβΠΈΡΡΠΎΠΊ ΠΈ n-ΠΠΠ ΡΡΡΡΠΊΡΡΡΠΎΠΉ ASG-S PMOS-NMOS (Augmented Shorted Gate-Source PMOS with NMOS) ΠΏΠΎ ΡΡΠ°Π²Π½Π΅Π½ΠΈΡ Ρ 4x1 ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠΎΠΌ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠΈ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠ³ΠΎ n-ΠΠΠ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° ΡΠΎ ΡΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΌ ΠΏΠΎΡΠΎΠ³ΠΎΠΌ ST-ATNMOS (Static Threshold AT-NMOS). ΠΡΠΈ ΡΡΠΎΠΉ ΠΊΠΎΠΌΠ±ΠΈΠ½Π°ΡΠΈΠΈ ΠΏΠΎΠ»ΡΡΠ΅Π½Ρ ΠΆΠ΅Π»Π°Π΅ΠΌΡΠ΅ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΡ ΡΠ°Π±ΠΎΡΠ΅ΠΉ Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠΈ ΠΏΡΠΎΠ΅ΠΊΡΠΈΡΡΠ΅ΠΌΠΎΠΉ ΡΡ
Π΅ΠΌΡ. Π ΡΠ°Π±ΠΎΡΠ΅ ΡΠ°ΡΡΠΌΠΎΡΡΠ΅Π½ΠΎ Π΄Π²Π° ΡΠΈΠΏΠ° ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ Π΄Π»Ρ 4x1 ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ°. ΠΠΎΠΊΠ°Π·Π°Π½ΠΎ, ΡΡΠΎ ΠΌΠΎΡΠ½ΠΎΡΡΡ ΡΡΠ΅ΡΠΊΠΈ ΡΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎ ΡΠΎΠΊΡΠ°ΡΠ°Π΅ΡΡΡ. Π₯Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠ° Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ ΡΠ°ΠΊΠΆΠ΅ ΡΠ»ΡΡΡΠ°Π΅ΡΡΡ Π΄ΠΎ 5% ΠΏΡΠΈ ΠΈΡΡΠΎΡΠ½ΠΈΠΊΠ΅ ΠΏΠΈΡΠ°Π½ΠΈΡ 1 Π Π² ΡΠ»ΡΡΠ°Π΅ ΡΠ°ΡΡΠΌΠΎΡΡΠ΅Π½ΠΈΡ ΠΌΠ½ΠΎΠ³ΠΎΡΡΠΎΠ²Π½Π΅Π²ΠΎΠΉ ΡΠΈΡΠΈΠ½Ρ ΠΊΠ°Π½Π°Π»Π° ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° Π΄Π»Ρ ΠΎΡΠ΅Π½ΠΊΠΈ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ 4x1 ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ° Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠΉ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠ³ΠΎ n-ΠΠΠ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠ° AT-NMOS. ΠΠΎΠ΄Π΅Π»ΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΠΎΡΡΡΠ΅ΡΡΠ²Π»ΡΠ»ΠΎΡΡ ΠΏΡΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠΈ ΠΌΠΎΠ΄Π΅Π»ΠΈΡΡΡΡΠΈΡ
ΠΏΡΠΎΠ³ΡΠ°ΠΌΠΌ Cadence Analog Virtuoso ΠΈ Spectre Simulator ΠΏΡΠΈΠΌΠ΅Π½ΠΈΡΠ΅Π»ΡΠ½ΠΎ ΠΊ 45 Π½ΠΌ ΠΠΠΠ-ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ
ΠΠΏΡΠΈΠΌΠΈΠ·Π°ΡΠΈΡ ΠΌΠΎΡΠ½ΠΎΡΡΠΈ ΠΈ Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ Π½Π°Π½ΠΎΡΠ°Π·ΠΌΠ΅ΡΠ½ΠΎΠ³ΠΎ (4Ρ 1)-ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ° ΠΏΡΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠΈ ΡΡ Π΅ΠΌΡ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π° ΠΠΠΠ ΡΡΡΡΠΊΡΡΡΠ°Ρ
ΠΠΎΠ»Π½ΡΠΉ ΡΠ΅ΠΊΡΡ Π΄ΠΎΡΡΡΠΏΠ΅Π½ Π½Π° ΡΠ°ΠΉΡΠ΅ ΠΈΠ·Π΄Π°Π½ΠΈΡ ΠΏΠΎ ΠΏΠΎΠ΄ΠΏΠΈΡΠΊΠ΅: http://radio.kpi.ua/article/view/S0021347016110017Π Π°Π±ΠΎΡΠ° ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠ°Π½Π° ΡΠ½ΠΈΠ²Π΅ΡΡΠΈΡΠ΅ΡΠΎΠΌ ITM (ΠΠ²Π°Π»ΠΈΠΎΡ) ΠΈ ΠΊΠΎΠΌΠΏΠ°Π½ΠΈΠ΅ΠΉ Cadence System Design (ΠΠ°Π½Π³Π°Π»ΠΎΡ)Π ΡΡΠ°ΡΡΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ Π²ΡΡΠΎΠΊΠΎΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΡΠΉ (4Γ1)-ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡ Ρ ΠΌΠ°Π»ΠΎΠΉ ΡΡΠ΅ΡΠΊΠΎΠΉ ΠΈ ΡΠΌΠ΅Π½ΡΡΠ΅Π½Π½ΠΎΠΉ Π·Π°Π΄Π΅ΡΠΆΠΊΠΎΠΉ, ΡΠ½Π°Π±ΠΆΠ΅Π½Π½ΡΠΉ ΡΡ
Π΅ΠΌΠΎΠΉ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π° ΠΠΠ-ΡΡΡΡΠΊΡΡΡΠ°Ρ
, ΠΊΠΎΡΠΎΡΠ°Ρ ΡΠΎΠ²ΠΌΠ΅ΡΠ΅Π½Π° Ρ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΠΠ-ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠ΅ΠΉ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠΎΠ² ΠΆΠ΄ΡΡΠ΅Π³ΠΎ ΡΠ΅ΠΆΠΈΠΌΠ° Π½Π°Π½ΠΎΡΠ°Π·ΠΌΠ΅ΡΠ½ΠΎΠΉ ΡΡΡΡΠΊΡΡΡΡ. ΠΡΠΈΠ³ΠΈΠ½Π°Π»ΡΠ½Π°Ρ ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΈΡ ΡΡ
Π΅ΠΌΡ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ ΡΠ΅Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Π° Π² Π²ΠΈΠ΄Π΅ Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡΠ΅Π»ΡΠ½ΠΎΠΉ ΡΡ
Π΅ΠΌΡ Π½Π° Π²ΡΡ
ΠΎΠ΄Π΅ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΈΠΈ Π΄Π»Ρ ΡΡΡΠΏΠ΅Π½ΡΠ°ΡΠΎΠ³ΠΎ ΡΠ²Π΅Π»ΠΈΡΠ΅Π½ΠΈΡ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ. ΠΡΠΎ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ ΡΠ΄Π²ΠΎΠΈΡΡ Π²ΡΡ
ΠΎΠ΄Π½ΠΎΠ΅ ΠΏΠΈΠΊΠΎΠ²ΠΎΠ΅ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΠ΅ Π·Π° ΡΡΠ΅Ρ ΠΏΠ΅ΡΠ΅Ρ
ΠΎΠ΄Π½ΡΡ
ΠΏΡΠΎΡΠ΅ΡΡΠΎΠ² ΠΏΠΎΠ»ΠΎΠΆΠΈΡΠ΅Π»ΡΠ½ΠΎΠ³ΠΎ ΠΈ ΠΎΡΡΠΈΡΠ°ΡΠ΅Π»ΡΠ½ΠΎΠ³ΠΎ ΡΠΈΠΊΠ»ΠΎΠ². ΠΡΠΎ ΠΏΠΎΠ²ΡΡΠ΅Π½Π½ΠΎΠ΅ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΠ΅ ΠΌΠΎΠΆΠ΅Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡΡΡ Π² ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅ ΡΡΠ°Π±ΠΈΠ»ΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠ³ΠΎ ΠΈΡΡΠΎΡΠ½ΠΈΠΊΠ° ΠΏΠΈΡΠ°Π½ΠΈΡ Π΄Π»Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΡ
ΡΠ΅Π»Π΅ΠΉ. ΠΠ°Π»ΠΈΡΠΈΠ΅ ΡΡ
Π΅ΠΌΡ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π΅ ΡΠ²Π»ΡΠ΅ΡΡΡ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΡΠΌ Π΄Π»Ρ ΡΠ»ΡΡΡΠ΅Π½ΠΈΡ ΠΎΠ±ΡΠ΅ΠΉ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΈΠΈ (4Γ1)-ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ°. ΠΠ»Ρ ΠΏΠΎΠ»ΡΡΠ΅Π½ΠΈΡ ΠΎΠ΄Π½ΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΠΎΠΉ ΠΎΠΏΡΠΈΠΌΠΈΠ·Π°ΡΠΈΠΈ ΠΏΠΎ ΠΌΠΎΡΠ½ΠΎΡΡΠΈ ΡΠ°ΡΡΠ΅ΡΠ½ΠΈΡ (ΠΌΠΎΡΠ½ΠΎΡΡΡ ΡΡΠ΅ΡΠΊΠΈ) ΠΈ Π΄Π»ΠΈΡΠ΅Π»ΡΠ½ΠΎΡΡΠΈ Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ ΡΡ
Π΅ΠΌΠ° ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ ΡΠΎΠ²ΠΌΠ΅ΡΡΠ½ΠΎ Ρ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΠΠ-ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠ΅ΠΉ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠΎΠ² ΠΆΠ΄ΡΡΠ΅Π³ΠΎ ΡΠ΅ΠΆΠΈΠΌΠ°. ΠΠ»Ρ ΠΌΠΈΠ½ΠΈΠΌΠΈΠ·Π°ΡΠΈΠΈ ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠ° ΠΌΠΎΡΠ½ΠΎΡΡΠΈ ΡΠ°ΡΡΠ΅ΡΠ½ΠΈΡ, Π²ΡΠ·Π²Π°Π½Π½ΠΎΠΉ ΡΡΠ΅ΡΠΊΠΎΠΉ, Π²Π²Π΅Π΄Π΅Π½Π° ΡΡ
Π΅ΠΌΠ° ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π° ΠΠΠ-ΡΡΡΡΠΊΡΡΡΠ°Ρ
, ΡΠΎΠ²ΠΌΠ΅ΡΠ΅Π½Π½Π°Ρ Ρ ΡΠ°ΡΡΠΈΡΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠ΅ΠΉ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠΎΠ² ΠΆΠ΄ΡΡΠ΅Π³ΠΎ ΡΠ΅ΠΆΠΈΠΌΠ°. ΠΡΠΎ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ ΡΠΌΠ΅Π½ΡΡΠΈΡΡ ΠΈΠ·Π±ΡΡΠΎΡΠ½ΡΡ ΠΌΠΎΡΠ½ΠΎΡΡΡ ΡΠ°ΡΡΠ΅ΡΠ½ΠΈΡ ΡΡ
Π΅ΠΌΡ, ΠΎΠ±ΡΡΠ»ΠΎΠ²Π»Π΅Π½Π½ΡΡ ΡΡΠ΅ΡΠΊΠΎΠΉ. Π£ΠΊΠ°Π·Π°Π½Π½Π°Ρ Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡΠ΅Π»ΡΠ½Π°Ρ ΡΠ°ΡΡΡ ΡΡ
Π΅ΠΌΡ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΠ΅Ρ ΠΏΠΎΠ»ΡΡΠΈΡΡ Π½Π΅ΠΎΠ±Ρ
ΠΎΠ΄ΠΈΠΌΡΠΉ ΡΡΠΎΠ²Π΅Π½Ρ Π²ΡΡ
ΠΎΠ΄Π½ΠΎΠ³ΠΎ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Ρ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠ³ΠΎ (4Γ1)-ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ° ΠΏΡΠΈ ΡΠ»ΡΡΡΠ΅Π½Π½ΡΡ
ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠ°Ρ
. ΠΠΎΠ΄Π΅Π»ΠΈΡΠΎΠ²Π°Π½ΠΈΠ΅ ΡΡΡΡΠΎΠΉΡΡΠ²Π° ΠΎΡΡΡΠ΅ΡΡΠ²Π»ΡΠ»ΠΎΡΡ ΠΏΡΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠΈ ΡΠ΅Ρ
Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ 45 Π½ΠΌ. Π ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠ΅ ΠΌΠΎΡΠ½ΠΎΡΡΡ ΡΠ°ΡΡΠ΅ΡΠ½ΠΈΡ, ΠΎΠ±ΡΡΠ»ΠΎΠ²Π»Π΅Π½Π½Π°Ρ ΡΡΠ΅ΡΠΊΠΎΠΉ, ΡΠΌΠ΅Π½ΡΡΠ΅Π½Π° Π΄ΠΎ ΡΡΠΎΠ²Π½Ρ ΠΏΡΠΈΠΌΠ΅ΡΠ½ΠΎ 55%, Π° Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊΠ° Π·Π°Π΄Π΅ΡΠΆΠΊΠΈ ΡΠ»ΡΡΡΠ΅Π½Π° Π΄ΠΎ ΡΡΠ΅Π±ΡΠ΅ΠΌΠΎΠ³ΠΎ ΡΡΠΎΠ²Π½Ρ Π±Π»Π°Π³ΠΎΠ΄Π°ΡΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ ΡΡ
Π΅ΠΌΡ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π° ΠΠΠ-ΡΡΡΡΠΊΡΡΡΠ°Ρ
ΡΠΎΠ²ΠΌΠ΅ΡΡΠ½ΠΎ Ρ ΡΠ»ΡΡΡΠ΅Π½Π½ΠΎΠΉ ΠΠΠ-ΠΊΠΎΠ½ΡΠΈΠ³ΡΡΠ°ΡΠΈΠ΅ΠΉ ΡΡΠ°Π½Π·ΠΈΡΡΠΎΡΠΎΠ² ΠΆΠ΄ΡΡΠ΅Π³ΠΎ ΡΠ΅ΠΆΠΈΠΌΠ°. Π ΡΡΠ°ΡΡΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Ρ ΡΠ°Π·Π»ΠΈΡΠ½ΡΠ΅ ΠΊΠΎΠΌΠ±ΠΈΠ½Π°ΡΠΈΠΈ ΡΡ
Π΅ΠΌΡ ΡΠ΄Π²ΠΎΠΈΡΠ΅Π»Ρ Π½Π°ΠΏΡΡΠΆΠ΅Π½ΠΈΡ Π½Π° ΠΠΠ-ΡΡΡΡΠΊΡΡΡΠ°Ρ
, ΡΠ΅Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Π½ΡΠ΅ Π½Π° Π²ΡΡ
ΠΎΠ΄Π΅ (4Γ1)-ΠΌΡΠ»ΡΡΠΈΠΏΠ»Π΅ΠΊΡΠΎΡΠ°
High Speed Reconfigurable NRZ/PAM4 Transceiver Design Techniques
While the majority of wireline standards use simple binary non-return-to-zero (NRZ) signaling, four-level pulse-amplitude modulation (PAM4) standards are emerging to increase bandwidth density. This dissertation proposes efficient implementations for high speed NRZ/PAM4 transceivers. The first prototype includes a dual-mode NRZ/PAM4 serial I/O transmitter which can support both modulations with minimum power and hardware overhead. A source-series-terminated (SST) transmitter achieves 1.2Vpp output swing and employs lookup table (LUT) control of a 31-segment output digital-to-analog converter (DAC) to implement 4/2-tap feed-forward equalization (FFE) in NRZ/PAM4 modes, respectively. Transmitter power is improved with low-overhead analog impedance control in the DAC cells and a quarter-rate serializer based on a tri-state inverter-based mux with dynamic pre-driver gates. The transmitter is designed to work with a receiver that implements an NRZ/PAM4 decision feedback equalizer (DFE) that employs 1 finite impulse response (FIR) and 2 infinite impulse response (IIR) taps for first post-cursor and long-tail ISI cancellation, respectively. Fabricated in GP 65-nm CMOS, the transmitter occupies 0.060mmΒ² area and achieves 16Gb/s NRZ and 32Gb/s PAM4 operation at 10.4 and 4.9 mW/Gb/s while operating over channels with 27.6 and 13.5dB loss at Nyquist, respectively. The second prototype presents a 56Gb/s four-level pulse amplitude modulation (PAM4) quarter-rate wireline receiver which is implemented in a 65nm CMOS process. The frontend utilize a single stage continuous time linear equalizer (CTLE) to boost the main cursor and relax the pre-cursor cancelation requirement, requiring only a 2-tap pre-cursor feed-forward equalization (FFE) on the transmitter side. A 2-tap decision feedback equalizer (DFE) with one finite impulse response (FIR) tap and one infinite impulse response (IIR) tap is employed to cancel first post-cursor and longtail inter-symbol interference (ISI). The FIR tap direct feedback is implemented inside the CML slicers to relax the critical timing of DFE and maximize the achievable data-rate. In addition to the per-slice main 3 data samplers, an error sampler is utilized for background threshold control and an edge-based sampler performs both PLL-based CDR phase detection and generates information for background DFE tap adaptation. The receiver consumes 4.63mW/Gb/s and compensates for up to 20.8dB loss when operated with a 2- tap FFE transmitter. The experimental results and comparison with state-of-the-art shows superior power efficiency of the presented prototypes for similar data-rate and channel loss. The usage of proposed design techniques are not limited to these specific prototypes and can be applied for any wireline transceiver with different modulation, data-rate and CMOS technology
Belle II Technical Design Report
The Belle detector at the KEKB electron-positron collider has collected
almost 1 billion Y(4S) events in its decade of operation. Super-KEKB, an
upgrade of KEKB is under construction, to increase the luminosity by two orders
of magnitude during a three-year shutdown, with an ultimate goal of 8E35 /cm^2
/s luminosity. To exploit the increased luminosity, an upgrade of the Belle
detector has been proposed. A new international collaboration Belle-II, is
being formed. The Technical Design Report presents physics motivation, basic
methods of the accelerator upgrade, as well as key improvements of the
detector.Comment: Edited by: Z. Dole\v{z}al and S. Un
Energy-efficient analog-to-digital conversion for ultra-wideband radio
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 207-222).In energy constrained signal processing and communication systems, a focus on the analog or digital circuits in isolation cannot achieve the minimum power consumption. Furthermore, in advanced technologies with significant variation, yield is traditionally achieved only through conservative design and a sacrifice of energy efficiency. In this thesis, these limitations are addressed with both a comprehensive mixed-signal design methodology and new circuits and architectures, as presented in the context of an analog-to-digital converter (ADC) for ultra-wideband (UWB) radio. UWB is an emerging technology capable of high-data-rate wireless communication and precise locationing, and it requires high-speed (>500MS/s), low-resolution ADCs. The successive approximation register (SAR) topology exhibits significantly reduced complexity compared to the traditional flash architecture. Three time-interleaved SAR ADCs have been implemented. At the mixed-signal optimum energy point, parallelism and reduced voltage supplies provide more than 3x energy savings. Custom control logic, a new capacitive DAC, and a hierarchical sampling network enable the high-speed operation. Finally, only a small amount of redundancy, with negligible power penalty, dramatically improves the yield of the highly parallel ADC in deep sub-micron CMOS.by Brian P. Ginsburg.Ph.D
Low-Power, Low-Voltage SRAM Circuits Design For Nanometric CMOS Technologies
Embedded SRAM memory is a vital component in modern SoCs. More than 80% of the System-on-Chip (SoC) die area is often occupied by SRAM arrays. As such, system reliability and yield is largely governed by the SRAM's performance and robustness. The aggressive scaling trend in CMOS device minimum feature size, coupled with the growing demand in high-capacity memory integration, has imposed the use of minimal size devices to realize a memory bitcell. The smallest 6T SRAM bitcell to date occupies a 0.1um2 in silicon area. SRAM bitcells continue to benefit from an aggressive scaling trend in CMOS technologies. Unfortunately, other system components, such as interconnects, experience a slower scaling trend. This has resulted in dramatic deterioration in a cell's ability to drive a heavily-loaded interconnects. Moreover, the growing fluctuation in device properties due to Process, Voltage, and Temperature (PVT) variations has added more uncertainty to SRAM operation. Thus ensuring the ability of a miniaturized cell to drive heavily-loaded bitlines and to generate adequate voltage swing is becoming challenging. A large percentage of state-of-the-art SoC system failures are attributed to the inability of SRAM cells to generate the targeted bitline voltage swing within a given access time.
The use of read-assist mechanisms and current mode sense amplifiers are the two key strategies used to surmount bitline loading effects. On the other hand, new bitcell topologies and cell supply voltage management are used to overcome fluctuations in device properties. In this research we tackled conventional 6T SRAM bitcell limited drivability by introducing new integrated voltage sensing schemes and current-mode sense amplifiers. The proposed schemes feature a read-assist mechanism. The proposed schemes' functionality and superiority over existing schemes are verified using transient and statistical SPICE simulations. Post-layout extracted views of the devices are used for realistic simulation results.
Low-voltage operated SRAM reliability and yield enhancement is investigated and a
wordline boost technique is proposed as a means to manage the cell's WL operating voltage. The proposed wordline driver design shows a significant improvement in reliability and yield in a 400-mV 6T SRAM cell. The proposed wordline driver design exploit the cell's Dynamic Noise Margin (DNM), therefore boost peak level and boost decay rate programmability features are added. SPICE transient and statistical simulations are used to verify the proposed design's functionality.
Finally, at a bitcell-level, we proposed a new five-transistor (5T) SRAM bitcell which shows competitive performance and reliability figures of merit compared to the conventional 6T bitcell. The functionality of the proposed cell is verified by post-layout SPICE simulations. The proposed bitcell topology is designed, implemented and fabricated in a standard ST CMOS 65nm technology process. A 1.2_ 1.2 mm2 multi-design project test chip consisting of four 32-Kbit (256-row x 128-column) SRAM macros with the required peripheral and timing control units is fabricated. Two of the designed SRAM macros are dedicated for this work, namely, a 32-Kbit 5T macro and a 32-Kbit 6T macro which is used as a comparison reference. Other macros belong to other projects and are not discussed in this document
Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency
abstract: Static CMOS logic has remained the dominant design style of digital systems for
more than four decades due to its robustness and near zero standby current. Static
CMOS logic circuits consist of a network of combinational logic cells and clocked sequential
elements, such as latches and flip-flops that are used for sequencing computations
over time. The majority of the digital design techniques to reduce power, area, and
leakage over the past four decades have focused almost entirely on optimizing the
combinational logic. This work explores alternate architectures for the flip-flops for
improving the overall circuit performance, power and area. It consists of three main
sections.
First, is the design of a multi-input configurable flip-flop structure with embedded
logic. A conventional D-type flip-flop may be viewed as realizing an identity function,
in which the output is simply the value of the input sampled at the clock edge. In
contrast, the proposed multi-input flip-flop, named PNAND, can be configured to
realize one of a family of Boolean functions called threshold functions. In essence,
the PNAND is a circuit implementation of the well-known binary perceptron. Unlike
other reconfigurable circuits, a PNAND can be configured by simply changing the
assignment of signals to its inputs. Using a standard cell library of such gates, a technology
mapping algorithm can be applied to transform a given netlist into one with
an optimal mixture of conventional logic gates and threshold gates. This approach
was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier
in 65nm LP technology. Simulation and chip measurements show more than 30%
improvement in dynamic power and more than 20% reduction in core area.
The functional yield of the PNAND reduces with geometry and voltage scaling.
The second part of this research investigates the use of two mechanisms to improve
the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM
devices for low voltage operation.
The third part of this research focused on the design of flip-flops with non-volatile
storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated
with both conventional D-flipflop and the PNAND circuits to implement non-volatile
logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of
system locally when a power interruption occurs. However, manufacturing variations
in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading
to an overly pessimistic design and consequently, higher energy consumption. A
detailed analysis of the design trade-offs in the driver circuitry for performing backup
and restore, and a novel method to design the energy optimal driver for a given yield is
presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented,
in which the backup time is determined on a per-chip basis, resulting in minimizing
the energy wastage and satisfying the yield constraint. To achieve a yield of 98%,
the conventional approach would have to expend nearly 5X more energy than the
minimum required, whereas the proposed tunable approach expends only 26% more
energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are
designed with the same backup and restore circuitry in 65nm technology. The embedded
logic in NV-TLFF compensates performance overhead of NVL. This leads to the
possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and-
accumulate (MAC) unit is designed to demonstrate the performance benefits of the
proposed architecture. Based on the results of HSPICE simulations, the MAC circuit
with the proposed NV-TLFF cells is shown to consume at least 20% less power and
area as compared to the circuit designed with conventional DFFs, without sacrificing
any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
Circuit techniques for low-voltage and high-speed A/D converters
The increasing digitalization in all spheres of electronics applications, from telecommunications systems to consumer electronics appliances, requires analog-to-digital converters (ADCs) with a higher sampling rate, higher resolution, and lower power consumption. The evolution of integrated circuit technologies partially helps in meeting these requirements by providing faster devices and allowing for the realization of more complex functions in a given silicon area, but simultaneously it brings new challenges, the most important of which is the decreasing supply voltage.
Based on the switched capacitor (SC) technique, the pipelined architecture has most successfully exploited the features of CMOS technology in realizing high-speed high-resolution ADCs. An analysis of the effects of the supply voltage and technology scaling on SC circuits is carried out, and it shows that benefits can be expected at least for the next few technology generations. The operational amplifier is a central building block in SC circuits, and thus a comparison of the topologies and their low voltage capabilities is presented.
It is well-known that the SC technique in its standard form is not suitable for very low supply voltages, mainly because of insufficient switch control voltage. Two low-voltage modifications are investigated: switch bootstrapping and the switched opamp (SO) technique. Improved circuit structures are proposed for both. Two ADC prototypes using the SO technique are presented, while bootstrapped switches are utilized in three other prototypes.
An integral part of an ADC is the front-end sample-and-hold (S/H) circuit. At high signal frequencies its linearity is predominantly determined by the switches utilized. A review of S/H architectures is presented, and switch linearization by means of bootstrapping is studied and applied to two of the prototypes. Another important parameter is sampling clock jitter, which is analyzed and then minimized with carefully-designed clock generation and buffering.
The throughput of ADCs can be increased by using parallelism. This is demonstrated on the circuit level with the double-sampling technique, which is applied to S/H circuits and a pipelined ADC. An analysis of nonidealities in double-sampling is presented. At the system level parallelism is utilized in a time-interleaved ADC. The mismatch of parallel signal paths produces errors, for the elimination of which a timing skew insensitive sampling circuit and a digital offset calibration are developed.
A total of seven prototypes are presented: two double-sampled S/H circuits, a time-interleaved ADC, an IF-sampling self-calibrated pipelined ADC, a current steering DAC with a deglitcher, and two pipelined ADCs employing the SO technique.reviewe
Simulation of the upgraded Phase-1 Trigger Readout Electronics of the Liquid-Argon Calorimeter of the ATLAS Detector at the LHC
In the context of an intensive upgrade plan for the Large Hadron Collider (LHC) in order to provide proton beams of increased luminosity, a revision of the data readout electronics of the Liquid-Argon-Calorimeter of the ATLAS detector is scheduled. This is required to retain the efficiency of the trigger at increased event rates despite its fixed bandwidth. The focus lies on the early digitization and finer segmentation of the data provided to the trigger. Furthermore, there is the possibility to implement new energy reconstruction algorithms which are adapted to the specific requirements of the trigger. In order to constitute crucial design decisions, such as the digitization scale or the choice of digital signal processing algorithms, comprehensive simulations are required. High trigger efficiencies are decisive at it for the successful continuation of the measurements of rare StandardModel processes as well as for a high sensitivity to new physics beyond the established theories. It can be shown that a significantly improved resolution of the missing transverse energy calculated by the trigger is achievable due to the revised segmentation of the data. Various energy reconstruction algorithms are investigated in detail. It can be concluded that these will facilitate reliable trigger decisions for all expected working conditions and for the whole possible energy range