Read-only memory (ROM) is widely implemented as a phase-to-amplitude mapping block in direct digital frequency synthesizers (DDFS). This paper derives an equivalent model for the ROM in a DDFS to analyze and reduce the access time that is critical to the performance of the DDFS. Moreover, the signal skew observed in the simulation waveform is illustrated. The proposed 64×3-bit ROM is integrated as a part of an 8-bit DDFS, which operates functionally at 6 GHz. Measurement results demonstrate the improvement in the spur free dynamic range. Direct digital frequency synthesizers (DDFS) are widely used in communication systems, chirp radar systems, and phase array antennas. To exploit DDFSs in broadband communication systems, DDFS designs operating at GHz-range clock frequencies are required. A direct digital synthesizer can be implemented from a phase accumulator, a phaseto-amplitude mapping block, and a digital-to-analog converter. The phase-to-amplitude mapping block is the key to a high performance DDFS. Many architectures and designs for the phase-to-amplitude mapping block in a DDFS have been reported in the literature. Phase-to-amplitude mapping methods are mainly based on ROM-based designs [1], computational mapping designs [2,3], or both [4-6]. The increasing demand for higher speed DDFS circuits and the frequency limitations in CMOS technologies have necessitated the development of DDFSs implemented using heterojunction bipolar transistor (HBT) technology. Although indium phosphide (InP) HBT based circuits tend to work at a high frequency, high cost and low yield have limited the development of InP HBT based large scale integrated (LSI) circuits. Gallium arsenide (GaAs) HBT combining high frequency and high yield with a moderate price, shows prominent application in mixed-signal integrated circuits with a high level of complexity, such as the ultrahigh speed DDFS with high spur free dynamic range (SFDR).
Direct digital frequency synthesizers (DDFS) are widely used in communication systems, chirp radar systems, and phase array antennas. To exploit DDFSs in broadband communication systems, DDFS designs operating at GHz-range clock frequencies are required. A direct digital synthesizer can be implemented from a phase accumulator, a phaseto-amplitude mapping block, and a digital-to-analog converter. The phase-to-amplitude mapping block is the key to a high performance DDFS. Many architectures and designs for the phase-to-amplitude mapping block in a DDFS have been reported in the literature. Phase-to-amplitude mapping methods are mainly based on ROM-based designs [1], computational mapping designs [2, 3] , or both [4] [5] [6] . The increasing demand for higher speed DDFS circuits and the frequency limitations in CMOS technologies have necessitated the development of DDFSs implemented using heterojunction bipolar transistor (HBT) technology. Although indium phosphide (InP) HBT based circuits tend to work at a high frequency, high cost and low yield have limited the development of InP HBT based large scale integrated (LSI) circuits. Gallium arsenide (GaAs) HBT combining high frequency and high yield with a moderate price, shows prominent application in mixed-signal integrated circuits with a high level of complexity, such as the ultrahigh speed DDFS with high spur free dynamic range (SFDR).
Comparing to a DDFS with computational mapping [3] , the one with both ROM and computational mapping [7] was found to have a higher SFDR. However, the ROM is often the limiting factor for the high speed of a DDFS, because it has to support clock rates in the order of two-and-half times the synthesized frequency [8] . Many technologies have been adopted to realize a high speed ROM with large size. The fastest CMOS ROM reported, operates at a frequency up to 1.1 GHz [9] . A 64-bit, 5-GHz read-write look-up table (LUT) has been implemented in GaAs HBT [10] , while an InP HBT 36-GHz, 16×6-bit ROM test circuit has also been fabricated [11] , with the output voltage amplitude falling from 330 mV at 20 GHz to 160 mV at 36 GHz.
Based on theoretical analysis, simulation and experimental results, we introduce a 64×3-bit, 6-GHz ROM fabricated using an 1 μm GaAs HBT process for DDFS application.
Design and analysis
The 64×3-bit ROM was integrated as part of the phaseto-amplitude mapping block in a DDFS [7] . The performance of the ROM is critical for a high speed DDFS. The overall speed of the DDFS could be increased by reducing the access time of the ROM. An array-structured memory organization is adopted for the architecture. An equivalent model is derived for analysis of the access time of the ROM.
ROM architecture
The 64×3-bit ROM is organized as an 8 row by 24 column array, with each block of 8 columns holding 3-bit words. Three X-inputs together with three Y-inputs select the addressed word. The architecture of the 64×3-bit ROM is illustrated in Figure 1 .
The ROM address decoding is divided into two parts, with the 3 most significant bits (MSBs) for row decoders and the other 3 bits for column decoders. The differential outputs from the accumulator select the bit values stored in the ROM memory cell array. The sense amplifier converts the bit values into ECL voltage levels and drives the digital-to-analog converter (DAC) to obtain the analog output.
Memory cell
The ROM memory cell can be designed using a diode to assign an "1" or "0". However, when the diode is selected by the word line, the current in the diode flows from the row decoder. This imposes a challenge on the fan out of the row decoder. The memory cells, consisting of one transistor per cell, greatly reduce the driving of the row decoder. The 64×1-bit memory cell array with a pull-up transistor, pull-down current source and sense amplifier is illustrated in Figure 2 .
The base of the memory cell transistor is connected to the word line and its collector and emitter are connected to ground and the bit line, respectively. To assign the bit value, a differential bit line is adopted for the following two reasons. First, for a differential signal, it is able to provide maximal noise margins, low noise sensitivity, as common mode noise signals; i.e. signal disturbances common to both differential bit lines are rejected to a large degree. Second, small logic swings could reduce the access time of the ROM.
The bit value stored in the memory cell transistor is determined by its emitter, which is connected to the high or low line. If the emitter is connected to the high line, the bit value is "1". When the memory cell is selected, a current of I on flows through the memory cell. The current in the unselected memory cell is I off . The equivalent transient model for a ROM with differential bit lines is illustrated in Figure 3 .
Because the equivalent resistance of the bit line is negligible, it is reasonable to equate the bit line to one capaci- tance. Moreover, all the capacitances connected to the bit line can be modeled with one capacitance C bit . As the current in the memory cell is controlled by the word line, the current in the bit line is a pulse current source. The propagation delay t p caused by the RC network shown in Figure  3 (b) is expressed as
where ∆V is the amplitude of the voltage swing on the bit line.
Assume there are n transistors connected to the high line, and thus, (8-n) transistors are connected to the low line, as shown in Figure 2 . Analysis shows that the current difference between the high and low line in one column can be expressed as
It should be noted that I  for n = 0 is the same as that for n = 8. Further analysis shows that when the selected memory cell is changed from "1" to "0" or "0" to "1", the current to charge C bit can be expressed as bit on off
Substituting eq. (3) into eq. (1) yields Figure 4 Schematic diagram of the row decoder circuit.
From the equations above, we know that minimizing the off-state current I off and increasing the on-state current I on could optimize the current difference between the high and low lines, and reduce access time for the memory cell. Reducing the capacitance connected to the bit line is also very important. The on-and off-state current of the memory cell are determined by the ROM address decoder.
Row and column decoders
Bipolar decoders are typically built either from ECL NOR gates [12] with complemented inputs or from AND gates [8] using diode logic. For the diode AND gate, a static current is supplied through the input diodes, thus the output of the diode decoder begins to change as its inputs change, rather than once the inputs cross a threshold voltage. Therefore, the output swing is determined by the input swing since there is no level-restoring gate. A NOR gate decoder, with its level-restoring gate structure allows smaller swings on the heavily-loaded address line than on the gate output. The advantages are balanced by a potential increased delay due to the level-restoring gain stage.
The ECL NOR gate based row decoder circuit is illustrated in Figure 4 . The three Y-address inputs are decoded in the same way through similar circuitry.
C load is the capacitance connected to the word line. The circuit contains a three-input NOR gate and an emitter follower. Eight row decoders, as shown in Figure 1 , fully constitute the decoding of 3-bit input addresses. Only when all the inputs are low is the decoder output high and any high input would pull the output down to a low voltage level. The output high and low voltage levels are critical to minimize the delay caused by the memory cell. According to eq. (4), the high and low voltage levels should be carefully chosen to maximize I bit . It should also be noted that in-creasing the swing of the decoder output will increase the delay of the decoder.
Sense amplifier
The logic swing on the bit lines is kept small to slow down the delay and speed up the memory cell. However, the output of the memory should have enough drive capability to drive the next stage logic circuit.
The sense amplifier connected to the data line pair converts the current difference into voltage output. The differential output of the sense amplifier then goes through the emitter follower to a final ECL output buffer, which restores the read data to normal ECL voltage levels and drives the DAC in the next stage of the DDFS.
Simulation results and discussion
A simulation was carried out to estimate the performance of the ROM. The access time of the ROM was simulated to be about 130 ps. The bit values stored in the ROM are random. When the stored bit values are read out, the output bit pattern may be a single logic low followed by a long string of logic highs, or vice versa. The simulation results show that signal skew and glitch may arise and degenerate the performance of the ROM. In a DDFS, as the bit values stored in the ROM represent the amplitude of the sine wave, the bit error caused by the ROM degenerates the SFDR of the DDFS.
The bit error of the ROM arises from the skew in the ROM address decoders and parasitic capacitance of the bit lines in the memory cell array.
Signal skew is observed because each row or column does not have the same decoding time. The worst case delay occurs when three of the input bits to the decoder are switched simultaneously [11] . Columns 0 and 4 have the longest delay because all the address bits in the column inputs must be switched simultaneously when going from column 7 to column 0, and from column 3 to column 4, respectively. In a DDFS, the ROM output should be aligned with other phase-to-amplitude mapping blocks under the clock frequency. When skew occurs, the setup time and hold time of the data flip-flop is reduced, and consequently, the possibility of bit errors in the ROM is increased.
There is another reason for bit errors in the ROM. As discussed above, the memory cell capacitance is charged through the current flowing in the bit lines. At high frequency, the current is insufficient to charge the capacitance. In view of the transient time response, skew is observed. The skew in row 2 and column 7 of the bit pattern depicted in Table 1 is shown in Figure 5 .
The bit pattern in row 2 is "00110001", where the number of "0" contains one bit more than the number of "1", so the low line of this block has more bit transistors connected to it than Table 1 is stored in the ROM.
the high line, and thus it has a greater parasitic capacitance than the high line, due to the additional base-emitter capacitance from the low bit transistor. The leakage current of unselected memory cells reduces the efficient current to charge the capacitance. Moreover, the bit pattern in row 3 is "00111000". When the address increases from row 2 to row 3, the bit pattern becomes "…1000100…", that is, three bits with "0" followed by only one bit with "1". As the time to response is proportional to the capacitance and inversely proportional to the current, at a high clock frequency, the parasitic capacitance will not have enough time to respond, causing signal skew. This phenomenon has also been previously reported [11] , although it was observed by a bit pattern of "1111111011111110" from a single column. The leakage current of unselected memory cells reduces the current amplitude of the bit lines, and thus the parasitic capacitances on the bit lines require more time to charge and discharge. When working at a high frequency, and with the ROM programmed with a bit pattern of a single logic low followed by a long string of logic highs, or vice versa, skew may arise and degenerate the performance of the ROM. Thus, the word and bit lines should be kept as short as possible to reduce the layout parasitic.
Experimental results
The ROM was fabricated in 1 μm GaAs HBT, with f t and f max both around 60 GHz. The 64×3-bit ROM was integrated as part of an 8-bit DDFS. The six least significant bits of the 8-bit accumulator outputs in the DDFS were set as ROM addresses. The ROM converts the phase to a 3-bit sine wave amplitude and drives the DAC in the next stage to obtain an analog output. Using 700 GaAs HBTs, the total area of the ROM is 1.2 × 0.6 mm 2 . The ROM draws a current of 130 mA from a 4.6 V power supply. A microphotograph of the 8-bit DDFS with the 64×3-bit ROM is shown in Figure 6 . The DDFS output waveform with the DDFS clocked at 5 GHz and the FCW set to 1 is shown in Figure 7 . With the FCW set to 1, the output frequency of the DDFS was 1×5000/2 8 = 19.531 MHz with an SFDR of 39.5 dBc. With this FCW, the phase increased every clock cycle, so the ROM output changed every clock cycle. Figure 8 shows a measured SFDR of 40.58 dBc with the output frequency 1.816 GHz and FCW=0x5D under a 5-GHz clock. As shown in Figure 9 , the DDFS had an output frequency of 2.367 GHz with SFDR of 33.96 dBc under a 6-GHz clock. The measured SFDR is higher than that of a DDFS [3] without ROM. The effectiveness of the proposed ROM for DDFS application has been verified by the measurement results. Table 2 compares our results with some recently published high speed ROM performances. The delay in the proposed ROM is greater than that in others, due to the lower power consumption and the output load capacitance.
Conclusion
We have proposed a 64×3-bit ROM for DDFS application. The operating frequency is as high as 6 GHz. This suggests that a ROM at microwave frequencies can be implemented using a GaAs HBT process. Moreover, power dissipation is not excessively high. The proposed ROM could be implemented as part of a DDFS for a better SFDR. 
