Abstract-Retro-directive antennas (RDAs) are a promising technique for communication systems as they provide selftracking of moving terminals. However, many classic implementations are only able to re-transmit a received message and/or suffer from small realizable frequency gaps between received and transmitted signals. To overcome these limitations, a system architecture based on phase-locked loops (PLLs) has been proposed which allows for arbitrary frequency gaps, different array geometries, and makes use of the array gain on reception. In this work, we show a digital implementation of this retrodirective receiver to be realized on a field-programmable gate array (FPGA). We demonstrate the phase detection and downmix performance by digital hardware simulation with noisy input signals. The dynamic system behavior for signal acquisition and re-acquisition is shown in time domain. Suitability for communication systems is analyzed by subjecting the system to binary phase shift keying (BPSK) modulated signals. We provide simulative proof-of-concept for the proposed RDA system and its general ability to simultaneously process a communication signal and estimate its direction-of-arrival (DOA).
I. INTRODUCTION
Retro-directive antenna (RDA) systems have been investigated for a couple of decades. Their main advantage is the re-transmission of an incoming signal into the direction of the sender without performing sophisticated digital signal processing (DSP) [1] . Such a simple re-transmission can be employed for e.g. radar applications and has been considered an alternative to corner reflectors [2] . However, in communications we want the transmitted signal to be different from the received signal. Moreover, regulations often demand that the frequencies for transmit and receive are considerably different. Classical retro-directive antenna concepts like van Atta [3] and Pon arrays [4] suffer from array squint, i.e. a beam pointing error due to the frequency difference. They also cannot make use of the array gain on receive because every element processes the incoming signal individually.
To meet the demands of communication applications, we have to employ active systems. In this paper we show a retrodirective receiver architecture capable of beamforming on the receiving side and array squint correction for the transmit side. In order to demonstrate the performance, the system was implemented in digital hardware using VHSIC hardware description language (VHDL). The behavior is analyzed in simulations using the commercial software tool ModelSim by Mentor Graphics. There is one such unit for each antenna element in the array.
II. PLL-BASED DOWNMIX AND PHASE DETECTION
It has been demonstrated in [5] that it is possible to perform simultaneous downmix and geometric phase detection of a radio frequency (RF) signal with a front-end architecture based on nested phase-locked loops (PLLs). A channel of the system is illustrated in Fig. 1 .
Let us assume an incoming RF signal from a certain direction θ which impinges on a uniform linear array (ULA) of N elements with spacing d. The received signal at the i-th array element can be written as
with A i being the signal amplitude, ω RF the frequency, and φ RF (t) the signal phase.
denotes the geometric phase resulting from the RF wavelength λ and the position of the i-th element within the ULA, i ∈ 1, . . . , N. The received signal x RF,i is mixed down using the local oscillator (LO) signal y LO,i from the synthesizer PLL. A subsequent lowpass filter rejects the higher mixing products. The downmixed signal is then given by
where A IF,i is the signal amplitude after mixing and filtering. ω LO and φ LO are the instantaneous frequency and phase at the LO synthesizer PLL output.
After the lowpass, the intermediate frequency (IF) signal is fed to a phase-frequency detector (PFD) which compares it to the reference signal x ref . This reference is common to all array channels. The error signal from the comparison is processed by a loop filter and input to a voltage-controlled oscillator (VCO) which drives the synthesizer PLL. When this PLL is in lock, its output phase is just the input signal phase scaled by the factor M . We write the signal from the VCO as
where ω V,i and φ V,i are the frequency and phase produced by the VCO. The output of the PLL after locking becomes
As can be seen from Fig. 1 , the architecture is a feedback system. If it is stable, it will converge to a steady state i.e. a constant error signal of the PFD. Combining (3) and (6) and assuming the reference signal
we obtain the VCO frequency and phase as
with n i ∈ Z. That means the VCO output signal phase consists of a fixed frequency component and the RF signal phase which is of course 2π-periodic. Assuming narrow-band signals and far-field conditions for the array, φ RF (t) is the same at all antenna elements. Thus it can be eliminated by comparing the VCO signal phases across the array channels and the geometric information δ i (θ) is extracted. Through pairwise phase comparison of neighboring channels we arrive at
where (2) has been used andn
, and λ are fixed system parameters, Δ i is only dependent on the direction-of-arrival (DOA) θ of the RF signal. The scaling by 1/M and the phase ambiguity in (10) present a challenge. The DOA estimate is obtained from
wheren i is chosen such that the argument of the arcsin is in the valid range of {−1, . . . , 1}. Due to the discontinuities of the argument, it is to be expected that DOA estimation becomes difficult near to ±90
• . However, for most communication applications a reduced field of view is acceptable.
The advantage of the proposed technique is that, in contrast to classic RDA systems, we explicitly obtain θ and do not employ a direct phase conjugation on the received signals.
Thus we are free to use an arbitrary frequency and array geometry for the return signal of our retro-directive transceiver.
Moreover, due to the common reference x ref , the downmixed signals y IF,i are phase-aligned. This means that their individual geometric phase information does not appear at the IF output. All IF signals can be combined to obtain a stronger output signal which still contains the phase modulation. Essentially, the system uses the tracking ability of the PLLs to perform phased-array like beam steering. The beamforming coefficients are thereby contained in the LO signal phases.
III. DIGITAL IMPLEMENTATION
In general, the described system architecture is devised to be built with analog components. However, such an implementation is rather inflexible regarding e.g. filter design and choice of components. Also all analog parts are prone to tolerances. Since our goal is a proof-of-concept, we show a digital implementation of the described receiver system in this work. This allows us a fast optimization of parameters like filter coefficients, gain settings, and frequency plans. Moreover, we are not limited to components available on the market. In the following, details about the implementation of individual system parts are given.
A. Single Receiver Channel
The digital representation of the system is depicted in Fig. 2 . The RF input signal has a word width of 10 bit and is directly fed to a 10 × 10 bit multiplier for downmixing. A subsequent 2nd order butterworth filter is used to get rid of higher mixing products. The downmixed signal is truncated in the filter and connected to the IF output.
The sign bit of the IF signal goes through a Schmitt trigger in order to obtain clear sign changes. It is then fed to a PFD which compares it to the reference input. The output pulse train of the PFD goes through the remaining PLL forward path consisting of loop filter, amplifier and numerically controlled oscillator (NCO). The PFD was built using edge-triggered flip-flops similar to the design shown in [6, p.21] . The loop filters are 1st order infinite impulse response (IIR) filters, corresponding to lead-lag filters which are often used in analog PLLs. Fixed point 18 × 12 bit multipliers were used with the coefficients having a word width of 12 bit. The design of digital IIR filters is unproblematic in this case as the PFD output is limited to {−1, 0, 1}. Thus filter input is always bounded and the necessary signal word width truncation does not impair the precision. The amplifier is necessary to adjust the gain of the NCO. It was realized as an arithmetic bit shifter, thus allowing only amplification of powers of two. The NCO is an intellectual property (IP) core from Altera cooperation. It has a 32 bit frequency modulation (FM) input for steering and generates a 10 bit wide output signal. The forward path building blocks of the LO synthesizer PLL is identical. The additional frequency divider was realized as an edge counter. Table I summarizes the chosen implementation variants of the individual system parts.
B. Array Output Signal Processing
The previously described receiver channel has to be employed in an array to work as a RDA system. Therefore we will describe how the IF and NCO output signals of the individual channels are combined to obtain an amplified downmixed signal and DOA estimates. Fig. 3 shows the processing within a four channel receiver: All IF output signals are summed up. Since they are phase aligned to the common reference signal, constructive superposition occurs. Thus the system makes use of the array gain and the summed output signal amplitude is ideally N times that of a single channel in an N element array.
To get the DOA of the received signal, we have to obtain (10). PFDs are used to compare the NCO signals pairwise. In order to smooth out small fluctuations of the NCO phase, we average the phase difference over several cycles. This is done with a duty cycle counter, which counts only when the PFD output signal is non-zero. The counter value is compared to a continuously running reference counter. Thus we obtain an averaged measure of the PFD pulse widths which are proportional to the phase difference of the input signals. Averaging is necessary as the VCO outputs are always slightly varying in frequency due to continuous corrections of the PLLs. Therefore we use a counter width w DC which is higher than the phase detection word width w ϕ .
Regarding the attainable phase resolution, it should be noted that phase information is obtained from time differences. Because of discrete sampling, this information is quantized. Given a sample rate ω det , the smallest detectable phase difference in one VCO period is given by
where (8) was used. This means that, besides averaging, an increase of ω det yields a higher phase detection precision. Thereby, only the VCO, PFD and duty cycle counter need to run at the higher sampling rate, not the whole system.
C. DOA estimation
Assuming that the duty cycle outputs D DCi are unsigned and have a word width of w ϕ . The averaged phase difference between channels i and i + 1 is given by
wϕ . To obtain the DOA, (10) has to be solved for θ. We then get N − 1 solutions for θ. To arrive at a single DOA estimate and to reduce the influence of small PLL fluctuations, we average across the N − 1 estimates.
Since the necessary calculations for (11) involve the arcsin(·) operation and divisions, they are performed externally. For the target system, explicit calculation of θ is not required, because sin(θ) is used for re-transmit beamforming. This value is available from (10). If θ is of importance in the real-time system, either a hardware lookup table or a softcore processor which allows floating point operations can be used.
IV. SIMULATION SETUP AND RESULTS
To demonstrate the performance of the downmix and phase detection module, a four element ULA was simulated in ModelSim, using the setup shown in Fig. 3 . A common reference signal is generated by an NCO and distributed to the ref_in inputs of all channels. Individual NCOs create the received RF signals. Via the phase modulation (PM) inputs we can simulate the geometrical phase difference between the antenna elements and perform carrier modulation. Additive Gaussian white noise signals N1, . . . , N4 are added to each RF signal before it is fed to the receiver channel. The noise signals are generated in MATLAB and read in during simulation run time.
The NCO signal outputs of the channels are processed as described in the previous section. The outputs of the duty cycle counters give us estimates for the Δ i . The IF signals are added to give the combined output. Table II shows the system parameters we will use throughout the simulations. According to (12) we achieve a phase resolution of 8.1
• for the output PFDs. The duration for one duty cycle counter run is 10.24 μs. 
A. Initial Signal Acquisition and Re-Acquisition
In our first scenario we want to analyze the acquisition performance of the receiver. We assume that a carrier signal from θ = 30
• impinges on our RDA. After 250 μs the incident angle will change abruptly to 60
• and after the same time span to −45
• such that re-acquisition is necessary twice. The signal-to-noise ratio (SNR) for this simulation was 10 dB. First, we take a look at the initial acquisition. The IF sum signal is shown in Fig. 4 along with the IF signal of channel 1. The initially irregular shape of the red curve indicates strong phase corrections by the PLLs in the channel. These subside after a short time and the sum signal begins to rise in amplitude, indicating that the four IF signals start to superimpose constructively. After 30 μs, the amplitude of the sum signal is about four times that of the single channel. That means, all channels are locked to the reference. The sum signal amplitude continues to exhibit variations which are mainly caused by the noise and, to a lesser degree, by small corrections applied by the PLLs. We can state that acquisition is completed after approx. 30 IF cycles.
Let us regard the IF output envelope over the whole simulation time, depicted in Fig. 5 . Besides the acquisition process described before, we see amplitude drops occurring due to the change of incident angle. This is to be expected since the phase differences amongst the RF signals change abruptly and the PLLs have to reacquire lock. The time for re-acquisition is about 20 μs. It is thus quicker than the initial locking process. For this simulation, the DOA estimations are of great interest. We can see estimatesθ calculated from the duty cycle counter outputs compared to the true values of θ in Fig. 6 . As can be expected, we see high estimation errors during initial and re-acquisition processes. However, while the geometric phase does not change, the error is reasonably small. Apart from the deviations caused by the jumps of θ, the peak estimation errors are less than 9
• . On average, the error is much lower than expected from (12). This accuracy is obtained by the time averaging of the duty cycle counters and the across-channel averaging applied in post processing.
B. Phase Modulated Signals
For use in communication systems, the receiver must be able to work with modulated signals. To test this, we assume a 978-1-5090-1447-7/16/$31.00 ©2016 IEEE scenario with a fixed incident direction of θ = 30
• . For acquisition, only the carrier signal is present during the initial 75 μs. After that the four RF signals are simultaneously modulated by a binary phase shift keying (BPSK) signal. Thereby, we take the worst case of 0 and 1 being alternately sent such that phase changes occur as often as possible. No pulse shaping is used, i.e. the modulation signal is a rectangular wave.
Again, the SNRs of the RF signals are 10 dB each. We use a bit rate of 50 kbit/s, corresponding to a symbol duration of 20 μs. From the previous experiment we know that this is enough time for the system to reacquire lock if necessary. Thus we expect that if one or more PLLs go out of lock due to the modulation, they will still be able to capture the next phase change.
Let us take a look at the DOA estimates for this scenario in Fig. 7 . It can be seen that after acquisition, the incident angle is correctly estimated over the whole simulation time. The occurring estimation errors are smaller than 3
• . The initial acquisition behavior of the system is the same as seen in the first simulation.
In Fig. 8 we examine the IF sum signal around 515 μs. We can clearly see regular phase jumps every 20 μs, i.e. the signal modulation. The figure exemplary shows the shape of the IF sum signal. It is the same over the whole simulation time, such that the signal envelope stays constant. Thus, it can be expected that the received symbol sequence can be decoded without problems.
The re-acquisition time of the system plays a crucial role for the maximum processable data rate. If the system is not able to re-acquire within one symbol duration, loss of lock will occur in some or all channels and the symbols will not be detectable anymore. This is a limitation of the system architecture which could be mitigated by the following measures: First, the system can be tuned for faster re-acquisition which directly allows higher symbol rates. Second, modulation must be removed from the IF signals prior to the reference PFD. This may be done by a Costas loop [7] . Finally, one of the RF signals is directly downmixed to IF and then used as reference. Thus, the reference follows the modulation and no reacquisition is necessary.
V. CONCLUSION
In this paper we have introduced a downmix and phase detection circuit for digital implementation in VHDL. We have given details about the architecture and the utilized key components along with information how they were realized.
The DOA estimation ability of the proposed system was validated using a four channel digital hardware simulation. We have demonstrated initial signal acquisition within 30 IF cycles with a SNR of 10 dB. For the case of abruptly changing signal incident direction, we achieved successful re-acquisition after 20 IF cycles. DOA estimation error was shown to be less than 9
• after (re-)acquisition. The capability of the system to process phase modulated signals was shown in a simulation setup with fixed DOA, a data rate of 50 kbit/s, and 10 dB SNR. Constant IF sum signal amplitude was observed and the applied BPSK was clearly recognized. Successful decoding of the transmitted symbols can therefore be expected. The DOA was estimated with an error below 3
• during the whole signal transmission. The presented results show that the proposed system architecture is a suitable candidate for retro-directive antenna systems which shall be employed for communications.
