SSC22-P5-14
S-Band QPSK Transmitter for Picosatellites
Jay Deorukhkar, Peter W. Pachowicz
George Mason University
Department of Electrical and Computer Engineering, MS 1G5
4400 University Dr., Fairfax, VA 22031
jdeoruk2@gmu.edu, ppach@gmu.edu
ABSTRACT
The small satellite field has become popular amongst academia, amateur satellite (AMSAT) community, and
commercial businesses due to the miniaturization of components and smaller form factors. Specifically, the
picosatellite structure has gained attraction for its size and affordability of launch fees. However, the size constraint
makes it difficult to generate power and limits the transmit power for downlink. Therefore, efficient data modulation
is key to providing high data downlink rates. Also, the typical VHF and UHF frequency spectrum used for satellites
is getting congested. Hence, the higher frequency bands such as S-band and X-band are gaining attraction and offer
higher data bandwidth. To address both issues, an architecture to implement QPSK modulation for S-band operation
is proposed. The design is focused on low-power picosatellites and the implementation is targeted for academia and
the AMSAT community.
1. INTRODUCTION

To address both issues, this paper provides an
architecture to implement QPSK modulation for S-band
operation on picosatellites. The following sections will
discuss the functionality, design, implementation,
testing, and analysis of the proposed architecture. This
is primarily directed for use by academia and the
AMSAT community.

The recent growth in small satellites has enabled
academia, amateurs, and the commercial sector to
design and launch payloads for specific applications.
This can include complex payloads for earth
observation, ship tracking, scientific experiments,
interplanetary missions, and various others. The
technology has also evolved to allow for smaller form
factors ranging from multi-unit CubeSats to
picosatellites and even smaller.

2. FUNCTIONALITY
The entire QPSK modulation chain is shown in figure
1. It starts with the binary input data being split into
even and odd bits for the IQ channels. Then, the bits are
mapped to symbols based on the desired modulation
scheme. Next, the symbols are passed into the pulse
shaping filter to generate the baseband signals. Lastly,
both signals are modulated using an IQ modulator to
generate the RF signal at a specific carrier frequency
[1].

The picosatellite form factor is particularly challenging
to work with, primarily due to its size. This smaller
satellite bus results in a smaller surface area for solar
cells, consequently generating less power. This puts a
limit on the payload and subsystems since they will
need to be designed with tighter power constraints. For
the transmitters, the transmit power will also be limited
for data downlink. This combination of small size and
limited power drives the need for efficient data
modulation structures, providing high data downlink
rates for complex payloads.
Along with the challenges of the transmitter, the
frequency spectrum used for satellites is getting
congested and noisy. Typically, satellites use the VHF
and UHF bands for uplink and downlink with ground
stations. However, with the growth of small satellites
and the need for higher data rates, more missions are
shifting towards the use of higher frequency bands.
Particularly, the S-band and X-band frequencies are
much less crowded and offer higher bandwidth
compared to VHF and UHF bands.

Deorukhkar

Figure 1: Full structure of a QPSK Modulator
3. DESIGN
3.1 Proposed Architecture
The proposed architecture for QPSK modulation is
shown in figure 2. The input data is received by the

1

36th Annual Small Satellite Conference

MachXO2 Field Programmable Gate Array (FPGA)
over Serial Peripheral Interface (SPI). The FPGA splits
the incoming data into two channels, maps it to the
appropriate symbols, and applies pulse shape filtering.
The resulting baseband signals in each channel are sent
out through a modified SPI interface where the two
outputs correspond to a differential pair output. The
first output is the positive signal (same as the baseband
signal) and the second is the negative signal (an
inverted version of the baseband signal).
Figure 3: Testing QPSK architecture
3.3 Alternate Architecture
An alternate architecture was considered for a simpler
approach that did not use FPGAs but instead used a
Microcontroller (MCU) with integrated DACs. In
figure 4, the input data is received by the MSP430FR2355 MCU over SPI. The incoming data is split into
two channels, mapped to symbols, and pulse shape
filtered by the MCU.
Figure 2: Proposed QPSK architecture
Both set of SPI outputs are fed into the two PMODDA2 2-channel Digital-to-Analog Converter (DAC).
This converts the differential digital baseband signals
into differential analog baseband signals. These are
connected to the differential inputs of the AD8348 IQ
modulator. Additionally, the FPGA also controls the
ADF4351 Voltage Controlled Oscillator (VCO) over
SPI which sets the oscillator frequency to the desired
carrier frequency. The VCO outputs the oscillator
signal in a differential pair, which is passed into the IQ
modulator. The result of the IQ modulator is the RF
signal at the upconverted carrier frequency.

Figure 4: Alternate QPSK architecture
In this case, the MCU has internal 4-channel DACs
which can convert the filtered baseband signal to the
equivalent analog signal. The differential baseband
signals are fed into the differential inputs of the IQ
modulator. Along with this, the MCU has internal SPI
peripherals for controlling the VCO oscillator's carrier
frequency. The VCO output signal is connected the
same as before to the IQ modulator. In this architecture,
the ADL5375 IQ modulator is a different chip but has
similar functionality and pinouts as the previous IQ
modulator. The resulting RF signal is also the same as
before.

3.2 Testing Architecture
For the testing architecture, the entire structure is the
same as the proposed architecture except for the FPGA.
In this case, the input data is received by the Artix-7
FPGA over SPI as shown in figure 3. The rest of the
QPSK modulation chain is the exact same, resulting in
the same RF signal as before.

4. IMPLEMENTATION
4.1 Hardware
In the following sections, the hardware used in the three
architectures is shown. The annotations highlight the
corresponding components from the architecture
diagrams along with the main inputs and outputs.

Deorukhkar

2

36th Annual Small Satellite Conference

4.1.1 Proposed Architecture
In this architecture, we start with the input data entering
the TinyFPGA over SPI as shown in figure 5. Next, the
data is sent to the two PMOD-DA2 DAC boards over a
modified SPI interface, where two MOSI lines are used
for the two independent output channels. In this case,
the input SPI side powers the TinyFPGA and DAC
boards. Then, the four analog baseband signals (in
differential pairs) are passed into the AD8346 IQ
modulator along with the local oscillator (LO) signal
from the ADF4351 VCO. The IQ modulator evaluation
board is set up as a single-ended input for the LO
signal, whereas typically it would be a differential pair
as well. Lastly, the RF signal is generated as a singleended 50 ohm output.

Figure 6: Hardware for testing architecture
4.1.3 Alternate Architecture
For the alternate architecture, the setup shown in figure
7 is fairly different from the previous two architectures.
Firstly, the input data passes through SPI into the
MSP430 MCU (which is powered and programmed
over USB). Since this particular MSP430 has a 4channel integrated DAC, the output is the baseband
analog signal which goes into the ADL5375 IQ
modulator to generate the output RF signal. The chip
itself can be seen in the top portion of the evaluation
board along with the ADF4351 VCO at the bottom. The
differential LO outputs from the VCO are internally
connected to the IQ modulator through a series of loop
filters for reducing the oscillator harmonics.

Figure 5: Hardware for proposed architecture
The board on the bottom left is an evaluation board for
the ADF4351 VCO as well as the ADL5375 IQ
modulator (which is covered in the later section). The
boxed portion of the board is the VCO and the positive
LO output is connected to the AD8346 IQ modulator.
Note that the architecture shows the ADF4351 being
connected to the FPGA over SPI however, it is
controlled over USB with a specific program in this
setup. This allows for setting the carrier frequency and
other functions on the computer with ease.

Figure 7: Hardware for alternate architecture

4.1.2 Testing Architecture

4.2 Software

The testing architecture in figure 6 is very similar to the
previous one, except the TinyFPGA is replaced by the
Basys 3 board. It is also powered and controlled over
USB, hence the DACs are also powered by the FPGA
board. The interfaces and rest of the chain are identical
to the earlier architecture.

Deorukhkar

The software that is running on the FPGA and MCU
boards within each architecture is discussed in the
sections below. The goal of the software is to
implement functionality for generating the baseband
signals as seen in the functionality section. One thing to
note for the FPGAs, the Very-High-Speed-IntegratedCircuit Hardware Description Language (VHDL) code
was written as generically as possible for the logic
elements and memory inferring to be transferable
between the Basys 3 board and the TinyFPGA board.
With modifications to the pinouts and clock speed

3

36th Annual Small Satellite Conference

settings, using the same VHDL code helped save time
when testing both architectures.

filter coefficients are the same 21-element RRC
coefficients seen previously and are stored as signed
10-bit integers in a ROM memory block. The input and
output buffers are 32-element circular buffers where
there is a single input buffer with two output buffers for
the positive and negative differential outputs. Both the
length of the FIR filter and input/output buffers are
declared as a generic value and can be adjusted. For the
input data, the values are NRZ coded and upsampled by
8 before being stored in the input buffer as 2-bit signed
integers. The upsampling rate is also a generic value
that can be adjusted depending on the FIR filter
requirements. To minimize resource usage, the FIR
filtering operation is done on a single cycle basis, where
each multiply and addition operation takes a clock
cycle. Since the filter length is small the performance
impact is not as significant, as opposed to the notable
increase of resource utilization with a parallel filter
structure. Lastly, the 12-bit signed integer results are
stored in differential output buffers as 16-bit values.
The leading 4 bits are used by the DACs for specific
functions and are set to zeros. These differential outputs
in each channel are sent bit by bit to the output SPI
block.

4.2.1 Testing Architecture
Starting with the testing architecture, the Basys 3 board
had three iterations of software where each one is more
optimized with lower resource utilization:
1st Iteration
With the first iteration, the idea was to build logic
blocks that performed each function independently.
This is shown in figure 8 where the input SPI block
takes the input data and passes it to FIR filter blocks in
each channel. Then, the filtered data is sent to the
output SPI blocks for the DACs to generate the
baseband signal. The FIR filter and SPI output blocks in
each channel are identical as they were instantiated
using the same VHDL code entities in the top-level file.
This results in duplicate logic hardware such as
input/output buffers which significantly increases
resource utilization.

In both channels, the SPI output block passes the
filtered data to the DACs by generating a clock and
chip-select signal. There are two clocks created by this
block, a data clock and an SPI clock. Both are generic
values that can be changed however, they are initially
set to 8 kHz and 500 kHz respectively. The data clock
can be viewed as the output sample rate, where a new
value is sent to the DAC for updating the output
waveform whereas the SPI clock is used to send each
bit of the new value. Therefore, the differential outputs
per channel are sent to the DAC through this SPI block
to generate the baseband output signal.

Figure 8: Block diagram for first iteration of
software for testing architecture
For the input SPI block, the SCK and MOSI inputs are
double-flopped to prevent meta-stability issues.
Anytime synchronous signals from an external device
are sampled by an FPGA, the inputs should pass
through two flip-flops to synchronize the signals with
the FPGA clock. For the MISO output, it uses the full
flags from the FIR filter blocks and passes one of them
to the input device. In this case, the input device
operates in master SPI mode with the MISO signal
functioning as an on/off switch, indicating when to
start/stop sending data. Another function of this block is
to split the incoming data stream into two channels.
This can be done with a Mux however, the Mux alone
will induce an offset of one bit between the two
channels as it flips between them. Additionally, a delay
block is used on one of the channels to synchronize the
data writes in both channels.

The simulation waveform shown in figure 9 shows the
input and output SPI signals as well as the expected
baseband signals from the DACs. For this simulation,
the main clock frequency is set to 100 MHz which is
the intended operating frequency of the Basys 3 board.
First, the testbench file transmits two data bytes (which
are 0x33) as seen at the top. The output SPI signals in
the middle show the filtered values being sent back to
the testbench file. Lastly, the baseband signals at the
bottom correspond to the output data received from the
SPI output. The two markers in the middle highlight the
maximum and minimum values of the wave, which are
shown on the left sidebar. These values are verified
with the RRC filter simulation to be within the bounds
of the IQ modulator input. The markers also show the
time between the two symbols at the bottom. In this
case, the data rate was set to 8 kHz with an upsampling

The FIR filter blocks are the most resource intensive
blocks due to the filtering logic and data buffers. The

Deorukhkar

4

36th Annual Small Satellite Conference

rate of 8, hence the time interval is 1 ms corresponding
to 1 kHz symbol rate.

Table 2: Timing report for first iteration of
software for testing architecture

2nd Iteration
This iteration is designed to reduce the resources used
in the first iteration. To achieve this, the duplicate FIR
filter and SPI output blocks in both channels are
merged into single blocks as shown in figure 10.
Therefore, most of the logic elements that were
duplicated are now reduced however, certain elements
are still the same (such as the buffers). The SPI input
block is the same as the previous iteration and the
overall functionality is also unchanged.

Figure 9: Output waveform for first iteration of
software for testing architecture
The resource utilization for this software iteration is
expected to be the highest due to the FIR filter and SPI
output blocks being duplicated for both channels. Table
1 shows the post-implementation resource utilization
for each block file. The FIR filter blocks take the most
resources (as expected) while the SPI input and output
blocks take much less. The memory implemented in
this case is Distributed RAM using Look-Up Tables
(LUTs) due to the small buffer sizes and faster speed
compared to block ram. One key point is that all the
registers are clock-edge driven to ensure the internal
signals are properly synchronized to clock edges. This
generally results in no latches being used in the
implementation, which is highly desirable due to their
asynchronous nature causing signals to potentially miss
clock edges.
Table 1:

Figure 10: Block diagram for second iteration of
software for testing architecture
With the same SPI input block, the FIR filter block is
re-designed to contain the logic for both channels. The
input/output signals and data buffers are the same
however, the control logic is now shared between the
buffers. The filtering operation is also the same and
each filter cycle does the multiply and addition for both
channels. The result is the differential output for each
channel, which will be passed to the SPI output block.

Resource utilization for first iteration of
software for testing architecture

Likewise, the SPI output blocks are combined to form a
single block with the same input/output signals. Also,
both channels share the same 8 kHz data clock and 500
kHz SPI clock. For the SCK and SYNC outputs, they
are independent signals that go to each DAC separately.
However, an alternative could have only one set of
SCK and SYNC pins while externally connecting them
to both DACs.

Looking at the timing report in table 2, it shows that
there are no failing endpoints and the timing constraints
are met. The positive value for the Worst Negative
Slack (WNS) means that the time it takes for the
longest path in the implemented logic still fits within a
clock period. Initially when designing the FIR filter
block, the multiply and add logic was entirely parallel.
This resulted in a large negative value for the WNS,
indicating that the filtering logic cannot fit within the
clock period. Hence, this single-cycle filtering approach
offers a more desirable WNS time.

Deorukhkar

In this case, the simulation waveform is identical to
figure 9. The baseband output signal expected from the
DAC is within the IQ modulator bounds and the symbol
rate is 1 kHz.
After the optimizations, the benefits can be seen in table
3. The resource utilization shows a significant reduction
in the LUTs and registers used in this implementation.
The number of LUTs is reduced by 78, registers by 73,
and the overall slices by 31. Since this optimization was
focused on the logic elements of the FIR filter and SPI
output blocks, the data buffers remain the same, hence

5

36th Annual Small Satellite Conference

the number of LUT memory elements is the same as
before.

Table 5:

Resource utilization for third iteration of
software for testing architecture

Table 3: Resource utilization for second iteration
of software for testing architecture
In this iteration, the timing report is the same as table 4
from the second iteration. This would mean that the
removal of the two output buffers does not affect the
critical path timing.

Similar to the previous implementation, the timing
report in table 4 shows no failing endpoints with the
timing constraints being met. Although the WNS is
reduced by a very small amount, the Worst Hold Slack
(WHS) has a noticeable increase. This gives a slightly
larger margin when modifying the design and working
to optimize it further.

Optimizations and Structure
Other optimizations were considered but not
implemented due to the cost outweighing the
performance. Typically with FIR filters, the first half of
the coefficients are the same as the second half.
Therefore, the buffers can be reduced so only the first
half of the coefficients are stored. In this case with 21
coefficients, the benefits would only be 11 LUT
memory elements at the cost of additional logic
elements such as an up-down counter for the index.
Therefore, this optimization is not worth implementing
for small filter sizes but may offer some resource
savings for much larger FIR filters. Another potential
optimization was using block RAMs instead of LUT
memory elements for the buffers. As mentioned
previously, the LUT memory elements are preferred
due to the small buffer sizes and faster speeds.
However, another reason is for compatibility between
different FPGAs as implementing block RAM usually
involves writing FPGA-specific code. Therefore,
changing the main logic to incorporate block RAM may
result in more logic elements being used.

Table 4: Timing report for second iteration of
software for testing architecture

3rd Iteration
Lastly, this implementation optimizes a step further by
reducing the output buffers in the FIR filter block. The
block diagram is the same as the previous iteration as
shown in figure 10. Apart from the FIR filter block, the
SPI input and SPI output blocks are unaltered.
In all the past iterations, the FIR filter block needs one
input buffer and two output buffers per channel. The
two output buffers are for storing the differential output
of the baseband signals. However, since the output of
the FIR filter is a signed value between the positive and
negative bounds, the negative part of the differential
signal can be generated by inverting the positive part.
Then, both positive and negative parts can be shifted to
the desired bias voltage to conform to the input
requirements of the IQ modulator.

Regarding the structure of the VHDL code, a typical
state machine structure with controller and datapath
was considered. However, the code needed to be highly
parallel as the state machine would become very
complex. Another consideration was to use the
Advanced eXtensible Interface (AXI) for connecting
blocks. This was not implemented due to the extra logic
overhead per block, which would substantially impact
the overall resource utilization.

The resulting simulation is analogous to the first
iteration as seen in figure 9. Since the main
functionality is still the same, the differential baseband
signals have a 1 kHz symbol rate as they are passed into
the IQ modulator from the DACs.

4.2.2 Proposed Architecture
For the proposed architecture, the software for the
TinyFPGA adopts the same generic VHDL code from
the third iteration of the testing architecture. The first
two iterations were not implemented as the designs
exceeded the available resources on the TinyFPGA.
Likewise, the block diagram is the same as figure 10.

From the resource utilization shown in table 5, there are
a few changes compared to the previous iterations.
First, the LUTs have decreased by 14 and registers by
10 whereas the overall slices increased by 1. Although
additional 12 Muxes were added due to the changes in
the FIR filter block, the LUT memory elements
decreased by 16. Overall, this optimization results in a
net benefit of reduced resource utilization.

Deorukhkar

The base-level code with SPI input, FIR filter, and SPI
output blocks are kept mostly the same although the
pinouts are completely different. However, the top-level
file was modified to have the clock initialization code,

6

36th Annual Small Satellite Conference

which would generate the main clock signal using the
internal PLL oscillator. In order for the timing
constraints to be met, the maximum clock rate that can
be set is 53.2 MHz to ensure a positive WNS.
Additionally, the SPI output block was slightly
modified so the data clock and SPI clock use the 53.2
MHz main clock rate. This was done by declaring it as
a generic value that can be changed. Apart from this,
the overall functionality is identical to the previous
architecture.

Table 6:

Looking at the simulation waveform in figure 11, all the
SPI signals are identical to the previous architecture.
The baseband signal amplitudes are also within the
expected bounds of the IQ modulator. With the main
clock frequency set to 53.2 MHz, the expected symbol
rate is still 1 kHz. However, the symbol period is not
exactly 1 ms due to the fractional portion of the main
clock period being truncated.

The timing report shows zero unrouted paths, hence all
the paths in the design meet the timing constraints.
Table 7 shows both the worst slack (which is equivalent
to the WNS) and the worst hold slack as being positive
values. This indicates that the implementation has small
available margins for additional functionality without
decreasing the main clock frequency.
Table 7:

Figure 11: Output waveform of software for
proposed architecture

Timing report of software for proposed
architecture

4.2.3 Alternate Architecture

The key difference from the previous architecture is the
post-implementation resource utilization as shown in
table 6. FPGA chips usually differ in the number of
LUTs, registers, and other hardware as well as the
grouping of this hardware (known as SLICEs).
Although less than half of the available registers,
SLICEs, and LUTs are being used, an additional 11
registers, 133 SLICEs, and 188 LUTs are used in this
implementation as compared to the previous
architecture. Additionally, there are 24 more LUTs
being used as distributed memory alongside 3 block
RAMs. The TinyFPGA inferred these block RAMs
despite the code being unchanged, albeit with no
noticeable difference in functionality.

Deorukhkar

Resource utilization of software for
proposed architecture

In the case of the alternate architecture, the software is
written in C and is similar in certain aspects to the
VHDL code. Although the hardware is entirely
different, the same overall functionality is implemented
where the hardware blocks are now software tasks that
run on the MSP430. The flow chart of the software is
shown in figure 12.

7

36th Annual Small Satellite Conference

this, the input buffers are checked again for full
capacity and the MISO signal is changed accordingly.
For the DAC ISR, the trigger is based on a timer that
raises a flag every 1 ms. This corresponds to the DAC
ISR being called at a rate of 1 kHz and the output
updating at that rate. The values from the output buffers
are first inverted to generate the negative part of the
differential signal. Both are then scaled to the intended
bias value corresponding to the bias voltage of the IQ
modulator input.
The memory layout and usage are shown in figure 13,
where less than half of the total available resources are
being used. As mentioned in the design section, placing
as much of the code in FRAM make the software more
resistant to radiation-related errors. Due to this, all the
buffers and variables are placed in FRAM while only
the stack is present in the RAM. This is due to the
extensive number of read/write operations taking place
in the stack, which would reduce the lifetime of the
FRAM. Also, errors during execution can be fixed by
power cycling the MCU, hence retaining the FRAM
data while resetting the stack in the RAM.
Figure 12: Software flow chart for alternate
architecture
Once the MSP430 boots up in the main function, it first
initializes the pins and configures the SPI and DAC
peripherals, and enables interrupts. This allows for the
Interrupt Service Routine (ISR) for the SPI and DAC
peripherals to be invoked depending on their trigger
condition. Continuing with the infinite while-loop, the
first if-statement checks if the input buffers have data to
be filtered and if the output buffers have space.
Assuming these conditions are met, the FIR filtering
operation takes place in a for-loop using the FIR
coefficients and input data for both channels. The final
results are written to the output buffers. If the condition
was not met, the filtering operation would be omitted
and the execution will continue. The last if-statement
checks the input buffers and raises the MISO signal if
they are full or lowers it otherwise. This indicates to the
input device whether to send more SPI data packets or
to wait.

Figure 13: Memory usage of software for alternate
architecture
Code Structure
Similar to the final FPGA code, there are two
input/output buffers corresponding to the IQ channels
and one FIR coefficients buffer. All the buffers are all
declared as integer arrays of specific bit length. The
FIR filter coefficients are the same RRC filter values
from the previous architectures. Initially, the length of
the arrays was chosen to be bit-aligned to the 8-bit
length of the read/write index variables. This allowed
for the read/write variables to roll over the 256 value
and back to zero, essentially implementing a circular
buffer without the additional if-statement index checks.
However, this caused issues where random values were
populated in the buffers and resulted in the nondeterministic behavior of the software. Instead, explicit
checks of the index using if-statements were employed,
resulting in reliable and stable functionality at the
output.

As the while-loop covers the filtering task, the other
two subset diagrams on the right side show the ISR for
the SPI input task and DAC output task:
The SPI ISR task is triggered when a complete byte is
received in the SPI peripheral's shift register. Then, the
values from the shift register are written to the input
buffers. However, the binary values are converted to
NRZ coded values between positive and negative one,
split into odd and even bits, and upsampled by 8. After

Deorukhkar

From the perspective of functionality, the filtering
operation results in the same output as the
aforementioned architectures while being implemented

8

36th Annual Small Satellite Conference

differently. Since the MSP430-FR2355 does not
contain any hardware DSP units for filtering, the first
approach was to use multiplication for calculating the
FIR filtering results. However, the multiply/divide
operators are resource intensive for large-scale
computations even though the operands are integers.

figures in the hardware implementation section. For the
baseband signal plots, the output was connected to an
oscilloscope for visualizing the time and frequency
domain. With the RF signals, they were connected to a
Vector Signal Analyzer (VSA). This particular
instrument can show the frequency spectrum of a signal
up to 6.5 GHz with 40 MHz bandwidth. It also has a
symbol decoder feature, where the incoming signal is
demodulated and the symbols are shown in different
formats.

Therefore, the second approach was to use the hardware
multipliers unit present on the MSP430-FR2355.
Although this gave a significant computation boost, it
exposed another issue with memory read/write
operations. With the lack of Direct Memory Access
(DMA) hardware on this specific MSP430, any
read/write operations would involve processor cycles
whereas a DMA unit would manage the memory while
the processor can work on filter calculations.

5.1.1 Proposed Architecture
Starting with the ADF4351 VCO, the output carrier
signal is shown in figure 14. The sharp peak is at
roughly 2.4 GHz with an amplitude of -24.62 dBm. The
frequency is not exactly at 2.4 GHz due to the clock
calibration and temperature changes within the chip.
This carrier signal is connected to the LO positive input
of the AD8346 IQ modulator.

The problems did not affect the overall functionality but
instead slow down the DAC output rate significantly.
Hence, other potential solutions were attempted where
the MCU clock speed was increased to the max rate.
The MSP430-FR2355 runs at 1 MHz by default but can
be increased up to 24 MHz. This can be done using the
in-built Digitally Controlled Oscillator (DCO) and
Frequency Locked Loop (FLL) to generate the 24 MHz
clock however, the final implementation should use a
more reliable and temperature stable 24 MHz crystal.
The result of this modification yielded a significant
increase in the DAC output rate however, it was still
under the desired 1 kHz output rate.
These cascading issues were solved by the final
approach which replaces the multiplication operation
with if-statements. Essentially, the multiplication
operation only multiplies the FIR filter coefficient with
the NRZ coded input data. Since the data can only be a
positive one, negative one, or zero, the multiplication
result can be explicitly coded using if-statements.
Depending on the value of the input data, the
corresponding result can be substituted into the rest of
the calculation per iteration of the filter.

Figure 14: Frequency spectrum of the carrier signal
for proposed architecture
Next, the time domain of the baseband signal from the
FPGA is shown in figure 15. The top two traces
correspond to the in-phase channel positive and
negative outputs while the bottom two are the
quadrature channel positive and negative outputs. The
input data is the same as the simulation (0x33) and the
symbol interval is also 1 ms (1 kHz symbol rate). The
509.97 Hz shown at the bottom right is the waveform's
frequency. The slight offset from the expected 500 Hz
is due to the inaccuracy of the internal FPGA clock,
which can be improved by using an external crystal.
Notice the highlighted markers are at the maximum and
minimum amplitude points. These correspond to 1.52 V
and 800 mV, which are both within the bounds of the
IQ modulator.

Overall, this method works very well in this case where
the input data is NRZ coded and only three states need
to be checked. However, if the symbol mapping was
not NRZ coding, the if-statements would need to be
expanded to cover all possible cases. This method was
also considered for the FPGA implementation however,
the if-statements would result in more resources being
used as opposed to the direct multiplication approach.
5. TESTING
5.1 Output Plots
The ensuing sections show the output at various stages
for each architecture. The setup is the same as the

Deorukhkar

9

36th Annual Small Satellite Conference

Figure 15: Time domain of the baseband signal for
proposed architecture

Figure 17: Input symbol decoding of the RF signal
for proposed architecture

In figure 16, the frequency domain of the in-phase
positive signal is shown. The markers indicate the first
two peaks which are at 500 Hz and 1500 Hz. Although
the symbol interval is at 1 ms, the frequency is at 500
Hz. Hence, the pulse shape filter passes the 500 Hz
signal with -21.5 dB while attenuating the 1.5 kHz
signal to -39.7 dB.

Lastly, figure 18 shows the output when the input data
contains random bits. The constellation diagram shows
four clusters of symbols while the decoded values are
all random bits. The main plot to observe is the time
domain plot of the baseband signal. This is known as an
eye diagram due to the overlaid signals forming an eye
shape. The middle of the eye represents the optimal
sampling point (highest SNR) for the receiver. Whether
the eye is open or closed is determined by the timing
errors due to clock jitter and noise-like interference
added to the RF signal.

Figure 16: Frequency domain of the baseband signal
for proposed architecture
Moving on to the VSA, the symbol decoding of the RF
signal is shown in figure 17. The top left plot shows the
symbol constellation where the received signal goes
between two symbol points. This matches the plot on
the right which shows the decoded binary values
(0x33). The yellow highlighted bits are indicating the
burst-search feature of the VSA to find the intended bit
sequence. The plot in the bottom left shows the
frequency domain of the RF signal while the bottom
right plot shows the time domain of the decoded
baseband signal. Notice the right sidebar shows the
measurement interval of 50 bits, the points per symbol
set to 16, and the symbol rate set to 1.02 kHz. Ideally,
these would be set to 8 points per symbol and 500 Hz
however, these cannot be set due to the minimum
symbol rate of 1 kHz of the VSA. Also, the rate
matches the slight offset from the time domain plot,
where the 1.02 kHz symbol rate is two times the 509.97
Hz frequency.

Deorukhkar

Figure 18: Random symbol decoding of RF signal
for proposed architecture
5.1.2 Testing Architecture
Identical to the earlier implementation, the same carrier
signal shown in figure 14 is passed into the LO positive
input of the AD8346 IQ modulator.
Looking at the baseband signal, the time domain plot
from figure 19 shows the same symbol interval of 1 ms
and similar voltage level bounds of 1.52 V and 720 mV.
However, the frequency of the waveform is at 500.00
Hz as opposed to the slight offset from the previous
architecture. This is due to the Basys 3 board having an
external crystal that provides a much more stable clock.

10

36th Annual Small Satellite Conference

Figure 19: Time domain of the baseband signal for
testing architecture

Figure 22: Random symbol decoding of RF signal
for testing architecture

With the frequency domain plot in figure 20, it is also
identical to the earlier architecture with the same peaks
of 500 Hz and 1500 Hz. In this case, the 500 Hz
passband signal is -18.4 dB while the attenuation at
1500 Hz is -33.7 dB.

5.1.3 Alternate Architecture
In this architecture, the same carrier signal is used as
shown in figure 14. This signal is fed to the onboard
ADL5375 IQ modulator, where the LO positive and
negative inputs are already connected through loop
filters. All of the subsequent plots for the baseband time
domain, baseband frequency domain, VSA input
symbol decoding, and VSA random symbol decoding
are identical to figure 19, 20, 21, and 22 respectively
from the previous architecture. Although a completely
different IQ modulator was used, there were no
noticeable differences in the output of each plot. This
would imply that the IQ modulators function
independently from the baseband and carrier signals as
expected.

Figure 20: Frequency domain of the baseband signal
for testing architecture

6. ANALYSIS

The VSA symbol decoding plot in figure 21 is nearly
identical to the earlier implementation. The only
difference is the symbol rate of 1 kHz due to the 500 Hz
waveform frequency.

6.1 End-to-End System
The following section covers an end-to-end setup for
the entire QPSK modulation chain. It starts from a
transmitter sending known data through the proposed
architecture to the receiver. The RF signal is then
received using a LimeSDR-USB Software-Defined
Radio (SDR) and demodulated with GNU Radio to
recover the baseband signal. This would mimic a
satellite-to-ground communication system where the
payload on the satellite would send data to its
transmitter, which transmits the RF signal to the ground
station receiver for recovering the payload data.
The payload is a Raspberry Pi 3B+ (RPI) which sends
fixed data values (0x33) over SPI. It is configured to
function as the bus master and runs the SPI clock at 500
kHz while the data rate is roughly 1 kHz. With the
MISO signal behaving as an on/off switch, the RPI
starts sending data when the signal is low and stops
when the signal is high. This allows for flow control at
the receiver end as the input data buffers reach full
capacity.

Figure 21: Input symbol decoding of the RF signal
for testing architecture
With random input bits, the same constellation is shown
in figure 22. The eye diagram shows a similar eye
shape where the middle of the eye is open, indicating
the ideal sampling point for the receiver.

Deorukhkar

11

36th Annual Small Satellite Conference

Implementing the proposed architecture, the receiver is
the TinyFPGA AX2 which functions as a slave device
on the SPI input bus. The same code discussed earlier is
executed on the FPGA and the output is the filtered
baseband differential signals from the IQ channels.
These signals are passed into the PMOD-DA2 DACs
through a modified SPI bus at the same 500 kHz SPI
clock rate and 1 kHz data rate.

with the symbol sync block is identical to the receiver
side of the QPSK baseband structure from the
implementation section. However, the flowgraph differs
as the output is passed into a Costas loop block.
The QPSK baseband simulation from the
implementation section assumed that the baseband
signal is perfectly centered at zero frequency. However,
this is not the case with a real hardware implementation
where the transmitter oscillator may have a slight offset
when upconverting to the 2.4 GHz carrier frequency.
Similarly at the receiver, the oscillator in the LimeSDRUSB can have a different offset from the carrier
frequency, resulting in the baseband signal being
frequency shifted. The Costas loop is used to adjust the
frequency shift back to zero frequency. Lastly, the
signal is viewed in the time domain and frequency
domain using the sink blocks [2].

Next, the DACs receive the digital baseband signals
and convert them to analog signals. The outputs are
updated as soon as a new SPI data packet is received,
hence the output rate is also 1 kHz. Both sets of
differential signals are connected to the appropriate
positive and negative inputs of the AD8346 IQ
modulator.
For the ADF4351 VCO, the board is controlled over
USB and powered using a power supply. The carrier
frequency is set to 2.4 GHz and LO positive output is
connected to the IQ modulator LO input. Although the
VCO provides differential outputs, the IQ modulator
board is configured in single-ended mode and only the
positive signal is needed.
Lastly, the IQ modulator is also powered by the same
power supply and generates the resulting RF signal.
This is connected to the RX input of the LimeSDRUSB, which is set to use the high-band specific channel
for inputs greater than 1.5 GHz. Once the signal is
sampled and demodulated by the GNU Radio software,
the baseband signal is recovered at the receiver end of
the system. The entire setup is shown in figure 23.

Figure 24: Flow graph for QPSK demodulator
From figure 25, the top plot shows the time domain of
the demodulated signal while the bottom plot shows the
frequency domain. The time domain plot has both IQ
traces that are overlapping each other. Once the IQ
channels are re-interleaved with odd and even bits, the
result is the same as the transmitted data. In the
frequency domain plot, the distinct spike corresponds to
1 kHz as expected.

Figure 23: End-to-end setup of QPSK modulation
chain
Looking at the GNU Radio flowgraph in figure 24, it
largely resembles the receiver side of the QPSK
baseband structure from the implementation section.
Starting with the left side, the osmocom source block is
used to interface with different SDRs. In this case, it is
configured to work with the LimeSDR-USB with the
frequency set to 2.4 GHz, bandwidth being
automatically controlled, and sample rate of 200 kHz.
The block generates IQ samples at the specified rate,
which are passed into an AGC block. This block along

Deorukhkar

Figure 25: Time domain and frequency domain plots
for QPSK demodulator
6.2 Setup Considerations
For the SPI input device, the current setup is to have the
RPI as the bus master while the FPGA is the slave
device. The FPGA uses the MISO signal to
enable/disable the data transmit from the RPI.
However, a typical SPI implementation would allow the
RPI to transmit continuously while the FPGA would

12

36th Annual Small Satellite Conference

send ACK/NACK bytes over the MISO signal. An
ACK would indicate the byte being transmitted is
successfully received by the FPGA. A NACK would be
sent to indicate the FPGA buffers are full and that the
RPI should keep re-transmitting the last byte until
acknowledged by the FPGA. The flip side of this
implementation would be if the FPGA was the bus
master and the RPI was the slave device. In this case,
the FPGA would send a byte to request the data from
the RPI. When the buffers are full, the FPGA can
simply turn off the SPI transmission to prevent new
data from being requested. Note that having the FPGA
as the bus master would require it to generate the SCK
clock signal, resulting in more resources being used.

Although the testing and analysis portions show this
architecture being a viable option for use on
picosatellites, additional testing needs to be conducted.
This would include building the final board and
undergoing functional and environmental testing. While
the output plots show the expected symbols being
received while transmitting at 2.4 GHz, these were
under the assumption of no channel impairments which
will inevitably cause symbol errors. Additionally, these
tests would also reveal the performance bounds of the
architecture. With the end-to-end link being tested at a
1 kHz data rate, the upper and lower bound may vary
from this initial baseline.
To build on this, other variations can be considered in
place of the QPSK structure. This includes offsetQPSK, 8-PSK, or even higher-order modulation
schemes. These can be achieved in the FPGA case by
building a generic symbol mapping block where the
mapping between the input data and output symbols can
be modified. Similarly in the MCU case, it will involve
changing the current mapping function to be the desired
symbol mapping. In both cases, the underlying
hardware may need to be changed as well if the MCU
cannot handle the extra processing or if the FPGA does
not have enough hardware resources.

When performing carrier synchronization, the Costas
loop method is known to have a phase ambiguity when
decoding the data symbols. This can be solved by
validating the output symbols with a known
synchronization header. If the symbols are inconsistent
with the header, then the flow graph can be re-run until
the correct header sequence is detected. Otherwise, the
received data can be corrected manually by shifting the
symbols and adjusting the data bits. Another approach
to solve this issue is using differential encoding. This is
where the symbols are encoded based on changes in the
data bits, as opposed to the symbols corresponding to
the data bits themselves [2].

In future implementations, the idea of software-defined
modulation could be incorporated, where the
modulation can adapt to a changing communication
channel to optimize the data rate. Implementing this
feature would require additional hardware and
accompanying software which far exceeds this
architecture. However, with enough hardware resources
and processing capability, the modulation scheme could
be reconfigured in software to maximize performance.
This would result in better link reliability and efficient
use of the channel.

7. CONCLUSION
With the growing number of small satellites being
launched, the picosatellite form factor has gained
attention for its reduced launch fees and relatively
inexpensive building cost with the trade-off of power
and size. With more complex missions taking place, the
need for higher data rates using better modulation
schemes becomes apparent. This growth has also
increased the congestion at the typical VHF/UHF bands
and provokes the use of higher frequency bands for
better signal strength and higher bandwidth.

References

To address these issues, this thesis proposes a QPSK
transmitter architecture for S-band operation on
picosatellites. This architecture alongside the testing
and alternate architectures is shown in detail starting
with the overall functionality of the QPSK structure.
Next, the design of each architecture was shown with
the reasoning for selecting the specific components.
Then, the implementation stage covers the simulation,
hardware, and software aspects of the architectures.
This is followed by the testing of each architecture
which verifies the intended behavior of the transmitter.
Lastly, the analysis portion brings the proposed
architecture into an end-to-end test where data is sent
through the entire chain from the transmitter side to the
received end.

Deorukhkar

13

1.

M.
Viswanathan,
“Qpsk
modulation
demodulation
(matlab
and
python),”
GaussianWaves, 2021. [Online]. Available:
https://www.gaussianwaves.com/2010/10/
qpsk-modulation-and-demodulation-2/

2.

Guided tutorial psk demodulation,” GNU Radio,
2021.
[Online].
Available:
https:
//wiki.gnuradio.org/index.php/Guided
Tutorial
PSK Demodulation/

36th Annual Small Satellite Conference

