Abstract-Magnetic resonance imaging (MRI) pulse sequence consoles typically employ closed proprietary hardware, software, and interfaces, making difficult any adaptation for innovative experimental technology. Yet MRI systems research is trending to higher channel count receivers, transmitters, gradient/shims, and unique interfaces for interventional applications. Customized console designs are now feasible for researchers with modern electronic components, but high data rates, synchronization, scalability, and cost present important challenges. Implementing large multichannel MR systems with efficiency and flexibility requires a scalable modular architecture. With Medusa, we propose an open system architecture using the universal serial bus (USB) for scalability, combined with distributed processing and buffering to address the high data rates and strict synchronization required by multichannel MRI. Medusa uses a modular design concept based on digital synthesizer, receiver, and gradient blocks, in conjunction with fast programmable logic for sampling and synchronization. Medusa is a form of synthetic instrument, being reconfigurable for a variety of medical/scientific instrumentation needs. The Medusa distributed architecture, scalability, and data bandwidth limits are presented, and its flexibility is demonstrated in a variety of novel MRI applications.
major medical and research markets, are typically closed with proprietary hardware, software, and interfaces, making difficult any expansion or adaptation to new techniques and experiments. FDA regulations on the design process, vendor marketing decisions, and reimbursement issues create political disincentives to rapid development. A confluence of these factors hinders innovations in interventional MRI, MR-guided therapies, and inhibits MR engineering education.
However, with advances in computer infrastructure, and digital and analog RF integrated circuits, investigators can now devise custom console designs targeted to their research needs [1] [2] [3] [4] [5] . Even so, MRI research systems are trending towards high channel-count receivers [6] , [7] , transmitter arrays [8] , field cameras [9] , [10] , and even gradient/shim arrays [11] , [12] . Meeting modern expectations for high channel counts, customized interfaces and real-time imaging performance poses a considerable design challenge. As one example, direct RF sampling/synthesis with modern wide-band analog-to-digital (ADC) and digital-to-analog (DAC) converters permit typically analog RF operations such as mixing, channel filtering, and modulation to be performed in the digital domain, a concept generally known as software defined radio (SDR) [13] , [14] . While this has made the RF portion of MRI easier to realize, SDR approaches place additional pressure on the systems which must handle the new large flows of digital data. Implementing such systems with efficiency and flexibility requires a scalable modular system architecture.
Standard multicore PCs can now handle user interface and image reconstruction duties in real-time [21] , yet efficient data handling and real-time control remains a problem. Tightly integrated console hardware, such as PCI cards, provide high-throughput low-latency communication, but sacrifice scalability, cost effectiveness, and design time. Moreover, the PC then becomes directly involved with low-level MRI system operation-an untenable compromise given typical operating system stability and security. The handling of hard real time tasks with PC (or microcontroller) software may provide high flexibility, but imposes harsh restrictions on software design and inefficient use of available processing power and hardware resources. For example, one may expend considerable coding effort and even CPU time to ensure that a controller is simply ready to perform real-time operations. Bus technologies like USB or Ethernet are intrinsically scalable, but place the PC at arms length from system hardware, forcing real-time sampling and pulse sequence tasks to be handled by the local console hardware. Data must be buffered locally, then efficiently communicated to the PC with low latency.
At the 2005 ISMRM [15] , we proposed a USB approach to MRI system design that could address these requirements (Fig. 1 ). This has now evolved into a full synthetic instrument prototype termed Medusa. Synthetic instrumentation refers to the use of reconfigurable building blocks to construct advanced instrumentation, while the name Medusa stems from the system's ever-expanding cables resembling a jellyfish or the head of the character from Greek mythology. In our design concept, we have investigated a highly scalable MR system based on digital synthesizer, digital receiver, and gradient blocks operating as a network of universal serial bus (USB) peripherals. The efficacy of this approach will depend upon efficient architecture and design: distributed processing, efficient use of hardware, streamlined efficient data flow, and leveraging of commodity communication buses and components. Here, we present the architectural details of Medusa, its bandwidth and scalability limits using distributed USB data transport, its performance metrics for software-defined all-digital RF transmission and reception for parallel scalable MRI systems. Finally, various configurations are demonstrated, and its suitability for other novel instrumentation applications are discussed.
II. METHODS

A. System Design Philosophy
The Medusa console aims to deliver the complete set of basic MRI console functions including multichannel RF excitation and reception, gradient waveform generation, RF coil and amplifier gating, and an open software platform for console control, pulse sequencing, and experimentation. Furthermore, we target a modular architecture for scalability, and a performance level sufficient to enable modern fast imaging with RF and gradient bandwidths of at least 250 K samples per second per channel, support for 100% duty cycle operation, and at least 16 bits of RF receiver dynamic range. To satisfy these requirements, hardware development was guided by three dominant priorities: use "digital" RF components to simplify and streamline the RF subsystems, employ a modular architecture with distributed processing for flexible system configuration and scalability to high channel counts, and leverage commodity PC peripheral interfaces for scalability, fast data transfer, interoperability, cost-efficiency, and a modicum of protection against hardware obsolescence. We chose USB as our data transport for its support of up to 127 devices per bus, guaranteed data delivery, simple and inexpensive hardware interface options, and the ability to aggregate multiple slower USB devices through a higher speed hub.
1) Real-Time Tasks:
To best explain our methodology, we define two timing domains: hard real-time and soft real-time. Hard real-time tasks require microsecond or nanosecond-level accuracy and encompass duties such as sample-by-sample RF and gradient pulse playback and echo acquisition, real-time waveform buffer management, and the timing of important pulse sequence parameters, e.g., echo time (TE) and repetition time (TR). The high-speed, repetitive, and parallel nature of low-level timing, sampling, and data flow are not well suited to a sequential processor, but better served by dedicated logic, preferably in programmable form as in a field programmable gate array (FPGA) or complex programmable logic device (CPLD). By comparison, soft real-time tasks have more relaxed requirements on a millisecond timescale, yet still must be completed on time. These include starting and stopping of the scanner, updating the running pulse sequence each TR as needed, and retrieving the received data from acquisition buffers. The design of Medusa reflects the relative requirements of hard and soft real-time tasks.
2) Concept Prototype: The large scope of the proposed development followed from a proof-of-concept console ( Fig. 1 ) first built to demonstrate basic hardware approaches. This prototype networked DDS and digital receiver evaluation boards, with analog gradient waveform generators using FTDI USB 1.1 modules, and a Belkin USB 2 hub employing Cypress hub USB ICs. A critical feature of the Cypress hub IC was the presence of four transaction translators (one per port) allowing full data rate translation to USB 2.0 high-speed. Most other hubs shared a single transaction translator amongst all ports, that would have created a data-rate bottleneck. The successes and challenges of this first generation were particularly influential in the second generation design discussed here and shown in Fig. 2 . Fig. 3 . Basic Medusa system is built from one or more RF and gradient modules that are driven by a system controller, in turn connected to a host PC. As application demands grow, the system can be expanded (dashed lines) up to 16 modules per controller. For large channel counts or high-bandwidth applications, additional system controllers and hosts may be used. The limit of scalability depends only on accurate clock and sync signal distrubution.
B. Medusa Bus Architecture
Medusa's bus architecture, illustrated in Fig. 3 , uses a tiered tree structure for configuration flexibilty and scalability. At the core, a Medusa system controller communicates with up to 16 modules over a 16-bit parallel data link called the Backbone Bus. Each module is responsible for a system task such as RF or gradients. The system controller coordinates the data flow amongst its modules and forwards it via USB to a host PC. When the channel count or throughput limit of a single system controller is reached, additional controllers (each with up to 16 modules) can be connected via USB. The USB standard supports up to 127 devices per bus, yet in practice, the data throughput limit of USB or the host PC itself is reached first. When a single host PC is no longer able to accomodate the data or image reconstruction load, the bus design supports multiple host PCs connected through a high speed network (e.g., Ethernet). This feature is supported in Medusa drivers but has not been tested since these limits have not yet been reached.
Data throughput concerns aside, there is no design limit to the number of Medusa modules that can be combined into a single system. Modules and controllers require only two signals to be shared system-wide in order to maintain synchronous operation: a reference clock (REFCLK) and a master sync (MSYNC). Medusa scales to the extent that these two signals can be distributed accurately.
C. Hardware 1) Logic Core: A common logic core subsystem (Fig. 4) is incorporated into each Medusa module and is a key element in providing high performance, scalability, and flexibility with relatively few parts. Using an Altera MAX-II EPM1270 CPLD, a Cypress CY7C1061 2 Mbyte high-speed SRAM, and an addressable backbone bus interface, the core performs in programmable hardware all the hard real-time module tasks such as sampling, real-time waveform buffer handling, and TR synchronization. The CPLD contains a configuration and identification register block, one or two direct memory access (DMA) engines, a multiport memory controller, a TR controller, and the backbone bus interface. In addition to these universal features, the logic core is customized with interfaces to serve the specific needs of each module type.
The intellectual-property (IP) cores in the CPLD were developed in Verilog HDL specifically for Medusa, with a focus on design efficiency and timing accuracy. The memory controller and DMA engines are partially-pipelined to increase data throughput per clock cycle, while the module-specific interface logic resynchronizes data transfers for precisely deterministic sample timing as needed in MRI. Many of the components are interconnected using a Wishbone [16] on-chip bus network, but with access and arbitration times tightly controlled for guaranteed real-time execution.
The logic core simultaneously performs several functions crucial to pulse sequence execution, such as timing, sampling, and data flow management. In parallel with pulse sequence execution, the logic core exposes a set of control and data registers through which the module hardware and logic core can be configured and waveform memory can be uploaded and downloaded. The multiport SRAM memory controller enables concurrent access to the memory from the DMA engines and backbone bus, managing them by using time-division arbitration with access priorities and wait-states. Thus, reconfiguration and pulse sequence updates can be performed in real-time, even during sequence execution.
2) System Controller: The Medusa system controller ( Fig. 5 ) coordinates up to 16 individual modules through the backbone bus, providing configuration, control, and clocking, as well as the USB data link back to the host PC. A NXP/Philips LPC2214 60 MHz 32-bit ARM7 processor is the heart of the controller, accepting high level commands from the host PC and coordinating data transfers to and from the modules. As in the modules themselves, the System Controller includes a logic core alongside the ARM processor for timing, data flow, and synchronization. USB high-speed (480 Mb/s) support is implemented using a Cypress Semiconductor CY7C68013A FX2 USB peripheral interface. The internal architecture of the Cypress FX2 is particularly well suited for high throughput, combining an embedded 8051 8-bit processor to perform USB enumeration and handshaking, while all USB user data flows through a dedicated hardware-driven 16-bit first-in first-out (FIFO) buffer interface. The FIFO interface configures easily for glue-less communication with programmable logic or microcontrollers. The logic core, USB interface, and ARM processor communicate using a 16-bit memory bus capable of 60 MBytes/sec, a considerable improvement over the speed-limited I/O on the first generation Medusa. The system controller also uses a FTDI FT2232L USB full speed (12 Mb/s) dual-interface IC with one interface dedicated to a serial debug terminal, and the other to support the legacy first-generation USB protocol and software.
For maximum flexibility, a programmable Cypress CY22150 PLL clock generator is used to synthesize the main system clock (SYSCLK) which drives local logic as well as the connected modules. The PLL can use an onboard 10 MHz crystal, or an external reference clock (REFCLK) to synchronize with outside equipment or other system controllers. Finally, each system controller provides some local I/O, all of which can be controlled and interrogated via the host USB connection to facilitate interfacing to auxiliary systems and custom experimental hardware. An addressable multidrop two-wire I2C bus interface is available for control of intelligent RF multiplexers and programmable-gain front-end amplifiers (PGAs). Also included are eight lines of combined logic I/O with 10-bit A/D for general-purpose control and sensing, and a 10-pin JTAG port for in-system updating of the logic core CPLD firmware on any module.
3) RF Module: The Medusa RF module ( Fig. 6 ) combines a logic core with digital RF exciter and receiver subsystems. The exciter is based on the Analog Devices AD9854 quadrature direct digital synthesizer (DDS) and is capable of producing RF output modulated in frequency, phase, and amplitude. The AD9854 DDS includes a complex numerically-controlled oscillator (NCO) with 48-bit tuning word, 14-bit phase control, 12-bit amplitude control, and inverse-sinc interpolation to drive dual 12-bit I/Q output DACs. The DDS operates internally at 200-266 MHz, 4 the system clock (SYSCLK), and is thus capable of directly synthesizing RF from dc to 100 MHz. To limit harmonics from the zero-order hold nature of the DACs, Fig. 6 . The Medusa RF module employs direct-conversion methods with highspeed DACs and ADCs to synthesize and acquire RF with relatively few components. Once in the digital domain, the logic core CPLD and RAM perform timing and buffering of the waveforms until they can be serviced by the Medusa controller. Gating outputs are provided to control RF amplifiers and coil bias. the DDS outputs are followed by a seventh-order elliptical lowpass filter with rolloff at 100 MHz.
Similar to the exciter, the RF module receiver digitizes RF directly using an Analog Devices AD6644 14-bit ADC operating at the system clock rate (SYSCLK, 50-66 MHz). The sample-and-hold bandwidth of the receiver ADC is 250 MHz capturing a frequency range of several Nyquist bands. For carrier frequencies above SYSCLK/2, the sampling operation effects a down-conversion via aliasing. As a result, an input lowpass or bandpass filter narrower than SYSCLK/2 must be used to prevent aliasing of unwanted signals and noise into the band of interest.
The received RF signal exits the ADC as a continuous data stream at 100-132 MB/s which is impractical for direct storage and remains spectrally sparse given the narrowband nature of MRI signals ( 1 MHz typical). To reduce the digital RF to MRI baseband, an Analog Devices AD6620 digital receiver (DR) is used to perform further downmixing and channel filtering numerically. The AD6620 receiver includes a complex NCO and multiplier with 32-bit frequency tuning word and 16-bit phase offset for downmixing. This is followed by two stages of cascaded integrator-comb (CIC) filtering and finally a user-defined 128-tap FIR filter/decimator. Baseband signal output is produced as 16-bit interleaved I/Q data.
For each sample clock within a pulse sequence transmit window, the logic core retrieves pulse data from the local waveform memory using DMA and writes the DDS exciter amplitude, phase, and frequency words. The new DDS settings take effect synchronously on the next sample clock. Likewise, during a pulse sequence receive window, receiver I/Q baseband data are captured temporarily by high-speed latches and then stored via DMA to the local waveform memory. The logic core also provides external gating outputs to signal when transmit or receive operations are taking place. These outputs may be used to sequence RF power amplifiers, T/R switches, and Q-spoiling bias networks for coils.
Together, the exciter and receiver can operate on carriers anywhere in the range from dc to 100 MHz (0-2.4T proton magnetic resonance) with only a change of input filter. To maintain carrier phase synchronization between exciter and receiver as required by MRI, the two subsystems are both clocked from the Medusa system clock (SYSCLK) and the Tx/Rx digital NCO tuning words can be digitally matched. The Tx and Rx NCOs have different word lengths, so step sizes must be chosen to ensure that the same frequency is synthesized. The Tx/Rx phases can also be resynchronized on command. In high-speed ADC and DDS applications, clock phase jitter will influence the noise floor as timing uncertainty maps directly to NCO phase noise and ADC sampling error. An On Semiconductor MC100LVEL16 PECL driver and differential distribution network enables robust ADC and DDS clocks targeting sub-picosecond jitter levels.
4) Gradient Module:
The Medusa gradient module (Fig. 7 ) implements the same logic core hardware as the RF module but is instead programmed to handle the playback of gradient waveforms and gating signals. The logic core CPLD renders four channels of digital gradient waveform data as four synchronous high-speed serial links, nominally operating at 10 Mb/s although signaling speed and data format are reconfigurable. Each of the four serial links consists of four signals (typically clock, data, load, and clear) which are converted to RS-422 100-differential pairs by STMicroelectronics ST26C31 differential drivers. The data format, signaling speed, and purpose of each signal may be reconfigured as needed by the gradient system interface requirements. Differential links were selected to permit robust transmission to remote D/A converters or digitally-controlled gradient amplifiers as far as 50 m away, as well as for immunity to cabinet-to-cabinet ground offsets. Output connectors are RJ-45 with signal pairing that allows the use of standard twisted-pair Ethernet cables. In addition to the gradient waveform outputs, two 5 V and ten 3.3 V logic-level gating outputs are provided for the coordination of other experimental hardware. The gating outputs present as a fifth channel and are arbitrarily programmable to a new value at each active sample point in the gradient sequence.
To convert the digital waveform streams for use with analog-input gradient amplifiers, we implemented Medusa DAC boards (Fig. 8) based on the high-precision Linear Technologies LTC1592 16-bit DAC. Avago HPCL-0738 opto-isolators convert the differential serial signals from the Gradient module to single-ended, while also providing 2 kV galvanic isolation from the console. The LTC1592 DAC offers a software programmable full scale output range of 2.5 V, 5 V, or 10 V helping to make the best use of the available 16-bit precision while matching gradient amplifier input requirements. The DAC is governed by a Linear LT1021-5 buried-zener 5 V reference with low noise and drift ( 1 ppm 0.1-10 Hz, and 5 ppm/ C). Two DAC module analog outputs are provided, a buffered single-ended output for monitoring, and a differential output to drive gradient amplifier inputs. Designed with experimental systems in mind, the DAC module also incorporates an optional output-integrating fault detector which aims to protect the gradient amplifiers and coils from thermal damage in case of gross programming error. The fault detector automatically latches the DAC output to zero when the integral of the output waveform exceeds a preset threshold. The circuit uses a lossy R-C integrator to avoid triggering on acceptable persistent low-level shim values.
Additional reference information for components is provided at http://mrsrl.stanford.edu/~medusa.
D. Software Architecture
The host PC software employs a layered plug-in design to complement the modularity and flexibility of the Medusa hardware. A Medusa server, with architecture shown in Fig. 9 , simplifies the management of multiple system controllers and their attached modules by providing a unified software interface for all connected console hardware. The server layer incorporates low-level drivers for both USB high-speed (Cypress FX2) and full-speed (FTDI) communication, and is specifically designed for easy extension to other interfaces in the future. Moreover, the Medusa system uses a standard command and data packet protocol for exchanges between the host PC and console hardware, which can be transported by virtually any communication method. For example, the Medusa protocol can be easily transported over Ethernet or RS-232 serial, in addition to USB.
Once the Medusa server has identified and established connections to available console hardware, it exposes the functionality to high-level software through either an IP network layer, or a shared dynamically-linked library (DLL). User interfaces, pulse sequencers, and reconstruction engines may then use these interfaces to operate the console and recover acquired data. While the DLL interface provides the fastest access, the IP network offers exceptional versatility permitting control and reconstruction software to reside on separate, even remote machines from that running the Medusa server. By virtue of being purely a communication layer using standard network sockets and leveraging cross-platform USB drivers, namely "libusb" and FTD2XX, the Medusa server can be compiled and run easily on any host machine under Microsoft Windows, Apple Mac OS X, or Linux operating systems.
While the Medusa server provides the communication layer, it depends on higher-level software for pulse sequence design, control, execution, image reconstruction, display, and storage. Writing such a suite of software is not a trivial matter. In an effort to harness existing work and provide a familiar powerful development environment, Medusa specifically supports control of the console directly from Mathworks Matlab, illustrated in Fig. 10 . A Matlab extension (MEX) driver layer is implemented by the Medusa server which allows complete access to all hardware features. A substantial library of Matlab functions and tools were written to simplify control of Medusa and facilitate development of pulse sequences. Executing a conventional noninteractive pulse sequence is as simple as supplying the pulse sequence data and parameters in a standardized Matlab structure and passing this to a provided execution script. At the conclusion of the scan, the structure is returned with acquired MRI data filled in. Even though Matlab is not a real-time environment, it can run Medusa successfully, even interactively, because the real-time burden is off-loaded to Medusa hardware.
E. Sequence Timing And Coordination
Pulse sequence execution typically begins with identification and configuration. The host PC software opens a connection to the Medusa server and requests a list of all available hardware, for example, the number of RF and gradient modules and channels. The pulse sequence (Fig. 11) is then defined by a set of data blocks, and their associated timing parameters within a TR interval. The data blocks and parameters are transmitted to the local RAM of each module. At the module level, pulse sequence execution is handled by each logic core independently. A master synchronization signal, MSYNC, triggers the beginning of TR, which is then executed to completion on each module. At the rising edge of each MSYNC signal, the internal TR controller registers the beginning of a TR and begins counting sample clocks (SMPCLK). The sample counter is continuously monitored and compared against the start time of the next pulse sequence element, e.g., gradient waveform, RF pulse, or acquisition window. When the TR controller determines that a pulse sequence element is active or "in window," it triggers the DMA engine on each SMPCLK to move data between the waveform SRAM and module devices. For playback, RF or gradient waveform data is retrieved from waveform memory and stored into the DDS or gradient shift registers. For RF acquisition, data is moved from the receiver I/Q data latches and stored back into waveform memory. When the last pulse sequence element in the TR is executed, the TR controller returns to idle and waits for the next TR start signal (MSYNC). Medusa can execute a sequence in two operating modes, streaming and batch, depending on the pulse sequence requirements and host PC limitations. Streaming mode is preferred because it maintains continuous execution. Here, the host machine must stream data blocks into each module in advance of MSYNC, while received data and status messages are read out at the completion of each TR. Real-time modification of the scan is possible. In practice, the host must keep a minimum of two TRs loaded (stay two TRs ahead of scan progress). Should the host PC fall behind, the scan will gracefully stop at the last TR that was loaded on-time. Streaming, even at high rate, can be successfully executed with a relatively modest host machine, e.g., Intel Core2 Duo at 1.8 GHz.
In batch mode, the complete pulse sequence is pre-loaded to Medusa in advance of the scan. The waveforms and parameters of most 2D scans can fit completely in the RF and gradient module memories. Once started, the full scan is executed without requiring communication from the host PC at all. Acquired MR data are downloaded to the host after the scan completes. Batch mode does not support real-time modification of the scan, but is useful for ultra-fast TR, very high data rates, or a slow host PC. Regardless of operating mode, any scan can be paused or halted at any time if user intervention is required.
III. RESULTS
Initial Medusa imaging tests were performed on the Stanford pre-polarized MRI (PMRI) scanner [17] [18] [19] [20] . PMRI employs an inhomogenious pulsed resistive electromagnet for polarization, followed by a lower-field uniform electromagnet during signal readout. The Stanford-built scanner served as an excellent testbed with ready access to RF, gradient, and coil control signals. A Matlab-based scan execution environment, as previously illustrated in Fig. 10 , was developed to drive the scanner along with gradient echo, spin-echo, and fast spin-echo sequences. The PMRI magnet and an early Medusa imaging result are shown in Fig. 12 .
Medusa was subsequently configured to operate a GE Signa Excite 1.5T scanner, adapting easily to 1.5T (64 MHz) with only a change of RF input filters and frequency settings. However, the Fig. 13 . Left: MR image of 12 cm GE resolution phantom, acquired using Medusa controlling a GE Signa Excite 1.5T scanner. Right: A frame capture of real-time cardiac imaging at 50 frames/s performed using a Medusa console and RT-Hawk software suite on the 1.5 T GE magnet. Acquisition is a 3072-pt spiral with 42 interleave at 20 ms TR. Fig. 14. An eight-channel real-time Medusa console demonstrates synchronized operation between two controllers. The controllers work in tandem by sharing a 10 MHz reference clock and an MSYNC pulse which signals the start of each TR. Each controller drives a bank of four RF channels, and one controller also handles the gradients. Inside the enclosure, the USB datalinks from the controllers are combined using a USB hub. This synchronization method enables scaling to very high channel counts, limited only by the practical distribution of the reference clock and MSYNC signal.
proprietary nature of some of the signal interfaces on the commercial scanner meant that an in-house RF chain and coils had to be built. Success with simple Matlab-controlled 2DFT scans eventually led to more advanced work such as spiral-based realtime cardiac imaging with Medusa driven by the RT-Hawk realtime scanning software suite [21] , results shown in Fig. 13 . This spawned a Medusa configuration with 8-channel RF transmit/ receive and dual system controllers designed specifically to satisfy the high performance demands of real-time parallel imaging (Fig. 14) .
Being essentially a generic multichannel RF instrument, the use of Medusa is not limited to MR imaging. RF network and spectrum analysis functions are easily implemented in the same Matlab programming environment used for scanning. As an example, a two-channel Medusa console combined with a directional coupler has been used to monitor impedance in a chest surface coil showing variability in-sync with cardiac and respiratory function (Fig. 15) . Tables I and II list performance figures for Medusa hardware. Experimentally achieved throughput using Medusa's fullspeed and high-speed USB interfaces did not reach theoretically possible rates (Table I ). The transaction, protocol, and bitstuffing overhead of USB consumed as much as 33% of total bus bandwidth. Furthermore, we observed that usable throughput on USB can vary significantly with host operating system, USB Fig. 15 . Left: Two-channel Medusa console monitors transmit coil impedance to detect changes in loading due to motion or even physiological activity. Right: With a chest surface coil we detect both cardiac and respiratory rhythms as impedance changes in the coil .   TABLE I  USB THEORETICAL AND ACHIEVED THROUGHPUT   TABLE II  MEDUSA PERFORMANCE RESULTS host controller, USB device interface, packet size, and transfer method. Using bulk transfer mode with large packets, and a device interface with hardware acceleration (DMA) generally yielded the best rates. Failure to achieve theoretically-possible rates is common, even with a modern host PC and high-performance commercial USB products.
The maximum Tx/Rx baseband sample rate in the RF module is limited largely by data transfer speeds between local waveform memory and the DDS and digital receiver. Operating independently, the exciter and receiver can achieve sample rates of 1.8 and 3.2 million complex samples per second, respectively. Yet when transmitting and receiving simultaneously (e.g., loopback or calibration), the maximum combined sample rate is 1.1 Ms/s, which is still in excess of typical MRI baseband requirements of 8-500 Ks/s.
In RF performance, the Medusa RF receiver achieves a signal over noise floor of 89 dB (15.8 bits) at 250 Ks/s baseband rate, a figure made possible in part by process gain from decimation and filtering of the high-rate ADC. The receiver's input-referenced third-order intercept point (IIP3) of 19.7 dBm was determined using two-tone tests at 1 kHz and 10 kHz. The RF transmitter produces a spur-free dynamic range (SFDR) of 78 dBc. All measurements were made at a center frequency of 63.9 MHz (1.5T). The first version of the RF module was carefully designed on a two-layer PCB for simplicity and economy, yet a recent update derived a mild 0.5 dB improvement in RF dynamic range by moving to a four-layer design with full internal power and ground planes. A four-layer PCB has the added advantage of better thermal conductivity for heatsinking of the high-dissipation DDS and ADC chips.
The maximum achievable gradient sample rate when driving Medusa DAC modules is 375 K samples per second per channel, the point at which the bandwidth of the 10 Mb serial links is saturated by 24-bit DAC words (8 bits control 16 bits data). The Medusa gradient module is intrinsically capable of 1.2 M samples per second when the connected hardware permits higher serial bit rates or a more compact data format. Typical MRI gradient amplifiers are limited to a bandwidth measured in tens of kilohertz, with sample rates of 250 Ks/s commonly used merely to offer additional temporal resolution.
IV. DISCUSSION
While Medusa has clearly been successful as a flexible, scalable platform for MRI imaging, it is also a study in trade-offs. The goal of delivering high performance while making efficient use of resources at every layer in the architecture has been addressed partly through initial design, and partly through iteration.
The proof-of-concept prototype highlighted several design pitfalls including insufficient RF and gradient sample rates due to underpowered microcontrollers, undersized memory buffers, and the low throughput of USB full speed (12 Mb/s). The loose hardware integration of development kits also made multichannel scale-up difficult. The second-generation Medusa system addressed many of these problems.
By implementing a common logic core to handle hard realtime tasks common to all Medusa modules, we simultaneously absolved the Medusa controller and host PC from dealing with precision timing while also simplifying system design, lowering cost, and promoting system modularity and expandability. Hard real-time tasks are contained and remain local to the modules. New or upgraded modules added to the Medusa system can build upon the same logic core (or implement a new compliant one) along with the circuits to support new functionality. Already, the Medusa Gradient module hardware has been easily reconfigured in firmware to drive several different gradient systems including GE and Varian digitally-driven amplifiers, as well as in-house built RF vector multiplier boards in place of the standard Medusa DAC boards.
Likewise, the use of digital RF components yields some key design advantages and simplifications. Performing downmixing and channel filtering numerically ensures that multichannel systems have perfectly matched response without drift due to temperature or component aging. Digitizing RF early also means fewer analog parts in the receive path that must be scrutinized for linearity, dynamic range, and IP3 performance. Direct digital conversion of RF eliminates the need to distribute a local oscillator (LO) common in analog implementations. However, digital systems do not escape the need for quality phase and frequency references. The low phase noise requirement on analog LO signals have an exact corrollary in the clock jitter specification levied on high-speed digital sampling clocks. In both cases, a poor quality reference will lead to unwanted spectral broadening and reduced SNR in the baseband, e.g., clock jitter of just 300 fs is equivalent to the loss of the least-significant bit on Medusa's 14-bit ADC when acquiring a 64 MHz signal.
We see from results (Table II) that the Medusa RF receiver closely tracks predicted process gain from oversampling. For sinusoids with random noise, we can expect oversampling by factor to yield additional bits of dynamic range. At lower imaging bandwidths, the receiver's noise-free dynamic range already exceeds the range provided by the 16-bit output data path. A future upgrade of the receiver ADC in speed or bit-depth would need to be accompanied by a 20 or 24-bit digital path to capture the wide dynamic range.
The performance of digital RF components such as DDS and high-speed ADCs are advancing rapidly. Already, the devices used in the Medusa RF module are old, superseded by newer ICs with a broader frequency range, faster sampling and update rates, and higher dynamic range. Far from making Medusa obsolete, this merely reinforces the importance of the modular system architecture which permits the seamless upgrade of components as technology improves and needs grow.
Building the Medusa system controller separately from the RF and gradient modules succeeded in minimizing hardware and allowing easy expansion and upgrade of the modules, but this came at the price of introducing a potential data bottleneck in the Medusa backbone bus. Future Medusa designs may need to balance the increasing data demands of high channel counts against ultimate modularity. For example, a single RF module hosting 4 or 8 channels could produce enough data that it alone would fill a USB 2.0 data link. For such a design, the RF module should integrate system controller functionality with a dedicated data link.
Moreover, there remains a question of whether USB is the best bus architecture for Medusa. USB is ubiquetous, fast, cost-efficient, and intrinsically supports multiple devices per bus. By virtue of being entirely host managed, USB does not suffer from bus collisions and can provide guaranteed data delivery. While Medusa is now able to saturate a single USB 2.0 480 Mb/s link under real-time multichannel operation, the introduction of USB 3.0 at 4.8 Gb/s may provide a compelling upgrade path with backwards compatibility. Multiple Medusa systems working in tandem have already been demonstrated as a method for increasing channel counts. A USB 3.0 hub can aggregate the data from at least ten Medusa systems each with 4-8 RF channels, permitting an order of magnitude increase in channel count without any hardware redesign.
Nonetheless, USB is not without flaws. Protocol overhead is 10%-15%, and host-managed bus scheduling often reduces usable bandwidth an additional 10%-25%. Commonly achievable user data rates are only 60% of the raw bus rate, with relatively long round-trip latencies of 300 s to 3 ms. The lack of support for broadcast, multicast, and peer-to-peer device communication presents an efficiency and logistics problem when distributing global information amongst a distributed system like the Medusa architecture. Consider for example the need to distribute a time-critical pulse sequence change due to cardiac trigger. On USB, the message must be sent independently to each Medusa controller, introducing delay and time skew.
The IEEE 1394a/b bus standard (Firewire 400/800 Mb/s) was considered early on as an alternative bus transport to USB. Even when running at a bitrate slower than USB 2.0, Firewire achieves higher data throughput with up to 95% bus utilization and guaranteed bandwidth. Yet Firewire began as a closed standard and despite clear technical advantages still has limited market share compared to USB. As a result, there are far fewer components, tools, and resources for Firewire device development, making implementation challenging.
Gigabit Ethernet, although more complex to implement than USB 2.0, presents some advantages for future Medusa hardware generations. Like USB, multiple Ethernet links can be aggregated into a high-bandwidth backbone using network switches. Ethernet network cards and switches are commonly engineered and specified for high network load and are readily available commercially at low cost. Most products can operate to over 90% network utilization. Tests of modern consumer-level PC hardware with built-in gigabit Ethernet achieved 850 Mb/s user data throughput when transferring to RAM, and round-trip latencies of less than 50 s, far better than USB 2.0 in both respects. Ethernet also supports broadcast and peer-to-peer communication for more flexible information flow, and significantly longer cable lengths as compared to USB or Firewire.
Medusa may be used as a standalone console for novel imaging systems, yet it also integrates easily with most commercial scanners to augment functionality and the number of RF and gradient channels. Only a 10 MHz clock reference (REFCLK) and start-of-TR logic signal (MSYNC) are required to keep Medusa synchronized for tandem operation with a host scanner. This same MSYNC trigger mechanism can also be used to precisely discipline Medusa's pulse sequence execution from an external device such as a cardiac or respiratory monitoring unit. In both cases, Medusa's inputs and outputs are synchronized to the host to within 20 ns, and can then be used for generic channel expansion, or to monitor or control auxiliary hardware such as for guidewire current monitoring or RF ablation.
When Medusa interacts with live subjects, exceptional care must be taken to ensure that SAR and other physiological safety limits are obeyed. As with commercial scanners, combining pulse sequence safety modeling, monitoring, and hardware interlocks is advisable. One approach to enhancing MR safety in an experimental setting is to use amplifiers which are physically incapable of exceeding safety limits, or are failsafe protected against it at the hardware level. For example, we may use 1 kW RF amplifiers yet with time-average RF power limited to 100 W by the dc power supply.
Thus far, 11 second-generation Medusa systems have been constructed. Demonstrating flexibility and versatilty in application, Medusa has played a key role in a variety of imaging experiments from Pre-polarized MRI, magnetic particle imaging [22] , and Overhauser-enhanced MRI [31] . Medusa has also enabled parallel transmit research in predistortion methods [24] and Cartesian feedback for RF power amplifier linearization [27] , [28] , and impedance characterization of RF amplifiers [23] and coils [25] . Finally, Medusa is used for interventional research addressing therapeutic RF ablation [30] as well as MR guidewire safety through reverse polarization imaging [26] and active current cancellation [29] .
V. CONCLUSION
MRI systems research has trended to higher channel counts in receiver, transmitter and gradient/shim subsystems. The rapid prototyping and implementation of novel MRI hardware concepts demands a flexible, scalable, open architecture. We have demonstrated Medusa, a highly scalable MR console architecture based on digital synthesizer, digital receiver and gradient building blocks networked by a USB 2.0 transport layer. The Medusa architecture is extensively reconfigurable to synthesize a variety of medical/scientific instruments. To date, 11 Medusa consoles have been built in various configurations to serve diverse research applications including prepolarized and real-time MRI, parallel transmit system development, and magnetic particle imaging.
