and office environment in the near future, the use of VLC systems will grow exponentially over the coming decades [1] . As a result, there has been significant research and commercial interest in VLC systems over the last ten years (see [2] for the detailed review). This is largely due to several key advantages that VLC offers in comparison to the existing radio frequency (RF) technology, including license-free operation, high available bandwidth, high spatial diversity, innate security, and controlled beam shaping.
Phosphor based white LEDs have a low communication bandwidth (a few MHz) due to the long photoluminescence lifetimes of the phosphor [3] . The bandwidth of the blue LED itself is limited to 20-30 MHz. Using pre and post-equalisation for on-off keying (OOK) modulation, a data rate of 550 Mbps was demonstrated using a phosphorescent white LED [4] . Further enhancement in data rate was achieved by adopting advanced modulation schemes including carrierless phase and amplitude modulation (CAP) and orthogonal frequency division multiplexing (OFDM) [5] , [6] .
To increase the data rates further, VLC systems must use wavelength division multiplexing (WDM) and/or spatial multiplexing (also known as multiple input multiple output (MIMO)) [2] . Using WDM, VLC systems with data rates up to 10 Gbps were demonstrated [7] , [8] . However, there are limited practical work using spatial MIMO and most of these are low data rate proof-of-the-concept demonstrations [9] [10] [11] .
Research progress has been substantial in the past few years [2] , and whilst a 'killer app' may provide a kick-start for VLC system [12] , it is also essential to explore technologies that can make a high-speed bi-directional VLC system compatible for energy-efficient integration with existing CMOS-based consumer electronics (e.g. mobile devices). Except for the use of CMOS-based image sensors [13] , [14] and other low-speed demonstrations [15] , most of the practical VLC demonstrations used off-the-shelf components that are not suitable for mass production and difficult to integrate into mobile devices. In a real-time application, device and system constraints that are not encountered in the laboratory environment, including limited memory, real-time processing power, and clock rates, also need to be addressed.
In this paper, we report high-speed imaging MIMO systems which address the issues of integrating transmitter and receiver components. We have demonstrated four and nine 0733-8724 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
channel MIMO systems with maximum data rate of ∼890 Mbps per channel with an aggregate data rate of 7 Gbps for nine channels. Higher data rates have been demonstrated using similar transmitter devices using significantly larger area photodiodes (PDs) and bigger optical aperture at the receiver. However, these demonstrations have a very limited field of view (FOV) as collimating and focusing is used to concentrate most of the emitted power onto the detector [28] . Hence, the contribution of the paper is as follows: a) Demonstration of fully integrated MIMO-VLC CMOS technology is widely used to realize integrated circuits, especially for mass production due to low power and highspeed performance. Previously reported CMOS based LED drivers have low driving currents (less than 70 mA) and limited data rate support (up to 155 Mb/s) (see [17] , Table x ). The CMOS-based LED driver circuit reported here can deliver up to 500 MS/s at a maximum full-scale current of 255 mA and has a power efficiency of 67%. The driver supports different modulation schemes (pulse amplitude modulation (PAM), CAP, OFDM) with the capability to drive multiple LEDs in MIMO or single-input-single-output (SISO) modes. A detailed description of the driver is reported in [18] .
The fully integrated system uses an array of blue µLEDs driven by a CMOS driver and an array of avalanche photodiodes (APDs) and processing circuitry manufactured using CMOS technology. Preliminary results using 1st generation APDs receiver with a bandwidth of ∼ 22 MHz (without the CMOS driver) was reported in [19] , [20] . Further, results with the second generation of APDs were reported in [21] . To the best of authors' knowledge, this is the first fully integrated MIMO demonstration.
b) Demonstration of spatially dense high-speed MIMO-VLC MIMO-VLC has so far been demonstrated only for limited data rates (often far below the maximum rate demonstrated using a single channel) [9] [10] [11] . In this paper, we have demonstrated data rates up to 7 Gbps using nine channels with transmitter separation of 750 µm and receiver separation of 250 µm. This demonstration improves upon /surpass the existing systems in terms of i) higher data rate: the previously reported maximum data rates for MIMO system (in an ideal condition i.e. without optical cross-talk) was 1.3 Gbps [10] , [22] . In this paper, we have improved the data rate to 7 Gbps. This data rate was achieved using significantly smaller size photodiode (200 × 200 µm 2 ) instead of using large area PDs such as 7. [9] ii) spatial data density: in comparison to above-mentioned work, we demonstrated system with a high spatial density. The previous MIMO work was demonstrated with transmitter spacing in cm range (e.g. 5 cm in [11] , 15 cm in [10] and 25 cm in [9] ). In comparison, the current system has transmitter spacing of 750 µm increasing the spatial density by > 4000 channel/cm 2 . Though this increase in the spatial density may not directly translate into the similar increment in the data rate, this demonstration provides a platform to use the multiple LEDs chips within each luminaire of the commercial chip-on-board (COB) LED architecture of illumination devices for parallel data transmission iii) higher MIMO order: the previously reported high-speed MIMO-VLC system are limited to 4-parallel channels. In the work, we have improved the number of channel to nine, limited by available transceivers and iv) scalability: since the transceiver is manufactured in CMOS technology, the system is readily scalable. We have demonstrated that the same system can be scaled from 4-channel to 9-channel. Higher order MIMO system is feasible with a larger array of the transceiver.
To the best of authors' knowledge, this is the first large-scale MIMO-VLC demonstration as we practically demonstrated a 9 × 9 MIMO system. We have reported a complete system with full devices and communication link characterization including error probability within the coverage distance. With the demonstration of high-speed, high-density, a large-scale MIMO-VLC system using CMOS-based LED driver and APD receiver, we have not only made system compact but also more practical for large scale integration with the existing system.
The rest of the paper is organized as follows: Section II gives an overview of the MIMO VLC system. The sub-systems of the demonstrator are described in detail in this section. Section III details the experimental set-up. The experimental results are presented in Section IV. Finally, conclusions are given in Section V.
II. INTEGRATED SYSTEM DESCRIPTION
The device parameters for optimum performance of a VLC system depend on the device constraints of both the transmitter and the receiver. At the transmitter, there is an inter-relationship between the µLED area, optical power, and bandwidth. Similarly, the photodiode bandwidth depends on its active area. By incorporating the power penalty for multilevel PAM with µLED constraints (area-bandwidth-output power relationship) and APD constraints (area-bandwidth relationship) in link budget analysis, a realistic target of 1 Gbps at 1 m distance was established for this integrated MIMO-VLC system. A conservative goal of 1 Gbps over 1 m was chosen based on the preliminary modelling of likely available µLED devices and CMOSbased receiver sensitivities. Though the link distance is lower than average distance ceiling/user, the system is scalable to a longer distance. Refer to [19] for detailed derivations of these parameters and numerical values.
A. Transmitter Subsystem
The transmitter subsystem consists of the µLED driver chips, the µLED array and associated printed circuit boards (PCB). The µLED driver chip is designed and implemented in an Austria Micro Systems 0.18 µm CMOS process. Each chip consists of four independent current steering digital to analogue converter (DAC) based driver channels each with 8-bit resolution. The chip can drive four individual LEDs in a ganged mode (i.e. all channels driving the same data stream) or MIMO mode (each channel driving an independent data stream). In MIMO mode, each driver operates with four times lower clock rate. Each driver channel can deliver full-scale current up to 255 mA (i.e. ∼ 1 A per chip), and power efficiency is 67% [18] . The DC bias current of each DAC can be varied independently from 16 mA to 255 mA with a 4-bit resolution. The DAC is designed to operate at a maximum sampling frequency of 500 MS/s. A custom PCB platform provides an interface between the driver chip and a field programmable gate array (FPGA) card (Opal Kelly XEM6310) (see Fig. 1 ). The FPGA provides the data, sampling clock, and other control signals to the driver chip. The full electrical characteristic of driver chip was reported in [18] .
B. µLEDs
The MIMO device consists of a 6 × 6 array of individuallyaddressable µLEDs with 39 µm diameter and was fabricated from commercially available GaN LED wafer material, grown on a 2" c-plane sapphire substrate. Each individual µLED has a common p-contact and an individually addressable n-contact to make them compatible for driving with an NMOS-based CMOS driver. The MIMO µLED array is arranged in nine 2 × 2 clusters with a pitch of 69 µm between two adjacent µLED elements ( Fig. 2(a) ). The separation between adjacent clusters is 750 µm making an end-to-end separation of 1500 µm. The nominal peak emission wavelength for each µLED is 450 nm with 20 nm full width at half maximum line width.
The µLED array is wire bonded to a 132-pin ceramic package which is attached to a daughter card (see Fig. 2(b) ). The daughter card is connected to a motherboard through four highspeed connectors, and the motherboard is connected to the LED driving subsystem. This hierarchical approach allows the LED array to be easily exchanged with another array. 
C. Receiver

D. Optical Design
An imaging MIMO system is preferred over a non-imaging MIMO system, as the channel H-matrix of an imaging system can be well-conditioned [14] . To achieve a full rank matrix in the imaging MIMO system, the image of more than one source should not fall entirely into the same receiver i.e.
where s is the source spacing, p is the PD width, f is the focal length of the receiver optics system and d is the link length. The ratio (p/f) governs the receiver FOV of the imaging system. The transmitter and the receiver optical systems are designed using a combination of commercially available aspheric singlet lenses (see Fig. 4 ). The transmitter and receiver optics have apertures of 45 mm and 23 mm, respectively and the distances of 37 mm and 21.5 mm to the lenses front edge from µLED and PD, respectively. The optics system was optimized for a 4 × 4 imaging MIMO system using ray tracing software. However, there is a flexibility to adjust the optics for other MIMO configurations. The transmitter optics is designed to offer a full divergence angle of 7.5 degrees. The divergence angle and FOV are calculated based on a link budget analysis to offer at least 1 Gbps data rate at a 1 m link distance. The receiver optics does not only create an image of the LED array but also provides an optical gain. The receiver has full FOV of 3.5 degrees. The FOV of the receiver is limited by the number of available APDs in the array. In order to increase the FOV, a significantly larger array of APDs is required [23] .
III. EXPERIMENTAL SET-UP
In this paper, we report a practical demonstration of the integrated four-channel MIMO-VLC system using state-of-the-art CMOS technology. We have also demonstrated four and ninechannel MIMO-VLC system driven using arbitrary waveform generator (AWG) that show the high-speed capacity of the system. A simplified block diagram of the MIMO-VLC experimental set-up using CMOS driver is shown in Fig. 5(a) and picture of the setup is given in Fig. 5(b) . To achieve an aggregate data rate of 1 Gbps or higher, two DAC driver chips were required as the maximum sampling rate the Opal Kelly FPGA supports was limited to 375 MS/s. Each DAC chip drives two µLEDs in MIMO mode. The synchronization of the two driver chips was achieved by configuring them in a master-slave configuration where a 10 MHz clock from a master DAC chip was used as the reference clock for the slave chip.
In the AWG-based system, DAC drivers are replaced with AWGs (Agilent 81150A). One µLED in each cluster of four were used for nine-channel system whereas only µLEDs from corner clusters were used for the four-channel system (see Fig. 2(a) ). The electrical outputs from the APDs were acquired using oscilloscopes for further offline processing which includes low pass filtering, equalisation and signal decoding. The fourchannel MIMO systems were evaluated at a link distance of 1 m and the nine-channel system was evaluated at 0.5 m In a bandlimited MIMO system, both spatial (among MIMO channels) and temporal (intersymbol) interferences occur. To mitigate these interferences, a joint spatial and temporal decision feedback equalization (DFE) as outlined in [24] are adopted in this work.
IV. SYSTEM PERFORMANCE
The system under test is fully characterized and optimum operating conditions for both transmitter and receiver are determined. For data rate evaluation, a bit error threshold of 3.8 × 10 −3 is adopted as recommended by ITU [25] .
A. System Characterization
The measured intrinsic bandwidth and optical power of µLED device are shown in Fig. 6(a) . The electrical-to-electrical (E-E) bandwidth and optical power of µLED increase with the driving current. The measured optical power at 100 mA current is ∼ 3.3 mW. The measured frequency response of the APD arrays is given in Fig. 6(b) . Except for two APDs which show a resonant dip at 63 MHz, the remaining APDs have a bandwidth >90 MHz which is limited by parasitic capacitance. The resonant dips in these two APDs are possibly caused by them being close to the pads. Further investigation on the issue is being carried out and will be rectified in future iterations.
In order to determine if the DAC driver has any adverse effect on the system performance, the bandwidth of the system was evaluated by driving µLEDs with a DAC and with an AWG under similar bias current using sinusoidal signals of different frequencies. The sampling rate of the DAC is fixed to 250 MS/s whereas the AWG has a sampling rate of 2 GS/s. Fig. 7 shows the measured bandwidth of the complete system (µLED + APD). As expected, the bandwidth of the µLEDs increases with an increase in the average current (see Fig. 6(a) ). It can be observed that the DAC driver has a lower bandwidth than the AWG. The DAC output waveforms exhibit longer fall times than rise times as the CMOS driver has an active rise and passive fall due to the current-steering approach, which reduces the bandwidth. Nonetheless, the DAC does not significantly degrade the frequency response. For example, a −4 dB level is obtained at 52 MHz and 50 MHz for AWG and DAC, respectively (∼ 4% difference). Note that frequency response beyond 50 MHz is not reliable for the DAC due to a low sampling rate. Based on this result, the average current per LEDs was set to ∼ 45 mA for the results presented in the following sections.
B. System FOV
The overall FOV of the system (transmitter with the transmitter optics and receiver with the receiver optics) was evaluated by driving a single µLED and measuring the output signals from all APDs. The transmitter position was fixed while the receiver was displaced horizontally (perpendicular to the direction of propagation of light) in steps of 5 mm. The total received power was calculated by summing the APDs' output. The normalized total received power (i.e. DC channel gain) versus the horizontal displacement is shown in Fig. 8 . The coverage diameter at 1 m distance is ∼ 60 mm, which corresponds to an overall system FOV of 3.45 degrees, limited by the number of available APDs. With a larger APD array, the FOV can be increased e.g. with an array of 5 × 5 APDs, the FOV is estimated to be ∼ 7 degrees. The maximum achievable data rate using maximal-ratio combining (MRC) and equal gain combining (EGC) methods are also shown in the figure for a 4-PAM scheme with DFE. As expected, MRC offers higher data rate than EGC. The maximum data rate achieved using EGC and MRC are of 440 Mbps and ∼500Mbps, respectively.
C. 4-channel MIMO Performance 1) DAC Drivers:
To evaluate the performance of the full integrated MIMO system, four µLEDs were driven by two DAC drivers in the MIMO mode with independent data streams. The signals from all the APDs were captured and further processing (filtering, downsampling and equalization) was done offline. The aggregate data rates against the displacement of the MIMO system with 4-PAM modulation scheme is shown in Fig. 9 . As it was not feasible to obtain an aggregate data rate beyond 750 Mbps for 2-PAM due to FPGA limitations, 2-PAM results are not presented here. The minimum and maximum data rates achieved were ∼ 1.23 and ∼1.3 Gbps, which correspond to a net rate of ∼1.14 and ∼1.21 Gbps respectively after 7% forward error correction (FEC) overhead reduction. The data rates with Fig. 9 . The aggregate data rate vs the displacement of the integrated MIMO system with the DAC driver for 4-PAM. the DAC drivers were limited by the available sampling rate and lower bandwidth.
2) Waveform Generators: As outlined in [26] , the optimum PAM level with a DFE depends on the ratio of the bandwidth and data rate, available signal-to-noise (SNR) and dynamic range. Hence, it is necessary to establish the optimum PAM level case-by-case basis as a generalized conclusion is difficult to draw. In order to determine the optimum PAM level, the BER performance of different PAM level under the identical operating condition is evaluated. Fig. 10 illustrates the data rate against BER of 2, 4 and 8-PAM with DFE, evaluated for a single channel (without any cross-talk) at 1 m link distance. This clearly demonstrates that 4-PAM offers the optimum performance closely followed by 2-PAM. 8-PAM offered a significantly lower BER performance.
The aggregate data rates against the displacement of the fourchannel MIMO-VLC system with 2-PAM and 4-PAM modulation schemes for the experiments with AWGs is shown in Fig. 11(a) . The minimum and maximum data rates achieved within the coverage area using 4-PAM are 1.35 and 1.96 Gbps (i.e. 1.25 and 1.8 Gbps, respectively after the 7% overhead FEC reduction). This corresponds to a maximum data rate of ∼500 Mbps per channel. Note that higher data rates (2 Gbps using pre-equalised 4-PAM [27] and 3 Gbps using OFDM [28] ) were achieved using µLEDs. However, those demonstrations used significantly larger area PDs and bigger optical aperture; and hence had significantly higher SNR with limited FOV.
As shown in Fig. 11 , there is a variation in the aggregate data rates within the FOV. The variation in data rates is due to: a) variation in inter-channel interference (ICI) and b) fill factor of APD array. It is feasible to design MIMO optical system with negligible cross-talk for a point-to-point link with the fixed transceiver position. However, when the transceiver positions are not fixed, the ICI (i.e. the condition number of the H-matrix and system capacity) depends on the relative position and orientation the transceiver. Secondly, the APD array has a fill factor of ∼80%. As the orientation of the transceiver changes, a significant proportion of the optical intensity may not be detected, this reduces the overall received power. For example, the overall channel gain (i.e. the sum of all the elements in H-matrix) at a displacement of 15 mm is 0.16 which is ∼ 11% less than at 5mm where the channel gain is ∼0.18. However, the H-matrix condition numbers are 2.7 and 2.2 (∼18% variation) which significantly reduced the achievable data rate at 15 mm. On the other hand, the channel gain at 35mm is 0.12 and a significantly lower data rate is expected. The optical cross-talk at different displacements is demonstrated by Fig. 11(b) and (c). Fig. 11(b) shows eye-diagrams of the received signal and the signal after applying zero-forcing (ZF) equalizer at 5 mm where the crosstalk is negligible. Hence, a clear eye-opening can be observed. On the other hand, Fig. 11(c) demonstrates that the signal recovery is not possible without an equalizer as the eye is completely closed due to ICI.
D. Nine Channel MIMO With AWGs
In order to demonstrate the feasibility of higher order, highspeed, dense spatial density MIMO system, a nine channel MIMO-VLC was studied. Nine is the maximum channel that the current set-up accommodates due to the limited number of available APDs. However, it is feasible to scale the system when the receiver with a higher number of APDs became available. To evaluate the performance of the 9 × 9 MIMO system, one µLED from each cluster of nine (see Fig. 2(a) ) was driven by AWGs. The link distance was reduced to 0.5 m so that the imaging MIMO condition given by (1) was satisfied.
As in the case of 4-channel system, 2, 4 and 8-PAM with DFE were evaluated to determine the best performance. The 8-PAM offered the best performance, followed by 4-PAM as reducing link distance to half (0.5 m) increases the SNR by at least 6 dB (see [26] for a theoretical comparison PAM under different ISI and SNR).
The BER against the date rate for nine channel MIMO system for 8-PAM is given in Fig. 12 . The BER for a channel was measured when all µLEDs corresponding to its neighboring channels were active so that the channel experiences the maximum possible inter-channel interference. For example, to estimate data rate for µLEDs in the top-left corner (Fig. 2(a) ), all µLEDs except the one in the bottom-right corner were active. The signal from APDs was then captured and joint MIMO decoding algorithm was applied. Each channel can support data rates up to 890 Mbps. The aggregate data rate for the nine channels is 7.48 Gbps, which is ∼6.95 Gbps after removing 7% FEC overhead. Note that the FOV of the 9 channel MIMO system is significantly smaller (theoretically less than 8 mrad) due to a limited number of available APD.
V. CONCLUSIONS AND FUTURE WORK
In this work, we have developed and demonstrated a complete integrated MIMO-VLC system that can support complex modulation schemes such as PAM and. Using two-dimensional arrays of µLEDs and APDs, and imaging optical system specifically designed for this purpose, combined with an imaging optical system, an integrated 4-channel MIMO system was demonstrated with aggregate data rates beyond 1 Gbps. Furthermore, we also we demonstrate a four-channel system with a minimum aggregate date rate of 1.35 Gbps and an FOV of 3.45 degrees at 1 m. This paper also shows the feasibility of a high-speed spatially dense nine-channel MIMO system that achieved data rates beyond 7 Gbps at a link distance of 0.5 m.
The FOV and range of the current system is limited due to the number of available APDs, as well as the source array size. Work to use WDM, improved receiver optics, more efficient sources, and novel receiver designs is underway. This will focus on achieving practical fields of view and coverage areas, whilst maintaining high data rates.
