Abstract-In this paper, we present low-power reconfigurable adaptive equalizers derived via dynamic algorithm transforms (DAT's). The principle behind DAT is that conventional signal processing systems are designed for the worst case and are not energy-optimum on average. Therefore, significant energy savings can be achieved by optimally reconfiguring the hardware in these situations. Practical reconfiguration strategies for adaptive filters are presented. These strategies are derived as a solution to an optimization problem. The optimization problem has energy as the objective function and a constraint on the algorithm performance (specifically the SNR). The DAT-based adaptive filter is employed as an equalizer for a 51.84 Mb/s very-highspeed digital subscriber loop (VDSL) over 24-pair BKMA cable. The channel nonstationarities are due to variations in cable length and number of far-end crosstalk (FEXT) interferers. For this application, the traditional design is based on 1 kft cable length and 11 FEXT interferers. It was found that up to 81% energy savings can be achieved when cable length varies from 1-0.1 kft and the number of FEXT interferers varies from 11 to 4. On the average, 53% energy savings are achieved as compared with the conventional worst-case design.
I. INTRODUCTION
T HE POWER dissipation of CMOS circuits [1] , [2] is a grave concern in the VLSI industry. This concern is mainly driven by the limited battery life in mobile applications, reliability, as well as packaging costs in both mobile and tethered applications. Several low-power techniques [1] have been proposed for general VLSI as well DSP-specific systems. General low-power techniques include logic minimization [3] - [4] and precomputation [5] (at the logic level), reduced voltage swing [6] and adiabatic logic [7] (at the circuit level), and CMOS scaling [8] (at the technological level). The lowpower techniques specific to DSP systems include strength reduction [9] - [11] and DECOR [12] (at the algorithmic level) and pipelining [13] - [14] and parallel processing [14] (at the architecture level). Moreover, algorithm transformation techniques [15] such as look-ahead [14] , relaxed look-ahead [16] , algebraic transformations [17] , and retiming [18] have been employed in high-speed and more recently low-power DSP system design. All of the above-mentioned techniques Manuscript received May 21, 1998 ; revised January 5, 1999 . This work was supported by DARPA under Contract DABT63-97-C-0025 and National Science Foundation CAREER Award MIP-9623737. The associate editor coordinating the review of this paper and approving it for publication was Dr. Konstantinos Konstantinides.
The authors are with the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA.
Publisher Item Identifier S 1053-587X(99) 07661-8. are applied during the VLSI design phase and their implementation is time invariant. Therefore, we refer to this class of low-power methods as static techniques. Recently, dynamic techniques both at the circuit level and algorithmic level have been proposed. Dynamic techniques can be applied after static techniques to obtain even greater energy savings. These techniques are based on the principle that the input is usually nonstationary, and hence, it is better (from an energy perspective) to adapt the algorithm and architecture to the input. Such systems are referred to as reconfigurable signal processing systems [19] - [29] . In [19] , reconfigurability is employed to map a wide class of signal processing algorithms to an appropriate architectural template. Field programmable gate array (FPGA)-based devices and their reconfiguration schemes are discussed in [20] - [22] . Hybrid architectures based on FPGA's and general-purpose DSP's is the topic of research in [23] and [24] . Other dynamic techniques include approximate signal processing [25] , [26] , where just the right amount of computational resources needed at a specific instant/period to meet the algorithm performance requirements is allocated. In addition to dynamic techniques at circuit and architecture level, techniques at the algorithmic level [27] - [29] are also being developed. The key goal of these techniques is to improve the algorithm performance (such as convergence rate [27] , datarate [28] , and image distortion [29] ) by exploiting variabilities in the data and channel. Thus, there are several emerging dynamic techniques at the circuit, architecture, and algorithmic levels.
Our approach to the design of reconfigurable DSP systems is to add just the right degree of flexibility (as demanded by the application) to ASIC's, resulting in application-specific reconfigurable IC's (ASRIC's). The ASRIC approach is suitable for mobile multimedia systems of the future as it maintains the energy and throughput efficiency of ASIC's. Recently, we have proposed dynamic algorithm transform (DAT) [30] , [31] (see Fig. 1 ) for the design of low-power ASRIC's. From an implementation perspective, a DAT-based reconfigurable DSP system has the signal processing algorithm (SPA) implemented in a reconfigurable hardware (see Fig. 1 ) (such as FPGA, certain DSP's, or ASIC's), whereas the input state and state transitions are monitored by a signal monitoring algorithm (SMA) block (or a controller). In [30] , DAT techniques are studied in the context of a system-identification scenario with application to a near-end crosstalk (NEXT) canceller for 155.52 Mb/s ATM-LAN. A general framework for DAT is presented in [31] , where the variabilities in the input are modeled as transitions in an input state-space, and the reconfigurations are modeled as transitions in the configuration space of the reconfigurable hardware fabric. In [30] and [31] , DAT was employed to derive energy-optimal reconfiguration strategies for adaptive equalizers. These strategies were employed in the context of a 155.52 ATM-LAN transceiver, and significant energy savings were demonstrated. In this paper, we provide proofs of energy optimality of the reconfiguration strategies in [31] and study their application in a 51.84 Mb/s very high-speed digital subscriber loop (VDSL) [32] , [34] .
Preliminaries results of this work have appeared in [35] , where we extended the strategy in [30] to include the equalization scenario along with precise energy models. Related works include a nonuniformly spaced equalizer [36] , where a technique for choosing the best taps (in terms of MSE) out of total taps is presented. This technique, however, is very complex and is not suitable for the real-time implementation. In contrast, our reconfiguration strategy is much simpler and is employed for real-time reconfigurations. In [37] , a 128-tap adaptive equalizer was proposed in which the tap length and precision are varied to maintain a certain SNR. Our approach is systematic as we employ the Lagrange multiplier method [38] to find optimum set of powered-up taps, which are not necessarily the end taps. Further, we consider a fractionally spaced linear equalizer (FSLE) and a complex-valued strengthreduced [38] feedback equalizer.
The rest of the paper is organized as follows. In Section II, we present some preliminaries related to the adaptive filter and multiplier energy models. The dynamic algorithm transforms are discussed in Section III, and energy-optimum reconfiguration strategy for adaptive filters is presented in Section IV. Finally, in Section V, we employ the DAT-based equalizer in 51.84 Mb/s VDSL transceiver and present simulation results.
II. PRELIMINARIES
In this section, we present some preliminaries regarding the reconfigurable architecture of the adaptive filter and energy models for the multipliers. Later sections will employ these energy models to determine the energy-optimum configuration for the adaptive filter architecture.
A. Reconfigurable Adaptive Filter
Let be the input signal to an -tap adaptive filter, and let be the real-valued filter coefficients. The least mean square (LMS) algorithm [39] is then given by
where and are the output error and the desired output, respectively, and is the step size. If the correlation sequence of the input signal is known, then the mean squared error (MSE) is computed as [40] ( 2.3) where is the desired signal power, and are the optimum filter coefficients.
A direct implementation of the LMS algorithm is shown in Fig. 2(a) , where each tap consists of two multipliers and two adders: the filter (F) block implements (2.1) and the weightupdate (WUD) block implements (2.2). The architecture in Fig. 2(a) can be modified to obtain a reconfigurable architecture in Fig. 2(b) , where we have introduced additional control signals and , which are employed to power up/down the th tap. For example, setting forces a zero at the input to the F-block multiplier of the th tap and bypasses the F-block adder, thereby powering down the th tap in the F-block. Similarly, powers down the th tap in the WUD-block.
Additional energy savings can be achieved by adapting the precisions of the input signal and the coefficients. The input precision is chosen to achieve a specified signal-toquantization noise ratio (SQNR). It can be shown that SQNR at the input is given by SQNR db PAR db (2.4) where PAR is the peak-to-average ratio at the input and is computed by dividing the maximum value of the input signal with its root-mean squared (RMS) value. It can be seen from (2.4) that a 6-dB reduction in PAR results in a 1-bit reduction in It can be shown [41] that to achieve a specified SQNR at the output, the coefficient precision is given by (2.5) where is the maximum value of (for , ). Thus, (2.5) indicates that a 1-bit reduction in the precision is achieved for each four-fold reduction in filter length. In Section IV, we will present reconfiguration strategies to choose the parameters and in an energy-optimum manner. In the next subsection, we present a multiplier energy model that will be employed to derive the reconfiguration strategies presented in Section IV. 
B. Multiplier Energy Model
We focus on energy models for the multipliers as these consume a large percentage of the total energy. There is a significant on-going effort [42] , [43] in the computer-aided design (CAD) community to find accurate estimates of the energy dissipation. Our interest here is to determine accurate and simple relative power dissipation models that can be employed by the SMA block to determine an energy-optimum configuration in real time. It is well known that energy dissipation is a function of the input statistics in CMOS circuits.
For a direct-form FIR filter, the input into the th-tap multiplier is a delayed copy of Thus, the statistics of the data input are the same for all taps. Therefore, we present an energy dissipation model of a multiplier, which is a function of the coefficient input only. We will assume that a -bit signal is being multiplied by a -bit constant coefficient The constant coefficient assumption is valid for adaptive filters if we assume that the WUD block is powered down after convergence. Assuming a two'scomplement representation, we have (2.6) where is the value of the th bit of coefficient In [35] , we define as the number of nonzero bits in the coefficient given in (2.6). Similarly, is defined as the difference of and the number of zeros at least significant bit (LSB) positions and is given as (2.7)
It was found via a real-delay gate-level simulations [44] that the linear energy model based on and underestimates and overestimates the energy consumption of the multiplier, respectively. Therefore, a better energy model can be obtained by taking a weighted sum of and as (2.8)
The constant was chosen to equal 0.9 in order to minimize the error between and the real-delay energy values obtained via a gate-level simulation tool MED [44] . Standard cells based on 0.18 m, 2.5 V technology are assumed for the real-delay simulations. It was found that the model in (2.8) is accurate with less than 9% estimation error, as compared with a real-delay gate-level simulation. Note that models based on closed-form expressions such as (2.8) are useful in computing the energy-optimum configurations.
III. DYNAMIC ALGORITHM TRANSFORMS (DAT)
In this section, we present the general framework for DAT's. The motivation for DAT is that the conventional signal processing system designed for the worst case is usually not optimum (from energy perspective) for the best and the nominal cases. Hence, significant energy efficiencies can be gained by having a signal monitoring algorithm or the SMA block (see Fig. 1 ) that monitors the input state and then reconfigures the SPA block to match the input. This naturally leads to the definition of input state and configuration, which are presented in Sections III-A and III-B, respectively. In Section III-C, we show how energy savings can be calculated.
A. Input State Space
We employ the input state space to distinguish between different scenarios at the input. In general, the input state space depends on the input nonstationarity and the hardware granularity. For example, the input state-space needs to have more states for a hardware platform with fine granularity of reconfiguration as compared with the one with coarse granularity. The input state is formally defined as follows.
Definition 1: The input state (where is the input state-space) at time instant is a vector of input-dependent parameters, where with a probability For example, assume that a 51.84 Mb/s VDSL network has 100 connections, out of which 80% of the are approximately 0.6 kft from the transmitter, and the remaining are distributed equally between 0.1-1 kft. Assume further that the equalizer complexities of these three cable lengths are substantially different in order to warrant reconfiguration. In that case, we can define the input state space and where and are the input states corresponding to the cable length of 0.1, 0.6, and 1 kft, respectively. For this example, corresponds to the worst case, whereas and represent the best and the nominal cases. Proceeding further, we can define each state as PAR SNR , where PAR and SNR are input signal energy, input peak-to-average ratio (PAR), and input signal-to-noise ratio, respectively. Thus, we an input state-space with three states will suffice where each state is a three-element vector.
B. Configuration Space
A reconfigurable hardware fabric is characterized by its configuration vector as defined below.
Definition 2: The configuration vector (where is the configuration-space) at time instant is defined as a vector of reconfiguration control signals. Each configuration vector corresponds to a particular value of the control signals.
For example, for the -tap reconfigurable adaptive filter in Fig. 2(b where energy dissipated by the SPA block in configuration ; specified MSE; MSE achieved by the SPA block when the input is in state and the SPA block is in configuration The optimum SPA configuration is illustrated in Fig. 3 . For each state , we need to compute the optimum configuration such that the energy dissipation is minimized while satisfying the constraint on the algorithm performance.
C. Energy Savings
The average energy savings of a DAT-based system as compared with the worst-case design is given as (3.2) where and are the average energy dissipation of the DAT-based system and worst-case design, respectively. Large energy savings can be expected for situations where Such situations arise if the energy dissipation requirements for the worst and the nominal cases are considerably different and the probability of occurrence of worst-case input is sufficiently small. Note that includes the energy of the SPA datapath and that of the SMA controller The SPA energy consumption can be computed by averaging over all the states Most of the SMA block is activated after samples only if there is a transition in the state. Therefore, the energy dissipation of the SMA block will be negligible for sufficiently large. However, we do include a fixed constant value for the SMA energy consumption to reflect the fact that the state monitor in the SMA is always active. This was found to be 2% of the worst-case energy consumption for the VDSL application. In the next section, we derive energy-optimum reconfiguration strategies for adaptive filters as a solution to (3.1).
IV. RECONFIGURATION STRATEGY FOR ADAPTIVE FILTERS
In this section, we employ the Lagrange multiplier method to derive an energy-optimum reconfiguration strategy for the adaptive filter architecture in Fig. 2(b) . The reconfigurable parameters in this architecture are the control signals s and s, which indicate the powered-up taps, and precisions and
The choice of precisions and were presented in Section II-A. Once the optimum value of s are obtained, then the coefficient precision can be computed via (2.5), and the input precision can be computed from (2.4). In this section, we present strategies for determining energy-optimal values of s and s. In Section IV-A, we formulate the energy optimization problem and derive the reconfiguration strategy in Section IV-B.
A. Lagrange Formulation
The optimization problem in (3.1) can be rewritten as s.t.
(4.1)
where we have dropped the state to simplify notation and with the understanding that we will now compute the optimum configuration for a given state. We refer to the optimization problem in (4.1) as a primal problem. The Lagrange multiplier method [38] for all and sufficiently close to The above definition implies that a saddle point is a local minimum of in the space and a local maximum in space. Thus, a natural way for finding the saddle point is to descend in the space and ascend in the space. The Lagrange multiplier can also be viewed as the penalties associated with constraints. Therefore, ascents of inspace corresponds to increasing the penalties for the unsatisfied constraints. Similarly, a descent in the -space corresponds to the minimization of the objective function while satisfying the constraints. Based on this understanding, we can redefine the optimization problem in (4.1) as
It can be shown [38] that in the saddle point of (4.2) obtained as the solution to (4.4) is also the optimum solution for the primal problem in (4.1). This is also called the saddle point theorem; refer to [38] for a detailed proof.
B. Energy-Optimum Reconfiguration Strategy for Adaptive Filters
We solve (4.4) for adaptive filters under the following assumptions:
1) The WUD block in Fig. 2(b) is switched off (i.e., after the filter has converged. 2) The input is uncorrelated. In other words, we will assume that the correlation sequence of the input signal is nonzero (and equal to only if Under these two assumptions, the energy dissipation of the adaptive filter in Fig. 2(b) is given by (4.5) where is the energy dissipated by a multiplier with coefficient , and is a vector representation of s. Note that we have ignored the energy dissipation of the adder in each tap. This is a reasonable assumption since multipliers are the power-hungry blocks in digital filters. We have also employed assumption 1 so that the energy consumption of the WUD block can be ignored. Substituting for in (2.3) and employing assumption 2 above, we obtain where is a constant, and is the optimum value of In practice, we do not need to compute the constant if we employ the reconfiguration strategy of powering down the taps, starting with the smallest value of until the MSE constraint [see (4.1)] is violated.
The reconfiguration strategy derived from (4.9) indicates that it is better to power down taps with small values of Intuitively, this makes sense as small values of imply that the th tap contributes less to the performance measure (as is small) but consumes more energy [
is large]. Note also that different multiplier models can easily be accommodated by redefining Furthermore, if is assumed to be independent of , then the energy-optimum reconfiguration strategy would be to switch off taps with the smallest coefficients. This is the strategy employed in [25] and [27] .
The optimum value for is chosen as 0 if either or the filter has converged. The justification for this is that we do not need to update the th tap if it is not being employed in F-block computation. In addition, if the filter has converged, then the WUD portion of all the taps can be powered down. Thus, we have presented a practical reconfiguration strategy that determines the configuration parameters and for an adaptive filter. In the next section, we employ this reconfiguration strategy for 51.84 Mb/s very high-speed digital subscriber loop (VDSL).
V. APPLICATION TO 51.84 Mb/s VDSL
In this section, we employ DAT-based adaptive equalizer for 51.84 Mb/s VDSL. First, we present an overview of the VDSL environment and the VDSL transceiver. 
A. The VDSL Environment
The VDSL application assumes a fiber-to-the-curb (FTTC) [34] network architecture. In this architecture shown in Fig. 4 , the optical fiber goes to a curbside pedestal that serves a small number of homes. At the pedestal, the optical signal is converted into an electrical signal and then demultiplexed for delivery to individual homes on copper wiring. These functions are performed in an optical network unit (ONU). The ONU also performs the multiplexing and signal conversion functions required in the opposite direction, i.e., from the homes to the network. In the VDSL system considered here, the downstream channel (from the optical network unit (ONU) to the home) operates at a data rate of 51.84 Mb/s. A receiver for this data rate is conventionally designed for 1 kft cable length and 11 far-end crosstalk (FEXT) interferers. However, in practice, the cable length and the number of interferers may vary. A DATbased receiver can exploit these variations to achieve energy savings.
Next, we briefly discuss channel and FEXT characteristics of a BKMA cable, which is employed for twisted pair distribution cable in Fig. 4 . The propagation loss of a BKMA cable is similar to that of a category-5 cable specified in the TIA/EIA-568A Standard [45] and is given by (5.1) where the propagation loss is expressed in decibels, the frequency is expressed in megahertz, and is the length of the cable in kilofeet. As far as FEXT is concerned, a quantity of interest is the ratio , where and are the received signal and FEXT signal, respectively. This ratio [which is also called the equal-level FEXT (EL-FEXT) loss or the input signal-to-noise ratio SNR in a FEXT dominated environment] can be written as:
where the EL-FEXT is expressed in decibels, the frequency is expressed in megahertz, is the length of the cable in kilofeet, is the maximum number of crosstalk interferers in the cable, and is the number of active crosstalk interferers. We assume 24 for this work. The FEXT impairment can be modeled as a Gaussian source because the FEXT sources are independent of each other. 
B. 51.84 Mb/s DAT-based VDSL Transceiver
In this subsection, we describe the transmitter and the receiver for 51.84 Mb/s VDSL. We will assume that the carrierless amplitude phase (CAP) [46] modulation scheme is being employed. The block diagram of a digital CAP transmitter is shown in Fig. 5(a) . The bit stream to be transmitted is first passed through a scrambler. The scrambled bits are then fed into an encoder, which maps blocks of 4 bits onto one of 16 different complex symbols [see Fig. 5(b) ] corresponding to 16-CAP (4 bits/symbol) line code. The symbols and are processed by digital shaping filters. The shaping filter impulse response is specified by a square-root raised cosine pulse with center frequency 12.96 MHz and excess bandwidth 38%. This requires that the shaping filters be operated at a sampling frequency , which is at least twice the maximum frequency component of the transmit spectrum. We choose 51.84 MHz here. The outputs of the filters are subtracted, and the result is passed through a digital-to-analog (D/A) converter operating at 51.84 MHz.
On the receiver (see Fig. 6 ), an analog signal is first amplified by a programmable gain amplifier (PGA). The gain of PGA is controlled by a digital PGA control block. The output of PGA is passed to an A/D operating at 51.84 MHz, which converts the analog signal to the digital signal. The sampling instant of the A/D is controlled by a timing recovery block. The digital output of the A/D is processed by a decision feedback equalizer (DFE). The output of the DFE is passed through a 16-CAP slicer to obtain the output symbols. The DFE consists of the two filters: a feedforward filter and a feedback filter. The feedforward filter is a fractionally spaced linear equalizer (FSLE), which is a pair of 48-tap adaptive filters. The feedback filter is a complex 10-tap adaptive filter operating at symbol rate. A low-power strength-reduced architecture proposed in [11] is employed for implementing the complex adaptive filter. As the precision requirements and the number of taps in the feedback filter are much smaller than those in the feedforward filter, we apply DAT only to the feedforward filter.
The complexity of the DFE is reduced by simplifying the adaptation algorithm. The equalizer is blindly adapted by employing reduced constellation algorithm (RCA) [47] . In this algorithm, the adaptive filter first converges to a coarse solution based on a 4-CAP constellation (in place of a 16-point constellation). After the convergence with a 4-CAP constellation, the filter is adapted with a 16-CAP constellation. We also employ powers-of-two approximations [48] of the output error for the update of the coefficients. This simplification, along with powers-of-two step-sizes, allows the replacement of the multipliers in the weight-update block with shifters.
The equalizer output is passed through the slicer, decoder, and descrambler to retrieve the 51.84 Mb/s data. The algorith- mic performance measure in this case is the SNR at the slicer (SNR ), which is equal to the ratio of signal constellation power (which equals 10 for 16-CAP) to the MSE across the slicer. For 16-CAP, an SNR 21.5 dB is sufficient to obtain a probability of error less than
To reduce undesirable glitching, we employ a window of 2 dB around SNR 21.5 dB. This implies that if SNR lies between 21.5 and 23.5 dB, then no reconfiguration takes place. From [33] , we obtain the parameters of the worst-case design as a 1-kft cable length and 11 FEXT interferers. In order to achieve SNR 21.5 dB, the requirements for the feedforward filter are number of filter taps 48, coefficient precision 10 bits, and data precision 8 bits. Similarly, the requirements for the feedback filter are number of filter taps 10, coefficient precision 8 bits, and data precision 3 bits. We assume that cable length can vary from 1-0.1 kft in steps of 0.1 kft. Similarly, the number of FEXT interferers can be 11, 7, or 4. These variations define the state space with 30 states
The exact probability distribution of the states requires a survey of the installation of the VDSL network. Since this information is not known at the present moment, we assume that states in (5.3) have a Gaussian distribution with 0.6 kft and 4-FEXT as the mean values (nominal case) and standard deviation of 0.2 kft and 3 FEXT.
The SMA block detects transitions in the input states by monitoring SNR as a function of the cable length or the number of FEXT interferers. The optimum SPA configuration is then computed via the reconfiguration strategy presented in Section IV-B. The variation in SNR can be detected by observing and , where and are errors across in-phase slicer and quadrature-phase slicer, respectively. As shown in Fig. 7 , and are computed by summing and over symbols with being chosen to be 4096 for this experiment. All the blocks before and including threshold comparator are always powered up. The subblocks after the threshold comparator in Fig. 7 compute the energy-optimum value of s, s, and only when a state transition occurs.
C. Simulation Results
In this subsection, we present simulation results for 51.84 Mb/s VDSL in terms of converged configurations SNR and energy consumption. The SNR is computed as a ratio of the signal constellation power (which is 10 for 16-CAP) and MSE across the 16-CAP slicer. Multiplier energy consumption is obtained via a real-delay gate-level simulator MED [44] , assuming a 0.18 m, 2.5 V CMOS technology. The energy consumption of the equalizer is then obtained by summing the energy values of the powered-up multipliers in the F block. We assume that all multipliers are powered-up in the worst-case design. The energy consumption of the SMA block is due to the blocks up to the threshold comparator in In Fig. 8 , we show SNR convergence curves for a DATbased VDSL equalizer for two input states. Fig. 8(a) shows convergence curve for the worst-case input state corresponding to a 1-kft cable length and 11 FEXT interferers. Recall that the equalizer is adapted blindly via the reduced constellation algorithm (RCA). In the 4-CAP constellation mode, the equalizer converges to an SNR of 16 dB after 32 768 symbols, after which it is switched to a 16-CAP constellation mode. Finally, the equalizer converges to an SNR of 22.5 dB, which is within the SNR constraint window of [21.5 dB, 23.5 dB]. Now, consider the situation where the cable length is 1 kft and the number of FEXT interferers is 4. The SNR convergence curve is plotted in Fig. 8(b) . The adaptive filter converges to an SNR of 24.5 dB in this case, thereby falling outside the desired SNR window [21.5 dB, Fig. 8(b) . This is done by powering down taps one at a time starting with the one with the smallest value of until the final SNR lies between 21.5-23.5 dB. The final configuration vector of the in-phase adaptive filter is , where a "1" indicates a tap that is powered up, and "0" indicates that it is powered down. It can be seen that only 16 (out of 48) taps are powered up. Unlike [25] and [37] , taps other than those at the end are also powered down. Similarly, the final configuration vector for the quadrature-phase filter , which corresponds to 17 powered-up taps. In this configuration, an SNR of 22.2 dB is achieved. Overall, the filter dissipates 63% less energy as compared with the worst-case design. Table I shows the converged configurations for in-phase and quadrature-phase filters. It can be seen that the number of powered-up taps (i.e., ) decreases as the cable length decreases. Similarly, the number of powered-up taps reduces with a reduction in the number of FEXT interferers for a fixed cable length. From Table I and consistent with (2.5), we observe that reduces by 1 bit for shorter cable lengths. For example, the number of powered-up taps in the in-phase filter ranges from 9-48, ranges from 9-10 for cable lengths ranging from 100 ft to 1000 ft, and the number of FEXT interferers is fixed at 11. The number of powered-up taps in the in-phase filter range from 16-48 when the number of FEXT interferers varies from 4-11, and the cable length is fixed at 1 kft. Similar observations can be made for the number of powered-up taps and precision of the quadrature-phase filter. Table II shows the converged SNR for equalizers based on both the worst-case design and DAT-based design. It can be seen that the SNR for the worst-case design increases from 21.3-28.8 dB as the cable length decreases from 1-0.1 kft and the number of FEXT interferers is fixed at 11. The SNR ranges from 21.3-23.1 dB as the number of FEXT interferers reduces from 4-11 with the cable length fixed at 1 kft. Overall, the DAT-based receiver maintains the SNR within the range of 21.5-23.5 dB for all the states.
Table III compares the energy consumption of the worstcase and DAT-based designs. The energy consumption of the worst-case design varies from 2.8-3.5 mW/MHz even though the configuration is fixed. This variation reflects the variation of the energy consumption on input data. The average energy consumption of the worst-case design was found to be 3.2 mW/MHz. The energy consumption of the DAT-based design varies from 0.6-3.6 mW/MHz. Of this, the energy of the SPA block varies from 0.5-3.5 mW/MHz, whereas that of the SMA block is 0.06 mW/MHz. Employing (3.2), we find that the energy savings range from 2 to 81% with an average of 53% assuming a Gaussian distribution for the input states. Thus, it can be seen that the DAT-based approach is quite attractive from the viewpoint of energy savings for the VDSL application.
VI. CONCLUSIONS AND FUTURE WORK
In this paper, we have presented dynamic algorithm transforms (DAT's) as a systematic method for designing lowpower application-specific reconfigurable DSP systems. In particular, we employed DAT in the equalization scenario for a 51.84 Mb/s VDSL transceiver. Substantial energy savings are observed due to variations in the cable length and the number of FEXT interferers. DAT techniques jointly optimize system performance and energy dissipation, thus representing a growing trend to synergize across the design hierarchy. In addition, DAT provides a convenient framework within which ongoing research in the areas of data-adaptive DSP algorithms and reconfigurable circuits and architectures can be synergistically combined to enable the design of energyefficient reconfigurable DSP systems. where is as defined in (A6). From (A7), it is clear that the terms for different indices are decoupled from each other. Therefore, for a particular index , the optimum can be obtained by solving the following optimization problem:
where we want to find the optimum value of , and is a term independent of
The optimum value is obtained as follows. which is identical to (A5).
Next, we prove that the optimization problem (A3) has a well-defined maximum point which is identical to (4.9).
