A fused floating-point dot product unit. The fused dot product unit includes an improved alignment scheme that generates smaller significand pairs compared to the traditional alignment due to the reduced shift amount and sticky logic. Furthermore, the fused dot product unit implements early normalization and a fast rounding scheme. By normalizing the significands prior to the significand addition, the length of the adder can be reduced and the round logic can be performed in parallel. Additionally, the fused dot product unit implements a four-input leading zero anticipation unit thereby reducing the overhead of the reduction tree by encoding the four inputs at once. The fused floating-point dot product unit may also employ a dual-path (a far path and a close path) algorithm to improve performance. Pipelining may also be applied to the dual-path fused dot product unit to increase the throughput.Board of Regents, University of Texas Syste