Multicast communication has applications in a number of fundamental operations in parallel computing. An effective multicast routing algorithm must be free from both livelock and deadlock while minimizing communication latency. We describe two classes of multicast wormhole routing algorithms that employ the multi-destination wormhole hardware mechanism proposed by Lin et al. [12] and Panda et al. [17]. Specific examples of these classes of algorithms are described and experimental results suggests that such algorithms enjoy low communication latencies across a range of network loads