Recent experimental results presented in Burridge and Taylor (2001a,b, and 2003) show that, as usually implemented, the Hylleberg et al. (1990) seasonal unit root tests can be rather liberal, with true level often substantially higher than nominal level. This effect is due to the presence of any of three things: data-based lag selection in the implementation of the tests, and either or both periodic heteroscedasticity and serial correlation in the driving shocks. Burridge and Taylor (2003) demonstrate that under experimental conditions a carefully implemented bootstrap substantially corrects test level without loss of power. The present study applies their technique to a large number of publicly available series, and demonstrates conclusively that the bootstrap produces less liberal, and, given the experimental results cited above, more reliable inference. We report results for Sweden, the UK and the US, which are typical of the fifteen countries in our panel. Other results, the GAUSS code, and raw data are all available at: www.staff.city.ac.uk/p.burridge