Remote operating system fingerprinting relies on implementation differences between OSs to identify the specific variant executing on a remote host. Because these differences can be subtle and difficult to find, most fingerprinting tools require expert manual effort to construct discriminative fingerprints and classification models. In prior work, Caballero et al. proposed a promising technique to eliminate manual intervention: the automatic generation of fingerprints using an approach similar to fuzz testing . Their work evaluated the technique in a small-scale, carefully controlled test environment. In this paper, we re-examine automatic OS fingerprinting in a more challenging large-scale scenario to better understand the viability of the technique. In contrast to the prior work, we find that automatic fingerprint generation suffers from several limitations and technical hurdles that can limit its effectiveness, particularly in more demanding, realistic environments. We use machine learning algorithms from the well-known Weka  data mining toolkit to automatically generate fingerprints over 329 different machine instances, and we compare the accuracy of our automatically generated fingerprints to Nmap. Our results suggest that overfitting to non-OS-specific behavioral differences, the indistinguishability of different OS variants, the biasing of an automatic tool to the makeup of the training data, and the lack of ability of an automatic tool to exploit protocol and software semantics significantly limit the usefulness of this technique in practice. Automatic techniques can help identify candidate signatures, but our results suggest that manual expertise will remain an integral part of fingerprint generation
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.