Phylogenetic methods typically rely on an appropriate model of how data
evolved in order to infer an accurate phylogenetic tree. For molecular data,
standard statistical methods have provided an effective strategy for extracting
phylogenetic information from aligned sequence data when each site (character)
is subject to a common process. However, for other types of data (e.g.
morphological data), characters can be too ambiguous, homoplastic or saturated
to develop models that are effective at capturing the underlying process of
change. To address this, we examine the properties of a classic but neglected
method for inferring splits in an underlying tree, namely, maximum
compatibility. By adopting a simple and extreme model in which each character
either fits perfectly on some tree, or is entirely random (but it is not known
which class any character belongs to) we are able to derive exact and explicit
formulae regarding the performance of maximum compatibility. We show that this
method is able to identify a set of non-trivial homoplasy-free characters, when
the number n of taxa is large, even when the number of random characters is
large. By contrast, we show that a method that makes more uniform use of all
the data --- maximum parsimony --- can provably estimate trees in which {\em
none} of the original homoplasy-free characters support splits.Comment: 37 pages, 2 figure