Plant traits are a key to understanding and predicting the adaptation of
ecosystems to environmental changes, which motivates the TRY project aiming at
constructing a global database for plant traits and becoming a standard
resource for the ecological community. Despite its unprecedented coverage, a
large percentage of missing data substantially constrains joint trait analysis.
Meanwhile, the trait data is characterized by the hierarchical phylogenetic
structure of the plant kingdom. While factorization based matrix completion
techniques have been widely used to address the missing data problem,
traditional matrix factorization methods are unable to leverage the
phylogenetic structure. We propose hierarchical probabilistic matrix
factorization (HPMF), which effectively uses hierarchical phylogenetic
information for trait prediction. We demonstrate HPMF's high accuracy,
effectiveness of incorporating hierarchical structure and ability to capture
trait correlation through experiments.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012