Sparse methods for supervised learning aim at finding good linear predictors
from as few variables as possible, i.e., with small cardinality of their
supports. This combinatorial selection problem is often turned into a convex
optimization problem by replacing the cardinality function by its convex
envelope (tightest convex lower bound), in this case the L1-norm. In this
paper, we investigate more general set-functions than the cardinality, that may
incorporate prior knowledge or structural constraints which are common in many
applications: namely, we show that for nondecreasing submodular set-functions,
the corresponding convex envelope can be obtained from its \lova extension, a
common tool in submodular analysis. This defines a family of polyhedral norms,
for which we provide generic algorithmic tools (subgradients and proximal
operators) and theoretical results (conditions for support recovery or
high-dimensional inference). By selecting specific submodular functions, we can
give a new interpretation to known norms, such as those based on
rank-statistics or grouped norms with potentially overlapping groups; we also
define new norms, in particular ones that can be used as non-factorial priors
for supervised learning