We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces. We give two very different proofs of this result. The first develops a theory of polynomial approximation for log-concave measures and constructs a low-degree ℓ1 polynomial approximator for sufficiently smooth functions. The second uses techniques related to the classical moment problem to obtain sandwiching polynomials. Both approaches deviate significantly from known Fourier-based methods, where essentially all previous work required the underlying distribution to have some product structure. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.