This paper deals with sparse approximations by means of convex combinations of elements from a predetermined \basis " subset S of a function space. Speci cally, the focus is on the rate at which the lowest achievable error can be reduced as larger subsets of S are allowed when constructing an approximant. The new results extend those given for Hilbert spaces by Jones and Barron, including in particular a computationally attractive incremental approximation scheme. Bounds are derived for broad classes of Banach spaces; in particular, for Lp spaces with 1 <p<1, the O(n 1=2) bounds of Barron and Jones are recovered when p =2. One motivation for the questions studied here arises from the area of \arti cial neural networks, " where the problem can be stated in terms of the growth in the number of \neurons " (the elements of S) needed in order to achieve a desired error rate. The focus on non-Hilbert spaces is due to the desire to understand approximation in the more \robust" (resistant to exemplar noise) Lp, 1 p<2 norms. The techniques used borrow from results regarding moduli of smoothness in functional analysis as well as from the theory of stochastic processes on function spaces.