Many machine learning algorithms require large numbers of labeled data to
deliver state-of-the-art results. In applications such as medical diagnosis and
fraud detection, though there is an abundance of unlabeled data, it is costly
to label the data by experts, experiments, or simulations. Active learning
algorithms aim to reduce the number of required labeled data points while
preserving performance. For many convex optimization problems such as linear
regression and p-norm regression, there are theoretical bounds on the number
of required labels to achieve a certain accuracy. We call this the query
complexity of active learning. However, today's active learning algorithms
require the underlying learned function to have an orthogonal basis. For
example, when applying active learning to linear regression, the requirement is
the target function is a linear composition of a set of orthogonal linear
functions, and active learning can find the coefficients of these linear
functions. We present a theoretical result to show that active learning does
not need an orthogonal basis but rather only requires a nearly orthogonal
basis. We provide the corresponding theoretical proofs for the function family
of nearly orthogonal basis, and its applications associated with the
algorithmically efficient active learning framework