We consider the problem of scheduling a queueing system in which many
statistically identical servers cater to several classes of impatient
customers. Service times and impatience clocks are exponential while arrival
processes are renewal. Our cost is an expected cumulative discounted function,
linear or nonlinear, of appropriately normalized performance measures. As a
special case, the cost per unit time can be a function of the number of
customers waiting to be served in each class, the number actually being served,
the abandonment rate, the delay experienced by customers, the number of idling
servers, as well as certain combinations thereof. We study the system in an
asymptotic heavy-traffic regime where the number of servers n and the offered
load r are simultaneously scaled up and carefully balanced: n\approx r+\beta
\sqrtr for some scalar \beta. This yields an operation that enjoys the benefits
of both heavy traffic (high server utilization) and light traffic (high service
levels.