In an optimal nonbipartite match, a single population is divided into matched
pairs to minimize a total distance within matched pairs. Nonbipartite matching
has been used to strengthen instrumental variables in observational studies of
treatment effects, essentially by forming pairs that are similar in terms of
covariates but very different in the strength of encouragement to accept the
treatment. Optimal nonbipartite matching is typically done using network
optimization techniques that can be quick, running in polynomial time, but
these techniques limit the tools available for matching. Instead, we use
integer programming techniques, thereby obtaining a wealth of new tools not
previously available for nonbipartite matching, including fine and near-fine
balance for several nominal variables, forced near balance on means and optimal
subsetting. We illustrate the methods in our on-going study of outcomes of
late-preterm births in California, that is, births of 34 to 36 weeks of
gestation. Would lengthening the time in the hospital for such births reduce
the frequency of rapid readmissions? A straightforward comparison of babies who
stay for a shorter or longer time would be severely biased, because the
principal reason for a long stay is some serious health problem. We need an
instrument, something inconsequential and haphazard that encourages a shorter
or a longer stay in the hospital. It turns out that babies born at certain
times of day tend to stay overnight once with a shorter length of stay, whereas
babies born at other times of day tend to stay overnight twice with a longer
length of stay, and there is nothing particularly special about a baby who is
born at 11:00 pm.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS582 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org