A common problem in astrophysics is determining how bright a source could be
and still not be detected. Despite the simplicity with which the problem can be
stated, the solution involves complex statistical issues that require careful
analysis. In contrast to the confidence bound, this concept has never been
formally analyzed, leading to a great variety of often ad hoc solutions. Here
we formulate and describe the problem in a self-consistent manner. Detection
significance is usually defined by the acceptable proportion of false positives
(the TypeI error), and we invoke the complementary concept of false negatives
(the TypeII error), based on the statistical power of a test, to compute an
upper limit to the detectable source intensity. To determine the minimum
intensity that a source must have for it to be detected, we first define a
detection threshold, and then compute the probabilities of detecting sources of
various intensities at the given threshold. The intensity that corresponds to
the specified TypeII error probability defines that minimum intensity, and is
identified as the upper limit. Thus, an upper limit is a characteristic of the
detection procedure rather than the strength of any particular source and
should not be confused with confidence intervals or other estimates of source
intensity. This is particularly important given the large number of catalogs
that are being generated from increasingly sensitive surveys. We discuss the
differences between these upper limits and confidence bounds. Both measures are
useful quantities that should be reported in order to extract the most science
from catalogs, though they answer different statistical questions: an upper
bound describes an inference range on the source intensity, while an upper
limit calibrates the detection process. We provide a recipe for computing upper
limits that applies to all detection algorithms.Comment: 30 pages, 12 figures, accepted in Ap