Generalizing the German Tank Problem

Abstract

The German Tank Problem dates back to World War II when the Allies used a statistical approach to estimate the number of enemy tanks produced or on the field from observed serial numbers after battles. Assuming that the tanks are labeled consecutively starting from 1, if we observe kk tanks from a total of NN tanks with the maximum observed tank being mm, then the best estimate for NN is m(1+1/k)1m(1 + 1/k) - 1. We explore many generalizations. We looked at the discrete and continuous one dimensional case. We explored different estimators such as the LL\textsuperscript{th} largest tank, and applied motivation from portfolio theory and studied a weighted average; however, the original formula was the best. We generalized the problem in two dimensions, with pairs instead of points, studying the discrete and continuous square and circle variants. There were complications from curvature issues and that not every number is representable as a sum of two squares. We often concentrated on the large NN limit. For the discrete and continuous square, we tested various statistics, finding the largest observed component did best; the scaling factor for both cases is (2k+1)/2k(2k+1)/2k. The discrete case was especially involved because we had to use approximation formulas that gave us the number of lattice points inside the circle. Interestingly, the scaling factors were different for the cases. Lastly, we generalized the problem into LL dimensional squares and circles. The discrete and continuous square proved similar to the two dimensional square problem. However, for the LL\textsuperscript{th} dimensional circle, we had to use formulas for the volume of the LL-ball, and had to approximate the number of lattice points inside it. The formulas for the discrete circle were particularly interesting, as there was no LL dependence in the formula.Comment: Version 1.0, 47 page

    Similar works