In the simplest formulation of the Neyman-Pearson hypothesis testing regime, we have some set of samples drawn from a particular distribution with some unknown parameter . We construct two hypotheses and , the null and alternative hypotheses, respectively, and attempt to reject the null hypothesis, for example by way of the likelihood-ratio test.
This line of work apparently came from a practical need for quality control in manufacturing—or at least this seems to be a common textbook example and certainly some work such as that of Wald and Wolfowitz was in this vein. The assumption is that a manufacturing process produces a batch of products with defects according to some distribution such as a binomial distribution, but with unknown parameter. Hypothesis testing allows one to gain some “confidence” that a particular batch has a small number of defects.
The problem that I’d like to consider turns this around in some sense. Assume that instead of samples drawn from a known distribution with unknown parameter, you have an adversary who gets to pick the distribution. To make this concrete, Let be integers chosen adversarially to satisfy some condition, say given some constraints on the maximum magnitude of each . That is, for some integers . At this point, we can form our two hypotheses, and . Now, we want to sample some of the and attempt to reject the null hypothesis with some “confidence”. More precisely, we want a test with some (small) significance level , say .
There are two issues. First, given some of the , say , can we form the likelihood ratio – assuming we wish to perform a likelihood-ratio test of some sort – given that the adversary knows everything except the indices ? Second, how should we pick those indicies to maximize the power of the test? These two questions are intertwined since how we pick the indicies determines the likelihoods. (This is clearly demonstrated by considering picking the first indices since the adversary can (assuming is not too large) only modify the last values.)
I said this turns the standard Neyman-Pearson regime around in some sense. That sense being that rather than the data being drawn from some distribution, the sample indicies are what is drawn from some distribution. Not being a statistician, I have no idea if this has been studied or if it is trivial to consider this as standard hypothesis testing, but I’d love to know.