In the simplest formulation of the Neyman-Pearson hypothesis testing regime, we have some set of samples drawn from a particular distribution with some unknown parameter $\theta$. We construct two hypotheses $H_0\colon \theta=\theta_0$ and $H_1\colon\theta=\theta_1$, the null and alternative hypotheses, respectively, and attempt to reject the null hypothesis, for example by way of the likelihood-ratio test.
The problem that I’d like to consider turns this around in some sense. Assume that instead of samples drawn from a known distribution with unknown parameter, you have an adversary who gets to pick the distribution. To make this concrete, Let $x_1,x_2,\ldots,x_n$ be integers chosen adversarially to satisfy some condition, say $\sum_{i=1}^n x_i\geq M$ given some constraints on the maximum magnitude of each $x_i$. That is, $\mid x_i \mid \leq m_i$ for some integers $m_1,m_2,\ldots,m_n$. At this point, we can form our two hypotheses, $H_0 \colon \sum x_i \geq M$ and $% $. Now, we want to sample some of the $x_i$ and attempt to reject the null hypothesis with some “confidence”. More precisely, we want a test with some (small) significance level $\alpha$, say $\alpha=0.05$.
There are two issues. First, given some of the $x_i$, say $x_{i_1},x_{i_2},\ldots,x_{i_k}$, can we form the likelihood ratio $L(\theta_0 \mid x_{i_1},\ldots,x_{i_k})/L(\theta_1 \mid x_{i_1},\ldots,x_{i_k})$ – assuming we wish to perform a likelihood-ratio test of some sort – given that the adversary knows everything except the indices $i_1,i_2,\ldots,i_k$? Second, how should we pick those indicies to maximize the power of the test? These two questions are intertwined since how we pick the indicies determines the likelihoods. (This is clearly demonstrated by considering picking the first $k$ indices since the adversary can (assuming $M$ is not too large) only modify the last $n-k$ values.)