## Uniform inequality of the form $\text{Proba}(\sup_{v \in [-M,M]^k}|p^Tv-\hat{p}_n^Tv| \le \epsilon_n) \ge 1 – \delta$

Let $$M > 0$$, $$k$$ be a positive integer, and $$\mathcal V:=[-M,M]^k$$. Finally, let $$p \in \Delta_k$$, (where $$\Delta_k$$ is the $$(k-1)$$-dimensional probability simplex) and let $$\hat{p}_n$$ be an empirical version of $$p$$ based on an iid sample of size $$n$$. Given $$\delta \in (0, 1)$$, my objective to obtain a uniform-bound of the form

[Objective] $$\text{Proba}(\sup_{v \in \mathcal V}|p^Tv-\hat{p}_n^Tv| \le \epsilon_n) \ge 1 – \delta$$, for some $$\epsilon_n >0$$ (the smaller the better).

# Idea using covering argument

Presumably, for each $$v \in \mathcal V$$, I can use Bernstein’s inequality to control $$|p^Tv-\hat{p}_n^Tv|$$. For example,

$$\text{Proba}\left(|p^Tv-\hat{p}_n^Tv| \le \left(\operatorname{Var}_p(v)\frac{\log(2/\delta)}{n}\right)^{1/2} + \frac{2M\log(2/\delta)}{3n}\right) \ge 1 -\delta.$$

On the other hand,

The mapping $$G:v \mapsto |p^Tv-\hat{p}_n^Tv|$$ is $$2$$-Lipschitz w.r.t the $$\ell_\infty$$-norm on $$\mathbb R^k$$.

Indeed, for all $$v’,v \in \mathcal V$$, one has $$\begin{split} |G(v’)-G(v)| &:= ||p^Tv’-\hat{p}_n^Tv’|-|p^Tv-\hat{p}_n^Tv|| \le |p^Tv’-\hat{p}_n^Tv’-(p^Tv-\hat{p}_n^Tv)|\ &= |p^T(v’-v)-\hat{p}_n^T(v’-v)| \le |p^T(v’-v)|+|\hat{p}_n^T(v’-v)| \ &\le (\|p\|_1+\|\hat{p}_n\|_1)\|v’-v\|_\infty = 2 \|v’-v\|_\infty, \end{split}$$ where the first and second inequalities are triangle inequalities, the third inequality is a Cauchy-Schwarz inequality, and the last inequality is because $$p,\hat{p}_n \in \Delta_k$$ are probability distributions.

Also, the sup-norm covering number of $$\mathcal V$$ is $$\mathcal N_\infty(\mathcal V;\varepsilon)\le(2M/\varepsilon)^k$$.

By using the fact that $$\|v\|_\infty \le M$$ for all $$v \in \mathcal V$$, I can replace the variance term in the above Bernstein bound (i.e we’d use a Hoeffding inequality instead) to get $$\operatorname{Var}_p(v) \le M^2$$ for all $$v \in \mathcal V$$, and then use covering arguments (e.g see https://mathoverflow.net/a/322161/78539) to get an inequality of the sough-for form [Objective] above. However, such an inequality is presumably “blurred”.

# Question

How can these ramblings be pieced together to obtain a strong uniform inequality of the form [objective] ? Of course, I’m more than happy to learn other tricks for obtain such a results, which might not use any of the ideas I’ve discussed above.