I am reading Stein’s 2013 monograph ‘Interpolation of Spatial Data’ where he characterizes the best linear predictor (BLP) in terms of a Hilbert space.

Let $ Q$ be the set on which a random process $ Z$ is observed. All linear predictors of $ h\in H_R$ , with $ R\subset\mathbb{R^d}$ , are of the form $ c+g$ , where $ c\in\mathbb{R}$ is a scalar and $ g\in H_Q$ . Let $ g(h)$ be the unique element in $ H_Q$ that satisfies $ \mathrm{cov}[h-g(h),g^\prime]= 0$ $ \forall g^\prime\in H_Q$ and set $ c(h) = \mathbb{E}[h] – \mathbb{E}[g(h)]$ . From this it follows that $ c(h)+g(h)$ is the BLP of $ h$ , which follows from $ \mathbb{E}[(h-c(h)-g(h))(c^\prime+g^\prime)]=0~\forall c^\prime\in\mathbb{R}$ .

I have three questions:

- How do I see that $ h$ is of the form $ c+g$ ?
- Does the scalar take me from one Hilbert space to another?

- Why is $ g(h)$ unique for that covariance restriction? How can I interpret this covariance?
- $ \mathbb{E}[(h-c(h)-g(h))(c^\prime+g^\prime)] = 0$ should be obvious from these definitions but I don’t see it.
- The first parenthesis could be 0 since $ \mathbb{E}[h]-\mathbb{E}[g(h)]$ is 0, but I’m really not sure.

Thanks for the help.