# PAC-learning definition

I am new in this field. In the book Foundations of Machine Learning by Mehryar Mohri, Afshi Rostamizadeh, and Ameet Talwalkar. The authors define the generalized error of a hypothesis as

$$R(h) = \Pr_{x \sim D}[h(x) \neq c(x)] = E_{x \sim D}[1_{h(x) \neq c(x)}]$$

which is a number. However in the Definition 2.3 a concep class $$C$$ is said to be PAC-learnable if

$$Pr_{S\sim D^{m}}[R(h_{S}) \leq \varepsilon ] \geq 1 – \delta$$

for any $$m \geq poly(1/\varepsilon,1/\delta,n, size(c))$$. Since by definition $$R(h) = \Pr_{x \sim D}[h(x) \neq c(x)]$$ is a number, What does this mean $$Pr_{S\sim D^{m}}[R(h_{S}) \leq \varepsilon ]$$ and $$h_{S}$$ in formal terms? The authors explain that $$h_{S}$$ depends on the sample $$S$$, but the question is how?

Another reason I have this question is due to the fact that every $$h$$ is defined as a function $$h: \mathcal{X} \to \mathcal{Y}$$ and the distribution $$D$$ is defined on $$\mathcal{X}$$, not on the hypothesis set $$H$$ where for each $$h \in \mathcal{H}$$, $$h: \mathcal{X} \to \mathcal{Y}$$.