## Conditional Expectation: Intergrating indicator function multiplied by the joint denisity

I am currently reading “Measure, Integral and Probability” by Capinski, Marek (see p179). It includes some motivation for the definition of the conditional expectation. For example, given two random variables $$X,Y$$ with joint density $$f_{(X,Y)}$$ (and so the marginal and conditional densities), we want to show that for any set $$A \subset \Omega, A=X^{-1}(B), B$$ Borel, that $$\int_A\mathbb{E}(Y|X)dP= \int_A \mathbb{E}(Y)dP.$$ This is one of the defining condition of an conditional expectation. The book shows the following calculation, \begin{align} \int_A\mathbb{E}(Y|X)dP &= \int_\Omega 1_B(X)\mathbb{E}(Y|X)dP\ &= \int_\Omega 1_B(X(\omega))\left(\int_\mathbb{R}yf_{Y|X}(y|X(\omega))dy\right)dP(\omega)\ &=\int_\mathbb{R}\int_\mathbb{R}1_B(x)yf_{(Y|X)}(y|x)dy f_X(x)dx\ &=\int_\mathbb{R}\int_\mathbb{R}1_B(x)yf_{X,Y}(x,y)dxdy\ &= \int_\Omega 1_A(X)YdP\ &= \int_A YdP. \end{align} What I don’t understand is the second to last equality immediately above, i.e. $$\int_\mathbb{R} y \int_\mathbb{R}1_B(x)f_{X,Y}(x,y)dxdy = \int_\Omega 1_A(X)YdP .$$ I think it is a typo since $$X\in \mathbb{R}$$ and $$A \subset \Omega$$ — however, I cant figure the correction either!

## Expectation of number of hubs in a random graph

Suppose $$\Gamma(V, E)$$ is a finite simple graph. Let’s call a vertex $$v \in V$$ a hub if $$deg(v)^2 > \Sigma_{w \in O(v)} deg(w)$$. Here $$deg$$ stands for the vertex degree, and $$O(v)$$ for the set of all vertices adjacent to $$v$$. Let’s define $$H(\Gamma)$$ as the number of all hubs in $$\Gamma$$.

Now, suppose $$G(n, p)$$ is an Erdos-Renyi random graph with $$n$$ vertices and edge probability $$p$$. Does there exist some sort of explicit formula for $$E(H(G(n, p)))$$ (as a function of $$n$$ and $$p$$)?

How did this question arise:

I have recently heard of a so called «friendship paradox» that states, that the number of your friends usually does not exceed the average number of friends your friends have. When I at first heard about it, I wondered, if that is just some specifics of human society, or is there a mathematical explanation behind it. First I tried to translate the statement of the «friendship paradox» to the mathematical language as

Any graph with $$n$$ vertices has $$o(n)$$ hubs,

but then I quickly found, that it it is blatantly false this way:

Suppose $$n > 2$$, let’s take the full graph on $$n$$ vertices $$K_n$$ and then remove one edge from it, then, the resulting graph will have $$n – 2$$ hubs, which is clearly not $$o(n)$$.

So, the «friendship paradox» can not be translated to deterministic graph theory using the notion of «hubs». So it can not be interpreted that way. So, I thought, that maybe something similar with random graphs should work.

Also:

Later, I found out, that there actually is a consistent mathematical interpretation of «friendship paradox» in terms of deterministic graph theory: Friendship paradox demonstration

However it does not solve my problem – which is finding the expectation of the number of hubs in a random graph. So, please do not mark my question as a duplicate of the aforementioned question.

## Is my interpretation of the expression on the expectation on a correct one?

Consider the following equation expression

$$E_{x\sim p_{data}(x)}[ f(x) ]$$

I am understanding it as

1) if X is a collection of a discrete random variable (X_1, X_2,……, X_n), and x is generated from X by a particular assignment of all random variables in the tuple

Then

$$E_{x\sim p_{data}(x)}[ f(x) ] = \sum\limits_{x} f(x) p_{data}(x)$$

2) if X is a collection of a continuous random variable (X_1, X_2,……, X_n), and x is generated from X by a particular assignment of all random variables in the tuple

Then

$$E_{x\sim p_{data}(x)}[ f(x) ] = \int\limits_{x_1}\int\limits_{x_2} \cdots \int\limits_{x_n} f(x) p_{data}(x) dx_n \cdots dx_1$$

Is my interpretation exact? If wrong, where am I going wrong?

## The expectation of partition times needed separate two elements in a set

I met a problem which can be formulated as set partition.

Given a set $$S=\{s_1,s_2,…,s_n\}$$ having $$n$$ elements, I want to separate two elements, say $$s_1,s_2$$, in $$S$$ by repeatedly using set partition operations. Each set partition operation randomly partitions a set, say $$A$$, into two non-empty subsets, $$B$$ and $$C$$, such that $$A=B\cup C$$ and $$B\cap C=\emptyset$$. I want to calculate or approximate the expectation of partition time, $$E(n)$$, to separate $$s_1$$ and $$s_2$$.

Let see two simple cases:

(1) $$n=2$$:

In this case, $$S=\{s_1,s_2\}$$. The only feasible partition will separate $$S=\{s_1,s_2\}$$ as $$\{s_1\}$$ and $$\{s_2\}$$. So, $$E(2) = 1$$.

(2) $$n=3$$:

In this case, $$S=\{s_1,s_2,s_3\}$$. There are two situations:

(a) if the first partition is $$\{s_1\},\{s_2,s_3\}$$ or $$\{s_2\},\{s_1,s_3\}$$, then 1 partition is ok!

(b) if the first partition is $$\{s_1,s_2\},\{s_3\}$$, then I need a second partition making $$\{s_1,s_2\}$$ into $$\{s_1\},\{s_2\}$$. So the partition time is 2.

The possibility of situation (a) is 2/3 and situation (b) 1/3. So the $$E(3)=1*(2/3)+2*(1/3)=4/3$$.

I tried using recursive formula but it seems to be a non-closed form. I also wonder whether or not $$E(n)$$ can be approximated by some other continuous functions?

## Bound for Expectation of Singular Value

In my case, $$X_{\boldsymbol{\delta}}\in\mathbb{R}^{d\times M}$$ is a function of Rademacher variables $$\boldsymbol{\delta}\in\{1,-1\}^M$$ with $$\delta_i$$ independent uniform random variables taking values in $$\{−1, +1\}$$. $$X_{\boldsymbol{\delta}}=[\sum_{i=1}^{I_1}\delta_{i}\mathbf{x}_{i},\sum_{i=I_1+1}^{I_2}\delta_{i}\mathbf{x}_{i},…,\sum_{i=I_{M-1}+1}^{I_M}\delta_i\mathbf{x}_{i}]$$ is a group-wise sum with known $$I_1,I_2,…,I_M$$ and non-singular $$X=(\mathbf{x}_1,\mathbf{x}_2,…,\mathbf{x}_N)\in\mathbb{R}^{d\times N}$$ where $$N>M\gg d$$.

Given that $$\sigma_i(X_{\boldsymbol{\delta}})$$ denotes $$i$$-th smallest singular value, how can I find the lower bound of the expectation $$\underset{\boldsymbol{\delta}}E\left[\sum_{i=1}^{k} \sigma_{i}^{2}\left(X_{\boldsymbol{\delta}}\right)\right]$$ assuming $$k?

Note: I can find an upper bound by Jensen’s inequality and concavity of sum of $$k$$ smallest eigenvalue, but I am curious about whether it is possible to get a lower bound.

I have also posted the question here.

## Expectation – product of expectations is expectation of product?

I have a random variable $$Z(t)$$ (which represents the number of cells at time $$t$$). I know that $$Z(t+\Delta t) = \sum_{k=1}^{\Delta t} Z^{k}(t)$$ with $$Z^{k}(t)$$ all independent.

Now it’s using this in a calculation involving expectations that is a problem…

So I have: $$f(x, t + \Delta t) = \mathbb{E} [x^{Z(t+ \Delta t)}] \ = \mathbb{E} [x^{\sum_{k=1}^{\Delta t} Z^{k}(t)}] \ = \mathbb{E}[x^{Z^1(t)} \times x^{Z^2(t)} \times \cdots \times x^{Z^{\Delta t}(t)}] \ =^{?} \mathbb{E} \prod_{k=1}^{\Delta(t)} \mathbb{E}[x^{Z^k(t)}]$$

My confusion is with the last line. I thought, the expectation of a product is the product of expectations, but this is not what has happened.

Any ideas?

## Expectation value of m parallel games

Easy example to start: You throw a $$n$$-sided dice until your lucky number shows. This is a Bernoulli process with $$p=1/n$$, your expected number $$E$$ of throws is $$n$$.
Now imagine you play this game $$m$$ times in parallel. Stop if any dice shows your lucky number, i.e. first win wins the whole game. (Feel free to discuss also the last win case 🙂 It’s still a Bernoulli process with $$p’=1-(1-1/p)^m, E’=1/p’$$.
Now generalize. The game $$G$$ is defined by fixing some winning probability $$p(i)$$ for each move $$i$$, we only assume $$E$$ shall be finite. Play $$m$$ copies of $$G$$ parallel, again first win in a copy ends the game. What can one say about $$E’$$?
1. $$E’\le E$$. (trivial)
2. $$E’\ge E/m$$ ? (tempting 🙂
3. Let $$p(i)$$ stem from some well-known probability distribution, say, instead of the geometric from above a Poisson or whatnot. Surely $$E’$$ already has been computed for many of these distributions?

## How to use Wald’s equation to determine expectation in gambling model?

$$\begin{array}{l}{\text { In each game played one is equally likely to either win or lose 1. Let } S \text { be your }} \ {\text { cumulative winnings if you use the strategy that quits playing if you win the first }} \ {\text { game, and plays two more games and then quits if you lose the first game. }} \ {\text { (a) Use Wald’s equation to determine } E[S] \text { . }} \end{array}$$

Let $$X_i$$ be the amount won in game $$i$$. $$E[S] =E\bigg[\sum_{i=1}^N X_i\bigg]$$ Applying Walds theorem, $$=E[N]E[X]$$ And since this is a fair game where the player starts with 0 dollars, we know that $$E[X]=0$$. So, $$E[S]=0$$

Is this correct?

## The expectation of the total number of pairs of keys in a hash table that collide using universal hashing

I am reading CLRS relating to perfect hashing. When computing the $$\mathbb{E}[\sum_{j=0}^{m-1}{n_j\choose{2}}]$$

where $$m$$ is the number of slots in the hash table, and $$n_j$$ is the number of keys in position $$j$$. I don’t understand why we can directly conclude that

$$\mathbb{E}[\sum_{j=0}^{m-1}{n_j\choose{2}}]\leq{n\choose{2}}\frac{1}{m}$$

I understand that since $$h$$ is randomly chosen from a universal hash function family, $$\Pr{(h(x_i)=h(x_j))}\leq{\frac{1}{m}},\forall{i\neq{j}}$$. I don’t understand why we can use the total number of pairs (the combination part) directly because if $$h(x_i)=h(x_j)$$ and $$h(x_j)=h(x_k)$$, then we have $$h(x_i)=h(x_k)$$ immediately instead of a probability of $$\frac{1}{m}$$.

Someone can help me out? Thanks!

## Expectation of a linear operator

Could anyone help me with this? Since I didn’t get any answer from math and stat StackExchange so I am writing here.

We define $$T: C[0,1]\to C[0,1]\ni T(f(x))= \sum\limits_{k=1}^{m} p_k (f\circ f_k)(x):=\mathbb E( f(X_{n+1}|X_n=x)$$ for a system $$X_{n+1}=f_{\omega_n}(X_n), n=0,1,2\dots.$$ and $$\omega_n$$ are i.i.d discrete r.v over $$\{1,2,\dots,m\}$$, $$p_k=\text{ Prob} (\omega_i=k)$$, $$f_k$$ are bounded Lipschitz.

Could anyone explain to me how come $$| T^n(f(x))-T^n(f(y))|=| \mathbb E(f(X_n(x))-f(X_n(y)))|$$?

Thanks for the help.