I am currently reading “*Measure, Integral and Probability*” by Capinski, Marek (see p179). It includes some motivation for the definition of the conditional expectation. For example, given two random variables $ X,Y$ with joint density $ f_{(X,Y)}$ (and so the marginal and conditional densities), we want to show that for any set $ A \subset \Omega, A=X^{-1}(B), B$ Borel, that $ $ \int_A\mathbb{E}(Y|X)dP= \int_A \mathbb{E}(Y)dP.$ $ This is one of the defining condition of an conditional expectation. The book shows the following calculation, \begin{align} \int_A\mathbb{E}(Y|X)dP &= \int_\Omega 1_B(X)\mathbb{E}(Y|X)dP\ &= \int_\Omega 1_B(X(\omega))\left(\int_\mathbb{R}yf_{Y|X}(y|X(\omega))dy\right)dP(\omega)\ &=\int_\mathbb{R}\int_\mathbb{R}1_B(x)yf_{(Y|X)}(y|x)dy f_X(x)dx\ &=\int_\mathbb{R}\int_\mathbb{R}1_B(x)yf_{X,Y}(x,y)dxdy\ &= \int_\Omega 1_A(X)YdP\ &= \int_A YdP. \end{align} What I don’t understand is the second to last equality immediately above, i.e. $ $ \int_\mathbb{R} y \int_\mathbb{R}1_B(x)f_{X,Y}(x,y)dxdy = \int_\Omega 1_A(X)YdP .$ $ I think it is a typo since $ X\in \mathbb{R}$ and $ A \subset \Omega$ — however, I cant figure the correction either!

# Tag: Expectation

## Expectation of number of hubs in a random graph

Suppose $ \Gamma(V, E)$ is a finite simple graph. Let’s call a vertex $ v \in V$ a

hubif $ deg(v)^2 > \Sigma_{w \in O(v)} deg(w)$ . Here $ deg$ stands for the vertex degree, and $ O(v)$ for the set of all vertices adjacent to $ v$ . Let’s define $ H(\Gamma)$ as the number of all hubs in $ \Gamma$ .Now, suppose $ G(n, p)$ is an Erdos-Renyi random graph with $ n$ vertices and edge probability $ p$ . Does there exist some sort of explicit formula for $ E(H(G(n, p)))$ (as a function of $ n$ and $ p$ )?

How did this question arise:

I have recently heard of a so called «friendship paradox» that states, that the number of your friends usually does not exceed the average number of friends your friends have. When I at first heard about it, I wondered, if that is just some specifics of human society, or is there a mathematical explanation behind it. First I tried to translate the statement of the «friendship paradox» to the mathematical language as

Any graph with $ n$ vertices has $ o(n)$ hubs,

but then I quickly found, that it it is blatantly false this way:

Suppose $ n > 2$ , let’s take the full graph on $ n$ vertices $ K_n$ and then remove one edge from it, then, the resulting graph will have $ n – 2$ hubs, which is clearly not $ o(n)$ .

So, the «friendship paradox» can not be translated to deterministic graph theory using the notion of «hubs». So it can not be interpreted that way. So, I thought, that maybe something similar with random graphs should work.

Also:

Later, I found out, that there actually is a consistent mathematical interpretation of «friendship paradox» in terms of deterministic graph theory: Friendship paradox demonstration

However it does not solve my problem – which is finding the expectation of the number of hubs in a random graph. So, please do not mark my question as a duplicate of the aforementioned question.

## Is my interpretation of the expression on the expectation on a correct one?

Consider the following equation expression

$ $ E_{x\sim p_{data}(x)}[ f(x) ] $ $

I am understanding it as

1) if X is a collection of a discrete random variable (X_1, X_2,……, X_n), and x is generated from X by a particular assignment of all random variables in the tuple

Then

$ $ E_{x\sim p_{data}(x)}[ f(x) ] = \sum\limits_{x} f(x) p_{data}(x) $ $

2) if X is a collection of a continuous random variable (X_1, X_2,……, X_n), and x is generated from X by a particular assignment of all random variables in the tuple

Then

$ $ E_{x\sim p_{data}(x)}[ f(x) ] = \int\limits_{x_1}\int\limits_{x_2} \cdots \int\limits_{x_n} f(x) p_{data}(x) dx_n \cdots dx_1$ $

Is my interpretation exact? If wrong, where am I going wrong?

## The expectation of partition times needed separate two elements in a set

I met a problem which can be formulated as set partition.

Given a set $ S=\{s_1,s_2,…,s_n\}$ having $ n$ elements, I want to separate two elements, say $ s_1,s_2$ , in $ S$ by repeatedly using set partition operations. Each set partition operation *randomly* partitions a set, say $ A$ , into two *non-empty* subsets, $ B$ and $ C$ , such that $ A=B\cup C$ and $ B\cap C=\emptyset$ . I want to calculate or approximate the expectation of partition time, $ E(n)$ , to separate $ s_1$ and $ s_2$ .

Let see two simple cases:

(1) $ n=2$ :

In this case, $ S=\{s_1,s_2\}$ . The only feasible partition will separate $ S=\{s_1,s_2\}$ as $ \{s_1\}$ and $ \{s_2\}$ . So, $ E(2) = 1$ .

(2) $ n=3$ :

In this case, $ S=\{s_1,s_2,s_3\}$ . There are two situations:

(a) if the first partition is $ \{s_1\},\{s_2,s_3\}$ or $ \{s_2\},\{s_1,s_3\}$ , then 1 partition is ok!

(b) if the first partition is $ \{s_1,s_2\},\{s_3\}$ , then I need a second partition making $ \{s_1,s_2\}$ into $ \{s_1\},\{s_2\}$ . So the partition time is 2.

The possibility of situation (a) is 2/3 and situation (b) 1/3. So the $ E(3)=1*(2/3)+2*(1/3)=4/3$ .

I tried using recursive formula but it seems to be a non-closed form. I also wonder whether or not $ E(n)$ can be approximated by some other continuous functions?

## Bound for Expectation of Singular Value

In my case, $ X_{\boldsymbol{\delta}}\in\mathbb{R}^{d\times M}$ is a function of Rademacher variables $ \boldsymbol{\delta}\in\{1,-1\}^M$ with $ \delta_i$ independent uniform random variables taking values in $ \{−1, +1\}$ . $ X_{\boldsymbol{\delta}}=[\sum_{i=1}^{I_1}\delta_{i}\mathbf{x}_{i},\sum_{i=I_1+1}^{I_2}\delta_{i}\mathbf{x}_{i},…,\sum_{i=I_{M-1}+1}^{I_M}\delta_i\mathbf{x}_{i}]$ is a group-wise sum with known $ I_1,I_2,…,I_M$ and non-singular $ X=(\mathbf{x}_1,\mathbf{x}_2,…,\mathbf{x}_N)\in\mathbb{R}^{d\times N}$ where $ N>M\gg d$ .

Given that $ \sigma_i(X_{\boldsymbol{\delta}})$ denotes $ i$ -th smallest singular value, how can I find the lower bound of the expectation $ \underset{\boldsymbol{\delta}}E\left[\sum_{i=1}^{k} \sigma_{i}^{2}\left(X_{\boldsymbol{\delta}}\right)\right]$ assuming $ k<d$ ?

Note: I can find an upper bound by Jensen’s inequality and concavity of sum of $ k$ smallest eigenvalue, but I am curious about whether it is possible to get a lower bound.

I have also posted the question here.

## Expectation – product of expectations is expectation of product?

I have a random variable $ Z(t)$ (which represents the number of cells at time $ t$ ). I know that $ Z(t+\Delta t) = \sum_{k=1}^{\Delta t} Z^{k}(t)$ with $ Z^{k}(t)$ all independent.

Now it’s using this in a calculation involving expectations that is a problem…

So I have: $ $ f(x, t + \Delta t) = \mathbb{E} [x^{Z(t+ \Delta t)}] \ = \mathbb{E} [x^{\sum_{k=1}^{\Delta t} Z^{k}(t)}] \ = \mathbb{E}[x^{Z^1(t)} \times x^{Z^2(t)} \times \cdots \times x^{Z^{\Delta t}(t)}] \ =^{?} \mathbb{E} \prod_{k=1}^{\Delta(t)} \mathbb{E}[x^{Z^k(t)}]$ $

My confusion is with the last line. I thought, the expectation of a product is the product of expectations, but this is not what has happened.

Any ideas?

## Expectation value of m parallel games

Easy example to start: You throw a $ n$ -sided dice until your lucky number shows. This is a Bernoulli process with $ p=1/n$ , your expected number $ E$ of throws is $ n$ .

Now imagine you play this game $ m$ times in parallel. Stop if *any* dice shows your lucky number, i.e. first win wins the whole game. (Feel free to discuss also the last win case 🙂 It’s still a Bernoulli process with $ p’=1-(1-1/p)^m, E’=1/p’$ .

Now generalize. The game $ G$ is defined by fixing some winning probability $ p(i)$ for each move $ i$ , we only assume $ E$ shall be finite. Play $ m$ copies of $ G$ parallel, again first win in a copy ends the game. What can one say about $ E’$ ?

1. $ E’\le E$ . (trivial)

2. $ E’\ge E/m$ ? (tempting 🙂

3. Let $ p(i)$ stem from some well-known probability distribution, say, instead of the geometric from above a Poisson or whatnot. Surely $ E’$ already has been computed for many of these distributions?

## How to use Wald’s equation to determine expectation in gambling model?

$ \begin{array}{l}{\text { In each game played one is equally likely to either win or lose 1. Let } S \text { be your }} \ {\text { cumulative winnings if you use the strategy that quits playing if you win the first }} \ {\text { game, and plays two more games and then quits if you lose the first game. }} \ {\text { (a) Use Wald’s equation to determine } E[S] \text { . }} \end{array}$

Let $ X_i$ be the amount won in game $ i$ . $ $ E[S] =E\bigg[\sum_{i=1}^N X_i\bigg]$ $ Applying Walds theorem, $ $ =E[N]E[X]$ $ And since this is a fair game where the player starts with 0 dollars, we know that $ E[X]=0$ . So, $ E[S]=0$

**Is this correct?**

## The expectation of the total number of pairs of keys in a hash table that collide using universal hashing

I am reading CLRS relating to perfect hashing. When computing the $ $ \mathbb{E}[\sum_{j=0}^{m-1}{n_j\choose{2}}] $ $

where $ m$ is the number of slots in the hash table, and $ n_j$ is the number of keys in position $ j$ . I don’t understand why we can directly conclude that

$ $ \mathbb{E}[\sum_{j=0}^{m-1}{n_j\choose{2}}]\leq{n\choose{2}}\frac{1}{m} $ $

I understand that since $ h$ is randomly chosen from a universal hash function family, $ \Pr{(h(x_i)=h(x_j))}\leq{\frac{1}{m}},\forall{i\neq{j}}$ . I don’t understand why we can use the total number of pairs (the combination part) directly because if $ h(x_i)=h(x_j)$ and $ h(x_j)=h(x_k)$ , then we have $ h(x_i)=h(x_k)$ immediately instead of a probability of $ \frac{1}{m}$ .

Someone can help me out? Thanks!

## Expectation of a linear operator

Could anyone help me with this? Since I didn’t get any answer from math and stat StackExchange so I am writing here.

We define $ T: C[0,1]\to C[0,1]\ni T(f(x))= \sum\limits_{k=1}^{m} p_k (f\circ f_k)(x):=\mathbb E( f(X_{n+1}|X_n=x)$ for a system $ X_{n+1}=f_{\omega_n}(X_n), n=0,1,2\dots.$ and $ \omega_n$ are i.i.d discrete r.v over $ \{1,2,\dots,m\}$ , $ p_k=\text{ Prob} (\omega_i=k)$ , $ f_k$ are bounded Lipschitz.

Could anyone explain to me how come $ | T^n(f(x))-T^n(f(y))|=| \mathbb E(f(X_n(x))-f(X_n(y)))|$ ?

Thanks for the help.