Eigenvalues of cyclic stochastic matrices

Let’s consider the following $ n \times n$ cyclic stochastic matrix

$ $ M= \begin{pmatrix} 0 & a_2 & & & &b_n \\ b_1 & 0& a_3& &&& \\ & b_2 & 0& \ddots & & \\ & &\ddots&\ddots &a_{n-1} & \\ & && &0 &a_n \\ a_1 & & & &b_{n-1} &0 \end{pmatrix} $ $

such that $ \forall i$ , $ a_i,\,b_i$ are positive real number, $ a_i+b_i = 1$ and all other component of the matrix are zeros. This is a cyclic matrix in the sense that the associated graph is cyclic.

From the Perron-Frobenius theorem, the eigenvalues $ \lambda$ of such matrix all belong to the unit circle. $ $ (\Re \lambda )^2 + (\Im \lambda )^2 \leq 1 $ $

From numerical explorations, I believe that all eigenvalues of $ M$ belong to the ellipse $ $ (\Re \lambda )^2 + \frac{(\Im \lambda )^2}{(\tanh p)^2} \leq 1 $ $

where $ p$ denote $ p = \frac{1}{2}\ln \frac{\sqrt[n]{\prod_i a_i}}{\sqrt[n]{\prod_i b_i}}$ , assumed to be positive, otherwise inverse $ a_i$ and $ b_i$ .

One of the extremal case is the symmetric case $ a_i=b_i$ where $ p=0$ and all eigenvalues are real. The equality is reached in the uniform case of all $ a_i$ to being equal to some value and all $ b_i$ being equal to another value, the matrix being then a circulant matrix.

I can already prove that the imaginary part of the eigenvalue is bounded by $ \tanh p$ (see below), but I am unable to extend the prove to include the real part. I also try to play with the Brauer theorem about oval of Cassini exposed into [Horn & Johnson, Matrix Analysis], but it did not get me anywhere

Do you have any hints or suggestions to prove the inclusion of the eigenvalue into the ellipse?

Proof for the imaginary part:

Denote $ z$ the left eigenvector associated with eigenvalue $ \lambda$ , we have from the eigenvalue equation $ \lambda z = z M $ , $ $ \forall i,\quad \lambda = a_i \frac{z_{i-1}}{z_i} + b_i\frac{z_{i+1}}{z_i} = \frac{a_i}{a_i+b_i} \frac{z_{i-1}}{z_i} + \frac{b_i}{a_i+b_i} \frac{z_{i+1}}{z_i} $ $ , where $ i+1$ and $ i-1$ ar evaluated modulo $ n$ , ad the second equality follow from $ a_i+b_i=1$ .

By taking the product of the imaginary part of all previous equation and denoting $ p_i= \ln \sqrt{\frac{a_i}{bi}}$ , we get $ $ \Im \lambda = \sqrt{\prod_i \,a_i b_i \Im \frac{z_{i-1}}{z_i} \Im \frac{z_{i+1}}{z_i} }\prod_i \frac{\sinh (p_i+\frac{1}{2}\ln \Im\frac{ z_{i+1} }{z_{i}} \Im\frac{z_{i} }{z_{i-1}} )}{\cosh p_i} \leq \prod_i \frac{\sinh (p_i+\frac{1}{2}\ln \Im\frac{ z_{i+1} }{z_{i}} \Im\frac{z_{i} }{z_{i-1}} )}{\cosh p_i}$ $ The inequality use that $ \prod_i \Im \frac{z_{i-1}}{z_i}\leq 1$ . The concavity of $ \ln \sinh$ and the convexity of $ \ln \cosh$ , give the result $ $ \Im \lambda \leq \tanh p.$ $

Rreference Request: book on stochastic calculus (not finance)

I am looking at looking at fractional Gaussian/Brownian noise from a signal theoretic and engineering point of view. In particular, I am looking at the math behind what defines these noise processes and what consequences this has on the physics, either generating them or consuming these noise signals.

As an engineer by training I am familiar with both (real/multivariate/complex) calculus and basic probability theory and also stochastic signals. But most of what I am doing now is where fractional calculus and stochastic calculus meet (Hic sunt dracones… literally). I think I can get my way around most of the fractional calculus part, but for the stochastic calculus I am in need of better understanding of how it works.

What I am looking for is a book (or lecture notes) that not only give me an understanding and intuition how stochastic calculus works (ie. how to apply it), but I also need the proofs in order to tell what I am allowed to do with the theorems and what not. Measure theory shouldn’t be much of a problem, as I have two mathematicians at hand who can explain things, if I get stuck.

Stochastic Dominance for Ito Integral

Consider the stochastic integral \begin{equation} \int_0^t f(s) \mathrm{d}W_s \end{equation} for a not necessarily deterministic function $ f$ . Can I bound this random variable in second order stochastic dominance $ \leq_\text{(2)}$ by an appropriately scaled normal distributed random variable?

Here random variables $ X,Y$ satisfy \begin{equation} X \leq_\text{(2)} Y \end{equation} if and only if for all $ q \in \mathbb{R}$ \begin{equation} \int_{-\infty}^q F_X(x) \mathrm{d}x \geq \int_{-\infty}^q F_Y(y) \mathrm{d}y \end{equation} holds. For simplicity one might assume that $ f$ is bounded.

The mean value of phase noise as a stochastic process

  1. What is the mean value of phase noise as a stochastic process?
  2. Where can I get a theoretical analysis of this topic?

PS: PLL produces cos(2*πfct+φ(t)). The phase noise refers to φ(t). The mean value of the phase noise what I say refers to the mathematic expectation of phase noise stochastic process φ(t), namely E{ φ(t) }.

Bound and approximation of function of stochastic processes

I got two issues that seem very easy on first sight, but I got problems proving them. I have two pairs of stochastic processes $ \{X_{n,j}(t_j) : t_j \geq 0 \}$ and $ \{Y_{n,j}(t_j): t_j \geq 0\}$ for $ j=1,2,$ and can suppose that for both $ j$ they satisfy

$ \vert X_{n,j}(t_j) – Y_{n,j}(t_j) \vert \leq C_j t_j^{1/2 – \beta_j}$ for some $ \beta_j > 0$

and (under some more regularity conditions)

$ \sup \limits_{t_j \in [0,1]} \vert X_{n,j}(t_j) – Y_{n,j}(t_j) \vert = o(1)$ as $ n \to \infty$ .

Now I want to verify if also $ \vert \sum_{j=1}^2 X_{n,j}^2(t_j) – \sum_{j=1}^2 Y_{n,j}^2(t_j) \vert \leq \sum_{j=1}^2 C_j t_j^{1/2 – \beta_j}$


$ \sup \limits_{t_1,t_2 \in [0,1]} \vert \sum_{j=1}^2 X_{n,j}^2(t_j) – \sum_{j=1}^2 Y_{n,j}^2(t_j) \vert = o(1)$ as $ n \to \infty$

holds. This seems very simple at first, since I only use the continous function $ (x,y) \mapsto x^2+y^2$ here, but the continous mapping theorem doesnt seem to be the correct way to prove this. Can anyone lead me into the right direction?

Thank you!

Expected Solution of a Stochastic Differential Equation Expressed as Conditional Expectation

On all you geniusses out there: this is a tough one.

Preliminaries and Rigorous Technical Framework

  • Let $ T \in (0, \infty)$ be fixed.

  • Let $ d \in \mathbb{N}_{\geq 1}$ be fixed.

  • Let $ $ (\Omega, \mathcal{G}, (\mathcal{G}_t)_{t \in [0,T]}, \mathbb{P})$ $ be a complete probability space with a complete, right-continuous filtration $ (\mathcal{G}_t)_{t \in [0,T]}$ .

  • Let $ $ B : [0,T] \times \Omega \rightarrow \mathbb{R}^d, \quad (t,\omega) \mapsto B_t(\omega)$ $ be a standard $ d$ -dimensional $ (\mathcal{G}_t)_{t \in [0,T]}$ -adapted Brownian motion on $ \mathbb{R}^d$ such that, for every pair $ (t,s) \in \mathbb{R}^2$ with $ 0 \leq t < s$ , the random variable $ B_s-B_t$ is independent of $ \mathcal{G}_t$ .

  • Let \begin{align} &\sigma: \mathbb{R}^d \rightarrow \mathbb{R}^{d \times d}, \ &\mu: \mathbb{R}^d \rightarrow \mathbb{R}^{d}, \end{align} be affine linear transformations, i.e. let there be matrices $ (A^{(\sigma)}_1,…,A^{(\sigma)}_d, \bar{A}^{(\sigma)}):= \theta_{\sigma} \in (\mathbb{R}^{d \times d})^{d+1}$ such that, for all $ x \in \mathbb{R}^d$ , \begin{equation} \sigma(x) = ( A^{(\sigma)}_1 x \mid … \mid A^{(\sigma)}_d x) + \bar{A}^{(\sigma)}, \end{equation} where $ A^{(\sigma)}_i x$ describes the $ i$ -th column of the matrix $ \sigma(x) \in \mathbb{R}^{d \times d}$ , and let there be a matrix-vector pair $ (A^{(\mu)}, \bar{a}^{(\mu)}) := \theta_{\mu} \in \mathbb{R}^{d \times d} \times \mathbb{R}^d$ such that, for all $ x \in \mathbb{R}^d$ , \begin{equation} \mu (x) = A^{(\mu)}x + \bar{a}^{(\mu)}. \end{equation}

  • Let \begin{equation} \varphi : \mathbb{R}^d \rightarrow \mathbb{R} \end{equation} be a fixed, continuous and at most polynomially growing function, i.e. let $ \varphi$ be continuous and let there be a constant $ C \in [1, \infty)$ such that, for all $ x \in \mathbb{R}^d$ it holds that \begin{equation} \lVert \varphi(x) \rVert \leq C (1+\lVert x \rVert )^C. \end{equation}

  • Let $ x_0 \in \mathbb{R}^d$ be fixed.


Consider the following stochastic differential equation, given as an equivalent stochastic integral equation, where the multidimensional integrals are to be read componentwise:

\begin{equation} S_t = x_0 + \int_{0}^{t} \mu(S_t) ds + \int_{0}^{t} \sigma (S_t) dB_s. \end{equation}

Under our assumptions, it is the case that an (up to indistinguishability) unique solution process

$ $ S^{(x_0, \theta_{\sigma}, \theta_{\mu})} :[0,T] \times \Omega \rightarrow \mathbb{R}^d, \quad (t, \omega) \mapsto S_t(\omega),$ $

for this equation exists (to see this, consider for example Theorem 8.3. in Brownian Motion, Martingales and Stochastic Calculus from Le Gall).

I am interested in the expectation of $ S^{(x_0, \theta_{\sigma}, \theta_{\mu})}$ at time $ T$ when passed through the function $ \varphi$ : $ $ \mathbb{E}[\varphi(S^{(x_0, \theta_{\sigma}, \theta_{\mu})}_T)].$ $ More specifically, I want to express $ \mathbb{E}[\varphi(S^{(x_0, \theta_{\sigma}, \theta_{\mu})}_T)]$ in the following way as a conditional expectation: $ $ \mathbb{E}[\varphi(S^{(x_0, \theta_{\sigma}, \theta_{\mu})}_T)] = \mathbb{E}[\varphi(S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}_T) \mid (X_0, \Theta_{\sigma}, \Theta_{\mu}) = (x_0, \theta_{\sigma}, \theta_{\mu})]. $ $

Here $ $ X_0 : \Omega \rightarrow \mathbb{R}^d, $ $ $ $ \Theta_{\mu} : \Omega \rightarrow \mathbb{R}^{d \times d} \times \mathbb{R}^d,$ $ $ $ \Theta_{\sigma} : \Omega \rightarrow (\mathbb{R}^{d \times d})^{d+1},$ $ are $ \mathcal{G}_0$ -measurable random variables, which define the initial value $ x_0$ of the process at $ t=0$ as well as the entries of the affine-linear coefficient functions $ \mu$ and $ \sigma$ . Moreover, $ \Sigma$ is a random function.

The random variable

$ $ S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}_T : \Omega \rightarrow \mathbb{R}^d$ $

is implicitly defined by the procedure of first “drawing” the random variables $ (X_0, \Theta_{\sigma}, \Theta_{\mu})$ at time $ t = 0$ in order to obtain fixed values $ $ (X_0, \Theta_{\sigma}, \Theta_{\mu}) = (\tilde{x}_0, \tilde{\theta}_{\sigma}, \tilde{\theta}_{\mu}) $ $ and then “afterwards” set $ $ S^{X_0, \Theta_{\sigma}, \Theta_{\mu})}_T := S^{(\tilde{x}_0, \tilde{\theta}_{\sigma}, \tilde{\theta}_{\mu})}_T, $ $ where
$ $ S^{(\tilde{x}_0, \tilde{\theta}_{\sigma}, \tilde{\theta}_{\mu})} :[0,T] \times \Omega \rightarrow \mathbb{R}^d, \quad (t, \omega) \mapsto S^{(\tilde{x}_0, \tilde{\theta}_{\sigma}, \tilde{\theta}_{\mu})}_t(\omega) $ $ is the (up to indistinguishability) unique solution process of the stochastic differential equation.

\begin{equation} S_t = \tilde{x}_0 + \int_{0}^{t} \tilde{\mu}(S_t) ds + \int_{0}^{t} \tilde{\sigma} (S_t) dB_s. \end{equation}

Here, $ \tilde{\sigma}$ and $ \tilde{\mu}$ are the affine-linear maps associated with the parameter values $ \tilde{\theta}_{\sigma}$ and $ \tilde{\theta}_{\mu}$ as described above.

Now, my questions:

  1. I know that there are technical problems with the way I “defined ” the random variable $ S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}$ , although I hope the idea is clear. How can I make the definition of $ S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}$ rigorous in the above framework?
  2. After having obtained a rigorous definition of $ S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}$ , how can I then show, that $ $ \mathbb{E}[\varphi(S^{(x_0, \theta_{\sigma}, \theta_{\mu})}_T)] = \mathbb{E}[\varphi(S^{(X_0, \Theta_{\sigma}, \Theta_{\mu})}_T) \mid (X_0, \Theta_{\sigma}, \Theta_{\mu}) = (x_0, \theta_{\sigma}, \theta_{\mu})] ?$ $

If further regularity assumptions (for example on the random variables $ X_0, \Theta_{\sigma}, \Theta_{\mu}$ ) are necessary in order to answer the above questions in a satisfactory way, then these can be made without second thoughts.

These questions are at the core of my current research. I am stuck and I would be extremely grateful for any advice!

On Riemann integration of stochastic processes of order $p$

Let $ x:[a,b]\times\Omega\rightarrow\mathbb{R}$ be a stochastic process, where $ \Omega$ is the sample space from an underlying probability space. Let $ L^p$ be the Lebesgue space of random variables on $ \Omega$ with finite absolute moment of order $ p$ , with norm $ \|\cdot\|_p$ .

Consider the following definition of Riemann integrability in the sense of $ L^p$ : we say that $ x$ is $ L^p$ -Riemann integrable on $ [a,b]$ if there is a random variable $ I$ and a sequence of partitions $ \{P_n\}_{n=1}^\infty$ with mesh tending to $ 0$ , $ P_n=\{a=t_0^n<t_1^n<\ldots<t_{r_n}^n=b\}$ , such that, for any choice of interior points $ s_i^n\in [t_{i-1}^n,t_i^n]$ , we have $ \lim_{n\rightarrow\infty} \sum_{i=1}^{r_n} x(s_i^n)(t_i^n-t_{i-1}^n)=I$ in $ L^p$ . In this case, $ I$ is denoted as $ \int_a^b x(t)\,dt$ . This approach is defined in (T.T. Soong, Random Differential Equations in Science and Engineering, Academic Press, New York, 1973) and (T.L. Saaty, Modern Nonlinear Equations, Dover Publications Inc., New York, 1981), for instance.

I have not been able to read a full exposition on $ L^p$ -Riemann integration anywhere. I have several questions regarding this definition:

  1. Once we know that $ x$ is $ L^p$ -Riemann integrable and that such a especial sequence of partitions $ \{P_n\}_{n=1}^\infty$ exists, can we take any other sequence of partitions $ \{P_n’\}_{n=1}^\infty$ ? I mean, for any $ \{P_n’\}_{n=1}^\infty$ with mesh tending to $ 0$ and any choice of interior points, the corresponding Riemann sums tends to $ I$ in $ L^p$ .

  2. Can this definition be related to upper and lower sums, as one does in real integration with the Darboux integral?

  3. Equivalence with this statement: there is a random variable $ I$ such that: for every $ \epsilon>0$ , there is a partition $ P_\epsilon$ such that, for every partition $ P$ finer than $ P_\epsilon$ and for any choice of interior points, the corresponding Riemann sum $ S(P,x)$ satisfies $ \| S(P,x)-I\|_p<\epsilon$ .

  4. Equivalence with this statement: there is a random variable $ I$ such that: for every $ \epsilon>0$ , there is a $ \delta>0$ such that for any partition $ P$ with $ \|P\|<\delta$ and for any choice of interior points, the corresponding Riemann sum $ S(P,x)$ satisfies $ \| S(P,x)-I\|_p<\epsilon$ .

And now we move to several variables. Let $ x:[a,b]\times[c,d]\times\Omega\rightarrow\mathbb{R}$ be a stochastic process. Briefly, we say that $ x$ is $ L^p$ -Riemann integrable on $ [a,b]\times[c,d]$ if there is a random variable $ I$ and a sequence of partitions $ \{P_n\}_{n=1}^\infty$ with mesh tending to $ 0$ such that, for any choice of interior points, the corresponding Riemann sums tend to $ I$ in $ L^p$ . In such a case, $ I$ is denoted $ \iint_{[a,b]\times[c,d]} x(t,s)\,dt\,ds$ . Do the above questions hold for this double integral as well? And another question in this particular setting: consider two partitions $ \{P_n’\}_{n=1}^\infty$ and $ \{P_m”\}_{m=1}^\infty$ of $ [a,b]$ and $ [c,d]$ , respectively, with mesh tending to $ 0$ , and let $ P_{n,m}=P_n’\times P_m”$ . Do the Riemann sums corresponding to $ P_{n,m}$ converge to $ I$ in $ L^p$ as $ n,m\rightarrow\infty$ (in the sense of double sequences)?

Space-time stochastic processes and their generalizations

In the Section 1.2.6 of the book: Weak and measure-valued solutions to evolutionary partial differentional equations – Malek, Necas, Rokyta, Ruzicka, 1996, it is written something like this:

Let’s say we have some deterministic real valued function $ u(t,x):[0,T] \times A \rightarrow \mathbb{R}$ where $ A\subset \mathbb{R}^d$ . We could think of $ u$ in a different way. For $ u(t,x)$ the map $ $ u(t):x \mapsto u(t,x) $ $

is an element of some function space (Sobolev, Lebesgue,…). Then the function $ $ t \mapsto u(t) $ $

maps the interval [0,T] into that function space…

The idea is to regard a function of time and space as a collection of functions of space that is parametrized by time.

So what happens when we add some randomness in the function $ u(t,x)$ ? There are a lot of generalizations of stochastic processes but I am interested in the two space-time generalizations given bellow.

a) Could we say that the space-time stochastic process $ u(t,x,\omega)$ maps $ [0,T] \times A \times \Omega \rightarrow \mathbb{R}$ where $ A\subset \mathbb{R}^d$ ? I work on the problems in the Stochastic evolutionary equations I do not see this type of generalization almost nowhere. The one I see usually is

b) Instead of using $ u(t,x,\omega)$ given above it is custom to use $ u(t,\omega)$ that maps $ [0,T] \times \Omega$ into some function space. Definitions like this could be found for example in Chapter 3 of the book Stochastic Equations in Infinite Dimensions – Da Prato, Zabczyk, 1992.

So instead of using real-valued $ u(x,t,\omega) $ we use Banach space-valued $ u(t,\omega) $ . For example, spaces like $ u(t,\omega) \in L^p(\Omega,L^q(0,T;L^{\infty}))$ or $ u(t,\omega) \in L^p(\Omega,C([0,T];W^{k,r}))$ or something similar.

My questions are:

  1. What are the advantages of using processes b) instead of the processes a)? I.E. why would somebody use b) instead of a) and vice versa? So it is the question of using real-valued or Banach space-valued processes .

  2. For the processes in a) what would be their trajectory i.e. do we now fix $ x$ and $ \omega$ instead of the just $ \omega$ like in the typical stochastic processes? In the processes given in b) trajectories usually belong to the spaces $ C([0,T];L^p(\mathbb{R}^d))$ or similar.

Thanks for the help. And if anyone needs some clarification of this question or more examples, please write it in the comments bellow.

Sucker Bet – Coin Flipping Stochastic Process

Having a lot of trouble working out this exercise. I have tried constructing the 8×8 matrix with all possible combinations of three flips of the coin {HHH, HHT, HTH, … , TTT} and then calculating an exit distribution and trying to find the P(going to player 2’s strategy < going to player 1’s strategy) but I keep getting the 1 vector when solving. (Using the method out lined in Durrett of (I-r)^-1 * v = h). Any advice would be greatly appreciated.

Stochastic Fixpoint Approximations of Contractions


Consider a contraction $ f\colon\mathbb{R}^S\to\mathbb{R}^S$ with $ f(X^*)=X^*$ where function evaluation at a certain position is only possible with some stochastic error. Where Y(X) is this imprecise evaluation and $ $ \mathbb{E}[Y(X)]=f(X)$ $

Then our stochastic approximation attempt is for $ x\in S$ : $ $ X_{n+1}(x)=(1-\alpha_n(x))X_n(x)+\alpha_n(x)Y_n(X_n)(x)$ $

Which can be rewritten as: $ $ \Delta_{n+1}(x):=X_{n+1}(x)-X^*(x)=(1-\alpha_n(x))\Delta_n(x)+\alpha_n(x)\underbrace{[Y_n(X_n)(x)-f(X^*)(x)]}_{=:F_n(x)} $ $

Then $ F_n$ has the following properties: $ $ \|\mathbb{E}[F_n\mid X_n]\|=\|f(X_n)-X^*\|\le\gamma\|X_n-X^*\|=\gamma\|\Delta_n\| $ $ and $ $ \begin{align} \|Var[F_n(x)\mid X_n]\| &\le \mathbb{E}[F_n^2(x)\mid X_n]\ &\le \mathbb{E}[f(X_n)(x)-X^*(x)+Y_n(X_n)(x)-f(X_n)(x)]^2\ &\le2([f(X_n)(x)-X^*(x)]^2 +\mathbb{E}[Y_n(X_n)(x)-f(X_n)]^2)\ &\le2([\gamma\|\Delta_n\|^2+Var(Y_n(X_n))]) \end{align} $ $ And assume that the vairance of $ Y_n$ can be bounded with $ C(1+\|\Delta_n\|)$ .

This is the problem that On the Convergence of Stochastic Iterative Dynamic Programming Algorithms (Jaakkola et al. 1994) addresses withenter image description here

And they show that this theorem can be applied to a range of reinforcement learning algorithms.

The main difficulty in proving this theorem is generalizing existing Stochastic Approximation results (e.g. Dvoretzky 1956) to higher dimensions and to error variances which depend on the approximation sequence.

If the seqence would converge to 0, then the variance is of course bounded which would lead to convergence of the sequence. So they bootstrap convergence by arguing, that if $ X_{n+1}(x)=G(X_n,Y_n,x)$ with $ $ G(\beta X_n,Y_n,x)=\beta G(X_n,Y_n,x)$ $ then scaling $ X_n(\omega)$ to keep it within bounds C is equivalent to initializing with $ \beta X_0(\omega)$ . And if $ X_n(\omega)$ converges given the assumption that we scale only if it wanders above the threshold C, then there is some $ N\in\mathbb{N}$ such that it stays within an epsilon environment of 0 afterwards and no more scaling is neccessary. This means that we only scale a finite number of times. But scaling by a finite amount does not impact the convergence, so we wouldn’t have had to do that. (This is their Lemma 2)

While the recursion from the theorem is not from this form, they manage to reduce the problem to the following lemma, which is where I am stuck

Problem Explanation

enter image description here

I am unable to show that this sequence converges and I did not seem to have missed something obvious since I got no responses on Math SE. But from my attempts I have built an intuition on why this statement is probably true even though I can not prove it, which is what I am going to present now

My Attempts

What I have so far (I’ll leave out w.p.1 everywhere to avoid clutter, since it is pretty much just analysis arguments for the casese where an appropriate $ \omega$ is slected):

  1. Since $ X_n$ is bounded we know that there are convergent subsequences.
  2. $ |X_{n+1}-X_{n}|\le (\alpha_n+\gamma \beta_n)C_1$ which implies that the distance between neighbours converges to zero, but it isn’t enough for Cauchy (not even with convergent subsequences).
  3. $ \lim\inf_n X_n\ge 0$ since $ $ \begin{align} X_{n+1}(x)&=(1-\alpha_n(x))X_n(x) + \gamma\beta_n(x)\|X_n\| \ &\ge (1-\alpha_n(x))X_n(x)\ &=\prod^n_{k=0}(1-\alpha_k(x))X_0 \to 0 \end{align}$ $ since $ \sum \alpha_n=\infty$ (c.f. Infinite Product).
  4. Because of $ \alpha_n(x) \to 0$ we know that $ (1-\alpha_n(x))\ge 0$ for almost all $ n$ , and if $ X_n(x)\ge 0$ then $ $ X_{n+1}(x) = \underbrace{(1-\alpha_n(x))}_{\ge 0} \underbrace{X_n(x)}_{\ge 0} +\underbrace{\gamma\beta_n(x)\|X_n\|}_{\ge 0} $ $ thus if one element of the sequence is positive all following elements will be positive too. The sequences which stay negative converge to zero ($ \lim\inf X_n\ge 0$ ). The other sequences will be positive for almost all n.
  5. For $ \|X_n\|$ not to converge $ \|X_n\|=\max_x X_n(x)$ for an infinite amount of n. If it was equal to the maximum of the negative sequences for almost all n it would converge. $ $ \|X_n\|=\max_x -X_n(x) \le \max_x – \prod_{k=0}^n (1-\alpha_k) X_0 \to 0 $ $

If we set $ \beta_n=0$ we have $ $ X_m=\prod_{k=n}^{m-1} (1-\alpha_k)X_n \to 0$ $ So my intuition is: since $ \beta_n$ is smaller than $ \alpha_n$ (on average) replacing $ \beta_n$ with $ \alpha_n$ should probably be fine, since you introduce a larger difference to zero. So I think going in the direction $ $ X_{n+1}\sim (1-\alpha_n)X_n +\gamma \alpha_n X_n = (1-(1-\gamma)\alpha_n)X_n$ $ Which is fine since $ \sum(1-\gamma)\alpha_n =\infty$ for $ \gamma\in(0,1)$

Case $ |S|=1$ : If $ X_n\le 0$ forall n, then by 3. $ X_n\to 0$

Otherwise by 4. there exists a $ N\in\mathbb{N}$ such that $ $ X_n\ge0 \quad \forall n\ge N$ $

Therefore $ $ \begin{align} X_{n+1}&=(1-\alpha_n)X_n+\gamma \beta_n X_n\ &=(1-\mathbb{E}[\alpha_n\mid P_n] +\gamma \mathbb{E}[\beta_n\mid P_n])X_n + (\gamma\beta_n-\gamma\mathbb{E}[\beta_n\mid P_n] -\alpha_n +\mathbb{E}[\alpha_n\mid P_n])X_n\ &\le (1-(1-\gamma)\mathbb{E}[\alpha_n\mid P_n])X_n + r_n X_n \end{align}$ $

with $ r_n=(\gamma\beta_n-\gamma\mathbb{E}[\beta_n\mid P_n] -\alpha_n +\mathbb{E}[\alpha_n\mid P_n])$ . Then $ $ \mathbb{E}[r_n]=0$ $ And given that $ \sum \beta_n^2 < \infty$ and $ \sum \alpha_n^2 < \infty$ and since $ X_n$ is bounded and therefore the variance of $ X_n$ is bounded, I think that $ \sum\mathbb{E}[r_nX_n]^2<\infty$ but I couldn’t show that yet.

But I still don’t really know how to generalize this to the higher dimensional case even if I could prove this.

Since this attempt would also show that $ X_n\to 0$ directly I don’t think that this was the intended approach. But I can not figure out what they intended me to do either.

I hope that this question is fine for MathOverflow – I am stuck for weeks now and barely having any new ideas and on Math SE no one seems to be able to help either. I would appreciate any suggestions.