I am very new to Mathematica. I am trying to do some manipulations using mathematica, I am solving functions using FullSimplify function of Mathematica. It gives me unpleasant approximations of Pi Whereas in my friends laptop It gives exact things like . How can set my mathematics to not to approximate Pi/2 as 1.5708 and give numbers O(1.e-16) as zero. Simply How can I configure my mathematica to get same behaviour.

# Tag: Approximations

## Solve boundary value problem with NDSolve. How to print out approximations to a solution?

I solve particular boundary-value-problem for ODE with NDSolve “Shooting” method. The case is that solution is attained very slow, that seems to mean that boundary-value-problem which is supposed to be well-defined enough, in fact, is ill at some place. So i try figure out. First step is to see concrete values of a produced approximations to a solution. What language constructs should i use for that?

Simple example. Suppose we consider particle motion in vertical plane. We throw our particle from initial point with coordinates {x0,y0} := {0,0} and initial trajectory angle 45 degrees. And try to achieve point with coordinates {x1,y1} := {1,0} by varying particle initial velocity. We don’t know two things here: initial velocity and a duration of a motion. Here is how that toy problem can be presented and solved in *mathematica*:

`gravity = 10; bvpsol = NDSolve[ { { (* ODE system (5th order) *) {x''[u] / c[u]^2 == 0, y''[u] / c[u]^2 == -gravity, c'[u] == 0}, (* boundary conditions (5 items) *) {x[0] == y[0] == 0, x[1] == 1, y[1] == 0, x'[0] == y'[0]} } }, {x[u], y[u], c[u]}, u, Method -> {"Shooting", "StartingInitialConditions" -> {x[0] == y[0] == 0, x'[0] == y'[0] == 1, c[0] == 1}}] // Flatten; {dxdu, dydu} = D[{x[u], y[u]} /. bvpsol, u]; {vx0, vy0} = ({dxdu, dydu} / c[u] /. bvpsol) /. {u -> 0}; duration = c[u] /. bvpsol /. {u -> RandomReal[]}; ivpsol = NDSolve[{ (* ODE system *) {x''[t] == 0, y''[t] == -gravity}, (* initial values *) {x[0] == y[0] == 0, x'[0] == vx0, y'[0] == vy0} }, {x[t], y[t]}, {t, 0, duration}]; ParametricPlot[{x[t], y[t]} /. ivpsol, {t, 0, duration}, GridLines -> Automatic, AspectRatio -> 1/2] `

**Question:** Now what options or language construct should i use to see approximations which are produced NDSolve while solving boundary-value-problem?

## Why Does Adding the nth Derivative Increase a Function Approximation’s Accuracy?

I am currently taking calculus 3: sequences and series, and we’ve just started learning about Maclaurin and Taylor Series. I understand the concept behind them — of these polynomials and derivatives of polynomials.

However, I do not understand physically *why*, when we have a function approximation $ g(x) \approx f(x)$ , adding more and more derivatives of $ f(x)$ increases the accuracy of $ g(x)$ more and more.

If someone could point me to a resource and explain it in simple terms it would be much appreciated. Thank you!

## Zeros of polynomial approximations of the Riemann $\zeta$ function

I know next to nothing about analytic number theory, or the theory of the Riemann $ \zeta$ function in particular, so the following might be too elementary to deserve more than derision; even so it seems it wouldn’t hurt to ask where the following question has been considered and what the outcome was.

$ \zeta : \mathbb{C} \rightarrow \mathbb{C}$ is a meromorphic function, and therefore, locally has a Laurent series expansion, and in any disc $ D(c,r)$ centered at $ c$ and of radius $ r$ in $ \mathbb{C}$ that avoids the poles, it has a Taylor series expansion, which can be truncated at $ n$ -th order to obtain a polynomial approximation, $ p_{\zeta, D, n}$ of degree $ n$ .

Let’s take such a disc in the critical strip. The question is: what are the zeros of $ p_{\zeta, D, n}$ and how do they behave as $ n \rightarrow \infty$ ? What happens to the asymptotic behavior as we move the disc around? Implicit in the question, of course, is curiosity about any light zeros of $ p_{\zeta, D, n}$ might shed on zeros of $ \zeta$ .

## Stochastic Fixpoint Approximations of Contractions

**Context/Introduction**

Consider a contraction $ f\colon\mathbb{R}^S\to\mathbb{R}^S$ with $ f(X^*)=X^*$ where function evaluation at a certain position is only possible with some stochastic error. Where Y(X) is this imprecise evaluation and $ $ \mathbb{E}[Y(X)]=f(X)$ $

Then our stochastic approximation attempt is for $ x\in S$ : $ $ X_{n+1}(x)=(1-\alpha_n(x))X_n(x)+\alpha_n(x)Y_n(X_n)(x)$ $

Which can be rewritten as: $ $ \Delta_{n+1}(x):=X_{n+1}(x)-X^*(x)=(1-\alpha_n(x))\Delta_n(x)+\alpha_n(x)\underbrace{[Y_n(X_n)(x)-f(X^*)(x)]}_{=:F_n(x)} $ $

Then $ F_n$ has the following properties: $ $ \|\mathbb{E}[F_n\mid X_n]\|=\|f(X_n)-X^*\|\le\gamma\|X_n-X^*\|=\gamma\|\Delta_n\| $ $ and $ $ \begin{align} \|Var[F_n(x)\mid X_n]\| &\le \mathbb{E}[F_n^2(x)\mid X_n]\ &\le \mathbb{E}[f(X_n)(x)-X^*(x)+Y_n(X_n)(x)-f(X_n)(x)]^2\ &\le2([f(X_n)(x)-X^*(x)]^2 +\mathbb{E}[Y_n(X_n)(x)-f(X_n)]^2)\ &\le2([\gamma\|\Delta_n\|^2+Var(Y_n(X_n))]) \end{align} $ $ And assume that the vairance of $ Y_n$ can be bounded with $ C(1+\|\Delta_n\|)$ .

This is the problem that On the Convergence of Stochastic Iterative Dynamic Programming Algorithms (Jaakkola et al. 1994) addresses with

And they show that this theorem can be applied to a range of reinforcement learning algorithms.

The main difficulty in proving this theorem is generalizing existing Stochastic Approximation results (e.g. Dvoretzky 1956) to higher dimensions and to error variances which depend on the approximation sequence.

If the seqence would converge to 0, then the variance is of course bounded which would lead to convergence of the sequence. So they bootstrap convergence by arguing, that if $ X_{n+1}(x)=G(X_n,Y_n,x)$ with $ $ G(\beta X_n,Y_n,x)=\beta G(X_n,Y_n,x)$ $ then scaling $ X_n(\omega)$ to keep it within bounds C is equivalent to initializing with $ \beta X_0(\omega)$ . And if $ X_n(\omega)$ converges given the assumption that we scale only if it wanders above the threshold C, then there is some $ N\in\mathbb{N}$ such that it stays within an epsilon environment of 0 afterwards and no more scaling is neccessary. This means that we only scale a finite number of times. But scaling by a finite amount does not impact the convergence, so we wouldn’t have had to do that. (This is their Lemma 2)

While the recursion from the theorem is not from this form, they manage to reduce the problem to the following lemma, which is where I am stuck

**Problem Explanation**

I am unable to show that this sequence converges and I did not seem to have missed something obvious since I got no responses on Math SE. But from my attempts I have built an intuition on why this statement is probably true even though I can not prove it, which is what I am going to present now

**My Attempts**

What I have so far (I’ll leave out w.p.1 everywhere to avoid clutter, since it is pretty much just analysis arguments for the casese where an appropriate $ \omega$ is slected):

- Since $ X_n$ is bounded we know that there are convergent subsequences.
- $ |X_{n+1}-X_{n}|\le (\alpha_n+\gamma \beta_n)C_1$ which implies that the distance between neighbours converges to zero, but it isn’t enough for Cauchy (not even with convergent subsequences).
- $ \lim\inf_n X_n\ge 0$ since $ $ \begin{align} X_{n+1}(x)&=(1-\alpha_n(x))X_n(x) + \gamma\beta_n(x)\|X_n\| \ &\ge (1-\alpha_n(x))X_n(x)\ &=\prod^n_{k=0}(1-\alpha_k(x))X_0 \to 0 \end{align}$ $ since $ \sum \alpha_n=\infty$ (c.f. Infinite Product).
- Because of $ \alpha_n(x) \to 0$ we know that $ (1-\alpha_n(x))\ge 0$ for almost all $ n$ , and if $ X_n(x)\ge 0$ then $ $ X_{n+1}(x) = \underbrace{(1-\alpha_n(x))}_{\ge 0} \underbrace{X_n(x)}_{\ge 0} +\underbrace{\gamma\beta_n(x)\|X_n\|}_{\ge 0} $ $ thus if one element of the sequence is positive all following elements will be positive too. The sequences which stay negative converge to zero ($ \lim\inf X_n\ge 0$ ). The other sequences will be positive for almost all n.
- For $ \|X_n\|$ not to converge $ \|X_n\|=\max_x X_n(x)$ for an infinite amount of n. If it was equal to the maximum of the negative sequences for almost all n it would converge. $ $ \|X_n\|=\max_x -X_n(x) \le \max_x – \prod_{k=0}^n (1-\alpha_k) X_0 \to 0 $ $

If we set $ \beta_n=0$ we have $ $ X_m=\prod_{k=n}^{m-1} (1-\alpha_k)X_n \to 0$ $ So my intuition is: since $ \beta_n$ is smaller than $ \alpha_n$ (on average) replacing $ \beta_n$ with $ \alpha_n$ should probably be fine, since you introduce a larger difference to zero. So I think going in the direction $ $ X_{n+1}\sim (1-\alpha_n)X_n +\gamma \alpha_n X_n = (1-(1-\gamma)\alpha_n)X_n$ $ Which is fine since $ \sum(1-\gamma)\alpha_n =\infty$ for $ \gamma\in(0,1)$

**Case $ |S|=1$ :** If $ X_n\le 0$ forall n, then by 3. $ X_n\to 0$

Otherwise by 4. there exists a $ N\in\mathbb{N}$ such that $ $ X_n\ge0 \quad \forall n\ge N$ $

Therefore $ $ \begin{align} X_{n+1}&=(1-\alpha_n)X_n+\gamma \beta_n X_n\ &=(1-\mathbb{E}[\alpha_n\mid P_n] +\gamma \mathbb{E}[\beta_n\mid P_n])X_n + (\gamma\beta_n-\gamma\mathbb{E}[\beta_n\mid P_n] -\alpha_n +\mathbb{E}[\alpha_n\mid P_n])X_n\ &\le (1-(1-\gamma)\mathbb{E}[\alpha_n\mid P_n])X_n + r_n X_n \end{align}$ $

with $ r_n=(\gamma\beta_n-\gamma\mathbb{E}[\beta_n\mid P_n] -\alpha_n +\mathbb{E}[\alpha_n\mid P_n])$ . Then $ $ \mathbb{E}[r_n]=0$ $ And given that $ \sum \beta_n^2 < \infty$ and $ \sum \alpha_n^2 < \infty$ and since $ X_n$ is bounded and therefore the variance of $ X_n$ is bounded, I think that $ \sum\mathbb{E}[r_nX_n]^2<\infty$ but I couldn’t show that yet.

But I still don’t really know how to generalize this to the higher dimensional case even if I could prove this.

Since this attempt would also show that $ X_n\to 0$ directly I don’t think that this was the intended approach. But I can not figure out what they intended me to do either.

I hope that this question is fine for MathOverflow – I am stuck for weeks now and barely having any new ideas and on Math SE no one seems to be able to help either. I would appreciate any suggestions.