Sample complexity of mean estimation using empirical estimator and median-of-means estimator?

Given a random variable $ X$ with unknown mean $ \mu$ and variance $ \sigma^2$ , we want to produce an estimate $ \hat{\mu}$ based on $ n$ i.i.d. samples from $ X$ such that $ \rvert \hat{\mu} – \mu \lvert \leq \epsilon\sigma$ .

Empirical estimator: why are $ O(\epsilon^{-2}\cdot\delta^{-1})$ samples necessary? why are $ \Omega(\epsilon^{-2}\cdot\delta^{-1})$ samples sufficient?

Median-of-means estimator: why are $ O(\epsilon^{-2}\cdot\log\frac{1}{ \delta})$ samples necessary?

Big O notation – estimation of run time [migrated]

I am running very computationally intensive tasks and wish to adjust the parameters respective of how long it takes.

The function I am running is PLINK – for those who don’t know, it is used for genotype data.

The function is said to follow a O(n*m^2) w.r.t. big O.

I have the run time for two time points with different parameters for m and a constant n, they are: 3 hours and 648 hours.

From this I wish to estimate the run-time for different parameters of m, that would respect the O(n*m^2) relationship.

Can anybody provide some insight as to methods for estimating run-time with the constant n parameters however also, for running tests with different parameters as well in order to achieve an optimal run-time with respect to accuracy of results?

[GET][NULLED] – WP Cost Estimation & Payment Forms Builder v9.681

[​IMG]
[​IMG]

[GET][NULLED] – WP Cost Estimation & Payment Forms Builder v9.681

Estimation of the number of solutions by Counting

This is a question from a quantum computation textbook.

Consider a classical algorithm for counting the number of solutions to a problem. The algorithm samples uniformly and independently $ k$ times from the Search Space of size $ N$ for solutions using an Oracle that outputs 1 or 0, and let $ X_1,X_2,X_3,…X_k$ be the results of the Oracle calls. So $ X_j=1$ if the $ jth$ Oracle call found a solution and $ X_j=0$ otherwise. This algorithm estimates the number of solutions $ S$ :

$ $ S=N * \sum_{j}\frac{X_j}{k}$ $

Assuming the number of solutions is $ M$ and this is not known in advance. The Standard Deviation of $ S$ is stated and found to be:

$ $ \Delta S=\sqrt{\frac{M(N-M)}{k}}$ $

The question is:
Prove that to obtain a probability at least $ \frac{3}{4}$ of estimating $ M$ correctly to within an accuracy $ \sqrt{M}$ for all values of $ M$ , we must have $ k=\Omega(N)$ .

I know how to get the 2nd equation from the 1st, which is by moving $ N$ and $ k$ to the left, thus treating $ kS/N$ as a Binomial Distribution $ B(k,\frac{M}{N})$ . Then finding the variance of the Binomial Distribution and some algebraic manipulation will lead to the 2nd equation. I’m clueless in proving of $ k=\Omega(N)$ . Only thing I tried writing is:

$ $ P\Big(\sqrt{\frac{M(N-M)}{k}}\leq \sqrt{M}\Big)\geq \frac{3}{4}$ $

Can someone help me with this?

[GET][NULLED] – WP Cost Estimation & Payment Forms Builder v9.677

[​IMG]
[​IMG]

[GET][NULLED] – WP Cost Estimation & Payment Forms Builder v9.677

Using O-notation for asymptotic estimation of the number of additions in recursive function

the number of additions that are executed during the calculation is a(n). How can i find an asymptotic estimation for the function mystery(n) with the help of the O-notation and master theorem. Note: here the question is not asked for the value of mystery(n), but rather for the number of additions!

def mystery(n):     if n==0:         return n * n     return 2 * mystery(n/3) + 4 * n 

Monte Carlo errors estimation routine

I would value your opinion on the following piece of code. I am rather new to both Python and Monte Carlo analysis, so I was wondering whether the routine makes sense to more experienced and knowledgeable users.

def MC_analysis_a():     x = spin_lock_durations     y_signal_a = (a_norm1, a_norm2, a_norm3, a_norm4, a_norm5, a_norm6, a_norm7, a_norm8)     x = np.array(x, dtype = float)     y_signal_a = np.array(y_signal_a, dtype = float)          def func(x, a, b):         return a * np.exp(-b * x)      initial_guess = [1.0, 1.0]     fitting_parameters, covariance_matrix = optimize.curve_fit(func, x, y_signal_a, initial_guess)     print(round(fitting_parameters[1], 2))      # ---> PRODUCING PARAMETERS ESTIMATES      total_iterations = 5000     MC_pars = np.array([])      for iTrial in range(total_iterations):         xTrial = x         yTrial = y_signal_a + np.random.normal(loc = y_signal_a, scale = e_signal_a, size = np.size(y_signal_a))         try:             iteration_identifiers, covariance_matrix = optimize.curve_fit(func, xTrial, yTrial, initial_guess)         except:             dumdum = 1             continue      # ---> STACKING RESULTS          if np.size(MC_pars) < 1:             MC_pars = np.copy(iteration_identifiers)         else:             MC_pars = np.vstack((MC_pars, iteration_identifiers))      # ---> SLICING THE ARRAY      print(np.shape(MC_pars))     # print(np.median(aFitpyars[:,1]))     print(np.std(MC_pars[:,1])) 

The output I get is apparently satisfactory and plausible.

Many thanks in advance to any contributor!