I asked this question on math.se, but didn’t receive an answer

Suppose we are given a vector $ v$ and vectors $ \mu_i$ :

$ v = \mu_1+\mu_2+…+\mu_m$ , where $ \mu_i \in R^n$ , all $ \mu_i$ are of unit length.

Oracle will give me $ k$ vectors $ \mu_{j_1}, \mu_{j_2},…\mu_{j_k}$ from the original set such that when I project $ v$ onto subspace spanned by these vectors the length of the projection is highest possible. In other words, from the set of all combinations of $ k$ vectors from $ [\mu_1,…\mu_n]$ the $ [\mu_{j_1}, \mu_{j_2},…\mu_{j_k}]$ give highest length of projection. Lets denote by $ v_{\text{proj}}$ projection of $ v$ onto $ [\mu_{j_1}, \mu_{j_2},…\mu_{j_k}]$

I want to estimate quality of projection before oracle gives me this $ k$ vectors. **I want to give upper bound on** $ ||v – v_{\text{proj}}|| $

As far as I understood it is very difficult to obtain these $ k$ vectors by myself. However, I know that for any two vectors $ \mu_i, \mu_j$ , $ ||\mu_i-\mu_j|| \leq \alpha$ , where $ \alpha$ is a given positive number.

Small values of $ \alpha$ will tell me that all $ \mu_i$ are close to each other and heading towards same direction. I would suspect then that projection will be good, and its length will be close to the length of original vector. How can I use this to give an upper bound $ ||v – v_{\text{proj}}|| $ ?

**My attempts**:

Without loss of generality lets assume that $ k$ optimal vectors are first $ k$ vectors in the list, i.e $ \mu_1,\mu_2,…\mu_k$ . Lets denote by $ P$ projection operator on the space spanned by $ \mu_1,\mu_2,…\mu_k$ .

$ \|v – v_{\text{proj}}\| = \|v – P(v)\| = \|v – P(\mu_1+\mu_2+…+\mu_m)\| = $

$ \|v – P(\mu_1) – P(\mu_2) – … – P(\mu_m)\| = $

$ \| v – \mu_1 – \mu_2 – … – \mu_k – P(\mu_{k+1}) – P(\mu_{k+2}) – … – P(\mu_m)\| = $

$ \|\mu_{k+1} – P(\mu_{k+1}) + \mu_{k+2} – P(\mu_{k+2}) + … + \mu_{m} – P(\mu_{m})\|$

$ \|v – v_{\text{proj}}\| \leq \|\mu_{k+1} – P(\mu_{k+1})\| + \|\mu_{k+2} – P(\mu_{k+2}) + … + \|\mu_{m} – P(\mu_{m})\|$

$ \|v – v_{\text{proj}}\| \leq (m-k)\alpha$

So in order to make $ \|v – v_{\text{proj}}\| \leq \epsilon$ , we need $ k \geq \frac{m\alpha – \epsilon}{\alpha}$

**I am not satisfied with this result because $ k$ grows linearly with $ m$ **. I want it to grow much slower, something like $ \log(m)$ . My goal is to show that under some constraints on $ \mu_i$ , we need only approximately $ \log(m)$ vectors to approximate $ v$ .

I think the bound can be improved substantially. First Cauchy inequality isn’t very tight and second, I used $ |\mu_{k+1} – P(\mu_{k+1})\| \leq \alpha$ which is also very loose.

**I am open for additional constraints on $ \mu_1,…\mu_m$ to achieve logarithmic growth**

Alex Ravsky on math.se has noted, that we also need a constraint on $ \alpha$ in order to achieve logarithmic growth. Assume that $ k$ $ \leq n$ , $ \mu_i$ is th $ i$ -th standard ort of the space $ \mathbb{R}^n$ , and $ \alpha = \sqrt{2}$ . Then $ \|v – v_{\text{proj}}\| = \sqrt{m-k}$