I learnt that in a order-statistic tree (augmented Red-Black Tree, in which each node $ x$ contains an extra field denoting the number of nodes in the sub-tree rooted at $ x$ ) finding the $ i$ th order statistics can be done in $ O(lg(n))$ time in the worst case. Now in case of an array representing the dynamic set of elements finding the $ i$ th order statistic can be achieved in the $ O(n)$ time in the worst case.[ where $ n$ is the number of elements].

Now I felt like finding a tight upper bound for forming an $ n$ element Red-Black Tree so that I could comment about which alternative is better : "maintain the set elements in an array and perform query in $ O(n)$ time" or "maintaining the elements in a Red-Black Tree (formation of which takes $ O(f(n))$ time say) and then perform query in $ O(lg(n))$ time".

So a very rough analysis is as follows, inserting an element into an $ n$ element Red-Black Tree takes $ O(lg(n))$ time and there are $ n$ elements to insert , so it takes $ O(nlg(n))$ time. Now this analysis is quite loose as when there are only few elements in the Red-Black tree the height is quite less and so is the time to insert in the tree.

I tried to attempt a detailed analysis as follows (but failed however):

Let while trying to insert the $ j=i+1$ th element the height of the tree is atmost $ 2.lg(i+1)+1$ . For an appropriate $ c$ , the total running time,

$ $ T(n)\leq \sum_{j=1}^{n}c.(2.lg(i+1)+1)$ $

$ $ =c.\sum_{i=0}^{n-1}(2.lg(i+1)+1)$ $

$ $ =c.\left[\sum_{i=0}^{n-1}2.lg(i+1)+\sum_{i=0}^{n-1}1\right]$ $

$ $ =2c\sum_{i=0}^{n-1}lg(i+1)+cn\tag1$ $

Now

$ $ \sum_{i=0}^{n-1}lg(i+1)=lg(1)+lg(2)+lg(3)+…+lg(n)=lg(1.2.3….n)\tag2$ $

Now $ $ \prod_{k=1}^{n}k\leq n^n, \text{which is a very loose upper bound}\tag 3$ $

Using $ (3)$ in $ (2)$ and substituting the result in $ (1)$ we have $ T(n)=O(nlg(n))$ which is the same as the rough analysis…

Can I do anything better than $ (3)$ ?

All the nodes referred to are the internal nodes in the Red-Black Tree.