## Average Case Analysis of Insertion Sort as dealt in Kenneth Rosen’s “Discrete Mathemathematics and its Application”

I was going through “Discrete Mathematics and its Application” by Kenneth Rosen where I came across the following algorithm of the Insertion Sort and also its analysis. The algorithm is quite different from the one dealt with in the CLRS so I have shared the entire algorithm below. Note that they have considered a machine where only comparisons are considered are significant and hence have proceeded according. The problem which I face is in the analysis portion given here in bold. Moreover the specific doubts which I have , have been pointed out by me at the very end of this question.

ALGORITHM The Insertion Sort.

procedure insertion sort($$a_1,a_2,…,a_n$$: real numbers with $$n \geqslant 2$$ )

for j:= 2 to n begin     i:=1     while aj > ai         i:=i+1     m := aj     for k:= 0 to j-i-1         aj-k := aj-k-1      ai:=m end {a1,a2,...,an is sorted}  

THE INSERTION SORT: The insertion sort is a simple sorting algorithm, but it is usually not the most efficient. To sort a list with $$n$$ elements, the insertion sort begins with the second element. The insertion sort compares this second element with the first element and inserts it before the first element if it does not exceed the first element and after the first element if it exceeds the first element. At this point, the first two elements are in the correct order. The third element is then compared with the first element, and if it is larger than the first element, it is compared with the second element; it is inserted into the correct position among the first three elements.

In general, in the $$y$$ th step of the insertion sort, the $$y$$ th element of the list is inserted into the correct position in the list of the previously sorted $$j — 1$$ elements. To insert the $$y$$ th element in the list, a linear search technique is used; the $$y$$ th element is successively compared with the already sorted $$j — 1$$ elements at the start of the list until the first element that is not less than this element is found or until it has been compared with all $$j — 1$$ elements; the $$y$$ th element is inserted in the correct position so that the first $$j$$ elements are sorted. The algorithm continues until the last element is placed in the correct position relative to the already sorted list of the first $$n — 1$$ elements. The insertion sort is described in pseudocode in Algorithm above.

Average-Case Complexity of the Insertion Sort: What is the average number of comparisons used by the insertion sort to sort $$n$$ distinct elements?

Solution: We first suppose that $$X$$ is the random variable equal to the number of comparisons used by the insertion sort to sort a list $$a_1 ,a_2 ,…,a_n$$ of $$n$$ distinct elements. Then $$E(X)$$ is the average number of comparisons used. (Recall that at step $$i$$ for $$i = 2,…,n$$ , the insertion sort inserts the $$i$$ th element in the original list into the correct position in the sorted list of the first $$i − 1$$ elements of the original list.)

We let $$X_i$$ be the random variable equal to the number of comparisons used to insert $$a_i$$ into the proper position after the first $$i − 1$$ elements $$a_1 ,a_2,…,a_{i−1}$$ have been sorted. Because

$$X=X_2+X_3+···+X_n$$,

we can use the linearity of expectations to conclude that

$$E(X) = E(X_2 + X_3 +···+X_n) = E(X_2) + E(X_3) +···+E(X_n).$$

To find $$E(X_i )$$ for $$i = 2, 3,…,n$$ , let $$p_j (k)$$ denote the probability that the largest of the first $$j$$ elements in the list occurs at the $$k$$ th position, that is, that $$max(a_1 ,a_2 ,…,a_j ) = a_k$$ , where $$1 ≤ k ≤ j$$ . Because the elements of the list are randomly distributed, it is equally likely for the largest element among the first $$j$$ elements to occur at any position. Consequently, $$p_j (k) = \frac{1}{j}$$ .If $$X_i (k)$$ equals the number of comparisons used by the insertion sort if $$a_i$$ is inserted into the $$k$$ th position in the list once $$a_1,a_2 ,…,a_{i−1}$$ have been sorted, it follows that $$X_i (k) = k$$ . Because it is possible that $$a_i$$ is inserted in any of the first $$i$$ positions, we find that

$$E(X_i)$$ = $$\sum_{k=1}^{i} p_i(k).X_i(k) = \sum_{k=1}^{i} \frac{1}{i}.k = \frac {1}{i}\sum_{k=1}^{i} k = \frac{1}{i}.\frac{i(i+1)}{2} = \frac{i+1}{2}$$

It follows that

$$E(X)$$ = $$\sum_{i=2}^{n} E(X_i) = \sum_{i=2}^{n} \frac{i+1}{2} =\frac{n^{2} + 3n -4}{4}$$

My doubt

Now here while we are considering the calculation of $$E(X_i)$$ we are first considering the probability of the maximum element between $$a_1,a_2,…,a_i$$ being at position $$k$$. Then they are saying that the number of comparisons when $$a_i$$ is placed into the $$k$$ th position in the list $$a_1,a_2,…,a_{i-1}$$ (which is already sorted) is $$k$$. Why are they considering the insertion of $$a_i$$ into the position of the maximum of the elements $$a_1,a_2,…,a_i$$. $$a_i$$ as per the algorithm should be placed at the first position (while scanning the array from left) when we find an element which is $$\geqslant a_i$$ and not the maximum element of the sublist $$a_1,a_2,…,a_i$$.

Moveover they say that the max element of the sublist $$a_1,a_2,…,a_i$$ is any arbitrary position $$k$$ th and the probability of it being $$\frac{1}{i}$$. But if we see that $$a_1,a_2,…,a_{i-1}$$ is sorted then the max of $$a_1,a_2,…,a_i$$ is either $$a_{i-1}$$ or $$a_i$$.