Proof for time complexity of Insertion (k-proximate) Sort equals O(nk)

The following is the definition for Proximate Sorting given in my paper:

An array of distinct integers is k-proximate if every integer of the array is at most k places away from its place in the array after being sorted, i.e., if the ith integer of the unsorted input array is the jth largest integer contained in the array, then |i − j| ≤ k. In this problem, we will show how to sort a k-proximate array faster than Θ(n log n).

The following is the proof for time complexity of k-proximate insertion sort:

To prove O(nk), we show that each of the n insertion sort rounds swap an item left by at most O(k).

In the original ordering, entries that are ≥ 2k slots apart must already be ordered correctly: indeed, if A[s] > A[t] but t − s ≥ 2k, there is no way to reverse the order of these two items while moving each at most k slots. This means that for each entry A[i] in the original order, fewer than 2k of the items A[0], . . . , A[i−1] are less than A[i]. Thus, on round i of insertion sort when A[i] is swapped into place, fewer than 2k swaps are required, so round i requires O(k) time.

My problem with the proof is this line : "This means that for each entry A[i] in the original order, fewer than 2k of the items A[0], . . . , A[i−1] are less than A[i]". The preceding statements just proved that t-s < 2k, otherwise the elements are sorted (as elements with distance greater than 2k cannot be swapped.) Isn’t the correct statement : "This means that for each entry A[i] in the original order, fewer than 2k of the items A[0], . . . , A[i−1] are greater than A[i]"?

Average Case Analysis of Insertion Sort as dealt in Kenneth Rosen’s “Discrete Mathemathematics and its Application”

I was going through “Discrete Mathematics and its Application” by Kenneth Rosen where I came across the following algorithm of the Insertion Sort and also its analysis. The algorithm is quite different from the one dealt with in the CLRS so I have shared the entire algorithm below. Note that they have considered a machine where only comparisons are considered are significant and hence have proceeded according. The problem which I face is in the analysis portion given here in bold. Moreover the specific doubts which I have , have been pointed out by me at the very end of this question.

ALGORITHM The Insertion Sort.


procedure insertion sort($ a_1,a_2,…,a_n$ : real numbers with $ n \geqslant 2 $ )

for j:= 2 to n begin     i:=1     while aj > ai         i:=i+1     m := aj     for k:= 0 to j-i-1         aj-k := aj-k-1      ai:=m end {a1,a2,...,an is sorted}  

THE INSERTION SORT: The insertion sort is a simple sorting algorithm, but it is usually not the most efficient. To sort a list with $ n$ elements, the insertion sort begins with the second element. The insertion sort compares this second element with the first element and inserts it before the first element if it does not exceed the first element and after the first element if it exceeds the first element. At this point, the first two elements are in the correct order. The third element is then compared with the first element, and if it is larger than the first element, it is compared with the second element; it is inserted into the correct position among the first three elements.

In general, in the $ y$ th step of the insertion sort, the $ y$ th element of the list is inserted into the correct position in the list of the previously sorted $ j — 1$ elements. To insert the $ y$ th element in the list, a linear search technique is used; the $ y$ th element is successively compared with the already sorted $ j — 1$ elements at the start of the list until the first element that is not less than this element is found or until it has been compared with all $ j — 1$ elements; the $ y$ th element is inserted in the correct position so that the first $ j$ elements are sorted. The algorithm continues until the last element is placed in the correct position relative to the already sorted list of the first $ n — 1$ elements. The insertion sort is described in pseudocode in Algorithm above.

Average-Case Complexity of the Insertion Sort: What is the average number of comparisons used by the insertion sort to sort $ n$ distinct elements?

Solution: We first suppose that $ X$ is the random variable equal to the number of comparisons used by the insertion sort to sort a list $ a_1 ,a_2 ,…,a_n$ of $ n$ distinct elements. Then $ E(X)$ is the average number of comparisons used. (Recall that at step $ i$ for $ i = 2,…,n$ , the insertion sort inserts the $ i$ th element in the original list into the correct position in the sorted list of the first $ i − 1$ elements of the original list.)

We let $ X_i$ be the random variable equal to the number of comparisons used to insert $ a_i$ into the proper position after the first $ i − 1$ elements $ a_1 ,a_2,…,a_{i−1}$ have been sorted. Because

$ X=X_2+X_3+···+X_n$ ,

we can use the linearity of expectations to conclude that

$ E(X) = E(X_2 + X_3 +···+X_n) = E(X_2) + E(X_3) +···+E(X_n).$

To find $ E(X_i )$ for $ i = 2, 3,…,n$ , let $ p_j (k)$ denote the probability that the largest of the first $ j$ elements in the list occurs at the $ k$ th position, that is, that $ max(a_1 ,a_2 ,…,a_j ) = a_k$ , where $ 1 ≤ k ≤ j$ . Because the elements of the list are randomly distributed, it is equally likely for the largest element among the first $ j$ elements to occur at any position. Consequently, $ p_j (k) = \frac{1}{j}$ .If $ X_i (k)$ equals the number of comparisons used by the insertion sort if $ a_i$ is inserted into the $ k$ th position in the list once $ a_1,a_2 ,…,a_{i−1}$ have been sorted, it follows that $ X_i (k) = k$ . Because it is possible that $ a_i$ is inserted in any of the first $ i$ positions, we find that

$ E(X_i)$ = $ $ \sum_{k=1}^{i} p_i(k).X_i(k) = \sum_{k=1}^{i} \frac{1}{i}.k = \frac {1}{i}\sum_{k=1}^{i} k = \frac{1}{i}.\frac{i(i+1)}{2} = \frac{i+1}{2}$ $

It follows that

$ E(X)$ = $ $ \sum_{i=2}^{n} E(X_i) = \sum_{i=2}^{n} \frac{i+1}{2} =\frac{n^{2} + 3n -4}{4}$ $

My doubt


Now here while we are considering the calculation of $ E(X_i)$ we are first considering the probability of the maximum element between $ a_1,a_2,…,a_i$ being at position $ k$ . Then they are saying that the number of comparisons when $ a_i$ is placed into the $ k$ th position in the list $ a_1,a_2,…,a_{i-1}$ (which is already sorted) is $ k$ . Why are they considering the insertion of $ a_i$ into the position of the maximum of the elements $ a_1,a_2,…,a_i$ . $ a_i$ as per the algorithm should be placed at the first position (while scanning the array from left) when we find an element which is $ \geqslant a_i$ and not the maximum element of the sublist $ a_1,a_2,…,a_i$ .

Moveover they say that the max element of the sublist $ a_1,a_2,…,a_i$ is any arbitrary position $ k$ th and the probability of it being $ \frac{1}{i}$ . But if we see that $ a_1,a_2,…,a_{i-1}$ is sorted then the max of $ a_1,a_2,…,a_i$ is either $ a_{i-1}$ or $ a_i$ .

How to implement GIF insertion in Unity? “XYZ App doesn’t support image insertion here”

I’m making an Android App with Unity, which has a chat service in it.

And I would like to handle sending images or GIFs via Gboard.

Currently when the user selects a TextField, his keyboard comes up, which is let’s say GBoard.

And now when I click on a GIF or a Sticker, it says

XYZ App doesn’t support image insertion here

So how could I implement it?

Or more precisely, how can I define what method will be be called in my application when the user tries to send a GIF?

Because my first thought to implement it, is like it’s done in all other apps:

  • every GIF would be in a separate message, and when the user clicks on a GIF it instantly sends a special GIF message, which will be rendered for example as an animated texture.

But how could I subscribe to this event in the first place?

Thanks in advance!

How to speed up an insertion from a huge table with postgres?

I have 5 tables in my database with respectively a size of 70Gb, 500Mb, 400 Mb, 110Mb and 20 Mb.

I want to create a new table that contains all columns of all tables, so I tried 2 queries, the first one is :

select into new_table as select .. from t1 join t2 join t3 join t4 ... 

and the second one is :

insert into new_table select .. from t1 join t2 join t3 join t4  ... 

Before executing these two queries on my big data tables, I tried both on a total 1G database, the first on took only 7s and the second one approximately 10 mn.

Now, executing the first one on my huge database, made my disk full even though I had 250Gb free space before running the query, and without finishing the query so I got the follow error :

ERROR:  could not write to temporary file: No space left on device 

The second one, is taking a lot of time and consuming my free disk space slowly and, as the first one, not returning the result.

What are the difference between these two queries ? Is there a way to make the insert into non transactional so as I can follow my insert steps. And I guess Postgres uses logs (journalization) so is there a way to deactivate that in order to speed up the insertion ? or I should follow another method in order to get a desired result without filling up all disk

Combining merge sort with insertion sort – Time complexity

I am learning algorithms from the CLRS book on my own, without any help. It has an exercise which combines merge sort {O(n log n)} with insertion sort {O($ n^{2} $ )}. It says that when the sub-arrays in the merge-sorting reach a certain size “k”, then it is better to use insertion sort for those sub-arrays instead of merge sort. The reason given is that the constant factors in insertion sort make it fast for small n. Can someone please explain this ?

It asks us to show that (n/k) sublists, each of length k, can be sorted by insertion sort in O(nk) worst-case time. I found from somewhere that the solution for this is O($ nk^{2}/n $ ) = O(nk). How do we get this part O($ nk^{2}/n $ ) ?

Thanks !

Time complexity of insertion in linked list

Apologies if this question feels like a solution verification, but this question was asked in my graduate admission test and there’s a lot riding on this:

What is the worst case time complexity of inserting $ n$ elements into an empty linked list, if the linked list needs to be maintained in sorted order?

In my opinion, the answer should be $ O(n^2)$ because in every insertion, we will have to insert the element in the right place and it is possible that every element has to be inserted at the last place, giving me a time complexity of $ 1 + 2 + … (n-1) + n = O(n^2)$

However, the solution that I have says that we can first sort the elements in $ O(n \log n)$ and then, we can insert them one by one in $ O(n)$ , giving us an overall complexity of $ O(n \log n)$ .

From the given wording of the question, which solution is more apt? In my opinion, since the question mentions “linked list needs to be maintained in sorted order”, I am inclined to say that we cannot sort the elements beforehand and then insert them in the sorted order.

Independence of order of insertion hashtable with open addressing

I’m taking a data-structure class, and the lecturer made the following assertion:

the number of attempts needed to insert n keys in a hash table with linear probing is independent of their order.

No proof was given, so I tried to get one myself. However, I’m stuck.

My approach at the moment: I try to show that if I swap two adjacent keys the number of attempts doesn’t change. I get the idea behind it, and I think it’s going in the right direction, but I can’t manage to make it into a rigorous proof.

Aside, does this fact also hold for other probing techniques such as quadratic or double hashing?

Why multiple rotations might be needed after deletion in an AVL tree if after insertion there can be at most one needed?

I understand that after deletion you have to retrace to update ancestors and after insertion you do the similar however at most one rotation will be performed.

The question is why is there the difference in number of potential rotations for these two operations.

To me it seems like these operations should be the reverse of each other but this difference in their properties suggests they’re not. Why?