Trivial clarification with the analysis of the Dijkstra Algorithm as dealt with in Keneth Rosen’s “Discrete Mathematics and its Application”

I was going through the text, “Discrete Mathematics and its Application” by Kenneth Rosen where I came across the analysis of the Dijkstra Algorithm and felt that the values at some places of the analysis are not quite appropriate. The main motive of my question is not the analysis of the Dijkstra Algorithm in general( a better version and more clearer version exists in the CLRS text) but my main motive is analysis of the algorithm acurately as far as the mathematics is concerned, considering the below algorithm as just an unknown algorithm whose analysis is required to be done. I just want to check my progress by the fact that whether the thing which I pointed out as being weird, is actually weird or not.

Lets move on to the question. Below is the algorithm in the text.

ALGORITHM: Dijkstra’s Algorithm.

procedure Dijkstra(G: weighted connected simple graph, with all weights positive)        {G has vertices a = v[1], ... ,v[n] = z and weights w(v[j], v[j])      where w(v[j], v[j]) = ∞ if {v[i],v[j]) is not an edge in G}      for i: = 1 to n         L(v[i]) := ∞      L(a) := 0      S:=∅      {the labels are now initialized so that the label of a is 0 and all          other labels are ∞, and S is the empty set}       while z ∉ S          u := a vertex not in S with L(u) minimal          S:= S ∪ {u}          for all vertices v not in S              if L(u) + w(u, v) < L(v) then                  L(v) := L(u) + w(u, v)              {this adds a vertex to S with minimal label and updates the labels of vertices not in S}            return L(z)  {L(z) = length of a shortest path from a to z} 

The following is the analysis which they used:

We can now estimate the computational complexity of Dijkstra’s algorithm (in terms of additions and comparisons). The algorithm uses no more than $ n − 1$ iterations where $ n$ is the number of vertices in the graph, because one vertex is added to the distinguished set at each iteration. We are done if we can estimate the number of operations used for each iteration. We can identify the vertex not in S in the $ k$ th iteration with the smallest label using no more than $ n − 1$ comparisons. Then we use an addition and a comparison to update the label of each vertex not in S in the $ k$ th iteration . It follows that no more than $ 2(n − 1)$ operations are used at each iteration, because there are no more than $ n − 1$ labels to update at each iteration.

The algorithm uses no more than $ n − 1$ iterations where $ n$ is the number of vertices in the graph, because one vertex is added to the distinguished set at each iteration., What I feel is that it shall be $ n$ iterations and not $ n$ as in the very first iteration the vertex $ a$ is included in the set $ S$ and the process continues till $ z$ is inserted into the set $ S$ and $ z$ may be the last vertex in the ordering i.e.$ v_n$ .

The rest statements are fine I hope.

Analysis of kd-tree, how is the vertical line L’s intersect areas equivalent to sqrt(N)?

I’m trying to understand how the number of intersected areas by a vertical line in a KD-tree is equivalent to sqrt(n)

If you draw a balanced KD-tree with 7 nodes.
enter image description here

And then draw a vertical line l.

enter image description here

The number of areas this line intersects should be equivalent to sqrt(N) where N is amount of nodes. (7)

When I count the areas the vertical line L intersects I get 5. But sqrt(7) = 2,6 not even close.

Both sources get to the recurrence:

enter image description here

Which solves to O(sqrt(N)).

What am I doing wrong?


Source 1

Source 2

Choosing potential function in amortized analysis of the dynamic array

Suppose we have a dynamic array with some initial capacity $ c_{initial}$ (i.e., the dynamic array with zero elements will have this capacity). The $ add$ operation is modified the following way:

New array capacity $ c^{‘} = c*scale$ , where the $ scale$ is some constant if the number of elements in the array is $ s = scale^{k}$ (i.e., the array is full, and the capacity must be increased $ scale$ times).

I am trying to show that the $ add$ operation has constant complexity using the following potential function: $ \Phi=scale*s-c+c_{initial}$ $ \Phi(0)=-c_{initial}+c_{initial}=0$ .

where $ s$ is the array size (number of elements) and $ c$ – array capacity (‘generalized’ version of the popular case when the array capacity doubles – $ \Phi=2*s-c$ and initial capacity $ c_{initial}=0$ ).

However, I stuck with the case when the scale factor is bigger than 2, for example, for $ scale=4$ case (and initial size = 0) the cost of $ add$ then the array is full is $ c_{i}^{‘} = c_i + \Phi_i – \Phi_{i-1} = (s+1)+(4*(s+1) – 4*c)-(4*s-c)=5+s-3*c=5-2*c$ , which is not a constant.

Is there some way besides the guess-and-check, ‘brute-force’ way to come up with an appropriate potential function?

Another possible modification of an $ add$ operation is to increase the capacity on some constant, and I don’t see any pattern to the potential function in this case.

Average Case Analysis of Insertion Sort as dealt in Kenneth Rosen’s “Discrete Mathemathematics and its Application”

I was going through “Discrete Mathematics and its Application” by Kenneth Rosen where I came across the following algorithm of the Insertion Sort and also its analysis. The algorithm is quite different from the one dealt with in the CLRS so I have shared the entire algorithm below. Note that they have considered a machine where only comparisons are considered are significant and hence have proceeded according. The problem which I face is in the analysis portion given here in bold. Moreover the specific doubts which I have , have been pointed out by me at the very end of this question.

ALGORITHM The Insertion Sort.

procedure insertion sort($ a_1,a_2,…,a_n$ : real numbers with $ n \geqslant 2 $ )

for j:= 2 to n begin     i:=1     while aj > ai         i:=i+1     m := aj     for k:= 0 to j-i-1         aj-k := aj-k-1      ai:=m end {a1,a2,...,an is sorted}  

THE INSERTION SORT: The insertion sort is a simple sorting algorithm, but it is usually not the most efficient. To sort a list with $ n$ elements, the insertion sort begins with the second element. The insertion sort compares this second element with the first element and inserts it before the first element if it does not exceed the first element and after the first element if it exceeds the first element. At this point, the first two elements are in the correct order. The third element is then compared with the first element, and if it is larger than the first element, it is compared with the second element; it is inserted into the correct position among the first three elements.

In general, in the $ y$ th step of the insertion sort, the $ y$ th element of the list is inserted into the correct position in the list of the previously sorted $ j — 1$ elements. To insert the $ y$ th element in the list, a linear search technique is used; the $ y$ th element is successively compared with the already sorted $ j — 1$ elements at the start of the list until the first element that is not less than this element is found or until it has been compared with all $ j — 1$ elements; the $ y$ th element is inserted in the correct position so that the first $ j$ elements are sorted. The algorithm continues until the last element is placed in the correct position relative to the already sorted list of the first $ n — 1$ elements. The insertion sort is described in pseudocode in Algorithm above.

Average-Case Complexity of the Insertion Sort: What is the average number of comparisons used by the insertion sort to sort $ n$ distinct elements?

Solution: We first suppose that $ X$ is the random variable equal to the number of comparisons used by the insertion sort to sort a list $ a_1 ,a_2 ,…,a_n$ of $ n$ distinct elements. Then $ E(X)$ is the average number of comparisons used. (Recall that at step $ i$ for $ i = 2,…,n$ , the insertion sort inserts the $ i$ th element in the original list into the correct position in the sorted list of the first $ i − 1$ elements of the original list.)

We let $ X_i$ be the random variable equal to the number of comparisons used to insert $ a_i$ into the proper position after the first $ i − 1$ elements $ a_1 ,a_2,…,a_{i−1}$ have been sorted. Because

$ X=X_2+X_3+···+X_n$ ,

we can use the linearity of expectations to conclude that

$ E(X) = E(X_2 + X_3 +···+X_n) = E(X_2) + E(X_3) +···+E(X_n).$

To find $ E(X_i )$ for $ i = 2, 3,…,n$ , let $ p_j (k)$ denote the probability that the largest of the first $ j$ elements in the list occurs at the $ k$ th position, that is, that $ max(a_1 ,a_2 ,…,a_j ) = a_k$ , where $ 1 ≤ k ≤ j$ . Because the elements of the list are randomly distributed, it is equally likely for the largest element among the first $ j$ elements to occur at any position. Consequently, $ p_j (k) = \frac{1}{j}$ .If $ X_i (k)$ equals the number of comparisons used by the insertion sort if $ a_i$ is inserted into the $ k$ th position in the list once $ a_1,a_2 ,…,a_{i−1}$ have been sorted, it follows that $ X_i (k) = k$ . Because it is possible that $ a_i$ is inserted in any of the first $ i$ positions, we find that

$ E(X_i)$ = $ $ \sum_{k=1}^{i} p_i(k).X_i(k) = \sum_{k=1}^{i} \frac{1}{i}.k = \frac {1}{i}\sum_{k=1}^{i} k = \frac{1}{i}.\frac{i(i+1)}{2} = \frac{i+1}{2}$ $

It follows that

$ E(X)$ = $ $ \sum_{i=2}^{n} E(X_i) = \sum_{i=2}^{n} \frac{i+1}{2} =\frac{n^{2} + 3n -4}{4}$ $

My doubt

Now here while we are considering the calculation of $ E(X_i)$ we are first considering the probability of the maximum element between $ a_1,a_2,…,a_i$ being at position $ k$ . Then they are saying that the number of comparisons when $ a_i$ is placed into the $ k$ th position in the list $ a_1,a_2,…,a_{i-1}$ (which is already sorted) is $ k$ . Why are they considering the insertion of $ a_i$ into the position of the maximum of the elements $ a_1,a_2,…,a_i$ . $ a_i$ as per the algorithm should be placed at the first position (while scanning the array from left) when we find an element which is $ \geqslant a_i$ and not the maximum element of the sublist $ a_1,a_2,…,a_i$ .

Moveover they say that the max element of the sublist $ a_1,a_2,…,a_i$ is any arbitrary position $ k$ th and the probability of it being $ \frac{1}{i}$ . But if we see that $ a_1,a_2,…,a_{i-1}$ is sorted then the max of $ a_1,a_2,…,a_i$ is either $ a_{i-1}$ or $ a_i$ .

Amortized Analysis of Binary Counter and Dynamic Table together

Let m ≥ 2 be a fixed positive integer. Consider the following ADT. A state of the ADT is an m-tuple of integers (v1, v2, . . . , v_m). The ADT supports two operations:

• Increment(i) which increases the value of vi by 1.

• Max returns an index of the maximum value, that is it returns an i such that vi ≥ vj for all j.

I am designing a data structure for this ADT as a dynamic sized table Ti. Each entry of Ti stores single bit of vi. This allows each component to grow arbitrarily large because we can resize the table to get longer and longer as value grows.

Now, Suppose the Algorithm for Max is :

Max res ← 1 for i ← 2..m if (number represented in Ti) > (number represented in Tres) then res ← i end if end for return res end Max 

Now, Given a sequence of s operations and initial state (0,0,0,0..0). I want to do amortized analysis of this case and find the total time and amortized time.

New to malware analysis, looking for a string type list

as the title suggests, I’m new to malware analysis and as I was looking over strings in a sample, I came across one that I haven’t seen before using + and / (6AAAAABZg+kFjYm2BgAAUegBAAAAw1WL7IPk+etc…).

I am wondering if there is a list of some kind out there that has examples of certain types of string, like SGVsbG8= is base64, 48 65 6c 6c 6f is hexadecimal, and so on. Thanks!

Sedgewick Quick Union Analysis

I am currently studying Algorithms, Fourth Edition by Sedgewick et al. On page 226, there is an analysis of the quick union algorithm’s find() method’s worst case. This is the algorithm:

  private int p   {    while (p != id[p]) p = id[p];   return p;   } 

Now I count N comparisons in total (inside the while) plus N-1 accesses (last comparison is false) to rewrite the value of p. This would give me 2N – 1 accesses to the array. The textbook however says there are 2N+1 accesses. What am I doing wrong? Thanks for your time.

Analysis of Dijkstra algorithm’s (Lazy) running time

I’m trying to figure out the running time for a Dijkstra algorithm. All the sources I have read say that the running time is O(E * log (E)) for a lazy implementation.

But when we do the math we get O(E * (Log(E)+E*Log(E))).

Since E isn’t a constant, I don’t see how someone could reduce this to O(E * log (E).

Are we analyzing the wrong or is it possible to reduce?

        while (!minPQ.isEmpty()) { <=== O(E)             Node min = minPQ.poll(); <=== O(log(e)              for (Edge edge : graph.adj(min)) { <=== O(E)                 if (min.getId() == target.getId()) {                     // Source and Target = Same edge                     if (edgeTo.size() == 0) edgeTo.put(target, edge);                      return;                 }                  relax(edge, min, vehicle); <=== log(e) (because of add method on PQ)             }       } 

Time complexity analysis of 2 arbitrary algorithms – prove or disprove

We are given 2 algorithms A and B such that for each input size, algorithm A performs half the number of steps algorithm B performs on the same input size.

We denote the worst time complexity of each one by $ g_A(n),g_B(n)$

Also, we know there’s a positive function $ f(n)$ such that $ g_A(n)\in\Omega(f(n))$

Is it possible that $ g_B(n)\in\Omega(f(n))$ ? Is it necessary?

It seems naive to think that it’s necessary, but I can’t figure out to contradict it.