Mathematics of the upper bound for the encoded input of an instance Kruskal’s MWST problem

I was going through the classic text “Introduction to Automata Theory, Languages, and Computation” by Hofcroft, Ullman and Motwani where I came across the encoding of an instance of Kruskal’s Mininum Weight Spanning Tree ($ MWST$ ) problem so that it could be given as input to a Turing Machine. The encoding is as shown.

The Kruskal’s problem could be couched as: “Given this graph $ G$ and limit $ W$ , does $ G$ have a spanning tree of weight $ W$ or less?”


Let us consider a possible code for the graphs and weight limits that could be the input to the $ MWST$ problem. The code has five symbols, $ 0$ , $ 1$ , the left and right parentheses, and the comma.

  1. Assign integers $ 1$ through $ m$ to the nodes.

  2. Begin the code with the value of $ m$ in binary and the weight limit $ W$ in binary, separated by a comma.

  3. If there is an edge between nodes $ i$ and $ j$ with weight $ w$ , place $ (i, j, w)$ in the code. The integers $ i$ , $ j$ , and $ w$ are coded in binary. The order of $ i$ and $ j$ within an edge, and the order of the edges within the code are immaterial. Thus, one of the possible codes for the graph of Fig. with limit $ W = 40$ is

$ 100, 101000(1, 10, 1111)(1, 11, 1010)(10, 11, 1100)(10, 100, 10100)(11, 100, 10010)$

If we represent inputs to the $ MWST$ problem above, then an input of length $ n$ can represent at most $ O(n/\log n)$ edges. It is possible that $ m$ , the number of nodes, could be exponential in $ n$ , if there are very few edges. However, unless the number of edges, $ e$ , is at least $ m-1$ the graph cannot be connected and therefore will have no $ MWST$ , regardless of its edges. Consequently, if the number of nodes is not at least some fraction of $ n/\log n$ , there is no need to run Kruskal’s algorithm at all we simply say “no there is no spanning tree of that weight.”

I do not understand the mathematics behind the lines which are given in bold. How are they bounding the number of edges by dividing the total input size $ n$ by $ \log n$ . If the number of edges are less then how can the number of node $ m$ be exponential in $ n$

Upper Bound on Local Search MST

Consider a connected weighted graph $ G=(V,E)$ . The local search algorithm is used to find the Minimum Spanning Tree, only swapping one edge for another at a time. At each step of the local search algorithm, select the best improving swapping of edges: $ \{e, f\}$ , where $ w_f – w_e$ is maximized.

Here is how the algorithm works:

Define a spanning tree $ T$ in $ G$ . Let $ e$ be an edge in $ G \setminus T$ , and define $ C$ as the cycle created when $ e$ is added to $ T$ . If $ w_e < w_f$ , where $ f$ is the edge in $ C \setminus \{e\}$ of largest weight, then $ \{e, f\}$ is defined as an improving swap and we set $ T := (T \setminus f) ∪ e$ . We repeat this until there are no more improving swaps.

There exists a finite number of spanning trees in a graph $ G$ . So, when using the best improving swap local search algorithm, there must be an eventual termination of the algorithm in a finite number of steps since the weight of the spanning tree falls in each step.

What I need is an upper bound on the max number of steps the best improving swap local search algorithm will take. Assume every edge weight in the $ G$ is distinct.

I’m really stuck and I’m not sure how to even start this problem, any help would be appreciated.

The total length of input to a pushdown automata which accepts by empty stack is an upper bound on the number states and stack symbols

I was going through the classic text “Introduction to Automata Theory, Languages, and Computation” (3rd Edition) by Jeffrey Ullman ,John Hopcroft, Rajeev Motwani, where I came across few statements about a pushdown automata (PDA) which accepts by empty stack, as:

1. $ n$ , the total length of the input, is surely an upper bound on the number of states and stack symbols.

2. One rule could place almost n symbols on the stack.

The following statements were made while the authors were about to make some notes about the decision properties of CFLs(Context Free Languages)

Now here are some points by which I am possibly able to contradict the claim rather than proving it correct.

  1. Suppose $ n$ , is the total length of the input, but as per the design of the PDA it might so happen that to accept the input string all the states of the PDA is not involved, so by this we can’t say that $ n$ is an upper bound on the number of states the PDA has.

  2. Though the PDA accepts by empty stack, it might so happen that a transition function adds more than $ n$ elements on the top of the stack, but at the end on consuming the $ n$ input symbols we can stay on the particular state and use epsilon transitions to just remain in the same state and pop the elements from the stack till it becomes empty. So how can we say that $ n$ is an upper bound on the number elements on the stack? We arrive at a contradiction…

I don’t understand where I am making the mistake, because the same statements are written in the 3rd edition of the book without any changes being made from the second edition which makes it probable that the statement is correct.

I have attached the corresponding portion of the text below: enter image description here

Find the upper bound of the recurrence T(n) = T(n – 4) + n with n is odd

I am trying to solve this recurrence assuming n is odd: $ T(n) = T(n – 4) + \Theta n$

What I did so far was:

First, $ T(n – 4) = T(n – 8) + (n – 4) $ , thus we get $ T(n) = T(n – 8) + (n – 4) + n$

Next, $ T(n – 8) = T(n – 12) + (n – 8)$ , thus we get $ T(n) = T(n – 12) + (n – 8) + (n – 4) + n$

Next, $ T(n – 12) = T(n – 16) + (n – 12)$ , thus we get $ T(n) = T(n – 16) + (n – 12) + (n – 8) + (n – 4) + n$

So know I am seeing that where this is going, I wrote the general form:

$ T(n) = T(n – 4k) + [n – 4(k – 1)] + [n – 4(k – 2)] + … + (n – 4) + n$

As mentioned, we have n is odd so I write it as $ n = 2k + 1$ , in which we get $ k = \frac{n – 1}{2}$ . So I will plug that in to get:

$ T(n) = T(-n + 2) + (-n + 6) + (-n + 10) + … + (n – 4) + n$

This is where I am stuck. I suspect that the $ (n – 4) + n$ must have gone wrong – but I am not too sure but I decided to ask you guys.

Thank you so much!

Upper bound on the average number of overlaps for an interval within a set of intervals

Let $ \mathcal{I}$ be a set of intervals with cardinality $ L$ , where each interval $ I_i \in \mathcal{I}$ is of the form $ [a_i, b_i]$ and $ a_i, b_i$ are pairwise distinct non-negative integers bounded by a constant $ C$ i.e. $ 0 \leq a_i < b_i \leq C$ . We say a pair of intervals $ I_i, I_j \in \mathcal{I}$ overlap if the length of overlap is $ > 0$ .

Define a function $ F(i)$ which computes the number of intervals in $ \mathcal{I} \backslash I_i$ that interval $ I_i$ overlaps with. \begin{equation} F(i) = \sum_{j=1, j \neq i}^{L} Overlap(I_i, I_j) \end{equation} where the function $ Overlap(I_i, I_j)$ is an indicator function which returns 1 if $ I_i, I_j$ overlap, else it returns 0.

The average number of overlaps for the intervals in $ \mathcal{I}$ , denoted by $ Avg(\mathcal{I})$ is given by $ Avg(\mathcal{I}) = \dfrac{\sum_{i=1}^{L}F(i)}{L}$ .

The question is, suppose we are allowed to choose the intervals in the set $ \mathcal{I}$ with the following additional conditions:

  1. For any $ t \in [0, C]$ , we have at most $ M$ (and $ M < L$ ) intervals in $ \mathcal{I}$ such that $ t$ is contained within those $ M$ intervals. Stated differently, at most $ M$ intervals overlap at any point in time.
  2. Any interval in $ \mathcal{I}$ overlaps with at most $ K$ other intervals, and $ M < K < L$ .

then, what is an upper bound on $ Avg(\mathcal{I})$ for any choice of the intervals in $ \mathcal{I}$ satisfying 1, 2?

In case you are wondering, this problem is of interest to me in order to be able to study the run-time of a scheduling algorithm.

I am unable to come up with a non-trivial upper bound for $ Avg(\mathcal{I})$ and would be interested to know if the problem I stated has been studied. I am also open to ideas on how one may be able to obtain a tight upper bound for $ Avg(\mathcal{I})$ .

Upper (or lower) envelope of some linear functions

1) Given some single variable line functions, is there exist an algorithm with the lowest complexity to find the upper (or lower) envelope of them? It means that we should find all breakpoints. I searched and found that there exists an algorithm with $ O(n \log(n))$ complexity. However. I need some references.

2) May the complexity be reduced if the tangent of all lines is positive?

In Big-O notation, what does it mean for T(n) to be upper bounded by something

I do not have much experience in mathematics but I would really like to grasp Big-O notation on its mathematical level. I already read What does the "big O complexity" of a function mean? from references, but I still do not understand (even graphically), what does it mean when we say:

T(n) = O(f(n)) if and only if there are constants c and g such that: T(n) <= c*f(n), where n>=g

Specifically, we say that T(n) is upper bounded by c*f(n). What does that actually mean and why does it matter? Does it have to do with eliminating constant factors and low-ordered terms?

Sorry if question is kind of confusing, and thanks for the help!

Upper bound for runtime complexity of LOOP programs

Recently I learned about LOOP programs, which always terminate and have the same computational power as primitive recursive functions. Furthermore primitve recursive functions can (as far as I understood) compute anything that isn’t growing faster than $ Ack(n)$ .

Is this implying that the upper bound runtime complexity for LOOP programs is $ O(Ack(n))$ ? And are there functions similar to Ackermann’s function, which can’t be computed by primitive recursive functions, but grow slower than $ Ack(n)$ ?

(sorry for spelling and grammar)

For what family of graphs does Dijkstra’s algorithm achieve the run-time upper bound

The question is in the title of the post. I am hoping to get some validation regarding my solution.

After some trial and error, my idea is as follows.

  1. Worst case complexity of $ \mathcal{O}(E\lg_{}{V})$ is achieved when nodes relax all of their neighbors when dequed.
  2. For the above to occur, longer paths to every neighbor of $ v$ must relax their respective edges into the said neighbors before $ v$ is dequed.

The two points above give rise to the following formalism.

For all paths $ p$ , $ q$ between any two nodes $ s$ and $ t$ , $ l(p) > l(q) \implies l(p \setminus \{t\}) < l(q \setminus \{t\})$ where $ l(x)$ is the length of path $ x$ .

enter image description here

For example, in the graph above, longer path edge $ A \rightarrow C$ is relaxed before the edge $ B \rightarrow C$ of the shorter path $ A \rightarrow B \rightarrow C.$ This property is true for every node in the graph.