Benefits of not having a clustered index on tables (Heaps)

What are the benefits of not having a clustered index on a table in SQL server. Will a

SELECT * INTO TABLE_A FROM TABLE_B 

Be faster if TABLE_A is a heap? Which operation will benefit if the table is a heap? I am quite sure UPDATE‘s and DELETE‘s will benefit from a clustered index. What about INSERTS? My understanding is that INSERT "might" benefit from the table being a heap, both in term of speed but also other resources and hardware (I/O, CPU, memory and storage…).

What is the most scarce resource in terms of hardware? In terms of storage is a heap going to occupy less space? Is disk storage not the least expensive resource? If so is it rational to keep table as heap in order to save disk space? How will a heap affect CPU and I/O with SELECT, INSERT, UPDATE and DELETE? What cost goes up when table is a heap and we SELECT, UPDATE and DELETE from it?

Thansk

Essence of the cost benifit obtained by using “markings” in Fibonacci Heaps (by using a mathematical approach)

The following excerpts are from the section Fibonacci Heap from the text Introduction to Algorithms by Cormen et. al


The authors deal with a notion of marking the nodes of Fibonacci Heaps with the background that they are used to bound the amortized running time of the $ \text{Decrease-Key}$ or $ \text{Delete}$ algorithm, but not much intuition is given behind their use of it.

What things shall go bad if we do not use markings ? (or) use $ \text{Cacading-Cut}$ when the number of children lost from a node is not just $ 2$ but possibly more ?

The excerpt corresponding to this is as follows:

We use the mark fields to obtain the desired time bounds. They record a little piece of the history of each node. Suppose that the following events have happened to node $ x$ :

  1. at some time, $ x$ was a root,
  2. then $ x$ was linked to another node,
  3. then two children of $ x$ were removed by cuts.

As soon as the second child has been lost, we cut $ x$ from its parent, making it a new root. The field $ mark[x]$ is true if steps $ 1$ and $ 2$ have occurred and one child of $ x$ has been cut. The Cut procedure, therefore, clears $ mark[x]$ in line $ 4$ , since it performs step $ 1$ . (We can now see why line $ 3$ of $ \text{Fib-Heap-Link}$ clears $ mark[y]$ : node $ у$ is being linked to another node, and so step $ 2$ is being performed. The next time a child of $ у$ is cut, $ mark[y]$ will be set to $ \text{TRUE}$ .)


[The intuition of why to use the marking in the way stated in italics portion of the block above was made clear to me by the lucid answer here, but I still do not get the cost benefit which we get using markings what shall possibly go wrong if we do not use markings, the answer here talks about the benefit but no mathematics is used for the counter example given]


The entire corresponding portion of the text can be found here for quick reference.

Union-Find using Fibonacci Heaps

I am trying to do a MST algorithm implementation (Finding minimum spanning tree by extracting minimum edges and including every vertex) using Fibonacci Heaps. I want to minimize the time complexity by augmenting data structure if needed.

Approximate Algorithm :

MST(G) 2 T ← {} // set that will store the edges of the MST 3 for i ← 1..n 4 Vi ← {i} 5 Ei ← {(i, j) : j is a vertex and (i, j) is an edge of G} // set of all edges incident with vertex i 6 end for 7 while there is more than one set Vi 8 choose any Vi 9 extract minimum weight edge (u, v) from Ei 10 one of the endpoints u of this edge is in Vi. let Vj be the set that contains the other endpoint v 11 if i != j then 12 T ← T ∪ {(u, v)} 13 combine Vi and Vj into Vi (destroying Vj ) 14 combine Ei and Ej into Ei (destroying Ej ) 15 end if 16 end while 17 return T 18 end MST  

Time complexity Analysis Approach:

The for loop of line 3 − 5 requires O(V) MAKE-HEAP operations and a total of O(E) INSERT operations in line 5. Note that the size of each Ei set can be at most O(V) by definition. The total time is thus O(V + E) ~ O(E) (Because Fibonacci Heaps can support constant insertions for both MakeHeap and Insertions.

Now for Line 7-15 – We can at most extract O(E) edges in line 9 taking a total of (E lg V) time if after every insert we also consolidate. (Debatable) Can we use consolidate a bit more efficiently?

Also, I feel that we are doing a couple of Union operations further. How to optimize it in a way that I try to save maximum time by using some auxiliary data structures if possible.

Data structure implementation of MST (Minimum spanning tree) through Fibonacci heaps

How can a fibonacci heap store the information needed by the algorithm? In order to achieve good efficiency, when would you run the Consolidate routine?

Algorithm :

MST(G) 2 T ← {} // set that will store the edges of the MST 3 for i ← 1..n 4 Vi ← {i} 5 Ei ← {(i, j) : j is a vertex and (i, j) is an edge of G} // set of all edges incident with vertex i 6 end for 7 while there is more than one set Vi 8 choose any Vi 9 extract minimum weight edge (u, v) from Ei 10 one of the endpoints u of this edge is in Vi ; let Vj be the set that contains the other endpoint v 11 if i 6= j then 12 T ← T ∪ {(u, v)} 13 combine Vi and Vj into Vi (destroying Vj ) 14 combine Ei and Ej into Ei (destroying Ej ) 15 end if 16 end while 17 return T 18 end MST 

Would you have to add any additional fields to nodes in the heaps or use any additional data structures?

Does the heap property in the definition of binary heaps apply recursively?

The definition of binary heaps says that it should be a complete binary tree and it should follow the heap property where according to the heap property, the key stored in each node is either greater than or equal to or less than or equal to the keys in the node’s children.

Binary tree

In the above tree, the node with value 70 is greater than its parent 10 breaking the heap property. However, 70 is also greater than 40 and lies in the subtree of 40. Will we say that the heap property is also breaking at 40, even though 40 is greater than its two children 10 and 2?

relationship between binary numbers and binomial heaps

I understand that a binomial heap can be represented as binary numbers according to the degree of each tree but what exactly is the relationship between inserting a new node into the binomial heap and incrementing the binomial number.

In addition to that, I assume that there is a relationship between performing union on two binomial heaps and adding the two binary numbers of the binomial heaps but what exactly is the relationship.

I’m looking for something like a formal / textbook definition or of the relationship (if possible)

How to compute the general term formula for the number of full binary tree heaps that can be formed with distinct elements?

The number of possible heaps that are full binary trees of height $ h$ and can be formed with ($ n = 2^h – 1$ ) distinct elements can be computed by recursion: $ $ a_h = {2^h – 2 \choose 2^{h – 1} – 1} a_{h – 1}^2 $ $ How to compute the general term formula with this recursion formula?

Distinct Binary Heaps

I have $ n$ elements out of $ n-1$ are distinct. The repeated element is either minimum or maximum element. I need to figure out how many distinct max heaps can be made from it.

My analysis : I started with $ n$ distinct elements. Since root is fixed( maximum element) we can choose $ l$ (found using deducting total elements from elements in penultimate level) from remaining $ n-1$ elements and recursively choose for Left Sub-tree and Right Sub-tree.

Recurrence Relation :

$ T(n)={n-1 \choose l} * T(l) * T(r)$

Now for $ n-1$ distinct elements(given), for root we have $ 2$ options i.e. maximum elements and we can recurse as above for left and right sub-tree. But since the repeated element is also there I am not able to figure out exact way to do so.

Eg: $ A=[2,6,6] =>$ There are 2 distinct max heaps $ => [6,2,6] , [6,6,2]$

I am unable to think of the way to find out the number of max heaps in this case. Can someone think of algorithm/recurrence relation to find so ?

Number of possible heaps on $\{1,…,2^h-1\}$


Let $ C_h$ be the number of possible heaps for the set of keys $ \{1,…,2^h-1\}$ . Determine a recurrence relation for $ C_h$ via the substitution method and prove it.

Definition

A binary tree is ordered if the key at every node is smaller than the keys at its children.

An ordered binary tree of height $ h$ is called a heap if the tree induced by all levels except the last is a complete binary tree and all leaves are “as much to the left as possible”.

So what I’ve come up with is this expression:

$ $ C_h=\displaystyle\prod_{n=1}^{h} (2^{i-1})!$ $

The intuition behind it is that with every increment by $ 1$ of $ h$ the tree gains $ 2$ times more leaves than before and the possibilities of ordering these leaves can be determined via the factorial operation. Does that make sense?