How can i reduce the cost in Bitmap Heap Scan, Filter and Heap Block?

Looks like the query taking time in the second last step in which it mapping index to actually rows so can I optimize this query in any terms, I m not sure if something wrong with the index? and what is Heap Block means is this related to work_mem.

Postgres Configuration: RAM 16GB vCpu 4

Table

CREATE TABLE main.user_resources (     id uuid NOT NULL DEFAULT (md5(((random())::text || (clock_timestamp())::text)))::uuid,     user_tenant_id uuid NOT NULL,     resource_name character varying(50) COLLATE pg_catalog."default" NOT NULL,     resource_value character varying(50) COLLATE pg_catalog."default" NOT NULL,     allowed boolean NOT NULL,     created_on timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,     modified_on timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,     created_by uuid,     modified_by uuid,     deleted boolean NOT NULL DEFAULT false,     deleted_on timestamp with time zone,     deleted_by uuid,     CONSTRAINT pk_user_resources_id PRIMARY KEY (id) ) 

Query:

SELECT "resource_name","resource_value","allowed" FROM "main"."user_resources" WHERE (("user_tenant_id"=abc') AND ((("resource_value" IN ('efg')) OR ("resource_name" IN ('BO','UB')))) AND ("deleted"=false)) ORDER BY "id";  

Index

CREATE INDEX idx_user_resource_user_tenant_id_and_not_deleted     ON main.user_resources USING btree     (user_tenant_id ASC NULLS LAST)     TABLESPACE pg_default     WHERE deleted = false; -- Index: idx_user_resources_resource_name  -- DROP INDEX main.idx_user_resources_resource_name;  CREATE INDEX idx_user_resources_resource_name     ON main.user_resources USING btree     (resource_name COLLATE pg_catalog."default" ASC NULLS LAST)     TABLESPACE pg_default; -- Index: idx_user_resources_resource_value  -- DROP INDEX main.idx_user_resources_resource_value;  CREATE INDEX idx_user_resources_resource_value     ON main.user_resources USING btree     (resource_value COLLATE pg_catalog."default" ASC NULLS LAST)     TABLESPACE pg_default; -- Index: idx_user_resources_user_tenant_id  -- DROP INDEX main.idx_user_resources_user_tenant_id;  CREATE INDEX idx_user_resources_user_tenant_id     ON main.user_resources USING btree     (user_tenant_id ASC NULLS LAST)     TABLESPACE pg_default;  

EXPLAIN ANALYZE

Sort  (cost=3547.20..3556.41 rows=3686 width=64) (actual time=7.080..7.438 rows=4562 loops=1)    Sort Key: id    Sort Method: quicksort  Memory: 834kB    ->  Bitmap Heap Scan on user_resources  (cost=281.12..3328.84 rows=3686 width=64) (actual time=1.070..5.325 rows=4562 loops=1)          Recheck Cond: (((user_tenant_id = 'abc'::uuid) AND ((resource_value)::text = 'efg'::text)) OR ((resource_name)::text = ANY ('{BO,UB}'::text[])))          Filter: ((NOT deleted) AND (user_tenant_id = 'abc'::uuid))          Rows Removed by Filter: 6912          Heap Blocks: exact=2177          ->  BitmapOr  (cost=281.12..281.12 rows=11470 width=0) (actual time=0.834..0.835 rows=0 loops=1)                ->  Bitmap Index Scan on idx_user_resource_mlt  (cost=0.00..4.43 rows=1 width=0) (actual time=0.021..0.021 rows=3 loops=1)                      Index Cond: ((user_tenant_id = 'abc'::uuid) AND ((resource_value)::text = 'efg'::text))                ->  Bitmap Index Scan on idx_user_resources_resource_name  (cost=0.00..274.84 rows=11469 width=0) (actual time=0.812..0.812 rows=11735 loops=1)                      Index Cond: ((resource_name)::text = ANY ('{BO,UB}'::text[]))  Planning Time: 0.214 ms  Execution Time: 7.932 ms (15 rows) 

A graph database suitable for analyzing a heap snapshot?

It looks like recommendation questions aren’t explicitly OT, so here goes:

I’m working on some tooling for analyzing a dump of the heap of a running program. The dump is just a list of nodes with associated metadata and references to other nodes (possibly-cyclical).

I don’t have any experience with graph databases, and I’m wondering if I would save myself a lot of time by building tooling around a graph DB. So I’m looking for recommendations and pointers to resources, and advice.

Some specific questions:

  • are there any graph databases that have functionality built in for computing a dominator tree? (googling this didn’t seem to get any results)
  • are there any DBs that have tooling for visualizing a huge graph?

Why does the formula floor((i-1)/2) find the parent node in a binary heap?

I learned that when you have a binary heap represented as a vector / list / array with indicies [0, 1, 2, 3, 4, 5, 6, 7, 8, …] the index of the parent of element at index i can be found with parent index = floor((i-1)/2)

I have tried to explain why it works. Can anyone help me verify this?

Took reference from Why does the formula 2n + 1 find the child node in a binary heap? thanks to @Giulio

How to draw the heap for an array in Java?

I have an assignment to draw the heap after an ArrayList and a LinkedList is created.

public static void main(String[] args) {      List list = new ArrayList();      list.add(0);      list.add(1);      list.add(1);      list.add(2); } 

and

public static void main(String[] args) {      List list = new LinkedList();     list.add(0);      list.add(1);      list.add(1);      list.add(2); } 

So far I have a tree set up for [0, 1, 1, 2] so that it just goes down like a tree from 0, 1 and 1, then 2. But I don’t know if that’s correct and if it’d be different for the ArrayList and LinkedList.

Why does my Query use a Bitmap Heap Scan

my simple query is not really fast and I don’t know why my query is using Bitmap Heap Scan.

explain analyze verbose SELECT connections, epoch_time FROM connections WHERE host_name = ‘xyz.z’ ORDER BY epoch_time;

Output:

                                                           QUERY PLAN 

Sort (cost=3257.93..3316.47 rows=23415 width=13) (actual time=8.607..9.624 rows=23259 loops=1) Output: connections, epoch_time Sort Key: connections.epoch_time Sort Method: quicksort Memory: 1859kB -> Bitmap Heap Scan on public.connections (cost=545.88..1558.57 rows=23415 width=13) (actual time=1.341..5.840 rows=23259 loops=1) Output: connections, epoch_time Recheck Cond: ((connections.host_name)::text = ‘xyz.z’::text) Heap Blocks: exact=651 -> Bitmap Index Scan on idx_host_name (cost=0.00..540.03 rows=23415 width=0) (actual time=1.266..1.266 rows=23259 loops=1) Index Cond: ((connections.host_name)::text = ‘xyz.z’::text) Planning Time: 0.171 ms Execution Time: 10.405 ms (12 rows)

The table has 97806 rows and I have created an index on the field epoch_time. The table has 4 columns (id, host_name, connections, epoch_time)

Thanks for your help!

Best regards

How could a key could be inserted in a heap without increasing the size of an array?

MAX-HEAP-INSERT(A, key)     A.heap-size = A.heap-size + 1     A[A.heap-size] = -infinity     HEAP-INCREASE-KEY(A,A.heap-size,key) 

How could a key could be inserted in a heap without increasing the size of an array? With this code from Introduction To Algorithms, you can’t just randomly increase the heap size upon wish. Did I miss something? All all online lectures I have seen do no talk about this issue. Neither does the book touch this issue. Or is it that the lowest key in an array would be dropped automatically?

Intuition behind the entire concept of Fibonacci Heap operations

The following excerpts are from the section Fibonacci Heap from the text Introduction to Algorithms by Cormen et. al


The potential function for the Fibonacci Heaps $ H$ is defined as follows:

$ $ \Phi(H)=t(H)+2m(H)$ $

where $ t(H)$ is the number of trees in the root list of the heap $ H$ and $ m(H)$ is the number of marked nodes in the heap.

Before diving into the Fibonacci Heap operations the authors try to convince us about the essence of Fibonacci Heaps as follows:

The key idea in the mergeable-heap operations on Fibonacci heaps is to delay work as long as possible. There is a performance trade-off among implementations of the various operations.($ \color{green}{\text{I do not get why}}$ ) If the number of trees in a Fibonacci heap is small, then during an $ \text{Extract-Min}$ operation we can quickly determine which of the remaining nodes becomes the new minimum node( $ \color{blue}{\text{why?}}$ ). However, as we saw with binomial heaps, we pay a price for ensuring that the number of trees is small: it can take up to $ \Omega (\lg n)$ time to insert a node into a binomial heap or to unite two binomial heaps. As we shall see, we do not attempt to consolidate trees in a Fibonacci heap when we insert a new node or unite two heaps. We save the consolidation for the $ \text{Extract-Min}$ operation, which is when we really need to find the new minimum node.


Now the problem which I am facing with the text is that they dive into proving the amortized cost mathematically using the potential method without going into the vivid intuition of the how or when the "credits" are stored as potential in the heap data structure and when it is actually used up. Moreover in most of the places what is used is "asymptotic" analysis instead of actual mathematical calculations, so it is not quite possible to conjecture whether the constant in $ O(1)$ for the amortized cost ( $ \widehat{c_i}$ ) is greater or less than the constant in $ O(1)$ for the actual cost ($ c_i$ ) for an operation.


$ $ \begin{array}{|c|c|c|} \hline \text{Sl no.}&\text{Operation}&\widehat{c_i}&c_i&\text{Method of cal. of $ \widehat{c_i}$ }&\text{Cal. Steps}&\text{Intuition}\ \hline 1&\text{Make-Fib-Heap}&O(1)&O(1)&\text{Asymptotic}&\Delta\Phi=0\text{ ; $ \widehat{c_i}=c_i=O(1)$ } &\text{None}\ \hline 2&\text{Fib-Heap-Insert}&O(1)&O(1)&\text{Asymptotic}&\Delta\Phi=1 \text{ ; $ \widehat{c_i}=c_i=O(1)+1=O(1)$ } &\text{None}\ \hline 3&\text{Fib-Heap-Min}&O(1)&O(1)&\text{Asymptotic}&\Delta\Phi=0;\text{ ; $ \widehat{c_i}=c_i=O(1)$ } &\text{None}\ \hline 4&\text{Fib-Heap-Union}&O(1)&O(1)&\text{Asymptotic}&\Delta\Phi=0;\text{ ; $ \widehat{c_i}=c_i=O(1)$ } &\text{None}\ \hline 5&\text{Fib-Extract-Min}&O(D(n))&O(D(n)+t(n))&\text{Asymptotic}&\Delta\Phi=D(n)-t(n)+1 &\text{$ \dagger$ }\ \hline 6&\text{Fib-Heap-Decrease-Key}&O(1)&O(c)&\text{Asymptotic}&\Delta\Phi=4-c &\text{$ \ddagger$ }\ \hline \end{array}$ $


$ \dagger$ – The cost of performing each link is paid for by the reduction in potential due to the link’s reducing the number of roots by one.

$ \ddagger$ – Why the potential function was defined to include a term that is twice the number of marked nodes. When a marked node $ у$ is cut by a cascading cut, its mark bit is cleared, so the potential is reduced by $ 2$ . One unit of potential pays for the cut and the clearing of the mark bit, and the other unit compensates for the unit increase in potential due to node $ у$ becoming a root.


Moreover the authors deal with a notion of marking the nodes of Fibonacci Heaps with the background that they are used to bound the amortized running time of the $ \text{Decrease-Key}$ or $ \text{Delete}$ algorithm, but not much intuition is given behind their use of it. What things shall go bad if we do not use markings or use $ \text{Cacading-Cut}$ when the number of children lost from a node is not just $ 2$ but possibly more. The excerpt corresponding to this is as follows:

We use the mark fields to obtain the desired time bounds. They record a little piece of the history of each node. Suppose that the following events have happened to node $ x$ :

  1. at some time, $ x$ was a root,
  2. then $ x$ was linked to another node,
  3. then two children of $ x$ were removed by cuts.

As soon as the second child has been lost, we cut $ x$ from its parent, making it a new root. The field $ mark[x]$ is true if steps $ 1$ and $ 2$ have occurred and one child of $ x$ has been cut. The Cut procedure, therefore, clears $ mark[x]$ in line $ 4$ , since it performs step $ 1$ . (We can now see why line $ 3$ of $ \text{Fib-Heap-Link}$ clears $ mark[y]$ : node $ у$ is being linked to another node, and so step $ 2$ is being performed. The next time a child of $ у$ is cut, $ mark[y]$ will be set to $ \text{TRUE}$ .)


Strictly I do not get the intuition behind the $ mark$ in the above block text especially the logic of doing the stuff in bold-italics.

[EDIT: The intuition of why to use the marking in the way stated was made clear to me by the lucid answer here, but I still do not get the cost benefit which we get using markings]


Note: It is quite a difficult question in the sense that it involves the description the problem which I am facing to understand the intuition behind the concept of Fibonacci Heap operations which is in fact related to an entire chapter in the CLRS text. If it demands too much in a single then please do tell me then I shall split it accordingly into parts. I have made my utmost attempt to make the question the clear. If at places the meaning is not clear, then please do tell me then I shall rectify it. The entire corresponding portion of the text can be found here. (Even the authors say that it is a difficult data structure, having only theoretical importance.)

Method to change value in a key for a min heap

How would you write a method to change the value of a min heap where bool changeKey(int oldKey, int newKey). The keys are unique, no duplicate keys are permitted. If there is a key in the heap with value oldKey, and no existing entry with key newKey, it will change it to newKey, keeping the the heap properties and returning true. If there is no key in the heap with value oldKey, or an existing entry has value newKey, then it will take no action and return false. Also an auxiliary map is used to store the locations of the keys in an array. I know how to write methods for insertion but haven’t seen any implementations of this.