PostgreSQL: difference between index on materialized view and index on tables used in a non-materialized view

I have tables a(id TEXT, url TEXT) and b(id TEXT, url TEXT) containing same or similar set of columns.

Would the query plans created for simple SELECT * FROM m WHERE id = ? be identical (or at least within same order of magnitude) for the following two views:

  1. Materialized view combining columns from both tables with an index on the shared id field
CREATE MATERIALIZED VIEW m AS (   SELECT id, url FROM a   UNION   SELECT id, url FROM b ); CREATE INDEX idx_m_id ON m(id);  
  1. View combining the tables, with each table having its own index on the column
CREATE INDEX idx_a_id ON a(id);  CREATE INDEX idx_b_id ON b(id);  CREATE VIEW m AS (   SELECT id, url FROM a   UNION   SELECT id, url FROM b ); 

Amount of expected loop iterations when searching an array by random index

Lets say we have an array A of size n. It has 1 as its first index and n as its last index. It contains a value x, with x occurring k times in A where 1<=k<=n

If we have a search algorithm like so:

while true:   i := random(1, n)   if A[i] == x     break 

random(a,b) picks a number uniformly from a to b

From this we know that the chances of finding x and terminating the program is k/n with each iteration. However what I would like to know is what would be the expected value for the number of iterations or more specifically the amount of times the array was accessed in this program given the array A as described above.

How to correctly interpret index usages from EXPLAIN of mysql?

enter image description here

I’ve exported the results of an EXPLAIN run on a query. What I find confusing is that there’s the key column listing out one of the indexes from the list of possible_keys(not shown in picture) however only the top row makes mention of Using index explicitely in the Extra column.

  1. What does this mean in the 2nd row, is it not using the index listed in the key column?
  2. How should I interpret what the contents of the key column is about and how it is used?
    enter code here

    a. Should I interpret this as that index is used in the where stage of this query?

How can I index and search a set of mathematical formulas? (using MathWebSearch or similar systems) [closed]

I have a bunch of documents and I need to search Mathematical Formulas in those documents. Those formulas must be ranked in a similar way Google ranks web pages.

Through Google I have found the open source system MathWebSearch but there is no good documentation of tht system.

Is there anybody here who already have worked with MathWebSearch and can help me using MathWebSearch?

Index Inject API Error

I’m not sure if this is a GSA bug or an Index Inject bug, however, I have one project running in one instance of GSA and it sent all my verified links over to Index Inject like I told it to.  That’s good thing.

The bad thing is that it did it over and over gain, 4800 links per until it used the entire 150K link quota I have over there.

Postgres UPDATE with data from another table – index only scan used for correlated subquery but not join

Context

I’m tuning a bulk UPDATE which selects from another (large) table. My intention is to provide a covering index to support an index only scan of the source table. I realise the source table must be vacuumed to update its visibility map.

My investigations so far suggest the optimiser elects to index only scan the source table when the UPDATE uses a correlated subquery, but appears to use a standard index scan when a join is used (UPDATE...FROM). I’m asking this question to understand why.

I provide a simplified example here to illustrate the differences.

I’m using Postgres 9.6.8, but get very similar plans for 10.11 and 11.6. I have reproduced the plans on a vanilla 9.6 Postgres installation in Docker using the official image, and also on db<>fiddle here.

Setup

CREATE TABLE lookup (     surrogate_key   BIGINT PRIMARY KEY,     natural_key     TEXT NOT NULL UNIQUE,     data            TEXT NOT NULL);  INSERT INTO lookup SELECT id, 'nk'||id, random()::text FROM generate_series(1,400000) id;  CREATE UNIQUE INDEX lookup_ix ON lookup(natural_key, surrogate_key);  VACUUM ANALYSE lookup;  CREATE TABLE target (     target_id               BIGINT PRIMARY KEY,     lookup_natural_key      TEXT NOT NULL,     lookup_surrogate_key    BIGINT,     data                    TEXT NOT NULL );  INSERT INTO target (target_id, lookup_natural_key, data) SELECT id+1000, 'nk'||id, random()::text FROM generate_series(1,1000) id;  ANALYSE target; 

UPDATE using join

EXPLAIN (ANALYSE, VERBOSE, BUFFERS) UPDATE target SET lookup_surrogate_key = surrogate_key FROM lookup WHERE lookup_natural_key = natural_key; 

Standard index scan on lookup_ix – so heap blocks are read from lookup table:

Update on public.target  (cost=0.42..7109.00 rows=1000 width=54) (actual time=76.688..76.688 rows=0 loops=1)   Buffers: shared hit=8514 read=550 dirtied=16   ->  Nested Loop  (cost=0.42..7109.00 rows=1000 width=54) (actual time=0.050..62.493 rows=1000 loops=1)         Output: target.target_id, target.lookup_natural_key, lookup.surrogate_key, target.data, target.ctid, lookup.ctid         Buffers: shared hit=3479 read=535         ->  Seq Scan on public.target  (cost=0.00..19.00 rows=1000 width=40) (actual time=0.013..7.691 rows=1000 loops=1)               Output: target.target_id, target.lookup_natural_key, target.data, target.ctid               Buffers: shared hit=9         ->  Index Scan using lookup_ix on public.lookup  (cost=0.42..7.08 rows=1 width=22) (actual time=0.020..0.027 rows=1 loops=1000)               Output: lookup.surrogate_key, lookup.ctid, lookup.natural_key               Index Cond: (lookup.natural_key = target.lookup_natural_key)               Buffers: shared hit=3470 read=535 Planning time: 0.431 ms Execution time: 76.826 ms 

UPDATE using correlated subquery

EXPLAIN (ANALYSE, VERBOSE, BUFFERS) UPDATE target SET lookup_surrogate_key = (     SELECT surrogate_key     FROM lookup     WHERE lookup_natural_key = natural_key); 

Index only scan on lookup_ix as intended:

Update on public.target  (cost=0.00..4459.00 rows=1000 width=47) (actual time=52.947..52.947 rows=0 loops=1)   Buffers: shared hit=8050 read=15 dirtied=16   ->  Seq Scan on public.target  (cost=0.00..4459.00 rows=1000 width=47) (actual time=0.052..40.306 rows=1000 loops=1)         Output: target.target_id, target.lookup_natural_key, (SubPlan 1), target.data, target.ctid         Buffers: shared hit=3015         SubPlan 1           ->  Index Only Scan using lookup_ix on public.lookup  (cost=0.42..4.44 rows=1 width=8) (actual time=0.013..0.019 rows=1 loops=1000)                 Output: lookup.surrogate_key                 Index Cond: (lookup.natural_key = target.lookup_natural_key)                 Heap Fetches: 0                 Buffers: shared hit=3006 Planning time: 0.130 ms Execution time: 52.987 ms 

db<>fiddle here

I understand that the queries are not logically identical (different behaviour when there a no/multiple rows in lookup for a given natural_key), but I’m surprised by the different usage of lookup_ix.

Can anyone explain why the join version could not use an index only scan please?

How can I check index fragmentation in the quickest way possible?

Checking index fragmentation in my database seems unruly slow. Regardless if I use the DMV sys.dm_db_index_physical_stats (for a specific database, table, or even index) or if I use the SSMS Index Properties window to look at Fragmentation on a specific index, it takes a really long time.

For example, using the Index Properties window will take upwards of 5 minutes to open up for a single index on my largest (~20 billion rows) table.

I do want to push to implement partitioning but until then I have to support an existing index maintenance job and I’m not sure how we can even check index fragmentation when one index on our heaviest table takes about 5 minutes to analyze. (Each of our tables has at least a few indexes.)

Here’s a case where it took so long in the Index Properties window that I think it timed out and returned nothing in window: enter image description here

What if Remove page from Google index after the new search console update?

google search console

According to the new update in search console, url can be temporarily remove for 6 months or less as 3 month. Can be removed permanently from site also.

I have pages that already indexed and ranked. The pages also have amount of visitor. But now don’t want to have those pages on site. Want to delete.

After deleting Will it remove from Google also. Whereas the pages indexed approx 2 years ago…

finding the parent index in an interval heap (stored on an array) given a child index

an interval heap is a binary tree stored on an array where the size of each node is 2.

i would like to be able to find the index of a parent and find one of the child indices given the index of a node.

an example of an interval trees indices would be:

         [0,1]      [2,3]     [4,5]  [6,7][8,9][10,11][12,13] 

or

         [1,2]      [3,4]     [5,6]  [7,8][9,10][11,12][13,14] 

How to calculate the cache index and the tag field?

“For a memory organization with 2GB of main memory, 32K cache size, and cache organized as 8-byte-blocks, directed mapped:

Suppose you have an access to memory address 0x0005B432. What is the cache index? What is the tag field?”

Please help me, I don’t know how I am supposed to solve this exercise. Also I need to find the cache index and the tag field of following memory addresses: 0x0005B436 0x0005B438 0x0003B437