Understanding postgres query planner behaviour on gin index

need your expert opinion on index usage and query planner behaviour.

\d orders                                          Partitioned table "public.orders"            Column            |           Type           | Collation | Nullable |                   Default -----------------------------+--------------------------+-----------+----------+----------------------------------------------  oid                         | character varying        |           | not null |  user_id                     | character varying        |           | not null |  tags                        | text[]                   |           | not null |  category                    | character varying        |           |          |  description                 | character varying        |           |          |  order_timestamp             | timestamp with time zone |           | not null |  ..... Partition key: RANGE (order_timestamp) Indexes:     "orders_uid_country_ot_idx" btree (user_id, country, order_timestamp)     "orders_uid_country_cat_ot_idx" btree (user_id, country, category, order_timestamp desc)     "orders_uid_country_tag_gin_idx" gin (user_id, country, tags) WITH (fastupdate=off)     "orders_uid_oid_ot_key" UNIQUE CONSTRAINT, btree (user_id, oid, order_timestamp) 

I have observed the following behaviour based on query param when I run the following query, select * from orders where user_id = 'u1' and country = 'c1' and tags && '{t1}' and order_timestamp >= '2021-01-01 00:00:00+00' and order_timestamp < '2021-03-25 05:45:47+00' order by order_timestamp desc limit 10 offset 0

case 1: for records with t1 tags where t1 tags occupies 99% of the records for user u1, 1st index orders_uid_country_ot_idx is picked up.

Limit  (cost=0.70..88.97 rows=21 width=712) (actual time=1.967..12.608 rows=21 loops=1)    ->  Index Scan Backward using orders_y2021_jan_to_uid_country_ot_idx on orders_y2021_jan_to_jun orders  (cost=0.70..1232.35 rows=293 width=712) (actual time=1.966..12.604 rows=21 loops=1)          Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))          Filter: (tags && '{t1}'::text[])  Planning Time: 0.194 ms  Execution Time: 12.628 ms 

case 2: But when I query for tags value t2 with something like tags && '{t2}' and it is present in 0 to <3% of records for a user, gin index is picked up.

Limit  (cost=108.36..108.38 rows=7 width=712) (actual time=37.822..37.824 rows=0 loops=1)    ->  Sort  (cost=108.36..108.38 rows=7 width=712) (actual time=37.820..37.821 rows=0 loops=1)          Sort Key: orders.order_timestamp DESC          Sort Method: quicksort  Memory: 25kB          ->  Bitmap Heap Scan on orders_y2021_jan_to_jun orders  (cost=76.10..108.26 rows=7 width=712) (actual time=37.815..37.816 rows=0 loops=1)                Recheck Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'ID'::text) AND (tags && '{t2}'::text[]))                Filter: ((order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))                ->  Bitmap Index Scan on orders_y2021_jan_to_uid_country_tag_gin_idx  (cost=0.00..76.10 rows=8 width=0) (actual time=37.812..37.812 rows=0 loops=1)                      Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (tags && '{t2}'::text[]))  Planning Time: 0.190 ms  Execution Time: 37.935 ms 
  1. Is this because the query planner identifies that since 99% of the records is covered in case 1, it skips the gin index and directly uses the 1st index? If so, does postgres identifies it based on the stats?

  2. Before gin index creation, when 1st index is picked for case 2, performance was very bad since index access range is high. i.e number of records that satisfies the condition of user id, country and time column is very high. gin index improved it but i’m curious to understand how postgres chooses it selectively.

  3. orders_uid_country_cat_ot_idx was added to support filter by category since when gin index was used when filtered by just category or by both category and tags, the performance was bad compared to when the btree index of user_id, country, category, order_timestamp is picked up . I expected gin index to work well for all the combination of category and tags filter. What could be the reason? The table contains millions of rows

Database connection lost whenever SQL query string is too long

I recently switched from running my Rails app on a single VM to running the database — MariaDB 10.3 — on a separate (Debian Buster) VM. Now that the database is on a separate server, Rails immediately throws Mysql2::Error::ConnectionError: MySQL server has gone away whenever it tries to make a query where the SQL itself is very long. (They query itself isn’t necessarily one that would put significant load on the system.)

An example query that causes the problem looks like this:

SELECT `articles`.`id` FROM `articles` WHERE `articles`.`namespace` = 0 AND `articles`.`wiki_id` = 1 AND `articles`.`title` IN ('Abortion', 'American_Civil_Liberties_Union', 'Auschwitz_concentration_camp', 'Agent_Orange', 'Ahimsa') 

… except the array of titles is about 5000 items long, and the full query string is ~158kB.

On the database side, this corresponds to warnings like this:

2021-03-25 15:47:13 10 [Warning] Aborted connection 10 to db: 'dashboard' user: 'outreachdashboard' host: 'programs-and-events-dashboard.globaleducation.eqiad1.wikimed' (Got an error reading communication packets)

The problem seems to be with the network layer, but I can’t get to the bottom of it. I’ve tried adjusting many MariaDB config variables (max_allowed_packet, innodb_log_buffer_size, innodb_log_file_size, innodb_buffer_pool_size) but none of those made a difference. The problem seems to be that the connection is aborted while it is attempting to transmit the long SQL query string from the app server to the database server. (There’s no corresponding problem with receiving large query results from the database.)

I’ve tried adjusting several timeout-related settings as well, although that seems unlikely to be the problem because I can replicate the connection error without any significant wait, just by issuing one of the long-SQL-string queries from a Rails console.

I’ve tried using tcpdump to see what’s coming in, but didn’t pick up any additional clues from that.

Prevent Duplicate Post Counted by Query

<?php  while ( have_posts() ) : the_post();           $  get_series_id = get_post_meta( get_the_ID(), 'series_seri', true );          if (in_array($  get_series_id, $  do_not_duplicate)) {             continue; // We've already seen this post ID, so skip the rest of the loop         }      $  do_not_duplicate[] = $  get_series_id; ?> <?php the_title(); ?> <?php endwhile; wp_reset_postdata(); ?> 

how to make duplicate post not counted ? i wanna show 20 post, but in frontpage only showing 5 post because duplicate post also counted,

Repairing buggy query server-side – without changes in the app

More external apps are accessing our ERP SW database. As we were adding some columns into our structure, some queries from the external app become buggy because they are not specifying the targeted table in the query and errors out with Ambiguous column name… message.

The app vendor is inaccesible. We cannot repair the query from the app point nor change our structure (it’s for more customers).

Can we do something at the database server level?

Taxonomy list inside the query

Is it possible to get a list of the "terms of taxonomy" or "just used taxonomy types" defined within that archive without looking at all pages of an archive query in a loop?

What I mean is this:

Custom Taxonomy Types and their terms:

colors (red, green, orangge, yellow, pink) sizes (xs, sm, md, lg, xl) 

An archive an post_type:

posts found 100 per_page 10 

In this case, the posts in the archive may be matched to any term of any taxonomy type, or they may not contain any.

When we want to use these terms to filter, it is necessary to remove the terms that do not exist in the current archive query (100 posts).

To be more precise, considering that there are 1000s of posts and 1000s of terms, it will take up unnecessary space on the screen (negative user experience) and in fact no posts will appear when filtered.

The goal here is to get a list of term or just taxonomy types in the query with the query.

Is there a configurable setting for this? Or should a new SQL query be written? Please help me.

Spatial Query very slow with radius

I posted this before at Query very slow when using spatial with radius which has a lot of details about my problem but I think i didnt include enough data so I am trying here again with database and the query I am having problem tuning.

Download database and attach Database [SQL Server 2019] (I promise no viruses, its just a .bak file in a zip), also scrubbed it out of info i dont want it out there 🙂 https://1drv.ms/u/s!AuxopE3ug8yWm-QTvSxyiHGIrAlXow?e=R7m20G

I shrank the database so its smaller to download, so you must run the following script to rebuild all indexes

EXECUTE sp_msForEachTable 'SET QUOTED_IDENTIFIER ON; ALTER INDEX ALL ON ? REBUILD;' 

Run query (Non-Indexed View)

DECLARE @p3 sys.geography SET @p3=convert(sys.geography,0xE6100000010C010000209DE44540FFFFFF3F77CA53C0) SELECT l.* FROM GridListings l WHERE  l.Location.STDistance(@p3) < 15000 ORDER BY createddate OFFSET 0 ROWS FETCH NEXT 21 ROWS ONLY 

Or Indexed View

DECLARE @p3 sys.geography SET @p3=convert(sys.geography,0xE6100000010C010000209DE44540FFFFFF3F77CA53C0) SELECT l.* FROM GridListingsIndexed l WHERE  l.Location.STDistance(@p3) < 15000 ORDER BY createddate OFFSET 0 ROWS FETCH NEXT 21 ROWS ONLY 

What I am looking for (I am sorry if it is too much, but I am really desperate for help as lot of my queries are timing out on my app which some take between 6-50 seconds on a server with 32gb ram and 6 vcores (hyper v), the server also does other things but I think there is enough horse power

  1. I use the view above which already has non-expired listings filtered, then I use that view to filter down further listings but right now its slow with expirydate set in view and the radius against the view

  2. Look through my indexes and propose better indexes, improvement suggestions overall.

  3. If all fails, i might have to restore to separating my expired and non expired listings into separate tables, but this becomes a nightmare for maintenance

Access Subdatasheet (plus sign) not showing in datasheet view form but shows in table or query

I want to show inside a form in datasheet view, the plus signs for viewing subdatasheet. I have tables properly linked, all is fine and I have subdatasheet set in table/query and it displays correctly and all information well linked.

I have a form based on that table/query that has the subdatasheet but when I open the form in datasheet view so that I can see the subdatasheet, this wont show up. The plus signs are not there.

I just want a form for looking up data, not for editing and subdatasheet view serves perfectly what I want to achieve but I don’t want to open tables or queries directly but inside a form.

Error while Running query in postgreSQL from sql server 2016

The below query runs well in sql server 2016:

 select ResellerId, vCompanyName ,x.*  from wlt_tblReseller AS main  outer apply ( select (count (ipkReportTypeId)) AS "count", vReportTypeName                 from wlt_tblReseller _all                inner join wlt_tblClient clients  on clients.ifkParentResellerId = _all.ResellerId               inner join wlt_tblReport_CompanyMaster reporslogs on reporslogs.ifkCompanyId = clients.ClientId               inner join wlt_tblReports_TypeMaster rpt_types on rpt_types.ipkReportTypeId = reporslogs.ifkReportTypeId                where RootResellerId = main.ResellerId                  and rpt_types.bStatus =1                  and bIsStatic =1                  and vReportTypeName  is not null               group by ipkReportTypeId,vReportTypeName             ) AS x  WHERE IsMiniReseller = 0    and ResellerId <> 1     and vReportTypeName is not null  order by  vCompanyName desc  

But when I take it to postgreSQL and change outer apply to LEFT JOIN LATERAL,it does not run and produces the following error:

ERROR: syntax error at or near "WHERE" LINE 9: )AS x WHERE IsMiniReseller = 0 and ResellerId <> 1 and v… SQL state: 42601 Character: 649

What could I be missing?Any help will be much appreciated.

Regards Chris.

How to get max value of filtered query post

I have a query post to filter my post in my CPT, filtered on front end by a form filter.

The problem is that with different filter, I have different max value.

For example I have the following custom field meta value:

  • filter #1 – Ford, Nissan, Toyota
  • filter #2 – Blue, Red, White
  • and on with other filter#3, #4 etc…

Choosing different filter (for example ‘Ford and Red’ vs ‘Nissan and White’), I may have different max value price of total cars in the same query post, that have the same characteristics.

How I can get the max value of current filter on query post, and use this value on each line of same query post to show the different price from the max?

Someone can help me? Thanks in advance