Understanding postgres query planner behaviour on gin index

need your expert opinion on index usage and query planner behaviour.

\d orders                                          Partitioned table "public.orders"            Column            |           Type           | Collation | Nullable |                   Default -----------------------------+--------------------------+-----------+----------+----------------------------------------------  oid                         | character varying        |           | not null |  user_id                     | character varying        |           | not null |  tags                        | text[]                   |           | not null |  category                    | character varying        |           |          |  description                 | character varying        |           |          |  order_timestamp             | timestamp with time zone |           | not null |  ..... Partition key: RANGE (order_timestamp) Indexes:     "orders_uid_country_ot_idx" btree (user_id, country, order_timestamp)     "orders_uid_country_cat_ot_idx" btree (user_id, country, category, order_timestamp desc)     "orders_uid_country_tag_gin_idx" gin (user_id, country, tags) WITH (fastupdate=off)     "orders_uid_oid_ot_key" UNIQUE CONSTRAINT, btree (user_id, oid, order_timestamp) 

I have observed the following behaviour based on query param when I run the following query, select * from orders where user_id = 'u1' and country = 'c1' and tags && '{t1}' and order_timestamp >= '2021-01-01 00:00:00+00' and order_timestamp < '2021-03-25 05:45:47+00' order by order_timestamp desc limit 10 offset 0

case 1: for records with t1 tags where t1 tags occupies 99% of the records for user u1, 1st index orders_uid_country_ot_idx is picked up.

Limit  (cost=0.70..88.97 rows=21 width=712) (actual time=1.967..12.608 rows=21 loops=1)    ->  Index Scan Backward using orders_y2021_jan_to_uid_country_ot_idx on orders_y2021_jan_to_jun orders  (cost=0.70..1232.35 rows=293 width=712) (actual time=1.966..12.604 rows=21 loops=1)          Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))          Filter: (tags && '{t1}'::text[])  Planning Time: 0.194 ms  Execution Time: 12.628 ms 

case 2: But when I query for tags value t2 with something like tags && '{t2}' and it is present in 0 to <3% of records for a user, gin index is picked up.

Limit  (cost=108.36..108.38 rows=7 width=712) (actual time=37.822..37.824 rows=0 loops=1)    ->  Sort  (cost=108.36..108.38 rows=7 width=712) (actual time=37.820..37.821 rows=0 loops=1)          Sort Key: orders.order_timestamp DESC          Sort Method: quicksort  Memory: 25kB          ->  Bitmap Heap Scan on orders_y2021_jan_to_jun orders  (cost=76.10..108.26 rows=7 width=712) (actual time=37.815..37.816 rows=0 loops=1)                Recheck Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'ID'::text) AND (tags && '{t2}'::text[]))                Filter: ((order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))                ->  Bitmap Index Scan on orders_y2021_jan_to_uid_country_tag_gin_idx  (cost=0.00..76.10 rows=8 width=0) (actual time=37.812..37.812 rows=0 loops=1)                      Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (tags && '{t2}'::text[]))  Planning Time: 0.190 ms  Execution Time: 37.935 ms 
  1. Is this because the query planner identifies that since 99% of the records is covered in case 1, it skips the gin index and directly uses the 1st index? If so, does postgres identifies it based on the stats?

  2. Before gin index creation, when 1st index is picked for case 2, performance was very bad since index access range is high. i.e number of records that satisfies the condition of user id, country and time column is very high. gin index improved it but i’m curious to understand how postgres chooses it selectively.

  3. orders_uid_country_cat_ot_idx was added to support filter by category since when gin index was used when filtered by just category or by both category and tags, the performance was bad compared to when the btree index of user_id, country, category, order_timestamp is picked up . I expected gin index to work well for all the combination of category and tags filter. What could be the reason? The table contains millions of rows

understanding how google adwords headlines / descriptions work

It says I need five headlines. If I try to add less I get a "Too few elements in the collection" error. Does Google switch out between the headlines? What if I’d rather it just use one headline for all ads? I tried to copy / paste the same headline 5x and got an error about duplicates existing.

I looked at another existant campaign on my website and that campaign appears to have 3x duplicate headlines (not 5x). Is that campaign just kinda being grandfathered in?

And what if I wanted certain headlines to go with certain descriptions? From the UI there doesn’t appear to be a way to link the two.

Finally, does the order of the headlines / descriptions matter?

Any ideas?

Understanding Connection To Proxy Ratio

I am watching looplines video on safely scraping google in 2020 and am having a fundamental misunderstanding on the terminology.

It says that there should be one connection for every 5 proxies, or that the connection ratio can vary.

How can more than 1 IP address make a single connection? When a page loads does it not load from a single IP?

How would it be possible for 50 IP addresses to load one connection?

What does “connection” mean in this case?

D&D 3.5e understanding attacks of opportunity

I am currently assisting in the dming of the game as far as understanding rules. in the last game session we played i experianced a AoO for the first time as did the actual dm. A pc(1) was engaged in combat with an enemy another pc(2) standing next to him moved away from that fight does that provoke an attack of opportunity from the enemy. also i understand an AoO to be a free attack so does that mean it doesnt have to roll hit? then one last question in the aforementioned situation if an enemy is engaged with pc(1) and pc(2) attacks enemy does the enemy shift target to pc(2) and does the pc(2) take any negatives for attacking the enemy via melee like a ranged attack would.

thank you for any feed back

Understanding mod_expires apache module [closed]

Working with htaccess is a new capitol for me. I have already read about mod_expires here https://httpd.apache.org/docs/current/mod/mod_expires.htm, but I’m stil confused.

I have the next code from stack overflow:

ExpiresActive On ExpiresByType image/jpg "access 1 year" ExpiresByType image/jpeg "access 1 year" ExpiresByType image/gif "access 1 year" ExpiresByType image/png "access 1 year" ExpiresByType text/css "access 1 month" ExpiresByType text/html "access 1 month" ExpiresByType application/pdf "access 1 month" ExpiresByType text/x-javascript "access 1 month" ExpiresByType image/x-icon "access 1 year" ExpiresDefault "access plus 2 months" 

My questions are:

  1. I suppose the line ExpiresByType image/jpg "access 1 year" tells the client (browser) to download the file and keep it for one year. And when the client accesses the same page from my website, the jpg image won’t be downloaded from my website. It will be read from his computer (where the browser saved first time). After one year, the browser automatically deletes the file and, once the client access my website, it will download again, for another year. If the client cleans his browser history after 2 months, then the browser will download it again even if a year hasn’t passed. Do I miss something ?

  2. What cache availability should I set for each file type ? It depends on the file ? What should I consider when set the expiration time ?

  3. What’s the difference between "access 1 year" and "access plus 1 year" ?

  4. Does ExpiresDefault rule works for php files too ? I mean, if I have a contact.php file with some content and I modify the content of contact.php, because of ExpiresDefault rule, the content won’t change to the user ?

  5. Do search engine spiders listen for this cache rules or they download the files each time they crawling my website ?

  6. Is it true that ExpiresDefault is for the files I don’t set using ExpiresByType ? If the answer is yes, what are the other types ? What types of files are included here ?

  7. Do I understand correct that Header set Cache-Control "max-age=290304000, public" is the maximum time allowed for a file to be cached ? If the answer is yes, then means that "access 999 years" will take no effect as long 290304000 is the limit. True or false ?

Understanding the mechanics of a satyr’s mirthful leap

I apologize if this question has been asked in this form or another, but I am still having an issue calculating how to utilize a satyr’s mirthful leap, and how it would affect the character moving forward, focusing on the math and stats. For this question, I’ll use the DndBeyond stat block’s 12 Strength as a fair basis and perfect rolls for maximum distance.

The satyr’s mirthful leap is stated as follows:

Whenever you make a long or high jump, you can roll a d8 and add the number rolled to the number of feet you cover, even when making a standing jump. This extra distance costs movement as normal.

while the calculation for long jump states:

When you make a long jump, you cover a number of feet up to your Strength score if you move at least 10 feet on foot immediately before the jump. When you make a standing long jump, you can leap only half that distance. Either way, each foot you clear on the jump costs a foot of movement.

So adding everything together as I understand it, a satyr can either clear a 20ft chasm with a 10 ft leap (12ft for strength and 8 for their ability), leaving them 5 ft extra in case something goes wrong, OR can clear a 13ft chasm ([12/2]+8) without a running start. This means the satyr has the ability to use nearly all his movement in one long jump, correct?

If that is the case, then:

  1. If your satyr had to cross a terrain during 1 round by leaping from platform to platform (say to cross a river, or avoid the many pit traps in a sealed hallway), how would the mirthful leap be applied? Without a running start, would each leap be up to 13ft? Would that only apply to the first jump? Or would the mirthful leap of the prior jump be considered a running start for the next, allowing them to jump to the next platform up to 20ft away?
  2. Would this not make modifiers like Boots of Striding (tripling jumping distance up to character speed) useless?

Understanding the importance of Gunicorn and Nginx for Django web development

I’m entirely uninitiated to the world of web development, and only have a tentative grasp on Django and web development through the test server it works through.

From the guide I’m reading, the author turns to using Nginx once he starts working on site deployment, because Django is "not designed for real-life workloads." What does that mean, and why doesn’t it? In terms of justification for using Gunicorn, the author remarks:

Do you know why the Django mascot is a pony? The story is that Django comes with so many things you want: an ORM, all sorts of middleware, the admin site…​ "What else do you want, a pony?" Well, Gunicorn stands for "Green Unicorn", which I guess is what you’d want next if you already had a pony…​

Well and good, but I don’t really know what the two are doing for the server. I know for web developers this is like asking what multiplication is to a maths professor, so please excuse the naivety. In your please keep in mind I have almost no knowledge of web development other than what I’ve thus far learned from this guide, doing my best to understand as much as I can for the previously entirely uninitiated (I’m from a computational programming background).

Understanding Logging Priority & Options in Oracle DB

From my understanding once we switch on database logging, the entire DB goes into logging mode and generates redo logs.

I want to categorically exclude/include some tables/tablespaces from this logging as they are not required for recovery incase of failures.

Is there any priority on the logging options and to exclude certain tables/tablespaces from logging so as to reduce some of the traffic going to redo logs.

Understanding JOIN Syntax

Given:

postgres=# \d foo                 Table "public.foo"  Column |  Type   | Collation | Nullable | Default  --------+---------+-----------+----------+---------  a      | integer |           | not null |   b      | text    |           | not null |  Indexes:     "foo_pkey" PRIMARY KEY, btree (a)  postgres=# \d bar                 Table "public.bar"  Column |  Type   | Collation | Nullable | Default  --------+---------+-----------+----------+---------  a      | integer |           | not null |   b      | text    |           | not null |  Indexes:     "bar_pkey" PRIMARY KEY, btree (a)  postgres=# select * from foo;  a |  b   ---+-----  1 | one (1 row)  postgres=# select * from bar;  a |  b   ---+-----  2 | two (1 row) 

I then join‘d using the following JOIN syntax:

postgres=# select * from foo, bar;  a |  b  | a |  b   ---+-----+---+-----  1 | one | 2 | two (1 row) 

Then, I compared it to full outer join:

postgres=# select * from foo full outer join bar using (a);  a |  b  |  b   ---+-----+-----  1 | one |   2 |     | two (2 rows) 

and cross join:

postgres=# select * from foo cross join bar;  a |  b  | a |  b   ---+-----+---+-----  1 | one | 2 | two (1 row) 

Is it always true that the from a, b, c will produce a cross join?

Understanding CSP: report shows blocked that shouldn’t have been blocked

I’m having trouble making sense of some reported CSP violations that don’t seem to actually be violations according to the CSP standard. I have not managed to reproduce the violations in my own browser, and based on my own testing I believe that the block is the result of a non-compliant browser. That seems like a bold assertion, but based on all the documentation I’ve read and my tests it’s the only thing that makes sense.

Here is (more or less) what the CSP is:

frame-ancestors [list-of-urls]; default-src https: data: blob: 'unsafe-inline' 'unsafe-eval' [list-of-more-urls]; report-uri [my-reporting-endpoint] 

The problem is that I’m getting some violations sent to my reporting endpoint. Here is an example violation report:

{"csp-report":{     "document-uri":"[REDACTED]",     "referrer":"[REDACTED]",     "violated-directive":"script-src-elem",     "effective-directive":"script-src-elem",     "original-policy":"[SEE ABOVE]",     "disposition":"enforce",     "blocked-uri":"https://example.com/example.js",     "status-code":0,     "script-sample":"" }} 

The context would be that the page in question had a <script src="https://example.com/example.js"></script> on it somewhere.

To be clear, https://example.com is not in the list of allowed URLs under default-src. However, that shouldn’t really matter. Here are all the relevant facts that lead me to believe this is being caused by a non-compliant browser that someone is using:

  1. There is no script-src-elem defined so it should fall back on the default-src for the list of allowed URLs.
  2. default-src includes the https: schema, which means that all urls with an https scheme will be allowed. The blocked URL definitely uses HTTPS
  3. This source agrees that the scheme source (https) will automatically allow any https resources. Therefore this should be allowed even though example.com is not in the list of allowed URLs.
  4. The official CSP docs also agree, showing that scheme matching happens first and can allow a URL even before the list of allowed URLs is checked.
  5. Therefore, if you include the https: scheme in your default-src, your CSP will match <script src="https://anything.com"> even if not specifically in the list of allowed URLs
  6. In my own testing I found the above to be true.

Despite all of this, I have sporadic reports of CSP violations even though it shouldn’t. Note that I’m unable to replicate this exactly because the pages in question have changed, and I don’t have easy control over them. The only thing I can think of is that some of my users have a browser that isn’t properly adhering to the CSP standard, and are rejecting the URL since it is not on the list of allowed URLs, rather than allowing it based on its scheme.

Is this the best explanation, or am I missing something about my CSP? (and yes, I know that this CSP is not a very strict one).