How to get count of an object, through 3 different tables in postgres with ID’s stored in each table

I’m currently using Postgres 9.6.16.

I am currently using 3 different tables to store a hypothetical users details.

The first table, called contact, this contains:

ID, Preferred_Contact_Method 

The second table, called orders, This contains:

ID, UserID, Contact_ID (the id of a row, in the contact table that relates to this order) 

The Third Table, Called order_details

ID, Orders_ID (the id in the orders table that relates to this order details) 

The tables contain other data as well, but for minimal reproduction, these are the columns that are relevant to this question.

I am trying to return some data so that i can generate a graph, in this hypothetical store, There’s only three ways we can contact a user: Email, SMS, or Physical Mail.

The graph is supposed to be 3 numbers, how many mails, emails, and SMS we’ve sent to the user; since in this hypothetical store whenever you purchase something you get notified of the successful shipment, these methods are 1:1 to the order_details, so if there’s 10 order_detail rows for the same user, then we sent 10 tracking numbers, and since there can be multiple order_details (each item has a different row in order_details) in an order, we can get the count by counting the total rows of order details belonging to a single user/contact, then attributing to what kind of contact method that user preferred at the time of making that order.

To represent this better: If a new user makes a new order, and orders 1 apple, 1 banana, and 1 orange. For the apple, the user set preferred tracking number delivery as SMS, for the banana, they set it to EMAIL, for the orange, they thought it would be funny to set the tracking number delivery via MAIL. Now, i want to generate a graph to this users preferred delivery method. So i’d like to query all those rows and obtain:

SMS, 1 EMAIL, 1 MAIL, 1 

Here’s a SQL Fiddle link with the schema and test data: http://sqlfiddle.com/#!17/eb8c0

the response with the above dataset should look like this:

method | count SMS,     4 EMAIL,   4 MAIL,    4 

Understanding postgres query planner behaviour on gin index

need your expert opinion on index usage and query planner behaviour.

\d orders                                          Partitioned table "public.orders"            Column            |           Type           | Collation | Nullable |                   Default -----------------------------+--------------------------+-----------+----------+----------------------------------------------  oid                         | character varying        |           | not null |  user_id                     | character varying        |           | not null |  tags                        | text[]                   |           | not null |  category                    | character varying        |           |          |  description                 | character varying        |           |          |  order_timestamp             | timestamp with time zone |           | not null |  ..... Partition key: RANGE (order_timestamp) Indexes:     "orders_uid_country_ot_idx" btree (user_id, country, order_timestamp)     "orders_uid_country_cat_ot_idx" btree (user_id, country, category, order_timestamp desc)     "orders_uid_country_tag_gin_idx" gin (user_id, country, tags) WITH (fastupdate=off)     "orders_uid_oid_ot_key" UNIQUE CONSTRAINT, btree (user_id, oid, order_timestamp) 

I have observed the following behaviour based on query param when I run the following query, select * from orders where user_id = 'u1' and country = 'c1' and tags && '{t1}' and order_timestamp >= '2021-01-01 00:00:00+00' and order_timestamp < '2021-03-25 05:45:47+00' order by order_timestamp desc limit 10 offset 0

case 1: for records with t1 tags where t1 tags occupies 99% of the records for user u1, 1st index orders_uid_country_ot_idx is picked up.

Limit  (cost=0.70..88.97 rows=21 width=712) (actual time=1.967..12.608 rows=21 loops=1)    ->  Index Scan Backward using orders_y2021_jan_to_uid_country_ot_idx on orders_y2021_jan_to_jun orders  (cost=0.70..1232.35 rows=293 width=712) (actual time=1.966..12.604 rows=21 loops=1)          Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))          Filter: (tags && '{t1}'::text[])  Planning Time: 0.194 ms  Execution Time: 12.628 ms 

case 2: But when I query for tags value t2 with something like tags && '{t2}' and it is present in 0 to <3% of records for a user, gin index is picked up.

Limit  (cost=108.36..108.38 rows=7 width=712) (actual time=37.822..37.824 rows=0 loops=1)    ->  Sort  (cost=108.36..108.38 rows=7 width=712) (actual time=37.820..37.821 rows=0 loops=1)          Sort Key: orders.order_timestamp DESC          Sort Method: quicksort  Memory: 25kB          ->  Bitmap Heap Scan on orders_y2021_jan_to_jun orders  (cost=76.10..108.26 rows=7 width=712) (actual time=37.815..37.816 rows=0 loops=1)                Recheck Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'ID'::text) AND (tags && '{t2}'::text[]))                Filter: ((order_timestamp >= '2021-01-01 00:00:00+00'::timestamp with time zone) AND (order_timestamp < '2021-03-25 05:45:47+00'::timestamp with time zone))                ->  Bitmap Index Scan on orders_y2021_jan_to_uid_country_tag_gin_idx  (cost=0.00..76.10 rows=8 width=0) (actual time=37.812..37.812 rows=0 loops=1)                      Index Cond: (((user_id)::text = 'u1'::text) AND ((country)::text = 'c1'::text) AND (tags && '{t2}'::text[]))  Planning Time: 0.190 ms  Execution Time: 37.935 ms 
  1. Is this because the query planner identifies that since 99% of the records is covered in case 1, it skips the gin index and directly uses the 1st index? If so, does postgres identifies it based on the stats?

  2. Before gin index creation, when 1st index is picked for case 2, performance was very bad since index access range is high. i.e number of records that satisfies the condition of user id, country and time column is very high. gin index improved it but i’m curious to understand how postgres chooses it selectively.

  3. orders_uid_country_cat_ot_idx was added to support filter by category since when gin index was used when filtered by just category or by both category and tags, the performance was bad compared to when the btree index of user_id, country, category, order_timestamp is picked up . I expected gin index to work well for all the combination of category and tags filter. What could be the reason? The table contains millions of rows

XAMPP MySQL service crash after reboot

Operating System: Window 10 64bit Video: https://youtu.be/HodTJxphn94

When I run mysql, it immediately exits. I think this happens every 3 months. Is there any way to solve this?

12:10:51  [mysql]   Attempting to start MySQL app... 12:10:52  [mysql]   Status change detected: running 12:10:54  [mysql]   Status change detected: stopped 12:10:54  [mysql]   Error: MySQL shutdown unexpectedly. 12:10:54  [mysql]   This may be due to a blocked port, missing dependencies,  12:10:54  [mysql]   improper privileges, a crash, or a shutdown by another method. 12:10:54  [mysql]   Press the Logs button to view error logs and check 12:10:54  [mysql]   the Windows Event Viewer for more clues 12:10:54  [mysql]   If you need more help, copy and post this 12:10:54  [mysql]   entire log window on the forums 

Log

InnoDB: using atomic writes. [Note] InnoDB: Mutexes and rw_locks use Windows interlocked functions [Note] InnoDB: Uses event mutexes [Note] InnoDB: Compressed tables use zlib 1.2.11 [Note] InnoDB: Number of pools: 1 [Note] InnoDB: Using SSE2 crc32 instructions [Note] InnoDB: Initializing buffer pool, total size = 16M, instances = 1, chunk size = 16M [Note] InnoDB: Completed initialization of buffer pool [Note] InnoDB: 128 out of 128 rollback segments are active. [Note] InnoDB: Creating shared tablespace for temporary tables [Note] InnoDB: Setting file 'C:\xampp\mysql\data\ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... [Note] InnoDB: File 'C:\xampp\mysql\data\ibtmp1' size is now 12 MB. [Note] InnoDB: Waiting for purge to start [Note] InnoDB: 10.4.17 started; log sequence number 44926144; transaction id 195314 [Note] InnoDB: Loading buffer pool(s) from C:\xampp\mysql\data\ib_buffer_pool [Note] Plugin 'FEEDBACK' is disabled. [Note] Server socket created on IP: '::'.``` 

Table scan instead of index seeks happening when where clause filters across multiple tables in join using OR

We have an application generated query using a view that has two tables joined on a LEFT OUTER join. When filtering by fields from just one table (either table) an index seek happens and it’s reasonably fast. When the where clause includes conditions for fields from both tables using an OR the query plan switches to a table scan and doesn’t utilize any of the indexes.

All four fields that are being filtered on are indexed on their respective tables.

Fast query plan where I filter on 3 fields from one table: https://www.brentozar.com/pastetheplan/?id=Hym_4PRSO

Slow query plan where I filter on four fields…three from one table and one from another table: https://www.brentozar.com/pastetheplan/?id=r1dVNDRHO

Ideally I would like to understand why this is happening and how to nudge the query engine to utilize all the indexes.

SQL30080N error points to two documents SC31-6160 and (SC31-6156

If I face this problem: SQL1476N with sqlerror "-30080", then I check this other error:

db2 ? sql30080   SQL30080N  A communication error "<reason-code>" occurred sending or       receiving data from the remote database. ... Refer to the document IBM Communications Manager 1.0 APPC Programming Guide and Reference (SC31-6160) for explanation of the APPC primary and secondary return codes. For details of APPC sense data, refer to the IBM Communications Manager 1.0 Problem Determination Guide (SC31-6156). 

When can I get information about these documents?

  • IBM Communications Manager 1.0 APPC Programming Guide and Reference (SC31-6160).
  • IBM Communications Manager 1.0 Problem Determination Guide (SC31-6156)

Reference:

https://www.ibm.com/docs/en/db2/11.5?topic=SSEPGG_11.5.0/com.ibm.db2.luw.messages.sql.doc/sql28000-sql33999.html#sql30080n

Postgresql: sort by value position in array column, then by secondary order

I’m not quite sure what the best way to phrase this is…

So in my DB there is pillars text array which is basically an enum where providers ordered what values meant the most to their business, from most important to providing that value for their clients, to least important.

I’m using PostGIS to query providers in a specific area, and want to return providers ordered first by the pillar that a client selected they were looking for, then by closest location.

so if the pillars all have values ['a', 'b', 'c', 'd'], in any order depending on what providers selected, and the client selected pillar c

the results of the query would preferably return any/all providers that have pillar c at array index 0 first, ordered by distance to geopoint, then by providers that have pillar c at array index 1 second ordered by distance to client geopoint, then idx 2, then idx 3

I’m really only looking for the top 3 results in all cases, and providers with pillar c at idx 1 would only be needed if there were less than 3 results for index 0

Is this possible to pull off in a single query? or should I just run it with a where clause and check the results length until I have 3 results?

The pillars column is indexed with a gin index btw

Select maximum of a count in a grouped clause

I have the following tables:

Vehicles(v͟i͟n͟, model,category) Sales(s͟a͟l͟e͟I͟D͟, staffID,customerID,date) vehicleSold(saleID,v͟i͟n͟,salePrice) 

When I join these tables using:

select YEAR(Sales.saleDate)      , Vehicles.model      , count(Vehicles.model) 'Sold'      , Vehicles.category   from Vehicles    JOIN vehicleSold     on Vehicles.vin = vehicleSold.vin   JOIN Sales      on Sales.saleID = vehicleSold.saleID  group      by YEAR(Sales.saleDate)      , Vehicles.model      , Vehicles.category; 

Result is:

+----------------------+-------------+------+----------------+ | YEAR(Sales.saleDate) | model       | Sold | category       | +----------------------+-------------+------+----------------+ |                 2020 | Altima      |    1 | car            | |                 2020 | Flying Spur |    2 | car            | |                 2020 | Lifan E3    |    3 | Electric Moped | |                 2020 | Ridgeline   |    2 | truck          | |                 2020 | Shiver      |    4 | motorbike      | +----------------------+-------------+------+----------------+ 

Out of this table I want to get the model that was most sold in a category. So, in this case I only want to return a 2020, Flying Spur, car as the only row in category car because it was the most sold in 2020 in its category. I tried using a subquery is MAX(COUNT(*)) but I guess that is not supported in mysql. If anyone could point out my mistake and has any idea how to do this then that would be big help!

Best practice for organizing DDL SQL files

I am developing a postgres database with the following approximate number of entities:

  • 60 tables spread across 7 schemas
  • 20 views
  • 20 functions

What’s the best practice for organizing all the DDL SQL?

I currently have a single SQL for the table definitions, another for the views and yet another for the functions. But two of these files have grown to over 1,000 lines each and become unwieldy. That said, there are relationships between tables in different schemas and one file makes these easy to manage.

Would it be better to organize the DDL by schema? Or finer grain still, at the entity level?

I am using JetBrains DataGrip and would appreciate that the solution still enable Intellisense and error checking. The SQL is stored in git.

How to migrate a SQL Server Erwin Mart to database to Aurora (Amazon RDS)

I want to migrate a SQL Server Erwin data Mart database to Aurora and trying to figure out what the easiest/quickest way to do that is.

Options to me seem to be:

  1. Saving models to the file system, repoint the application to the new mart database, then loading from the file system to the new database. https://support.erwin.com/hc/en-us/articles/360003443452-Java-scripts-that-automatically-save-a-mart-s-models-offline-to-a-drive [support.erwin.com] https://support.erwin.com/hc/en-us/articles/115002674131-ERWIN-DATA-MODELER-MART-API-RESOURCE-PAGE [support.erwin.com]

Has anyone got any experience using these apis?

  1. Export/Import. Mysql migration tool. https://www.mysql.com/products/workbench/migrate/
    Amazon migration tool Does anyone know if the schema is the same, can I simply export/import the data?