Non-deterministic performance of query on select, from 1s to 60s on table with 1 billion rows

I’m trying to investigate why the performance of this query is so non-deterministic. It can take anywhere from 1 seconds, to 60 seconds and above. The nature of the query is to select a "time window", and get all rows from within that time window.

Here’s the query in question, running on a table of approximately 1 billion rows:

SELECT CAST(extract(EPOCH from ts)*1000000 as bigint) as ts     , ticks     , quantity     , side FROM order_book WHERE ts >= TO_TIMESTAMP(1618882633073383/1000000.0)     AND ts < TO_TIMESTAMP(1618969033073383/1000000.0)     AND zx_prod_id = 0 ORDER BY ts ASC, del desc 

The values within TO_TIMESTAMP will keep sliding forward as I walk the whole table. Here is the EXPLAIN ANALYZE output for the same query on two different time windows:

Slow Performance

Gather Merge  (cost=105996.20..177498.48 rows=586308 width=18) (actual time=45196.559..45280.769 rows=539265 loops=1)   Workers Planned: 6   Workers Launched: 6   Buffers: shared hit=116386 read=42298   ->  Sort  (cost=104996.11..105240.40 rows=97718 width=18) (actual time=45169.717..45176.775 rows=77038 loops=7)         Sort Key: (((date_part('epoch'::text, _hyper_16_214_chunk.ts) * '1000000'::double precision))::bigint), _hyper_16_214_chunk.del DESC         Sort Method: quicksort  Memory: 9327kB         Worker 0:  Sort Method: quicksort  Memory: 8967kB         Worker 1:  Sort Method: quicksort  Memory: 9121kB         Worker 2:  Sort Method: quicksort  Memory: 9098kB         Worker 3:  Sort Method: quicksort  Memory: 9075kB         Worker 4:  Sort Method: quicksort  Memory: 9019kB         Worker 5:  Sort Method: quicksort  Memory: 9031kB         Buffers: shared hit=116386 read=42298         ->  Result  (cost=0.57..96897.07 rows=97718 width=18) (actual time=7.475..45131.932 rows=77038 loops=7)               Buffers: shared hit=116296 read=42298               ->  Parallel Index Scan using _hyper_16_214_chunk_order_book_ts_idx on _hyper_16_214_chunk  (cost=0.57..95187.01 rows=97718 width=18) (actual time=7.455..45101.670 rows=77038 loops=7)                     Index Cond: ((ts >= '2021-04-22 01:34:31.357179+00'::timestamp with time zone) AND (ts < '2021-04-22 02:34:31.357179+00'::timestamp with time zone))                     Filter: (zx_prod_id = 0)                     Rows Removed by Filter: 465513                     Buffers: shared hit=116296 read=42298 Planning Time: 1.107 ms JIT:   Functions: 49   Options: Inlining false, Optimization false, Expressions true, Deforming true   Timing: Generation 9.273 ms, Inlining 0.000 ms, Optimization 2.008 ms, Emission 36.235 ms, Total 47.517 ms Execution Time: 45335.178 ms 

Fast Performance

Gather Merge  (cost=105095.94..170457.62 rows=535956 width=18) (actual time=172.723..240.628 rows=546367 loops=1)   Workers Planned: 6   Workers Launched: 6   Buffers: shared hit=158212   ->  Sort  (cost=104095.84..104319.16 rows=89326 width=18) (actual time=146.702..152.849 rows=78052 loops=7)         Sort Key: (((date_part('epoch'::text, _hyper_16_214_chunk.ts) * '1000000'::double precision))::bigint), _hyper_16_214_chunk.del DESC         Sort Method: quicksort  Memory: 11366kB         Worker 0:  Sort Method: quicksort  Memory: 8664kB         Worker 1:  Sort Method: quicksort  Memory: 8986kB         Worker 2:  Sort Method: quicksort  Memory: 9116kB         Worker 3:  Sort Method: quicksort  Memory: 8858kB         Worker 4:  Sort Method: quicksort  Memory: 9057kB         Worker 5:  Sort Method: quicksort  Memory: 6611kB         Buffers: shared hit=158212         ->  Result  (cost=0.57..96750.21 rows=89326 width=18) (actual time=6.145..127.591 rows=78052 loops=7)               Buffers: shared hit=158122               ->  Parallel Index Scan using _hyper_16_214_chunk_order_book_ts_idx on _hyper_16_214_chunk  (cost=0.57..95187.01 rows=89326 width=18) (actual time=6.124..114.023 rows=78052 loops=7)                     Index Cond: ((ts >= '2021-04-22 01:34:31.357179+00'::timestamp with time zone) AND (ts < '2021-04-22 02:34:31.357179+00'::timestamp with time zone))                     Filter: (zx_prod_id = 4)                     Rows Removed by Filter: 464498                     Buffers: shared hit=158122 Planning Time: 0.419 ms JIT:   Functions: 49   Options: Inlining false, Optimization false, Expressions true, Deforming true   Timing: Generation 10.405 ms, Inlining 0.000 ms, Optimization 2.185 ms, Emission 39.188 ms, Total 51.778 ms Execution Time: 274.413 ms 

I interpreted this output as most of the blame lying on this parallel index scan.

At first, I tried to raise work_mem to 1 GB and shared_buffers to 24 GB, thinking that maybe it couldn’t fit all the stuff it needed in RAM, but that didn’t seem to help.

Next, I tried creating an index on (zx_prod_id, ts), thinking that the filter on the parallel index scan might be taking a while, but that didn’t seem to do anything either.

I’m no database expert, and so I’ve kind of exhausted the limits of my knowledge.

Thanks in advance for any suggestions!

*High Performance US VPS Hosting – High Storage & Bandwidth – Pure SSD & HDD.

Hostpoco has been providing quality unmanaged VPS, Dedicated servers, and advanced solutions. We’re focused on customers who are looking for premium quality hosting on a reliable platform that performs beyond their expectations. We are working very hard to make our customers satisfied. We provide the best high-quality web hosting to our customers at a cheap price. Our main priority is to provide our customers with secure & private hosting services at an affordable price to fulfill all their hosting demands.

Pick the best plan that suits your needs.

All plans include:
– VPS Type OpenVZ
– Free Setup
– 1 IPv4 included
– 99 % Uptime Guarantee
– 24/7 Live Support
– Root / SSH Access
– Easy to use control panel to self-manage

View all plans here: https://www.hostpoco.com/cheap-us-vps-hosting.php

*VPS Startup Plan
– 1024 MB Memory – 30 GB Raid 10 Storage – 2 TB Monthly Traffic – 1 IPv4 included – only $14.99 /m.

*VPS Pro Plan
– 2048 MB Memory – 60 GB Raid 10 Storage – 3 TB Monthly Traffic – 1 IPv4 included – only $24.99 /m.

*VPS Premium plan
– 4096 MB Memory – 120 GB Raid 10 Storage- 4 TB Monthly Traffic – 1 IPv4 included – only $44.99 /m.

*VPS Elite Plan
– 8192 MB Memory – 180 GB Raid 10 Storage – 8 TB Monthly Traffic – 1 IPv4 included – only $84.99 /m.

If you have any questions about our services please send us an email at sales@hostpoco.com
We also have a live chat link on our website.

Follow us on Twitter: https://twitter.com/HostPoco
Find us on Facebook: https://www.facebook.com/HostPoco/

Implicit conversion: when does it hurt performance a lot?

I know that when data types are the same it is good and it hurts performance when they are not.

But sometimes it seems to have very little impact and basically a waste of time to fix it. Other times it will have a huge impact and performance gets a lot better.

Of course there are many scenarios, e.g. a insert select where the data type of the select is not the same, but compatible, with the insert column. Or it could be on the join predicates.

Where does implicit conversion hurt the most?

Also it seems that one also gets an implicit conversion when the lengths don’t match, e.g. joining varchar(10) with varchar(20). How big a deal is these scenarios when we are only dealing with the length of the columns and not their type?

Basically I would like to know when I should worry about the compute scalar operator in the execution plan. How can I know if removing it will have a significant impact?

Help on improving the query performance of MySQL table

My team handles a tool that automatically detects and categorizes images.We have a Mysql DB (InnoDB engine) used by our tool in production where we store information about each image processed.

The table is poorly designed by someone long before I joined the team. It was all working well as there was very less data in Db till now. Recently we launched the tool to a wider audience and the Db is now having huge data. The ‘select’ query speed is very slow and takes days to get result.

I am not an expert in Databases. Please help on suggestions to improve the performance. I am thinking of options like creating indexes. We cannot have downtime for the system.Th Db is write heavy (around 200 insertion per second). I am afraid of creating table locks if i try out index creation.

Table schema: 6 columns , all Varchar, no primary Key, no indexes

Performance of select from a 3d list – Mathematica slower than Python

I am creating a random 3d data set in Matematica 12.1. Then I am selecting all points that are in a certain range of one axis.

The same I am doing in Python (same computer, Python 3.8.5, numpy 1.19.2)

RESULT: It seems that Python is able to select much faster (1.7 sec) than Mathematica (5.2 sec). What is the reason for that? For selection in Mathematica I used the fastest solution, which is by Carl Woll (see here at bottom).

SeedRandom[1]; coordinates = RandomReal[10, {100000000, 3}];  selectedCoordinates =     Pick[coordinates,      Unitize@Clip[coordinates[[All, 1]], {6, 7}, {0, 0}],      1]; // AbsoluteTiming  {5.16326, Null}  Dimensions[coordinates]  {100000000, 3}  Dimensions[selectedCoordinates]  {10003201, 3} 

PYTHON CODE:

import time import numpy as np   np.random.seed(1) coordinates = np.random.random_sample((100000000,3))*10  start = time.time() selectedCoordinates = coordinates[(coordinates[:,0] > 6) & (coordinates[:,0] < 7)] end = time.time()  print(end-start)  print(coordinates.shape)  print(selectedCoordinates.shape)  1.6979997158050537  (100000000, 3)  (9997954, 3) 

How to improve Oracle Standard Edition’s performance for testing?

There’s a great post on StackOverflow about improving Postgres performance for testing.

https://stackoverflow.com/questions/9407442/optimise-postgresql-for-fast-testing/9407940#9407940

However, there aren’t any resources on doing the same for OracleDB. I don’t have a license for Enterprise Edition, that has features like ‘In-Memory’ columnar storage that would almost definitely improve performance.

https://docs.oracle.com/en/database/oracle/oracle-database/19/inmem/intro-to-in-memory-column-store.html

I’m really limited in what I can try in Standard Edition. It’s running in a Docker container in a CI pipeline. I’ve tried putting the tablespace on a RAM disk, but that doesn’t improve performance at all. I’ve tried fiddling with FILESYSTEMIO_OPTION, but no performance change.

Would anyone know of some more obvious things I can do in OracleDB in a CI environment?

Grouping/sorting performance choice between bigint and nvarchar

I want to store a hash-code for a variable-length text field (max 1000 chars) in a database table. The hash-code will be computed and assigned once on insert, and new rows will be inserted very often.

The hash-code will be used mainly for filtering (WHERE), grouping (GROUP BY), and sorting (ORDER BY) in a couple of queries. The database table will hold a few million rows over time, with the probability of identical hash-codes (for identical text) being around 30% (rest being unique).

I have the choice of making the hash-code data type NVARCHAR (SHA1 of text) or BIGINT (converted bytes of SHA1 of text). I think BIGINT will be better in terms of storage space (less pages).

Generally speaking, which of these two data types will be better in terms of performance, considering the operations mentioned above?

Can objects created by Performance of Creation be used as expensive material components?

Tasha’s Cauldron of Everything introduces the College of Creation which gets the class feature Performance of Creation:

As an action, you can channel the magic of the Song of Creation to create one nonmagical item of your choice in an unoccupied space within 10 feet of you. The item must appear on a surface or in a liquid that can support it. The gp value of the item can’t be more than 20 times your bard level, and the item must be Medium or smaller.

…this isn’t a huge deal since there is a gold limit, so the cost of what you could reach is fairly small, but later on the College grants:

Creative Crescendo

[…]

You are no longer limited by gp value when creating items with Performance of Creation.

Could you create expensive nonmagical objects such as a diamond for resurrection magic, or the 500 gp statue for imprisonment? Would it work when casting these spells?