PostgreSQL 13 – Improve huge table data aggregation

I have a huge database (current size is ~900GB and new data still comes) partitioned by Year_month and subpartition by currency. The problem is when I try to fetch aggregation from the whole partition it goes slow. This is a report so it will be queried very often. The current size of partition which I want to aggregate: 7.829.230 rows. Each subpartition will be similar. Table schema (anonymized):

-- auto-generated definition CREATE TABLE aggregates_dates (     currency              CHAR(3)                                    NOT NULL,     id                    uuid            DEFAULT uuid_generate_v4() NOT NULL,     date                  TIMESTAMP(0)                               NOT NULL,     currency              CHAR(3)                                    NOT NULL,     field01               INTEGER                                    NOT NULL,     field02               INTEGER                                    NOT NULL,     field03               INTEGER                                    NOT NULL,     field04               INTEGER                                    NOT NULL,     field05               INTEGER                                    NOT NULL,     field06               CHAR(2)                                    NOT NULL,     field07               INTEGER         DEFAULT 0                  NOT NULL,     field08               INTEGER         DEFAULT 0                  NOT NULL,     field09               INTEGER         DEFAULT 0                  NOT NULL,     field10               INTEGER         DEFAULT 0                  NOT NULL,     field11               INTEGER         DEFAULT 0                  NOT NULL,     value01               INTEGER         DEFAULT 0                  NOT NULL,     value02               INTEGER         DEFAULT 0                  NOT NULL,     value03               INTEGER         DEFAULT 0                  NOT NULL,     value04               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value05               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value06               INTEGER         DEFAULT 0                  NOT NULL,     value07               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value08               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value09               INTEGER         DEFAULT 0                  NOT NULL,     value10               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value11               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value12               INTEGER         DEFAULT 0                  NOT NULL,     value13               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value14               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value15               INTEGER         DEFAULT 0                  NOT NULL,     value16               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value17               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value18               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,     value19               INTEGER         DEFAULT 0,     value20               INTEGER         DEFAULT 0,     CONSTRAINT aggregates_dates_pkey         PRIMARY KEY (id, date, currency) )     PARTITION BY RANGE (date); CREATE TABLE aggregates_dates_2020_01     PARTITION OF aggregates_dates         (             CONSTRAINT aggregates_dates_2020_01_pkey                 PRIMARY KEY (id, date, currency)             )         FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-01-31 23:59:59')     PARTITION BY LIST (currency); CREATE TABLE aggregates_dates_2020_01_eur     PARTITION OF aggregates_dates_2020_01         (             CONSTRAINT aggregates_dates_2020_01_eur_pkey                 PRIMARY KEY (id, date, currency)             )         FOR VALUES IN ('EUR'); CREATE INDEX aggregates_dates_2020_01_eur_date_idx ON aggregates_dates_2020_01_eur (date); CREATE INDEX aggregates_dates_2020_01_eur_field01_idx ON aggregates_dates_2020_01_eur (field01); CREATE INDEX aggregates_dates_2020_01_eur_field02_idx ON aggregates_dates_2020_01_eur (field02); CREATE INDEX aggregates_dates_2020_01_eur_field03_idx ON aggregates_dates_2020_01_eur (field03); CREATE INDEX aggregates_dates_2020_01_eur_field04_idx ON aggregates_dates_2020_01_eur (field04); CREATE INDEX aggregates_dates_2020_01_eur_field06_idx ON aggregates_dates_2020_01_eur (field06); CREATE INDEX aggregates_dates_2020_01_eur_currency_idx ON aggregates_dates_2020_01_eur (currency); CREATE INDEX aggregates_dates_2020_01_eur_field09_idx ON aggregates_dates_2020_01_eur (field09); CREATE INDEX aggregates_dates_2020_01_eur_field10_idx ON aggregates_dates_2020_01_eur (field10); CREATE INDEX aggregates_dates_2020_01_eur_field11_idx ON aggregates_dates_2020_01_eur (field11); CREATE INDEX aggregates_dates_2020_01_eur_field05_idx ON aggregates_dates_2020_01_eur (field05); CREATE INDEX aggregates_dates_2020_01_eur_field07_idx ON aggregates_dates_2020_01_eur (field07); CREATE INDEX aggregates_dates_2020_01_eur_field08_idx ON aggregates_dates_2020_01_eur (field08); 

Example Query (not all fields used) which aggregate whole partition (This query might have many more WHERE conditions but this one is the worst case)

EXPLAIN (ANALYSE, BUFFERS, VERBOSE) SELECT        COALESCE(SUM(mainTable.value01), 0)            AS                                    "value01",        COALESCE(SUM(mainTable.value02), 0)       AS                                    "value02",        COALESCE(SUM(mainTable.value03), 0)       AS                                    "value03",        COALESCE(SUM(mainTable.value06), 0)       AS                                    "value06",        COALESCE(SUM(mainTable.value09), 0)    AS                                    "value09",        COALESCE(SUM(mainTable.value12), 0)      AS                                    "value12",        COALESCE(SUM(mainTable.value15), 0) AS                                    "value15",        COALESCE(SUM(mainTable.value03 + mainTable.value06 + mainTable.value09 + mainTable.value12 +                     mainTable.value15), 0) AS                                    "kpi01",        COALESCE(SUM(mainTable.value05) * 1, 0)                                         "value05",        COALESCE(SUM(mainTable.value08) * 1, 0)                                         "value08",        COALESCE(SUM(mainTable.value11) * 1, 0)                                      "value11",        COALESCE(SUM(mainTable.value14) * 1, 0)                                        "value14",        COALESCE(SUM(mainTable.value17) * 1, 0)                                   "value17",        COALESCE(SUM(mainTable.value05 + mainTable.value08 + mainTable.value11 + mainTable.value14 +                     mainTable.value17) * 1, 0)                                   "kpi02",        CASE            WHEN SUM(mainTable.value02) > 0 THEN (1.0 * SUM(                        mainTable.value05 + mainTable.value08 + mainTable.value11 +                        mainTable.value14 + mainTable.value17) / SUM(mainTable.value02) * 1000 * 1)            ELSE 0 END                                                                      "kpiEpm",        CASE            WHEN SUM(mainTable.value01) > 0 THEN (1.0 * SUM(                        mainTable.value05 + mainTable.value08 + mainTable.value11 +                        mainTable.value14) / SUM(mainTable.value01) * 1)            ELSE 0 END FROM performance mainTable WHERE (mainTable.date BETWEEN '2020-01-01 00:00:00' AND '2020-02-01 00:00:00')   AND (mainTable.currency = 'EUR') GROUP BY mainTable.field02; 

EXPLAIN:

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN                                                                                                                                                                          | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |HashAggregate  (cost=3748444.51..3748502.07 rows=794 width=324) (actual time=10339.771..10340.497 rows=438 loops=1)                                                                 | |  Group Key: maintable.field02                                                                                                                                                      | |  Batches: 1  Memory Usage: 1065kB                                                                                                                                                  | |  Buffers: shared hit=2445343                                                                                                                                                       | |  ->  Append  (cost=0.00..2706608.65 rows=11575954 width=47) (actual time=212.934..4549.921 rows=7829230 loops=1)                                                                   | |        Buffers: shared hit=2445343                                                                                                                                                 | |        ->  Seq Scan on performance_2020_01 maintable_1  (cost=0.00..2646928.38 rows=11570479 width=47) (actual time=212.933..4055.104 rows=7823923 loops=1)                        | |              Filter: ((date >= '2020-01-01 00:00:00'::timestamp without time zone) AND (date <= '2020-02-01 00:00:00'::timestamp without time zone) AND (currency = 'EUR'::bpchar))| |              Buffers: shared hit=2444445                                                                                                                                           | |        ->  Index Scan using performance_2020_02_date_idx on performance_2020_02 maintable_2  (cost=0.56..1800.50 rows=5475 width=47) (actual time=0.036..6.476 rows=5307 loops=1)  | |              Index Cond: ((date >= '2020-01-01 00:00:00'::timestamp without time zone) AND (date <= '2020-02-01 00:00:00'::timestamp without time zone))                           | |              Filter: (currency = 'EUR'::bpchar)                                                                                                                                    | |              Rows Removed by Filter: 31842                                                                                                                                         | |              Buffers: shared hit=898                                                                                                                                               | |Planning Time: 0.740 ms                                                                                                                                                             | |JIT:                                                                                                                                                                                | |  Functions: 15                                                                                                                                                                     | |  Options: Inlining true, Optimization true, Expressions true, Deforming true                                                                                                       | |  Timing: Generation 4.954 ms, Inlining 14.249 ms, Optimization 121.115 ms, Emission 77.181 ms, Total 217.498 ms                                                                    | |Execution Time: 10345.662 ms                                                                                                                                                        | +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 

Server spec:

  • AMD 64 Threads
  • 315GB Ram
  • 6xSSD RAID 10 Postgres Config:
postgresql_autovacuum_vacuum_scale_factor: 0.4 postgresql_checkpoint_completion_target: 0.9 postgresql_checkpoint_timeout: 10min postgresql_effective_cache_size: 240GB postgresql_maintenance_work_mem: 2GB postgresql_random_page_cost: 1.0 postgresql_shared_buffers: 80GB postgresql_synchronous_commit: local postgresql_work_mem: 1GB 

Content Aggregation / Headline Plugin?

I am trying to find a solution to do the following:

  1. be on an article / webpage
  2. click a browser (ideally chrome) extension
  3. input a new headline for the article
  4. plugin publishes the headline with link in a widget / page of your wordpress site

Imagine this as a way to streamline a content curation site similar to the Drudge Report or popurls.com

Does anyone know a solution that would do this?

Mongoose – Aggregation – Setar campo como valor

Não estou conseguindo fazer a query abaixo me retornar alguns campos como valores de outros campos

Ex.:

{  _id:'Javascript',  valorWd: valorPs_Label // wd = "based on" e ps_Label = "ecmascript"  }  Linguagem.aggregate(         [          //{$  match: {"isIdentifier": "false"}},         {"$  match": {             "linguagemLabel": "JavaScript"}         },         {"$  group": {              _id: "$  linguagemLabel",              wd: {$  addToSet: "$  wdLabel"},              ps: {$  addToSet: "$  ps_Label"},              wdpq: {$  addToSet: "$  wdpqLabel"},              pq: {$  addToSet: "$  pq_Label"},             }         },             {"$  project": {                 wd: 1,                 ps: 1,                 wdpq: 1,                 pq: 1,                 prop: 1               }         },         { $  sort : { _id : 1} }     ]).exec(function(err, ret){         if(err){console.log(err)}         console.log(ret)         return res.json(ret);     }); 

Mongo aggregation on array values

I have a mongo document like this:

{  "_id":"5cfe7767f4046327ad57ec33",  "area":"Actor",   "profile":{    "about":true,     "gallery":true,    "characteristics":[      "Gender": true,      "Height": false,      "Weight": false    ]  } } 

And another one like this:

{  "_id":"5cfe7767f4046327ad57ec33",  "characteristic":"Gender",   "items":["Female","Male"] } 

I want to aggregate the characteristics from the first document that are true with the values of the other document. Something like this:

{  "_id":"5cfe7767f4046327ad57ec33",  "area":"Actor",   "profile":{    "about":true,     "gallery":true,    "characteristics:[      "Gender": ["Female","Male"],      "Height": false,      "Weight": false    ]  } } 

Is it possible?

Views Aggregation problem in Drupal 8

I have created a view of node content and I need author name, role, and image to show with content. I have also made a relationship with the user in the advance section for the above-mentioned fields. it’s working correctly but when I on Aggregation it gives this SQL query error.

SQLSTATE[42S22]: Column not found: 1054 Unknown column ‘users_field_data_node_field_revision__user__user_picture.user_picture_’ in ‘field list’: SELECT node_field_data.title AS node_field_data_title, users_field_data_node_field_revision.name AS users_field_data_node_field_revision_name, users_field_data_node_field_revision__user__user_picture.user_picture_ AS users_field_data_node_field_revision__user__user_picture_use, nodeviewcount.id AS id, nodeviewcount.nid AS nodeviewcount_nid, node_field_data.created AS node_field_data_created, MIN(node_field_data.nid) AS nid, MIN(users_field_data_node_field_revision.uid) AS users_field_data_node_field_revision_uid FROM {node_field_data} node_field_data INNER JOIN {node_field_revision} node_field_revision ON node_field_data.vid = node_field_revision.vid LEFT JOIN {users_field_data} users_field_data_node_field_revision ON node_field_revision.uid = users_field_data_node_field_revision.uid LEFT JOIN {user__user_picture} users_field_data_node_field_revision__user__user_picture ON users_field_data_node_field_revision.uid = users_field_data_node_field_revision__user__user_picture.entity_id AND (users_field_data_node_field_revision__user__user_picture.deleted = :views_join_condition_0 AND users_field_data_node_field_revision__user__user_picture.langcode = users_field_data_node_field_revision.langcode) LEFT JOIN {nodeviewcount} nodeviewcount ON node_field_data.nid = nodeviewcount.nid WHERE (node_field_data.status = :db_condition_placeholder_2) AND (node_field_data.type IN (:db_condition_placeholder_3)) GROUP BY node_field_data_title, users_field_data_node_field_revision_name, users_field_data_node_field_revision__user__user_picture_use, id, nodeviewcount_nid, node_field_data_created ORDER BY node_field_data_created DESC LIMIT 11 OFFSET 0; Array ( [:db_condition_placeholder_2] => 1 [:db_condition_placeholder_3] => ask_question [:views_join_condition_0] => 0 )

I don’t know what is the issue and how to fix this problem. Aggregation is must for me in this view because I have to count Node ID through Aggregation.

Taxonomy terms not localized in view with enabled aggregation

On a view on content, I added a field “All taxonomy terms” (Display all taxonomy terms associated with a node from specified vocabularies) and enabled aggregation, see screenshot of view attached. This view is listing exclusively terms that are actually used in content. While the view is working for the standard language of the site, the terms are not translated into the other languages despite the vocabulary is translatable. Once I disable the aggregation, the translation of terms is working again.

So how can I make the translation of terms work?

enter image description here

Kubernetes aggregation certificates – apiserver client authentication allowed names

Definitions I’m using in this question:

  • Main apiserver: the core kube-apiserver
  • Extension apiserver: an addon like metrics-server

I am reading through the configure aggregation layer guide and I don’t understand the main apiserver’s use of --requestheader-allowed-names. In section Kubernetes Apiserver Client Authentication it says:

The connection must be made using a client certificate whose CN is one of those listed in –requestheader-allowed-names. Note: You can set this option to blank as –requestheader-allowed-names=””. This will indicate to an extension apiserver that any CN is acceptable.

It makes it sound like the main apiserver is responsible for setting this. Surely the extension apiserver would be in control of this and determine what is acceptable? Why configure this on the main apiserver at all? I.e. The client certificate common names are what they are and it’s up to the extension apiserver to accept/reject these?

Or is that doc section mixing options that are passed to both the main and extension apiservers?

Views Table with Aggregation, total values counted per option

I’m setting up a view of event registrations. Each registration will be allocated to 1 of 3 workshops at two different times in the day.

I’m looking to output the total number of allocations per workshop so we can try to balance the delegate numbers.

I’m pretty close, but not quite there.

I’m using the Display Format of Table with aggregation options with “Group and Compress” on the Event ID.

I then apply the group function on two data fields: “Allocated Workshop AM” and “Allocated Workshop PM”.

I can get closest to my desired result using the “Tally members” option, however, the results are wrong.

enter image description here

They are showing much lower numbers than are actually allocated.

enter image description here

Any suggestions on how I could achieve this?

How to disable core JS aggregation and AdvAgg just for admin paths

I help look after a site which uses $ settings['file_public_base_url'] to serve files from a different subdomain on the frontend, but on the backend (admin paths like node/add and node/edit) this causes our JS to be loaded with absolute rather than relative URLs which seems to cause an issue with JS loaded via AJAX not executing correctly. There are no JS errors in my devtools console, but I suspect the cause is that the JS is being loaded from differing subdomains, as turning AdvAgg and core’s JS aggregation off makes the JS files load via relative paths and function as intended.

Without any errors showing in my console or server logs I’m not sure I’m going to be able to get to the bottom of the AJAX issue, so for now a compromise seems to be to turn off aggregation for admin and content editing routes.

What would be the best way to do that? I know I can set $ config['system.performance']['js']['preprocess'] = FALSE; and presumably a similar config key for AdvAgg’s own setting (we have AdvAgg set to optimise/compress JS, so even with system.performance.js.preprocess turned off AdvAgg will still manage the files and use absolute paths to include them. Can config settings be altered on a per route/path basis?