Starting with SQL Server 2019, does compatibility level no longer influence cardinality estimation?

In SQL Server 2017 & prior versions, if you wanted to get cardinality estimations that matched a prior version of SQL Server, you could set a database’s compatibility level to an earlier version.

For example, in SQL Server 2017, if you wanted execution plans whose estimates matched SQL Server 2012, you could set the compatibility level to 110 (SQL 2012), and get execution plan estimates that matched SQL Server 2012.

This is reinforced by the documentation, which states:

Changes to the Cardinality Estimator released on SQL Server and Azure SQL Database are enabled only in the default compatibility level of a new Database Engine version, but not on previous compatibility levels.

For example, when SQL Server 2016 (13.x) was released, changes to the cardinality estimation process were available only for databases using SQL Server 2016 (13.x) default compatibility level (130). Previous compatibility levels retained the cardinality estimation behavior that was available before SQL Server 2016 (13.x).

Later, when SQL Server 2017 (14.x) was released, newer changes to the cardinality estimation process were available only for databases using SQL Server 2017 (14.x) default compatibility level (140). Database Compatibility Level 130 retained the SQL Server 2016 (13.x) cardinality estimation behavior.

However, in SQL Server 2019, that doesn’t seem to be the case. If I take the Stack Overflow 2010 database, and run this query:

CREATE INDEX IX_LastAccessDate_Id ON dbo.Users(LastAccessDate, Id); GO ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140; GO SELECT LastAccessDate, Id, DisplayName, Age   FROM dbo.Users   WHERE LastAccessDate > '2018-09-02 04:00'   ORDER BY LastAccessDate; 

I get an execution plan with 1,552 rows estimated coming out of the index seek operator:

SQL 2017, compat 2017

But if I take the same database, same query on SQL Server 2019, it estimates a different number of rows coming out of the index seek – it says “SQL 2019” in the comment at right, but note that it’s compat level 140:

SQL 2019, compat 2017

And if I set the compatibility level to 2019, I get that same estimate of 1,566 rows:

SQL 2019, compat 2019

So in summary, starting with SQL Server 2019, does compatibility level no longer influence cardinality estimation the way it did in SQL Server 2014-2017? Or is this a bug?

Checking to see if my relation is in 3NF based on the functional dependencies

I have a relation, called Score (which stores scores of football games), and it has the following functional dependencies for its various attributes G, H, T, S, W, D, and O (representing GameID, HomeOrAway, TeamID, Season, Week, Date, and Outcome):

GH → TSWDOP

SD → W

TSW → GH

D → SW

I would like to know if Score is in 3NF, or if not, suggest a decomposition that achieves 3NF. Can you guys help me go about figuring this out? I know that for each functional dependency, I need to check that the left side contains a key for Score, but I’m not sure how to really go about doing that. Any help would be greatly appreciated!

Why is VACUUM FULL locking ALL databases on the cluster?

From everything I have read about VACUUM FULL, I would expect it to lock the database I’m running it on, but it renders every database in the cluster inaccessible. Is there perhaps something we might have wrong in our configuration? I do have autovacuum on, but we have a few unwieldy databases I’d like to clean up thoroughly. This is PostgreSQL v10. Thanks for any insights.

Entries in suspect_pages but checkdb shows no error

I am using Microsoft SQL Server 2016 (SP2-GDR) (KB4505220) – 13.0.5101.9 (X64) Jun 15 2019 23:15:58 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 (Build 9600: )

Yesterday, I got two entries in “suspect_pages” for the same database. One of event type 1 and one of type 2

1 = An 823 error that causes a suspect page (such as a disk error) or an 824 error other than a bad checksum or a torn page (such as a bad page ID).

2 = Bad checksum.

database_id file_id page_id eventtype   error_count last_update_date 8           1       1482057 1           1           2019-11-14 14:40 8           1       1482057 2           1           2019-11-14 14:40 

I found the object related and they both point to the same table on the database.

DBCC TRACEON (3604); DBCC PAGE (8, 1, 14823057, 0); DBCC TRACEOFF (3604);

I had a valid backup of before the corruption and couldn’t afford a down time so I took a backup of the corrupted database, restored my backup on a new name. I dropped the corrupted table and then recreated it from the valid backup.

Today, I restored the corrupted database backup that I took yesterday on a test server and when I run a full checkdb, it detects no corruption.

DBCC CheckDB() WITH No_INFOMSGS, ALL_ERRORMSGS

How is it possible that the backup I took from a corrupted database (according to suspect_pages) doesn’t have any problems? Can those entries in suspect_pages be a false positive?

Database Compatibility Level is 130 (SQL 2016) Our SQL Server is running on Windows Server 2012.

What should my database design look like if I want to implement an invoice preset?

I just finished creating the invoice part of my database and now I need to implement an invoice preset feature wherein it basically takes different products and services, both main and sub and puts them into an invoice with one click of a button through the client’s app.

enter image description here

I do have a couple of questions with this design:

  1. Is the current design enough?
  2. I had some trouble with figuring out how do I recreate an invoice exactly how it was made (wherein the items are listed in the proper order) so a db admin from another website suggested I use a timestamp on the invoice items so I could keep track. Was that correct?
  3. I’ve created some mockup designs of the invoice preset (yellow and blue) but couldn’t decide which one is correct or if both of them are wrong.

Multiple Result Tabs in Workbench

enter image description hereI have several queries in the editor, each terminating with a semi-colon (I am teaching myself SQL and so in the editor is a mix of queries and notes like below):

SELECT * FROM orders WHERE order_date >= ‘2019-01-01’;

— AND operator

SELECT * FROM customers WHERE birth_date > ‘1990-01-01’ AND points > 1000;

— OR operator

SELECT * FROM customers WHERE birth_date > ‘1990-01-01’OR points > 1000 AND state = ‘VA’;

— NOT operator

— Exercise: FROM order_items table get the items — for order #6 — where total price is greater than 30

SELECT * FROM order_items WHERE order_id = 6 AND quantity * unit_price >= 30;


When I first began using MYSQL, I would run a query and only the last SELECT statement would run and the results would show in the result tab. The ACTION OUTPUT would show the time-stamp which clearly indicated only the last SELECT statement in the editor was run.

Now, all queries ending with a semi-colon are running simultaneously and each appears on its own tab and in the ACTION OUTPUT section on the bottom (i.e., “orders43”, “customers44”, “customers45” and so on.)

The only way I can get the most recent result is by highlighting the specific SELECT statement and running the query. However, I want to know why the results aren’t just “overriding” each other and the most recent SELECT statement being the results displayed in the same result tab…

At this rate, I’ll have hundreds of result tabs for each query ending with a semi-colon, and this is clearly not what I want nor was it how the workbench was working for me up until recently.

I am not sure why this is occurring, or how to stop it.

Any help would be greatly appreciated.

Thanks!

Using Temp Tables in Azure Data Studio Notebooks

tl;dr I want to use temp tables across multiple cells in a Jupyter Notebook to save CPU time on our SQL Server instances.

I’m trying to modernize a bunch of the monitoring queries that I run daily as a DBA. We use a real monitoring tool for almost all of our server level stuff, but we’re a small shop, so monitoring the actual application logs falls on the DBA team as well (we’re trying to fix that). Currently we just have a pile of mostly undocumented stored procedures we run every morning, but I want something a little less arcane, so I am looking into Jupyter Notebooks in Azure SQL Data Studio.

One of our standard practices is take all of the logs from the past day and drop them into a temp table, filtering out all of the noise. After that we run a dozen or so aggregate queries on the filtered temp table to produce meaningful results. I want to do something like this:

Cell 1

Markdown description of the loading process, with details on available variables 

Cell 2

T SQL statements to populate temp table(s) 

Cell 3

Markdown description of next aggregate 

Cell 4

T SQL to produce aggregate 

The problem is that, it seems, each cell is run in an independent session, so the temp tables from cell 2 are all gone by the time I run any later cells (even if I use the “Run cells” button to run everything in order).

I could simply create staging tables in the user database and write my filtered logs there, but eventually I’d like to be able to pass off the notebooks to the dev teams and have them run the monitoring queries themselves. We don’t give write access on any prod reporting replicas, and it would not be feasible to create a separate schema which devs can write to (for several reasons, not the least of which being that I am nowhere near qualified to recreate tempdb in a user database).

Can a foreign key satisfy all primary key constraints? (If so, why is it not a primary key?)

As someone super new to SQL and relational databases, it’s not always easy for me to distinguish between primary keys and foreign keys.

I understand that primary keys must uniquely identify records in a table and must not have missing values. However, some foreign keys seem to satisfy these constraints as well.

For instance, the people table has information on all baseball players and the hof_inducted table has information on those inducted to the Hall of Fame. In both tables, playerid is unique and has no missing values. So why is it a primary key in the former but a foreign key in the latter?

Conceptually, it kind of makes sense because people is where all the player information originates from. However, I don’t know how I can reach this conclusion just by examining primary/foreign key constraints.

enter image description here

P.S., I’d appreciate any brief-ish readings on this topic! A ton of thanks in advance!