tl;dr I want to use temp tables across multiple cells in a Jupyter Notebook to save CPU time on our SQL Server instances.
I’m trying to modernize a bunch of the monitoring queries that I run daily as a DBA. We use a real monitoring tool for almost all of our server level stuff, but we’re a small shop, so monitoring the actual application logs falls on the DBA team as well (we’re trying to fix that). Currently we just have a pile of mostly undocumented stored procedures we run every morning, but I want something a little less arcane, so I am looking into Jupyter Notebooks in Azure SQL Data Studio.
One of our standard practices is take all of the logs from the past day and drop them into a temp table, filtering out all of the noise. After that we run a dozen or so aggregate queries on the filtered temp table to produce meaningful results. I want to do something like this:
Markdown description of the loading process, with details on available variables
T SQL statements to populate temp table(s)
Markdown description of next aggregate
T SQL to produce aggregate
The problem is that, it seems, each cell is run in an independent session, so the temp tables from cell 2 are all gone by the time I run any later cells (even if I use the “Run cells” button to run everything in order).
I could simply create staging tables in the user database and write my filtered logs there, but eventually I’d like to be able to pass off the notebooks to the dev teams and have them run the monitoring queries themselves. We don’t give write access on any prod reporting replicas, and it would not be feasible to create a separate schema which devs can write to (for several reasons, not the least of which being that I am nowhere near qualified to recreate tempdb in a user database).