AWS RDS is showing very high wait/synch/mutex/sql/ values and EXPLAIN statements in performance insights


I’m running a CRON script which checks the database for work and executes anything that needs to be done. It does this across ~500 customers per minute, but we are using AWS RDS with a 16 vCPU machine which, until recently, has been plentiful to keep it happy (normally plugging along under 20%).

This weekend we updated customers to the latest version of the code and implemented some tooling, and since then we’ve started seeing these huge waits: enter image description here

Further I’m seeing that about half of our busiest queries are EXPLAIN statements, somewhere illustrated here: enter image description here

Nowhere in our code base is an "EXPLAIN" performed (though we are using AWS RDS performance insights, ProxySQL and New Relic for monitoring). I did notice that in the past week our number of DB connections was previously baselined around 10 and is now closer to 90. enter image description here

Any ideas on where I should be digging to find the cause of these waits and explain statements? And if they could justify the large number of open connections?