SQL DB corruption recovery procedure

I’ve like to have this discussion on db corruption topic. We run weekly dbcc checkdb and most time, this sql job went through without errors. However, if it does, that usually means big headache for any dbas.

The recommended way is to check if it can be fixed without data loss or restore from last good copy.

My question is about the procedure to restore from last good copy.

Our backup strategy is weekly full back on sunday, diff backup daily and tlog backup hourly.

Say if weekly integrity check indicated errors,

How to determine the last good backup once last good backup is determined, say corruption on wed, should I use last week full backup + diff backup on tuesday + all tlogs after tuesday’s diff backup till current time to restore db? question should I use replace option ?

Damage Output of Lightning Recovery (Tome of Battle Maneuver)

Lightning Recovery is a Tome of Battle maneuver that allows a reroll on an attack d20 if it misses, along with an additional +2 on the second attempt. How can I calculate the value of the damage added per round? Here are some figures we can use:

Rapier Attack: +10/+5
Base Damage: 1d6+2
Extra feat damage: 2d6
Critical: 18/x2
Battle Ardor (warblade ability): +2 to confirm critical threat
Target AC: 20

How much damage per round does the use of this maneuver Lightning Recovery add? I’m guessing it’s an array of spread due to the 5e ‘advantage’-like nature of the maneuver mechanic? Could that even be averaged?

Recovery with Postgresql logical replication

I’m using Postgresql 13 on Debian 10.6 and learning about logical replication.

I’ve set up logical replication with one publisher and one subscriber of one table. I’m wondering what my options are for recovering data (or rolling back) when, for example, someone accidentally does something on the publisher side like updating all the data in the table with the wrong value, or even deleting everything from a table. With logical replication these unintentional changes will of course be applied to the subscriber.

I’ve relentlessly searched online but have had no luck finding out what my options are. I read about PITR but I’m thinking that’s suited more for physical replication, whereas I want to test rolling back changes on a specific database on a server.

Database system is in recovery mode: Segmentation fault

PostgreSQL version : 12.4

Server: RHEL 7.9

I got postgres server into recovery mode for a minute and then came back normal.

Looking into logs, found this error before it went to recovery mode:

db=,user= LOG:  server process (PID 4321) was terminated by signal 11: Segmentation fault db=,user= DETAIL:  Failed process was running: select distinct some_col.some_state_id,case when some_col.some_state_id=99 then 'CENTRAL' else state.state_name_english end as stateNm,case when some_col.some_state_id=99 then 'AAA' else state.state_name_english end from xema.table_name_definition_mast defn_mast left join othe.get_state_list_fn() state on some_col.some_state_id=state.state_code where defn_mast.third_srvc_launch ='Y' and some_col.some_state_id < 100 order by 3 

I doubt if this issue will come up again. Is this query specific or hardware problem? Got stuck.

SQL Server log file in “Simple” recovery model

Can someone please shed some light on log growth in simple recovery model. If I understand correctly, even in "Simple" recovery model the transaction log can grow? So if I have open transaction(s) (Update, Insert, Delete, Index Rebuilds, etc.) these start to re-use inactive VLF’s and if none are present the log will start growing based on log configuration until it reaches the configured size or hits disc space available limit? Can any of the active VLF’s be marked as "inactive" due to a checkpoint and\or full backups?

In short, it was requested that I change all databases to "Simple" recovery model and issue a "DBCC SHRINKFILE (N’LogName’ , 0, TRUNCATEONLY)" every hour. I received a Error 9002 and the Full database backup and dbcc shrink failed. I guess, eventually, the transactions completed and the checkpoint, full backup and dbcc shrinkfile were able to complete.

pgpool-II and Postgres docker image : automated failover and online recovery via rsa key

I’ve been following this documentation for pgpool-ii https://www.pgpool.net/docs/latest/en/html/example-cluster.html

I’m having a hard time setting up rsa on my postgres streaming cluster built in official docker image https://hub.docker.com/_/postgres.

I was able to do the streaming now i’m on the part of setting up failover.

part of the documentation says.

To use the automated failover and online recovery of Pgpool-II, the settings that allow passwordless SSH to all backend servers between Pgpool-II execution user (default root user) and postgres user and between postgres user and postgres user are necessary. Execute the following command on all servers to set up passwordless SSH. The generated key file name is id_rsa_pgpool.  
     [all servers]# cd ~/.ssh      [all servers]# ssh-keygen -t rsa -f id_rsa_pgpool      [all servers]# ssh-copy-id -i id_rsa_pgpool.pub postgres@server1      [all servers]# ssh-copy-id -i id_rsa_pgpool.pub postgres@server2      [all servers]# ssh-copy-id -i id_rsa_pgpool.pub postgres@server3       [all servers]# su - postgres      [all servers]$   cd ~/.ssh      [all servers]$   ssh-keygen -t rsa -f id_rsa_pgpool      [all servers]$   ssh-copy-id -i id_rsa_pgpool.pub postgres@server1      [all servers]$   ssh-copy-id -i id_rsa_pgpool.pub postgres@server2      [all servers]$   ssh-copy-id -i id_rsa_pgpool.pub postgres@server3 

Is it possible to set it up inside a container from postgre’s official image? I would like to get an idea on how to do it from some samples or existing solution.

Moreover, Since I can’t do the rsa thing as of the moment.

I decided to create a script that is using a psql command on my pgpool server to the new master

#!/bin/bash # This script is run by failover_command.  set -e  # Special values: #   %d = failed node id #   %h = failed node hostname #   %p = failed node port number #   %D = failed node database cluster path #   %m = new master node id #   %H = new master node hostname #   %M = old master node id #   %P = old primary node id #   %r = new master port number #   %R = new master database cluster path #   %N = old primary node hostname #   %S = old primary node port number #   %% = '%' character  FAILED_NODE_ID="$  1" FAILED_NODE_HOST="$  2" FAILED_NODE_PORT="$  3" FAILED_NODE_PGDATA="$  4" NEW_MASTER_NODE_ID="$  5" NEW_MASTER_NODE_HOST="$  6" OLD_MASTER_NODE_ID="$  7" OLD_PRIMARY_NODE_ID="$  8" NEW_MASTER_NODE_PORT="$  9" NEW_MASTER_NODE_PGDATA="$  {10}" OLD_PRIMARY_NODE_HOST="$  {11}" OLD_PRIMARY_NODE_PORT="$  {12}"  #set -o xtrace #exec > >(logger -i -p local1.info) 2>&1  new_master_host=$  NEW_MASTER_NODE_HOST ## If there's no master node anymore, skip failover. if [ $  NEW_MASTER_NODE_ID -lt 0 ]; then     echo "All nodes are down. Skipping failover."     exit 0 fi  ## Promote Standby node. echo "Primary node is down, promote standby node" $  {NEW_MASTER_NODE_HOST}.  PGPASSWORD=postgres psql -h $  {NEW_MASTER_NODE_HOST} -p 5432 -U postgres <<-EOSQL  select pg_promote(); EOSQL  #logger -i -p local1.info failover.sh: end: new_master_node_id=$  NEW_MASTER_NODE_ID started as the primary node #exit 0 

The above script is working if i simulate that my primary is down.

However, in my new primary this is the log

2020-10-07 20:25:31.924 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:25:31.924 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:25:32.939 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:25:32.939 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history 2020-10-07 20:25:32.939 UTC [1165] WARNING:  archiving write-ahead log file "00000002.history" failed too many times, will try again later cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:26:33.003 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:26:33.003 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:26:34.012 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:26:34.012 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:26:35.026 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:26:35.026 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history 2020-10-07 20:26:35.026 UTC [1165] WARNING:  archiving write-ahead log file "00000002.history" failed too many times, will try again later cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:27:35.096 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:27:35.096 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:27:36.110 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:27:36.110 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:27:37.123 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:27:37.123 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history 2020-10-07 20:27:37.123 UTC [1165] WARNING:  archiving write-ahead log file "00000002.history" failed too many times, will try again later cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:28:37.177 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:28:37.177 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:28:38.221 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:28:38.221 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history cp: cannot create regular file '/archives/00000002.history': No such file or directory 2020-10-07 20:28:39.230 UTC [1165] LOG:  archive command failed with exit code 1 2020-10-07 20:28:39.230 UTC [1165] DETAIL:  The failed archive command was: cp pg_wal/00000002.history /archives/00000002.history 2020-10-07 20:28:39.230 UTC [1165] WARNING:  archiving write-ahead log file "00000002.history" failed too many times, will try again later 

still trying to execute the WAL part.

moreover, on my other standby it is still looking for the old master.

2020-10-07 20:29:07.818 UTC [1365] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:12.827 UTC [1367] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:17.832 UTC [1369] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:22.835 UTC [1371] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:27.826 UTC [1373] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:32.836 UTC [1375] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:37.836 UTC [1377] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:42.850 UTC [1379] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:47.857 UTC [1381] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 2020-10-07 20:29:52.855 UTC [1383] FATAL:  could not connect to the primary server: could not translate host name "pg-1" to address: Name or service not known 

and dealing with this I think is more complicated than setting up the rsa part so that i could utilize the existing fail_command script that pgpool has.

Thanks for the response.

SQL Database stuck In Recovery state after restart

SQL Server was restarted by mistake, when it came online, database came back in "In Recovery" mode.

Check from error it says "Recovery of database ‘DB1’ (5) is 8% complete (approximately 27146 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.

It says it will 8 hours to bring this 2tb database online.

Any quick way to fix this, as we didnt had anything open in LOG files, so even if they r ignored, its no impact.

We want to bring this database ONLINE quickly

What’s the security risk in password recovery attempts

Last days I’ve received multiple password recovery attempts for a WordPress user. The user didn’t initiate these attempts.

I’m blocking the IP’s on the server, but I don’t see what the goal of the attacker is. I checked the mails the user receives, and they contain a valid password reset link (so no phishing attempt).

So I don’t really understand what the attacker is trying to achieve with these password recovery requests. Or are they just checking for vulnerabilities on that page?