We have a 4 server setup, 1 master, 3 daisy-chained slaves, in the following setup:
A (master) -> B (slave) -> C (slave) -> D (slave)
(the servers B and C and D are running with log-slave-updates)
In normal operation everything works as expected: if we add new data to A, we see it show up quickly in B and C and D
Now we want to create a failure scenario — we shutdown A and want to make B the new master:
B (master) -> C (slave) -> D (slave)
It seems like what we want to do is fairly simple — switch B from Slave to Master
We are trying to follow the documentation "Switching Sources During Failover" https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-switch.html
The doc says " On the replica Replica 1 being promoted to become the source, issue STOP REPLICA | SLAVE and RESET MASTER."
So if we’re reading correctly, to switch B from Slave to Master all we have to do is run:
STOP SLAVE RESET MASTER
Running "STOP SLAVE" causes no issues, but running "RESET MASTER" breaks the replication to downstream staves C and D. This is the error on C:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size'
So what is the point of "RESET MASTER" and why does it break the chain? Is there any harm in omitting it/how does one properly do a failover in MySQL chain replication?