After mysql 8.0.24 does master-slave replication, if the master database hangs, how does the slave database switch to the master database? If it can switch, what are the exact steps?
If switching is not possible, what mode can be used so that if a primary database hangs, another database, which is synchronised with the primary database in real time, can operate normally and can be read and written to?
I currently have MySql Master-Master replication set-up with Read_only on Master2. There were lot of sync issues so I’ve stopped replication from Master2 to Master1 by stopping the Slave in Master1. Master1 is currently replicating to Master2 with no issues. Is this enough or is there another best way to revert to Master-Slave replication. Should I run RESET SLAVE on Master1 to completely stop Replication from Master2 to Master1.
I’m learning master-slave replication with MySQL and I could get it to work and also use the Percona backup tool to restore a slave. I learnt from this project https://github.com/vbabak/docker-mysql-master-slave
Now I wonder if the master-slave replication must be two separate mysql instances of if it is possible to configure master-slave replication between two databases only instead of needing two separate mysql instances. I think it’s not possible.
The reason I want to know is that I want to automate failover and restore scenario and in my environment a new mysql instance always runs on the default port because of infrastructure automation, and therefore it is not possible today to start two mysql instances on the same host machine server, therefore it looks like I need to create two VMs with one master and one slave just to perform the test which is quite overkill for a test scenario and would be slow.
In the context of a distributed database, I’m trying to understand why 2PC (as described in e.g. https://www.cs.princeton.edu/courses/archive/fall16/cos418/docs/L6-2pc.pdf) is better than the following hypothetical protocol between a client, master, and slave:
- client tells master to commit
- master commits it
- master tells client the commit succeeded
- master tells slave to replicate the commit. If the slave fails, master keeps on trying until it succeeds and gets the slave caught up on all edits.
This seems to me to satisfy the same properties as 2PC:
- Safety: If the master commits, the slave will also eventually commit. If the slave commits, the master must have committed first. I suppose an advantage of 2PC is that if a participant fails before starting the commit, the transaction will be failed instead of only committing on the TC. However, in the proposed protocol, the commit on the master still eventually gets to the slave.
- Liveness: This protocol does not hang.
- Both rely on the master / TC durably recording the decision to commit. Both assume failed slaves / participants eventually wake up and catch up with the master / TC.
- Both fail if the master / TC goes down.
- In both protocols, it’s possible to have temporary inconsistencies where the master / TC has finalized a decision, but the slaves / participants haven’t yet committed.
It seems to me that the key theoretical difference in 2PC is that the participant (slave) can vote “no” to the commit, as opposed to merely temporarily failing. That would break the conclusion above where the slave eventually catches up. However, I don’t see why the slave would need to vote “no” in the first place. Given the assumption that the slave / participant does not permanently fail, it seems it should either vote “yes” or fail to respond. (Unlike the bank account example, I expect the slave to blindly replicate the master.)
Distilling all this down, it seems to me that 2PC’s assumption that participants don’t permanently fail makes it unnecessary to give participants a chance to vote “no” in the “prepare” phase.
What am I missing here? Presumably there’s some advantage to 2PC over the above that I’m not understanding, since 2PC is actually used to build distributed databases.
- Am I incorrect in concluding that a slave shouldn’t need to explicitly vote “no”, as opposed to temporarily failing? (I’m only talking about the data replication use case, rather than the bank account example.)
- Given the same assumptions as 2PC, and assuming slaves only say “success” or “try again”, is there some guarantee 2PC offers that the naive replication above doesn’t?
For the purpose of the question, I’d like to ignore practicalities, unless they’re critical to the answer. In particular, I’d like to ignore things that could be interpreted as being disallowed by the no-permanent-failure assumption, such as disk full, slave mis-configured, slave corrupt, operator error, buggy software, etc.