Most databases today have WAL (undo, redo), but why do databases need them?
Take a simple case of RocksDB deployment. I would imagine a storage/database cluster will have a certain redundancy scheme (erasure coding or replication) for fault tolerance, but I don’t see why application level also needs fault tolerance. For example, when one server temporarily (or permanently) fails, I would think the recovery will use the other servers’ data because there will be new data written during the time when the server fails and you cannot just rely on the logging to recover the server. Do I have a misunderstanding somewhere?
Not sure this is the right place for this question, if not, happy to ask at a different place. Thank you!