What motivates the RAM model?

It looks like most of today’s algorithm analysis is done in models with constant-time random access, such as the word RAM model. I don’t know much about physics but from popular sources I’ve heard that there’s supposed to be a limit on information storage per volume and on information travel speed, so RAMs don’t seem to be physically realizable. Modern computers have all these cache levels which I’d say makes them not very RAM-like. So why should theoretical algorithms be set in RAM?