The company i’m working for it’s rewriting a legacy web application and we are evaluating the better architecture to get the job done.
The business domain is quite simple: there are few entities, few relationships and simple business rules.
We expect a low number of concurrent writes, but a huge number of reads: the expected read workload is much greater than the expected write workload.
The web application is meant to allow a few number of editors to create entities called blogs which are composed by posts. There are different type of posts: textual posts, photos, links to youtube videos, links to twitter posts and so on.
The expected workflow is that an editor, which is following a live sport event, is in charge of editing a dedicated blog which is basically a stream of posts for the event: the editor creates one post for each interesting fact of the event. The main concern is having each post available for the mobile and web clients as soon as possible so that people all around the world will be able to follow the event.
We have evaluated two possible architectures:
- CQRS architecture with two different databases one for the command stack and one for the query stack
- simpler architecture with one domain model supporting both writes and reads and one database
The main advantage of the CQRS approach is the possibility to distribute the content to the clients in an optimized way by using some dedicated read models in order to have simple and fast reads from the read model database. This way the read and write sides can be scaled independently, exploiting the difference between the write and the read workloads highlighted above.
The main advantage of the single domain model approach is a much simpler architecture. In this scenario the commands from the editors are processed synchronously and when a command is successfully processed the editor knows for sure that its work is saved inside the database and available for clients. No eventual consistency between write model and read model, no need to handle the asynchronous update of the read model from the perspective of the backoffice UI users (the editors), no need for any kind of messaging system involved in mission critical tasks.
In my opinion considering our requirements the best way to go is using a single domain model architecture and scaling the reads by using an aggressive caching strategy. The idea is using Redis as a cache in order to limit the database access and trying to update the cache layer each time we write something inside the database using a streaming approach (we will probably use Mongo DB and our first idea is exploiting the change stream feature).
Do you think that a properly sized database and a wise caching strategy with redis could be enough in order to handle our read needs ? Or conversely in a scenario where the read loads is much greater than the write loads the best way to go is using a CQRS architecture (even at the cost of a bigger overall complexity) ?