There are 60 Million Shipments per day. Each shipment has about 50 metrics to be calculated. Each metric is calculated based on a type of the event(Let’s say
event_1 has the required information to calculate
metric_2 and so on). All the events are independent of each other apart from one dependency, a single event(let’s say
event_1) which has vital information required to process each of the other events.
The current design:
(In Order)Scenario 1:
event_1 arrives first, we calculate
metric_1 and store the vital information required to process other events in DynamoDB. Other events(
event_2,….) arrive and are processed by accessing the information from DynamoDB.
(Out of Order)Scenario 2:
event_3 arrives first, system checks for required information in DynamoDB and fails, the system places the event in the dead letter queue to be retried after a period of time. One
event_1 arrives and is processed, the other events go through.
Is using a data store and retry mechanism the right approach to resolve the dependency on the base event(
Are there better approaches/patterns to solve the event dependency problem?
Additional Context: Although I believe this information is irrelevant, I am giving it anyway if it helps. Source of Events: SNS topics, Event Processing: SNS->SQS->Lambda, Data Store: DynamoDB, Metrics are stored in RedShift.