
In general, most of the load distribution design strategies for your applications fall into these two general models:
1- Push Method:
In the push model basically, you push the load to a server or a node through an external service like a Network Load Balancer (Amazon ELB), DNS Load Balancer (Amazon Route 53).
The user sends requests to a Network Load Balancer or DNS, and the Load Balancer distributes the traffic to the next available node based on different distribution algorithm. (Round Robin, Load, Performance, Delay, ..).
When using the DNS Load Balancer be mindful of client-side caching of DNS which is usually out of your control; This may cause your users to be directed to a server/node that is unavailable at the time of request so causing a service timeout; In such a situation they don’t receive the correct DNS record (based on the available nodes) until their local cache expires.
This is why the DNS load balancing itself is not a very accurate way unless combined with a live server health check (Amazon Route 53)
2- Pull Method:
In a pull model, there is no need for any external load balancing service; This type of distribution is more suitable in your back-end data processing services when processing a particular task.
When you need to process your Asynchronous Event-Driven Workload, you put them as a message in a queue (Amazon SQS) and multiple processing nodes can access/pull these messages successively and process them based on their load. Your workload can also be a stream of data instead of the message that come from a Data Streaming Service (Amazon Kinesis)