How Elastic Load Balancing Works
A load balancer accepts incoming traffic from clients and routes requests to its registered EC2 instances in one or more Availability Zones.
- You configure your load balancer to accept incoming traffic by specifying one or more listeners.
- A listener is a process that checks for connection requests.
- It is configured with a protocol and port number for connections from clients to the load balancer and a protocol and port number for connections from the load balancer to the instances.
The load balancer also monitors the health of its registered instances and ensures that it routes traffic only to healthy instances.
- When the load balancer detects an unhealthy instance, it stops routing traffic to that instance, and then resumes routing traffic to that instance when it detects that the instance is healthy again.
When you attach an Availability Zone to your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone that forwards traffic to the healthy registered instances in that Availability Zone.
- We recommend that you configure your load balancer across multiple Availability Zones. If one Availability Zone becomes unavailable or has no healthy instances, the load balancer can route traffic to the healthy registered instances in another Availability Zone.
Request Routing
Before a client sends a request to your load balancer, it resolves the load balancer's domain name using a Domain Name System (DNS) server.
- The DNS entry is controlled by Amazon, because your instances are in the amazonaws.com domain.
- The Amazon DNS servers return one or more IP addresses to the client. These are the IP addresses of the load balancer nodes for your load balancer.
- As traffic to your application changes over time, Elastic Load Balancing scales your load balancer and updates the DNS entry.
- Note that the DNS entry also specifies the time-to-live (TTL) as 60 seconds, which ensures that the IP addresses can be remapped quickly in response to changing traffic.
The client uses DNS round robin to determine which IP address to use to send the request to the load balancer. The load balancer node that receives the request uses a routing algorithm to select a healthy instance. It uses the round robin routing algorithm for TCP listeners, and the least outstanding requests routing algorithm (favors the instances with the fewest outstanding requests) for HTTP and HTTPS listeners.
The cross-zone load balancing setting also determines how the load balancer selects an instance. If cross-zone load balancing is disabled, the load balancer node selects the instance from the same Availability Zone that it is in. If cross-zone load balancing is enabled, the load balancer node selects the instance regardless of Availability Zone. The load balancer node routes the client request to the selected instance.
Availability Zones and Instances
To ensure that your back-end instances are able to handle the request load in each Availability Zone, it is important to keep approximately the same number of instances in each Availability Zone registered with the load balancer.
For example, if you have ten instances in Availability Zone us-west-2a and two instances in us-west-2b, the requests are distributed evenly between the two Availability Zones. As a result, the two instances in us-west-2b serve the same amount of traffic as the ten instances in us-west-2a. Instead, you should have six instances in each Availability Zone.
To distribute traffic evenly across all back-end instances, regardless of the Availability Zone, enable cross-zone load balancing on your load balancer. However, we still recommend that you maintain approximately equivalent numbers of instances in each Availability Zone for better fault tolerance.