Google's Load Balancer
Maglev Google( https://research.google.com/pubs/pub44824.html ) Used in Google Cloud since 2008 Scalable load balancer Consistent hashing Connection Tracking Scale-out model backed by router’s ECMP Bypass kernel space for performance. Support connection persistence Network Architecture DNS - Routers - Maglevs - Service EndPoints. One service is served by one or more VIPs DNS returns VIP considering geolocation and load of location One VIP is served by multiple Maglevs Router use ECMP to select one Maglev One VIP is mapped to multiple Service EndPoints Maglev select Service EndPoint by seletion algorithm and connection tracking table Maglev use GRE to send incoming packet to Service EndPoint or another Maglev Send to IP fragment to another special Maglev servers Use only 3-tuple for IP fragment Each Service EndPoint use Direct Server Return(DSR) Maglev Controller Responsible for VIP announcement with BGP Check health status of forwarder If forwarder is not headthy, withdraw all VIP announcements Forwarder Each VIP has one or multiple backend pools(BP) BP contain physical IP address of the Service EndPoint Each BP has specific health checking methods - depends on the service requirement(just reachability or more) Config Manager parse and update configuration of forwarder’s behavior based on the Config Objects Sharding Sharding of Maglev enables service isolation - new service or QoS Backend Selection Consistent Hashing distribute loads Record selection in LOCAL connection tracking table Connection tracking table is not shared with another Maglev Does not guarantee consistency on Maglev or Service EndPoint Changes(add/delete) For different traffic type TCP SYN : select Backend and record it in connection tracking table TCP non-SYN : lookup connection tracking table 5-tuple : (maybe) lookup connection tracking table and select backend if not found Consistent Hashing If Maglev is added or removed, router select different Maglev for the exsiting session - ECMP is changed If one Maglev’s local connection tracking table is overflowed, it will lose previous selection To resolve this issues, Synchronize local connection tracking table between Maglevs -> overhead, overhead, overhead Consistent hashing for minimize disruption in member changes Maglev hashing - load balancing and minimal disruption on member changes reference Maglev: A Fast and Reliable Software Network Load Balancer Consistent Hashing The Simple Magic of Consistent Hashing