A couple of years ago, Ross Smith IV gave a presentation on how Outlook client’s connection to Exchange 2010 Client Access Servers (CAS). Part of presentation covered load-balancing of the CAS servers and for the first time I heard Microsoft recommend using a Hardware Load-Balancer over Windows Network Load Balancing (NLB).
Here is a quick summary of his recommendations:
Load Balancing Recommendations:
- Hardware Load Balancers
- Integrated “is alive” monitoring recommended
- Fixing of MAPI and directory endpoint ports
- Create CAS Array and load-balance selected, or all CAS in a site
- Client IP affinity or cookie-based authentication where appropriate
- Not Recommended
- DNS Round Robin
- Windows Network Load Balancing
- Do not load-balance cross-site (create two arrays instead and load-balance separately)
You might find this strange, since Microsoft owns and include Windows NLB in their Windows Server free of charge. Well there are several reasons they recommend using a Hardware Load Balancer for most deployments, instead of NLB. Let’s talk about why you shouldn’t use NLB.
Why not Windows NLB?
- Issues with WNLB
- Switch/Port flooding
- NAT/Source IP pool/affinity
- Scalability over 8 nodes
- Service awareness
- Not supported with Windows Failover Clustering
- Add/remove single node causes all clients to reconnect
Service Awareness: NLB is “server aware”, but not service aware. This means a CAS server could still be up and running, but the particular service you are connecting to could be down/stopped. NLB is not aware of these types of failures and will continue to send clients to the server.
Switch/Port Flooding: Network Load Balancing induces switch flooding by design, so that packets sent to the cluster’s virtual IP address go to all the cluster hosts. Switch flooding is part of the Network Load Balancing strategy of obtaining the best throughput for any specific load of client requests. This can cause issues in particular with virtual machine environments, because virtual machine needs do not have MAC addresses. You will need to make ARP entries in your switch for each VM. If those VMs are moved to another host then those entries will no longer be valid and require updating. Here is an article from Cisco on how to implement NLB (http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml).
Persistence/Affinity: NLB only provides IP-based persistence. Session persistence (a.k.a. Session Affinity or Stickiness) is the ability of the load-balancer to make sure a given Client always gets to the same Real Server, even across multiple connections. Persistence can make sure that all requests from a client are sent to the same server in a Server Load Balancer (SLB) array or server farm (in case of CAS array). Depending on the service in Exchange 2010
Add/remove single node causes all clients to reconnect: Any time you add or remove a server from NLB it will force all Outlook clients to reconnect. While this should rarely happen, it is another example that NLB does not provide fault-tolerance to your Exchange environment.