Equal Cost Multi Path has made it to the top of the networking food chain with the emergence of spine and leaf architectures. We have tuned and tweaked it over the years and while it has its purpose in network forwarding, it falls far short of what we need, far short of how anyone but networking folks would solve optimal demand based distribution across a set of interconnected destinations.

One of the most recent additions (in terms of ASIC support) to ECMP is the ability to weight the ECMP choices. While the paths are all of equal cost (and therefore equally qualified to be used), the amount of traffic (flows) being put on each does not have to be. Traditional hash calculations created a fair distribution across all ECMP choices, weighted ECMP allows you to favor specific choices. The hard part? Trying to determine how to set the weights, and how dynamic those weights should be. There has been plenty of research and proposals on how to create and adjust weights in ECMP to best utilize the links towards the ECMP next-hops. Abrahamsson and Bjorkman wrote this paper; Zhang, Xi and Chao wrote this paper to utilize source routing mapped into ECMP ratios. But other than manual adjustments, no real dynamic weight mechanisms exist in today's datacenter network. And even with weights, it is still a hop by hop calculation and decision.

ECMP next hops are calculated using Shortest Path First (SPF) based algorithms. The way we have guaranteed loop free forwarding in hop by hop IP networks is by using routing protocols with these algorithms as their foundation. The protocols create a graph of all routers and their connections within a forwarding domain, then calculate the shortest path from itself to all other routers, and the networks attached to it. As proven by mathematicians, the algorithms (the most used/well-known of which was put forward in 1956 by Edsger Dijkstra) can calculate shortest loop free paths, really a collection of spanning trees, across the graph that represents the routers and their connections. They create an consistent view across all routers in a network, and all routers will make the same choices because they are based on that same view of the network.

In the most basic network of 3 routers, a triangle, if router 1 is filling up its link to router 3, but the links to and from router 2 are underutilized, why would router 1 not send a portion of its traffic through router 2? The reason why this is not necessarily trivial is because you must figure out what traffic to offload in a coordinated manner. Without it, router 2 could decide to do the same for some of its traffic in the reverse direction and we could have a forwarding loop between router 1 and 2.

It's the coordination need that is key, and that is exactly what controller based networks give you. Controllers have a view of the entire network, how switches and routers are connected, what traffic patters exist in the network, and provide a mechanism to allow applications to specifically request a certain treatment of their traffic. As a piece of software that can be run on state-of-the-art servers, it can calculate through the most optimal way to distribute traffic across a specific network. Better yet, if the network is capable, it can change how the network is constructed if it finds there are better topologies for the traffic that needs to be exchanged. It sounds easy, but most certainly is not. All traffic is interrelated once placed on a network, changes to the way you send certain traffic may/will have an impact on how traffic behaves further down the network. But it comes with a significant freedom, to create networks that are not uniform, to create networks and forwarding topologies that are the best match for the needs of the applications. Which means it can instruct routers and switches to do Non Equal Cost Multi Path (I made that up) and send traffic on paths that are not all equal, while continuing to ensure loop free forwarding. At L2 and L3.

Like ECMP and based on the fact that SPF algorithms have been in use in networks for probably 40 years, there is most certainly value in their ability to determine paths through a network. But also like ECMP, we have made them the center piece of connectivity, rather than one of many tools that can be used. With network controllers becoming a more accepted mechanism to calculate optimal connectivity, we are creating freedom to significantly improve how we send traffic between end points and how we calculate backup topologies and paths. With some seriously advanced graph theory, we can move traffic across a network in a non SPF way and still be loop free. For unicast and multicast. Independently. And that is tremendously powerful.

Comments

Leave a Comment