A while ago I wrote a few articles describing the various tunnel protocols used for network virtualization between vSwitches on servers, and between vSwitches and physical network gateways. These are the mechanisms that construct overlay networks on top of a physical network. VMWare uses STT as the tunneling mechanism between vSwitches on servers and VXLAN to communicate with gateways to the non virtualized world. NVGRE is used mostly by Microsoft, and is an extension to GRE tunneling that has been around for a while.
Each one of these mechanisms has its pros and cons. They are all pretty much standard, or at least published by a standards organization, and multiple implementations exist for most of them. Outside of my complaint about the stream like nature of STT, the biggest problem with all of them is the fact that they are fixed in their definition. The header definition is fixed, the fields are fixed, the sizes of the headers are fixed.
The 24 bits used for a Virtual Network Identifier providing for 16 million different virtual network may seem like an amount we can live with for a long time to go. But creative minds will find ways to use those 24 bits to signal all sorts of meta information between the tunnel endpoints, or to intermediate switches and other services forwarding these tunneled packets. Additional summarized information about the content of a packet is extremely useful for any device or service that makes intelligent decisions on this packet.
And once you start that thought process, those 24 bits will disappear really quickly. And when 24 bits are not enough, it requires an update to the protocols and their implementations, and those are extremely painful, especially if portions of those protocols are handled in hardware.
Enter GENEVE, an Internet Draft published on Valentine’s day this year, which looks to take a more holistic view of tunneling. GENEVE takes its cue from many other protocols that have shown themselves to have a long life. Protocols like BGP, LLDP, ISIS and many others have been around for multiple decades and are still as popular as they have ever been. And the reason is simple, they are extendable. They evolve over time with new capabilities, not by revising the base protocols, but by adding new optional capabilities.
All of these protocols have a set of fixed headers, parameters and values, but then leave room for non-defined optional fields. New fields can be added to the protocol by simply defining and publishing them. The protocol is created in such a way that implementations know there may be optional fields that they may or may not understand.
Think of my favorite BGP. When its 4th version was created in the mid 90s, it had no ability to carry IPv6 (which wasn’t even called IPv6 at the time). It had no ability to carry multicast routes, Communities, ORFs or act as a Route Reflector. BGP has many optional attributes that have been added over the years and as a result is the most powerful routing protocol in existence. It is the result of a solid protocol design practice: we don’t know everything we may want to use this for at the moment we create it, so we design in the ability to simply add new capabilities.
Like VXLAN, GENEVE runs on top of UDP. It adds its own header, which is only 8 bytes of fixed header, containing that same 24 bit Virtual Network Identifier and a 16 bit Protocol Type as the main fields. After that, the initial definition is remarkably empty, everything else is left open as options that follow a specific format if and when they get defined.
And that is the beauty of an extensible protocol, only those implementations that care about some specific option will have it implemented and acted upon. Everyone else has to quietly accept and ignore these options. Backward compatibility by design.
Along the way, a few more recommended practices are articulated, including one I complained about in my description of STT. While the language could be a bit stronger, it is recommended that each GENEVE encapsulated packet includes the entire header, which means I can actually reconstruct fragmented packets if I need to during debugging, each packet on the network has the entire original packet and the added GENEVE encapsulation.
Whether you like overlays, tunnels and everything that comes along with it or not, it is good to see folks from VMWare and Microsoft try and come together to create a single tunnel mechanism that is sensible and extensible. It seems to have all the right bits and pieces to have a long life and certainly provides mechanisms that allow for much better orchestration and cooperation between the overlay and physical network.
Of course it will take a while before it becomes a complete definition, with real implementations and hardware support. But when it does, we will have a better toolkit as a result.
[Today’s fun fact: The city of Geneva (which in French of course is spelled Geneve) has the shortest commute time of any major city in the world. How appropriate.]