Vendors, especially networking vendors, keep using the term “integrated” to imply that they work with other parts of the IT infrastructure ecosystem, generally the cloud management platform (i.e. VMware, OpenStack). However, every vendor seems to have a different idea of what “integrated” actually means and the implications on the overall data center. So, let’s take a look at what we at Plexxi consider a proper integration to be and why the nuances matter.
WHAT’S THE PROBLEM?
Before diving into the varying degrees of ”integrated-ness” a data center network platform can sport, let’s think about this from the customer perspective.
Customers trying to build a private (on-premise) cloud infrastructure generally want to get to the same level of experience for management and usability for their users that they would get if they used a Public Cloud service such as Amazon AWS, Microsoft Azure, or Google Cloud Platform. These Public Cloud platforms focus on the interface that allows the users to create, deploy, manage, and troubleshoot the resources necessary to deliver applications, versus the individual management and operational tasks associated with the individual resource domain. This means, a user can create/ modify/delete/expand their virtual machines or other “software-defined” compute constructs that have the storage and networking capabilities needed to allow the user to create an application. This contrasts to how a typical Enterprise IT infrastructure approach might create a server/compute, storage, and networking operational environment that is managed distinctly from each other with little relation to the overall application requirements.
Public Cloud providers have been able to provide this level of user experience mainly by creating a highly-integrated infrastructure environment where the “software-defined” aspects of compute, storage, and network infrastructure all talk to each other via APIs, exchanging data about their own systems state, health, and capabilities. In order for customers to successfully build these types of environments on-premises, they need to ensure that their building blocks for compute, storage, and networking have this level of ability to integrate. Sounds easy enough to duplicate, right? Not really. A lot of the larger Public Cloud companies build their own custom software to create this level of integration across the three SDI stacks, an option that is not feasible for all companies.
For the rest of the world, the network plays a special role since it is responsible for gluing together the compute and the storage resources in a way that matters to the applications. This puts the network at the center of a good Private Cloud strategy, and an especially important role in the area of integration. With this in mind, let’s look at what the network can provide and how it can help drive a better overall Private Cloud experience.
WHAT COMPRISES THE DEFINITION OF “INTEGRATED”?
Networking vendors make a variety of claims with respect to their level of ”integrated-ness” in the IT infrastructure ecosystem. But integration has a number of facets that comprise a true integrated system. There are at least 3 levels of integration that should be understood – Bootstrap, Lifecycle, and Converged Management:
Bootstrap integration allows the “day 0” problem to not be a complete “chicken and the egg” situation. Since a modern network relies on software for the control plane, and software relies on compute/storageinfrastructure to run, and compute/storage (at least the SD-type) infrastructure relies on a network to build itself into a working cluster, we have a complete circle of dependencies. There are various improvements to installations and bootstrapping that can be made by various integrated compute/storage offerings, and those improvements are drastically better when the network is also bootstrapped simultaneously. This means that whatever software is needed to operate the network is installed with initial working parameters coincident with the compute/storage installation.There are very few examples of integrated bootstrapping in the industry outside of pre-engineered converged systems. However, this is changing. As customers expect more ease-of-use, they are forcing vendors to provide easier installation for their components and to work together with other vendors to provide single installation and bootstrapping mechanisms that resolve the common chicken and the egg issues.
“Lifecycle” refers to the various phases a particular resource might go through. For example, a virtual machine needs to be created, modified or moved, and eventually destroyed. For storage a given data store might need to have similar type events and may also need to be “rebuilt” or “re-silvered” if the underlying drive system is changed. Increasingly, higher-speed “NVMe” based memory systems are being used as the underlying storage mechanism which can have explicit end-to-end lossless requirements for packet transport. The network needs to understand ALL of these events and situations to provide a fully integrated environment.For many events, the network simply needs to provide the right connectivity to a new entity creation or change the connectivity based on a move. A common example of this is how a new VM can automatically get its port and VLAN configuration when it is created or have it updated when it is moved. But there are other events that also require the network to respond in more complex ways. For example, a data store that is built from an SDI-based distributed cluster might need to ensure that the infra-cluster meta-data traffic is fully isolated and secured for the cluster to operate well. Or, imagine the user needs to evacuate an entire rack of VMs to upgrade the physical hardware. Here the network will be required to provide the addition of temporary bandwidth to ensure that the evacuation can finish as quickly as possible.Ultimately, we have lifecycle needs for compute and storage entities, and for each of those we have 2 broad types of integration needs – auto-configuration of network for connectivity purposes, and dynamic network response to events that need network resources such as bandwidth, low latency, or dedicated network paths.Most networking vendors, if they do any lifecycle integration, focus only on compute event, and typically only provide VLAN auto-configuration for new VMs or dynamic updates as VMs move. As discussed above, this is only a small set of the overall lifecycle that should be considered.
- Converged Management
Ultimately customers building Private Clouds should have as few management consoles required as possible, also known as Single-Pane of Glass Management. Many of the leading cloud management platforms provide the ability to create specific third-party plug-ins to allow the other systems to be managed from that platform. A fully integrated network should provide as much management integration into that platform as possible. This should not stop at the “read-only” viewing of network information, but extend into full operational control of the system.Since ultimately the cloud administrator or operator is responsible for the entire infrastructure, the focus on converged management should be their chosen management platform. This puts the onus on the network solution to “export” its visibility and configuration capabilities easily into third-party platforms.Many network vendors that claim integration provide only network CLI-based visibility. This “integration” takes VM level information (such as name and MAC address) and correlates it with network information (MAC address location such as physical port). This type of integration then allows the network administrator to view where a given VM is located, typically by typing a CLI command into the network console. While this type of integration provides some contextual value for a network administrator, it does not do anything for a cloud administrator trying to manage a complete infrastructure.
SO, IS IT REALLY “INTEGRATED”?
To achieve a true on-premise cloud, integration needs to evolve. It needs to be fully bi-directional information exchange, and it needs to cover all three layers of integrated networking requirements:
- Integrated Day 0 Installation and Process
- Compute Events → Network Auto-Configuration
- Compute Events → Network Dynamic Response
- Storage Events → Network Auto-Configuration
- Storage Events → Network Dynamic Response
- Converged Management:
- Integrated Visibility (in Cloud Management Platform)
- Integrated Configuration (in Cloud Management Platform)
Most vendors really only focus on network auto-configuration for compute events. Ultimately, as the vendors realize that enterprises need better than that, they will start to work together to provide a more complete solution across the various compute, storage, and networking software-defined stacks.
When evaluating vendors for your own enterprise data center, be sure to look for network vendors that are leading that discussion with their compute and storage partners. Ask the tough questions and dig deep into their definition of “integrated”. You may discover it does not necessarily mean what they think it means, or what you need.