“Tag everything and chargeback based on those tags.” That is the litany we have been told, lectured, cautioned and warned is the only way to track resources and associated spend in the cloud.
In fact, in our Tagging Governance: Managing the Metadata of Your Cloud blog post, we wrote about the attributes of a sound tagging governance program. That post assumed you’d already made the decision to use tagging. What it didn’t cover was the cost, or the dark side, of tagging.
As you read this post, I encourage you to think about what tagging enables you to achieve. And to ask yourself... would I give up tagging if there were a more cost-effective approach?
Most of us would initially answer “Yes” but would likely second-guess a decision to eliminate tagging. After all, tagging is a foundational requirement for anyone who wants to achieve Run-stage maturity in FinOps. Or is it?
In this post, I’ll briefly revisit the rationale behind tagging followed by a look at the costs of tagging and a discussion of five alternative ways to approach tagging...or not.
The reason we tag cloud resources reminds me of a book I read early in my career. It posed the question: Why do people buy a drill bit?
When I was a new homeowner, I admit I was excited by tool purchases. But the reality was, if I was buying a drill bit, it was because I needed a hole of a certain size.
In other words, the drill bit is a means to an end. The same goes for tagging.
We tag for what it enables us to do. Tagging provides the means, in the form of metadata, to identify virtual resources. More specifically, it allows us to categorize, group and classify cloud resources by numerous dimensions that are meaningful to our organization so that we can analyze data.
One of the most common reasons for tagging is for cost allocation. This process can be a simple grouping of resources by a cost center, or it can be more complex if you’re driving towards total cost of ownership (TCO) with value streams or unit economics.
A single-product SaaS company, for example, may want to simply see its costs by dev and production. On the other hand, a large multi-product insurance company will likely want to group resources by cost centers, dev teams and applications. It will also want to calculate unit economics for its product lines and for value streams such as the cost to process a claim.
Tagging is an effective method for assigning metadata to cloud resources that allows you to slice and aggregate cloud spend data to deliver these costs.
A complex mix of stakeholders participates in tagging governance and they often have competing priorities.
Tagging initiatives are typically spearheaded by the FinOps team or the Cloud Center of Excellence (CCoE). The latter establishes a policy and procedures for the engineering community to follow.
A crucial component of any tagging program is ongoing governance, including consistent reporting and auditing. To ensure tagging compliance, KPIs with specific targets must be effectively managed.
Tagging management often breaks down in the Engineering group. Typically, Engineering is tasked with creating and maintaining tags for their own resources. Unfortunately, they are often disconnected from the benefits of tagging. Seeing it as a burden, they may hastily enter any tag that comes to mind in order to quickly return to coding. In turn, this creates rework for the FinOps team.
Also, some companies enforce tagging policies with a process that automatically terminates an untagged cloud resource within 24 hours of creation. Others implement policies that block the creation of a resource unless it’s tagged. Such policies ensure resources are tagged but do little to ensure tagging accuracy.
Effective tagging requires a process to continuously audit and clean up untagged resources as well as improperly tagged resources.
When you need to factor in the additional time engineers spend on tagging cleanup to meet the FinOps team’s remediation requests, the costs escalate even further. It's no surprise that tagging is often a source of frustration.
Shared resources and services (such as containers, VWANs, Transit Gateways, support, taxes, marketplace spend -- the list seems endless) and other similar items often cannot be tagged at a meaningful level of granularity. Sometimes they may not support tagging at all.
These "non-taggable" resources require alternative methods for cost grouping. The shared nature of these resources creates obvious allocation challenges that add administrative costs to an already imperfect system.
The following options can enhance your ability to classify cloud resources. The first three options provide ways to improve traditional tagging. The last two offer alternatives for replacing tagging altogether.
For those utilizing traditional tagging, consider two key automation options. First, enforce tagging compliance by implementing policies that prevent the creation of resources unless they are tagged. Second, establish a policy that automatically deletes untagged resources after a specified period, such as 24 hours. While these measures ensure tagging compliance, they still rely on a human to pick the relevant tags and do little to ensure tagging accuracy.
In many companies, the Configuration Management Database (CMDB) serves as the authoritative source for information about cloud resources and should be considered as the foundation for tagging. If the CMDB is well-managed, leveraging it for tagging is an easy lift.
However, the CMDB is often managed by a team outside of the FinOps group. If further curation is needed, organizational politics can complicate the process.
If you are using Infrastructure as Code (IaC), tags can be integrated directly into the configuration. As the code is deployed, tags are automatically generated as part of the process. Implementing a simple policy to ensure tags are included in every pull request streamlines the process.
However, the challenge with IaC lies in its coverage. Most organizations are not fully utilizing IaC and it cannot support all resources. Additionally, as with other tagging methods, managing shared resources remains a common issue.
One of the most common alternatives to tagging is using what FOCUS calls subaccounts (traditional accounts, subscriptions or billing accounts) to group spending and allocate costs. Typically, a department is assigned three accounts (dev, test and prod) for tracking cloud expenditures. These account structures can be utilized as-is or to drive rules-based tagging to aid in reporting.
Some companies prefer account-based structures over tagging. Using this method, cost allocation can be managed in a few hours a month using Excel or require no time at all using a flexible cloud cost management (CCM) tool.
A few notable challenges exist with this approach. Shared resource and cost issues still persist. The granularity required for unit economics or TCO may not align with the account structures.
Perhaps the most significant challenge arises for companies that undergo frequent M&A activity or reorganizations. Realigning resources within accounts to match the new organizational structure can be a substantial and complex effort.
I started this blog post by challenging you to look at tagging as a means to an end. Here is a way to look at cost allocation with a completely different lens: Imagine using network traffic to aggregate the cost to serve a user and then grouping those costs at the customer level.
Attribute (attrb.io) does exactly that, offering an innovative way to view costs. For SaaS businesses, this method provides an effective solution for cost allocation while simultaneously achieving unit economics. Large enterprises with complex reporting hierarchies may still need traditional approaches; but for shared costs, unit economics and TCO, this approach could be highly beneficial.
My aim here is not to knock tagging but to highlight the seldom discussed costs associated with tagging and put knowledge and choices in your hands.
Tagging is undoubtedly an effective method for adding dimensions to your FinOps data. However, given the expenses associated with traditional tagging, it can literally pay to examine alternate, equally effective approaches. If they meet your requirements, they can save you money.
The answer to whether or not to tag cloud resources depends on your specific requirements. We’ve worked with companies that have eight-figure cloud spends and use account-based structures and would never consider tagging.
If you decide to tag, the more you can automate, the better. While untagged resource termination policies may seem harsh, they are effective.
Take a hard look at your CMDB; can you use that as the gold master? Are you using IAC? Can you use it across the board?
Ultimately, you need to think about how you want to slice your data for analysis and find a (means) method that gives you the dimensions needed to achieve that (end) goal.
Whether you choose to tag using a traditional approach or by integrating with a data source such as a CMDB or if you choose to skip tagging altogether and use account-based structures, Envisor Cloud Analytics can meet your needs. Purpose-built for Run-level maturity and built on an open data lake architecture, it easily accommodates integration with enterprise data sources. Get in touch with us to learn more., we’d love to hear from you.