The Inform stage is all about empowering the different DataFinOps personas—Michael and Nan, Raj and Sammi, Jon—with 360° visibility, precise allocation, real-time budgeting information, and accurate forecasting into cloud data costs.
MegaBankCorp has only hazy visibility into their cloud data spend. One report has found that gaining visibility into cloud usage is the single biggest challenge to controlling costs, another that only 3 out of 10 organizations know exactly where their spend is going, with the majority either guesstimating or having no idea. And the larger the company, the more significant the cost visibility problem.
Usually lack of insight into cloud costs is a problem of too much information, not too little. The cloud data platforms and providers themselves generate literally millions of cost-billing data points. They know exactly what costs were incurred (what resources were “rented,” how much they cost, how long they were in use). It’s everything that ultimately adds up to the bill you see. Then there’s all the performance engineering data (about who’s doing what, when, where, why, and how) that you get from Spark, Kafka, Airflow, dbt, etc. This is the data you analyze to understand why the incurred costs are what they are.
To make data-driven decisions, you have to first have the data. And that starts with stitching together the financial data and the performance data to show you exactly where the money is going.
The dozen or more systems you have running generate or capture gigabytes of financial and performance details about cloud data costs: cloud billing data, performance engineering metrics, variable pricing decisions, cluster and data storage usage/costs. There’s a ton of data available out there, but it’s all over the place—hidden in plain sight.
Before the DataFinOps team at MegaBankCorp can do anything, all this financial and performance data has to be collected and correlated so that Finance, Engineering, and LOBs can slice and dice the data to look at costs from various business viewpoints:
- By individual user
- By individual job
- By team, department, line of business (LOB)
- By project, initiative, or “data product”
- By application/pipeline
- By cluster
- By environment (Dev vs. Prod)
- By budget
And this collection and correlation requires a high degree of automation.