Cloud cost optimization is a primary goal of most FinOps practices. While FinOps teams often start with manual steps to shut down unused or underutilized resources, they quickly discover that waste keeps coming back as developers and IT continue to deploy new cloud resources. It can feel like a never-ending game of whack-a-mole to uncover and eliminate inefficient costs.
Automation is an important tool to prevent and eliminate inefficient use of cloud resources and keep cloud costs from getting out of control. Fortunately, you don’t need to wait until you’ve reached a magical level of FinOps maturity to implement automation. The FinOps Framework highlights automation opportunities at the Crawl, Walk, and Run stages of maturity.
Here are a few detailed recommendations for automating at each stage of your FinOps journey.
Crawl stage
Eliminate unused compute and storage
If you are early in your FinOps journey, you likely have significant spend on compute and storage resources that are no longer used. Often there are compute instances that were spun up for development, testing, or demo purposes, but were never shut down after the development or testing was completed. Another common area for waste is storage buckets that have been provisioned but aren’t being used.
You will want to collaborate with the relevant stakeholders to agree on rules that work for your business. For example, you might agree on a policy with your development teams that instances in development accounts that are unused for the previous 48 hours will be automatically shut down. Alternatively, you could notify the instance owner after 24 hours and then shut it down automatically after 48 hours.
How to automate:
- Search for unused compute instances in your development, test, or demo environments with no memory or CPU utilization for a specified period of time. You can then alert the owners or optionally shut down the instances.
- Search for unused storage volumes where no read or write operations have been performed for a period of time or where the storage volume is not attached to any compute resources. You can then alert the owners or optionally take a final snapshot and delete the storage.
- Search for old snapshots of storage volumes that are older than a specified number of days. You can then alert the owners or optionally delete the snapshots.
Watch for underutilized or expiring commitments
If you’ve begun using Reserved Instances or Savings Plans, make sure you maximize your utilization of these commitments. Low utilization of commitments means that you are wasting money you’ve already paid to the cloud providers. Allowing commitments to expire for even a few days or weeks without replacing them can also cause unexpected spikes in your cloud bill.
How to automate:
- Monitor utilization of commitments and send alerts if the utilization falls below a certain threshold for a specified period of time.
- Notify stakeholders in advance of expiring commitments so you can make any necessary adjustments
- Use tools that can help automatically optimize and manage commitments for you.
Get alerts when cloud spend will exceed budgets
Even in the early stages of your FinOps journey, you can use budget alerts to notify you if you are forecasted to exceed your budgeted cloud spend in any given month. By getting alerted, you can identify any unexpected costs and take action to bring your spend back in line with your budgets.
How to automate:
- Use automated tools to track forecasted monthly spend vs. budget and send alerts to stakeholders. You may choose to set up these alerts for any cloud vendor, department, cloud account, or cloud service.
Walk stage
Apply schedules for intermittently used compute instances
Many organizations have applications that don’t need to be running 24×7. The most common example of this is development instances which are not needed when developers aren’t working. However, you may have other applications that are only needed during business hours and can be shut down after hours or on weekends. Keep in mind that running an application 12×5 (12 hours a day for 5 days a week) can save 65% versus the cost of running 24×7.
How to automate:
- Automatically shut down development instances after hours or on weekends. Restart the instances before your developers come back to work each day. Use tags to indicate which instances can be shut down and the schedule to be followed.
- Automatically shut down unneeded instances after hours or on weekends. Restart the instances before your work hours each day. Use tags to indicate which instances can be shut down and the schedule to be followed.
Use Bring-Your-Own-License (BYOL) to reduce software costs
Cloud providers offer the option to include commercial software like Windows Server, RedHat Enterprise Linux, SQL Server, and Oracle Database as part of your cloud services. The cost of these software packages is also much higher than the costs of the underlying cloud infrastructure and can represent a significant portion of your cloud bill.
By default, you will be billed directly by the cloud provider for these software licenses on an hourly basis using “pay-as-you-go” (PAYG) prices. Unfortunately, for instances that are running 24×7, these PAYG prices can be 6 times more than your negotiated discounts from the software vendors.
Alternatively, you can change to a bring your own license option (BYOL) and use licenses that you have already purchased from the software vendor, which can represent as much as 85% savings. However, developers and IT staff often don’t know when to choose BYOL and instead leave the PAYG default resulting in higher costs.
What to automate:
- Continuously look for instances using PAYG where money could be saved using BYOL. Alert your software asset management team and optionally apply the BYOL flag to the instance.
Run stage
Rightsize cloud resources
As more and more workloads are deployed in the cloud, it is very common to find that you are paying for larger instance sizes or higher resource tiers than are needed. This results in wasted spend that could be recaptured by rightsizing.
You will want to gather information about overprovisioned resources and collaborate closely with your DevOps teams to determine where rightsizing is feasible
What to automate:
- Use automated checks to identify compute or database instances that are under utilization thresholds over a specified period of time. Send alerts or create tickets to evaluate for rightsizing.
Automate commitment management
As you progress in your FinOps maturity, you will make extensive use of Savings Plans and other commitments like Reserved Instances or Committed Use Discounts. Managing and optimizing these commitments and ensuring they are full utilized becomes more and more complex for large environments.
Automated tools can continuously analyze your environment, make recommendations, and even take automated actions to buy, convert, and sell commitments.
What to automate:
- Use automated tools that continuously analyze your environment, make recommendations, and even take automated actions to buy, convert, and sell commitments within guardrails that you specify.
Optimize container environments
With the rapid adoption of Kubernetes, optimizing these environments can result in significant savings and better performance. FinOps and DevOps teams must work together to evaluate how automated tools can help you optimize Kubernetes clusters and choose the best purchase options, including on-demand, spot instances, commitments and savings plans.
What to automate:
- Automated tools can help you continuously analyze workloads to rightsize Kubernetes clusters, avoiding both under- and over-provisioning.
- Automated tools can identify the best instance purchase options for Kubernetes and maximize the use of commitments.
How Flexera helps
Flexera offers a comprehensive set of FinOps solutions that can help you at all stages of FinOps maturity. Identified as a leader in both the Gartner® Magic Quadrant™ for Cloud Financial Management and the Forrester Cloud Cost Management and Optimization Wave™ Report, Flexera provides solutions that help you optimize your cloud spend.