The RightScale engineering team is moving the entire RightScale cloud management platform comprising 52 services and 1,028 cloud instances to Docker. This article is the sixth in a series chronicling “Project Sherpa” and is written by Ryan Williamson, senior software engineer, and Tim Miller, vice president of engineering.
We made it. After seven weeks, we finally reached the summit on our journey to complete our Docker migration. Our original plan was for our Dev team to spend four weeks and our Ops team to spend five weeks on the project. In the end, we were off by a week or so in our estimate. Ninety percent of the developers working on Sherpa spent five weeks, while ten percent continued on into weeks six and seven. Our Sherpa Ops team was fully committed for six weeks and 50 percent committed during week seven.
Out of 52 services, we completed the containerization of 48 services. For three of the remaining four, we determined that there wasn’t any ROI in migrating them to Docker. We did not finish porting the only service that we had classified as Super Mega Crazy Hard when we did our initial assessment. Several intrepid engineers raised their hands to take it on after they finished their assigned services, and we expect to finish migrating that service to containers in the next month.
During Project Sherpa, we got a good start on updating our automated release tools, container-based development tools, and integrations with our Travis CI system, but more work will continue in these areas as we optimize our processes around containers. Along the way, we built a new RightScale Container Manager feature for the RightScale Platform to enable our internal Ops teams as well as our customers to use RightScale to view and manage all of our running containers.
The Bottom Line
We had two goals for Project Sherpa: accelerate development and reduce costs, and we were able to achieve both.
Out of the 1,028 cloud instances that we were using when we started, 670 were running dynamic apps (our software), and all of these were migrated to Docker containers, while 358 were running “static” apps that don’t change much and for which we didn’t see much ROI in moving to containers. These “static apps” include: SQL databases, Cassandra rings, MongoDB clusters, Redis, Memcached, RabbitMQ systems, Syslogs, HTTP load balancers, Collectd aggregators, proxy servers, and VPN endpoints. The instances for the static apps support the dynamic apps that we did migrate, so we are operating in a hybrid environment using containerized and non-containerized components. We believe that this will be a common model for many companies that are using Docker because some components (such as storage systems) may not always benefit from containerization and may even incur a performance or maintenance penalty if containerized. That said, we will continue to evaluate the ROI over time and can migrate the static apps later if we decide there is value.
Here is a breakdown of the savings we realized:
- Reduced number of instances running dynamic apps by 55 percent
- Reduced costs of dynamic apps by weighted average of 53 percent
- 44 percent savings in production
- 63 percent savings in staging
- 74 percent savings in dev and test integration
The higher savings in our dev and test integration environments resulted from our being able to do more work on laptops and desktops instead of cloud-based assets because Docker makes it easier for us to easily assemble the necessary services.
We also are already seeing a broad set of agility benefits that are speeding up our development process. As time goes on, we expect to have more data, but here are a few early examples of benefits we have gained:
- A developer working on a bug ran into issues in the integration environment. Integration is made up of dozens of services, and when there is something wrong with one or more of them it can be incredibly difficult to test your specific change. He needed to set up a specific service locally to better troubleshoot, but prior to migrating to Docker, this would have eaten days of his life. As a result of containers, he was able to get it set up locally and resolve his bug within the same day.
- A developer was working on an issue in an application that he had no experience with. With Project Sherpa, we now have consistent interface contracts about how applications are configured and deployed, so he was able to jump right in, get the app running locally, and start debugging. The developer then needed to hand off the issue to another developer, and all he needed to do was provide his image name, and the second developer was able to get up to speed quickly with no prior knowledge of the application and continue where the first developer had left off.
- A developer was doing some work over the weekend and needed to tie his local Docker setup into just a few specific systems that live in our integration systems. Normally he would use the integration system designated for his team, but these integration systems are often terminated over the weekend to save money. Because there was another team in a different time zone that had their integration environment running, he was able to connect his local Docker setup on his laptop to their integration environment right away and continue his work.
- A product manager is now able to take features that are actively being worked on by developers and load them up on his local box to see a demo without requiring cycles from the dev team. This is a huge win because it reduces work to make the handoff and keeps the product manager up to speed on the current state of development.
There are certainly more improvements that we will make in our use of Docker, but we would definitely consider Project Sherpa a success based on the early results. We’re already realizing cost savings and more efficient development, and we expect to further improve our agility going forward.