13 November, 2023
July 26, 2022
A pioneering company in the online games industry was experiencing a dilemma. Among the first to offer free-to-play social games on social networks and mobile platforms, they had been using HashiCorp Terraform to deploy their infrastructure both on-premises and in the cloud. However, in order to maximize their performance, they needed to ensure their system wouldn’t deploy virtual machines (VMs) beyond the appropriate ratio to their VMware ESXi hosts. The team’s strategy of manually checking clusters was inefficient and caused excessive delays.
The company uses thousands of bare metal servers that run tens of thousands of VMs. To “cloudify” this experience, the team developed a Terraform provider for bare metal management. TeraSky’s team was involved in developing the Terraform provider, including resolving crucial corner cases. To avoid performance degradation, the gaming company wanted to maintain a configurable ratio between ESXi hosts and VMs within each vSphere cluster. The goal was to update the system so that when new VMs were about to be provisioned through Terraform, it would first confirm that the ratio had not been passed and only then deploy the VMs.
TeraSky separated the solution into three parts. First, they deployed Terraform Enterprise to the client’s on-premises environment to enable it to interact with their vSphere components. Then, the TeraSky team created Terraform Enterprise workspaces to provision ESXi hosts and VMs to a specific vSphere cluster. Finally, TeraSky created a Sentinel, HashiCorp’s policy as code framework, policy to enforce the current ratio of ESXi hosts to VMs against the vSphere environment after each workspace plan and to fail the deployment if the ratio is surpassed with clear feedback to infrastructure engineers.
“The Sentinel policy was an advanced customization and key to the success of the solution,” explained TeraSky’s CTO, Lev Andelman. The policy performs the following steps:
1. Obtain the number of ESXi hosts in the vSphere cluster (fetched from the plan variables)
2. Obtain the current number of VMs in the vSphere cluster
3. Calculate the number of VMs that are going to be deployed in the current run (plan)
4. Check if the current and planned VMs count is higher than the permitted ratio (configured as a variable), and fail the run if it is.
Sentinel policy was applied for the version control system (VCS) integrated workspaces and through self-service API-driven workspaces with clear and immediate policy feedback. This allows the client’s engineers to adjust Terraform code and re-apply configuration fast and according to company best practices.
The Bottom Line
TeraSky’s customized solution achieved the client’s objectives precisely. Further, the combination of Sentinel with a private cloud infrastructure from VMware made this solution the first of its kind. “TeraSky’s implementation both allows us to easily deploy new VMs to all vSphere clusters without the bottlenecks caused by manual checks and improves our system’s overall performance,” explained a company representative. “The difference is significant – to our end users and us.”