When it launched, Forza Horizon 5 saw more than 10 million concurrent players—the biggest first week in Xbox Game Studios history—drive across vibrant, virtual landscapes and face off in thrilling tournaments and challenges. Forza Horizon 5 is also a stunning win behind the scenes, where Azure platform as a service (PaaS) provides a virtual crew, and a Windows-based container architecture adapts to changing demand at warp speed. The game developers rely on autoscaling Azure Kubernetes Service (AKS) clusters to meet the most challenging performance demands, while fully managed Azure services free the team from time-consuming infrastructure management tasks.
What the game developers learned about AKS applies to retail, financial services, streaming, and other workloads that need Grand Prix performance to manage concurrency with low latency and big spikes in demand.
High Speed and Low Latency Race
Developed by Turn 10 Studios and Playground Games and published by Xbox Game Studios, the Forza Horizon series has grown in popularity with each new title. As the series has exploded in size, so too has the infrastructure needed to support it. At first, the Windows-based development team hosted the game on-premises, buying and borrowing servers as needed.
Like most in the gaming industry, the Forza Horizon code base is completely built in Windows. According to Turn 10 Program Manager Madden Osei, “We wanted to deliver more agility and speed to scale for millions of players while abstracting the lower layers of VM management. But our preferred solution needed to support Windows-based images.”
Going to 3 Million Concurrent Users
The answer came from a technology that ForzaTech didn’t expect—Kubernetes, the popular container orchestration platform. “We had never used container technology whatsoever. There was never any reason to even look into Kubernetes,” Hennessy recalls. But the team leaned into the curve. “Within a month, basically we had converted our services over to AKS and had everything running.”
AKS also allowed the team to iterate much faster during stress testing, a mission-critical step in preparing for the big launch day. Before moving to AKS, it could take ForzaTech developers half an hour to swap out old images on their VMs and prepare them for load. In AKS, this process takes seconds. The extra speed enables the team to instantly make repairs, overcome roadblocks, and scale as needed. In addition, the team can script cluster updates ahead of time, adding flexibility and agility to the demanding stress tests.
Well ahead of the big launch, the stress tests proved that ForzaTech’s services could easily scale from 600,000 to 3 million concurrent users. The team was more than a little happy. “We were thrilled!” Osei says. “Every reasonable estimate for players and concurrent users was broken.”
Turbocharging a Gaming Architecture
The Forza Horizon 5 architecture includes 17 core services that run in AKS. Among them, the leaderboard service tells users where they rank, while the user-generated content (UGC) service supports the ability for players to upload photos and other content. The auction house service enables players to sell cars to each other, and the analytics service collects game and player telemetry used in reports.
To simplify the migration to Windows containers, ForzaTech created a playbook based on the experience of moving the UGC service. After about two weeks of testing, the team applied the playbook to the other services.
The front door for all client requests is the aggregator service, a single autoscaling deployment that runs multiple pods in AKS. All the pods interact with the same AKS configurations and secrets. The aggregator service interacts with the other microservices running in containers. Their similar architectures made the playbook easy to follow. They are instrumented with Kubernetes liveness and readiness health probes, so the team can quickly identify any issues and make changes, such as adjusting the scale.
For business continuity, two instances of every service run on just four VMs. Before the move to AKS, ForzaTech ran a minimum of 40 VMs at all times. “Most of them were running very cold,” Hennessy admits. “This configuration reduced the cost of our test environment by 90 percent.”
Revving up for big lunch
To ensure that the new container-based architecture could handle the anticipated loads, ForzaTech built a stress test harness on Azure and set a goal to exceed the launch volumes seen when the previous version of the game debuted. However, Forza Horizon 5 outpaced that launch with five times the number of players. Fortunately, the team worked with partners at Azure PlayFab, a service for building and operating games on a live platform with dedicated, global, multiplayer servers.
After reassessing the game’s needs, the team scaled the infrastructure down, saving more than $100,000 the following month. Hennessy notes that it was as simple as turning a dial one way to scale up for the launch and expected holiday usage and then turning the dial the other way to scale down and reduce costs.