Overview
You can use this document to:
- Learn about the concepts of auto scaling at StackPath
- Add auto scaling to an existing workload
You can also add auto scaling when you first create a container-based workload or a virtual machine-based workload.
- To learn how to create a workload and add auto scaling, see Create and Manage Virtual Machines, Containers, and Workloads.
StackPath allows you to auto scale your workloads granularly across PoPs so that you can serve your users worldwide and ensure that an unexpected spike in localized traffic is served without any degradation of service.
Note: While auto scaling does not incur any unique fees, you are charged for the cost associated with leveraging a higher quantity of our existing services.
Introduction to auto scaling at StackPath
Auto scaling monitors the CPU usage for instances every 15 seconds. If usage is above the configured threshold, then additional instances are provisioned, based on the auto scaling's configurations. Instances are scaled down when the instances are at least 10% lower than the configured threshold after 5 minutes of the last auto scaling action.
This workflow ensures that instances are not overloaded so that you can serve all requests promptly. Additionally, your infrastructure is cost-efficient because additional instances are only added when needed, and then are removed.
While auto scaling is configured per deployment target, the auto scaling actually occurs per PoP.
Consider the following scenario:
- You have a North American deployment target with PoP locations in Ashburn, Dallas, and New York.
- You have a CPU usage threshold of 50%.
- During monitoring, Ashburn reaches 30% of CPU usage, Dallas reaches 70%, and New York reaches 30%.
- A new instance will be created in Dallas, not in Ashburn or New York.
If your workload is configured with an Anycast IP address, requests will automatically be routed to the new instances as soon as they are ready, as determined by readiness probes.
- To learn more, see Edge Computing: Liveness and Readiness Probes.
Container-based workloads will auto scale based on the selected OS image.
Virtual machine-based workloads will auto scale based on the selected OS image. Since this image requires configuration, ensure that cloud-init user data is configured, along with readiness probes, so that your virtual machines are configured correctly before they are added to the Anycast IP routing.
Add auto scaling to an existing workload
You can add auto scaling when you first create a workload or to an existing workload.
- To learn how to create a workload and add auto scaling, see Create and Manage Virtual Machines, Containers, and Workloads.
Note: You can create different auto scaling configurations for different deployment targets.
- In the StackPath Control Portal, in the left-side navigation, click Edge Compute.
- Under Workloads, click the ellipses for the desired workload, and then click Edit.
- Navigate to the Targets page
- Mark Enable Auto Scaling.
-
For CPU Utilization, enter a threshold limit that will enable auto scaling.
- When a particular PoP location reaches this limit, auto scaling will take place for that particular PoP.
- For Min Instances Per PoP, enter the minimum number of instances that can be created per PoP location when auto scaling is activated.
- For Max Instances Per PoP, enter the maximum number of instances that can be created per PoP location when auto scaling is activated.
- Click Go to Summary and review any changes.
- Click Relaunch Workload.