This feature unlocks many use cases that weren't previously compatible with Cloud Run:
Executing background tasks and other asynchronous processing work after returning responses
Leveraging monitoring agents like OpenTelemetry that may assume access to CPU in background threads
Using Go's Goroutines or Node.js async, Java threads, and Kotlin coroutines
Moving Spring Boot apps that use built-in scheduling/background functionality
Listening for Firestore changes to keep an in-memory cache up to date
Even if CPU is always allocated, Cloud Run autoscaling is still in effect, and may terminate container instances if they aren't needed to handle incoming traffic. An instance will never stay idle for more than 15 minutes after processing a request (unless it is kept active using min instances).
Combined with Cloud Run minimum instances, you can even keep a certain number of container instances up and running with full access to CPU resources. Together, these functionalities now enable new background processing use cases like using streaming pull with Cloud Pub/Sub or running a serverless Kafka consumer group.
When you opt in to "CPU always allocated", you are billed for the entire lifetime of container instances—from when a container is started to when it is terminated. Cloud Run's pricing is now different when CPU is always allocated:
There are no per-request fees
CPU is priced 25% lower and memory 20% lower
Of course, the Cloud Run free tier still applies, and Committed Use Discounts can give you up to 17% discount for a one-year commitment.
How to allocate always-on CPU
You can change your existing Cloud Run service to always have CPU allocated from the Google Cloud Console: