Cloud Run gets always-on CPU allocation | Google Cloud Blog

This feature unlocks many use cases that weren't previously compatible with Cloud Run:

  • Executing background tasks and other asynchronous processing work after returning responses

  • Leveraging monitoring agents like OpenTelemetry that may assume access to CPU in background threads

  • Using Go's Goroutines or Node.js async, Java threads, and Kotlin coroutines

  • Moving Spring Boot apps that use built-in scheduling/background functionality

  • Listening for Firestore changes to keep an in-memory cache up to date

Even if CPU is always allocated, Cloud Run autoscaling is still in effect, and may terminate container instances if they aren't needed to handle incoming traffic. An instance will never stay idle for more than 15 minutes after processing a request (unless it is kept active using min instances).

Combined with Cloud Run minimum instances, you can even keep a certain number of container instances up and running with full access to CPU resources. Together, these functionalities now enable new background processing use cases like using streaming pull with Cloud Pub/Sub or running a serverless Kafka consumer group.

When you opt in to "CPU always allocated", you are billed for the entire lifetime of container instances—from when a container is started to when it is terminated. Cloud Run's pricing is now different when CPU is always allocated: 

  • There are no per-request fees

  • CPU is priced 25% lower and memory 20% lower 

Of course, the Cloud Run free tier still applies, and Committed Use Discounts can give you up to 17% discount for a one-year commitment.

How to allocate always-on CPU

You can change your existing Cloud Run service to always have CPU allocated from the Google Cloud Console: