(Replying to PARENT post)
As my site matures, I might move some predictable parts to containerized VMs to save on costs. I have a crawler-type service that has extremely consistent traffic that probably shouldn't be on Cloud Run, but I did it anyway due to being able to iterate so quickly (see revisions here: https://imgur.com/fkFtIRM). Cloud Run also has a free tier which was enough when I was starting out and prototyping. It's also nice since my site gets very little traffic at night so at the moment the cost is not too bad. This is what my billing looks like: https://imgur.com/gIo3IGJ
If I was a big company or my site got super popular I might do things differently. But for a side project it has made me enjoy programming more than anything I've used before because I can focus 95% on code.
(Replying to PARENT post)
I had a worker service running on Heroku. Very CPU intensive. The traffic pattern was extremely low throughout the day, but had completely unexpected surges.
On Heroku, my choices were: Paying $3k (basically paying for peak surge throughout the month) or having a lot of slow/failed responses during the surge.
Moved to Cloud Run very easily. Just a normal dockerized 12 factor app.
Now I pay ~$50/m and it automatically scales up when I need more workers.
If you want a managed app platform, it couldn't be more simple and cheap.
The scaling model of Cloud Run has been great. Now that they support websockets, I will be moving all my apps slowly to it.
My only gripe: I wish they had something for worker processes. I know that the rest of Google Cloud has solutions for it but being able to just spawn worker processes as part of the same deployment would be fantastic.
(Replying to PARENT post)
Around β¬1500 will get you a second hand Dell R720 with 190+ GB RAM and 48 cores.
It would be cheaper to handle such scaling with a vm instead of functions. Any kernel with a bit of tweaking will easily handle that number of connections. One just needs RAM and sufficient CPU. 16GB RAM and 8 vCPUs on aws will be enough for 500k.
Sleeping would be very uncomfortable if my service using such approach started being popular and was running at increasing capacity for a week because my team needs time to move back to vms.
(Replying to PARENT post)
Web sockets are a handy thing to have access to but the app described is a pretty pathological case for the Cloud Run pricing model. I think people knocking the article on pricing alone are missing the point.
(Replying to PARENT post)
Given that, it seems like WebSockets/persistent connections is a weird use case.
(Replying to PARENT post)
Why do we now need thousands of instances to achieve the same?
(Replying to PARENT post)
(Replying to PARENT post)
The second problem is that really the demo has just kicked your scaling problem over to redis. The demo isn't doing anything actually interesting, redis is doing all the work. The reality is that one of the best parts of Cloud Run over something like serverless functions is that you can have state in your server. You don't need redis at all if you are doing the same demo in Kubernetes without Cloud Run, since it's pretty easy to use channels in golang to do the same thing. However, in order for you to really make that work, you need some way to at the very least route the traffic consistently so that people watching the same resource get routed to the same server, or to be able to have cross node communication.
So this is a great start, but a little work to go before many people would be able to switch over to something like this due.
(Replying to PARENT post)
1) Egress is hand-waved, when in reality the free GBs would basically exhaust in a month on just pongs and reconnection requests, so doing nothing at all. That amount of people just sending single emojis as messages for just one day (to play along "short marketing event") is already matching monthly napkin estimate stated.
2) Isn't there's a good chance Memorystore wouldn't be able to handle this at fairly charitable load interpretations under that many CCUs? 1β° of those 250k CCUs sending a message every second means Redis needs to publish 250k messages as well.
Again though, I'm aware it's only a hypothetical scenario demonstration, and it really does look cute in how it's simple to deploy. Fun stuff.
(Replying to PARENT post)
> Any Cloud Run service, by default, can scale up to 1,000 instances. (However, by opening a support ticket, you can get this number elevated.) This means we can support 250,000 clients simultaneously without having to worry about infrastructure and scaling!
Aren't these statements at odds with one another?
(Replying to PARENT post)
(Replying to PARENT post)
(Replying to PARENT post)
I remember that in 2012, the Rizon IRC network maxed out at 80,000 concurrent users on an AMD Bulldozer CPU.
(Replying to PARENT post)
(Replying to PARENT post)
(Replying to PARENT post)
API Gateway supports an unlimited (you need to ask for an increase from the 500 new connections per second rate) number of connections and is only around $0.25 per million minutes + $1 per million messages.
So just having 250,000 connections open would cost only around $270 per month (and scales up and down as you please).
(Replying to PARENT post)
1. 250 connections per container, and
2. 1,000 containers.
However the 250 concurrency limit does not refer to connections, it refers to requests. 250 concurrent requests can actually represent thousands of clients, depending on their think time.
(Replying to PARENT post)
If you are a GCP or AWS or bare metal expert that can set this thing up in their sleep, that's great but the majority of people can really benefit from a PaaS like GCR.
Because Cloud Run uses vanilla Docker containers, once you have validated the idea you can move to GKE or VMs or a server under your desk or whatever. And if it never takes off, that's fine too because you didn't spend a ton of time investing in making it work.
Ahmet if you are reading this (hi) it would be REALLY cool to see something like using GKE for the base load and dynamically bursting to GCR to fill in the gaps. Not sure if this possible with GCLB today, but would be super cool.
(Disclaimer: I used to work on this team at Google.)