Published on Wednesday, 2024-09-04 at 14:00 UTC

Cloud of Disillusion - The Broken Promise of PaaS

Why I’m letting go of dangerous complexity and why I’m embracing the VPS.

This post is a bit of a rant. I wrote it just after my email newsletter tool Keila had been offline for over four hours on a managed Kubernetes setup. If you want to read about how I moved it to a VPS using Kamal in less than a day, stay tuned for part two.

When I first heard about cloud computing in the late 2000s, I was excited: The prospect of server infrastructure being abstracted away, a world in which memory, storage, and computing power were all allocated automatically – it sounded great. No longer would we have to carefully spec a server. We’d just upload our application and let the cloud take care of it.

The Promise of the Cloud

Hundreds of servers for the price of one
Automatically [grow] to support any load level. Easily handle traffic spikes with the power of hundreds of servers powering your site.
Media Temple Grid-Service marketing copy

When Media Temple (RIP) introduced Grid-Service in 2007, I was excited. The above quote with which they advertised their latest hosting plan sounded like the future.

Their marketing copy invoked the idea that you could upload an app to the cloud and their technology would magically distribute it across a vast network of machines without you ever needing to worry about the technical details.

And I’m sure, many people still envision something like this when they think of the cloud.

The thing is: This approach – shared webhosting by a fancy name – worked well enough for stateless PHP scripts. But with the emergence of web apps that required their own event loop, such as Django and Rails, this was no longer a viable approach. These newer kinds of web apps needed to run a continuous process. They needed to run somewhere.

A Step Backwards

So where were all these Python and Ruby apps supposed to run? Also in the cloud! But a different kind of cloud. Gone was the promise of automatic resource allocation. Instead, you once again had to decide in advance on the amount of storage and RAM and pick the right number of CPU cores.

But this was now a different kind of cloud. Gone was the promise of automatic resource allocation.

And unlike in shared PHP hosting, you now had to become a Linux sysadmin as well! Before, all you had to worry about was which FTP client to use for uploading your .php files. Now you needed scripts and orchestration tools to configure an entire Linux distribution. The server had become the smallest unit of an app. Automatic scaling no longer meant dynamically allocating resources. It meant provisioning additional servers.

This seemed like a step backwards.

Fighting Complexity with Complexity

So I was happy when I learned that there was another emerging type of cloud: Platform as a Service (PaaS).

The premise of PaaS is simple: Instead of managing servers, you can once again upload your code (or push a container) and hey presto: everything else gets taken care of.

Except … you still have to pick instance sizes.

And while this certainly makes launching simple stateless apps easier, once you want to use a database or store files, things get complicated again.

Let me give you two examples:

Fly.io doesn’t have managed Postgres, so you have two options: Either deploy your own Postgres server or connect to a managed database from another provider. If you’re trying to avoid sysadmin work by using a PaaS, why would you want to manage your own database server? So that’s not an option. But the alternative is equally absurd: The "easiest" way to connect to an external database is to set up another server (!) and create a Wireguard tunnel to your Fly instance.
Kubernetes allows you to mount block storage volumes for storing files. But each volume can only be mounted to a single server instance (pod). So if you want multiple pods to access your volume, you need to configure and deploy an NFS service. In which world is this simple, convenient, or straightforward?

And that’s not even taking into account the complexity tax from the vendor lock-in you get from relying on proprietary tooling for deploying your apps.

Over the years I’ve tried time and time again to love various PaaS platforms. Heroku, Cloud Foundry, Red Hat OpenShift, Fly.io, Managed Kubernetes – you name it, I’ve probably tried it.

But it’s time to admit: They’re not making things easier. Not really, anyways.

Pricing and Reliability

Now, you might tolerate this level of complexity if everything else about PaaS was great. But it’s not.

Let’s take a look at pricing:

At Hetzner, you can get a VPS with 8 GB of RAM, 4 vCPUs, and 20 TB of traffic for 8€/month. A similar machine at Fly.io will set you back $43/month without traffic. At Scaleway, you have to spend 45€/month for a single-node Kubernetes setup with the same specs. At Google App Engine, you have to shell out a whopping $175 for a machine with a measely 3 GB of RAM.

Now the big question is:

Does this added cost translate to improved reliability? In my experience, absolutely not. I’ve had too many multi-hour (sometimes even multi-day) outages on every PaaS I’ve tried in the past. And in terms of customer support – if you don’t pay extra for a service level agreement, don’t expect too much.

Not only that, but PaaS providers are also notorious for changing their products, constantly causing you to adjust your deployment pipelines and face service degradations.

Time to Embrace the VPS

So my conclusion is this: PaaS don’t help you reduce complexity. They are expensive and unreliable.

Complexity creates friction, and friction causes systems to fail.

I’m aware that deploying to a VPS instead of a PaaS doesn’t mean everything will magically be fine or that occasional downtimes become a thing of the past. But complexity creates friction, and friction causes systems to fail. By moving from complex PaaS to self-managed VPS, you’re cutting out a good chunk of this dangerous complexity. And since VPS are an absolute commodity, there’s little to no vendor lock-in. If your provider goes down, there’s a good chance you can be up and running again in minutes somewhere else.

TL;DR: I wanted to believe in the PaaS promise. I wanted to deploy apps to the mythical cloud.
Alas, that’s not how reality works, so now I’m back running stuff on VPS.
Fortunately, these days that’s really easy! If you’re interested in how I used Kamal, make sure to subscribe to my newsletter or follow me on Mastodon/Twitter for when I publish the constructive follow-up to this rant.