Serverless Questions
4 min read

Serverless Questions

Hugely inspired by Paul Swail's thoughts collected in his personal garden and resultant of my over 2 years of ups and downs building serverless applications.

My first dab with the idea was back in 2016 when I made my first cloud-first CRUD with Firebase functions and Firebase. I haven't spent much time on GCP since then though, and the remainder of the experience comes from AWS cloud -  mostly CloudFormation.

Since the beginning there were a couple of things that drove me crazy when working in this "new" paradigm. No one seemed to have answers (or enough authority) to a number of questions, such as:

  1. What and how to test?
  2. MicroLambdas vs MacroLambdas vs MonoLambdas?
  3. How to deal with the databases? What database to use in the first place?
  4. Which development flow works well when working with serverless?
  5. When to use the approach?

Below are my current answers to the questions I've been pondering along with observations of other issues that get magnified by it (even if not directly related)

  1. What and how to test?
    This one is admittedly still a work in progress - I am yet to coin an opinion on whether 3rd party side effects such as publishing message to a queue or event bus should be subject to it or not.
    I believe integration tests on real services are way more important than unit testing in ALL cloud native scenarios. Even better if you can afford doing proper e2e. Many times I've worked with perfectly tested code that wasn't given the proper permissions in the stack it was eventually deployed to. Since serverless allows you to cheaply and easily spawn ephemeral environments - for example to run your automated test suite on - this would be my go to method to ensure trust in what I am deploying.
  2. MicroLambdas vs MacroLambdas vs MonoLambdas?
    Start with MonoLambda move into MacroLambdas if you need more separation, but only once the domain clarifies - this process should be simple as spawning new resource should be as easy as adding a new 5-10 line definition to your Infra configuration and cutting the router in half.
    Exposing each of the endpoints separately makes it easier to track what's happening with your system, ie. even if your code handles two separate endpoints - expose each of them separately in your Serverless stack. It's easier to pinpoint the bottlenecks (but X-ray and similar tools are way better at that)
    Contrary to what is usually advocated for when discussing serverlees, I don't see that many use cases for the miniscule microlambdas - they make sense to me only in a couple use-cases, most prominent of which are:
    - a totally separate and independent process
    - a process the side effect of which have little influence on the whole system  and are not time sensitive in any way
    - linkers - the missing pieces between other services / software you are using
    Otherwise they are too much operation overhead, and separation can often end up too granular.
  3. How to deal with the databases?
    Use cloud based solutions, or if you really have to host an RDB with a proxy to handle connection polling in front of it. I have yet to find a solution that is cloud agnostic that addresses the connection polling problem, however there are plenty of databases that were created with serverless & "infinite" scalability in mind. This thread  pretty much sums my thoughts on the matter. Default to NoSQL solutions built without low connection limits in mind - Firebase, DynamoDB, CosmosDB, FaunaDB, Aurora - just to name a few great options we have. Part of what is so great about them is you can hook most of them up as event sources for your application! You don't have to default to REST and rely on POST messages flying around - DB streaming is a whole new world of comfort and mystery!
    If you are set on using a relational database add a load test to your test suite. You will be amazed at how easy it is to arrive at 99% failure rate without managing the db connections properly.
  4. Which development flow works well when working with serverless?
    Trunk based development, with tag deploys and a fix forward strategy for dealing with bugs. Meaning ideally I would have only one branch (+ ephemeral feature branches where things change, but they are never that active for more than a couple of days) a proper CI/CD process and possibly automated deployments. It's great to use a feature flags system of sorts - even something simple work here (it can get out of hand quickly and become a task of it's own if you try to replicate the feature set of something like LaunchDarkly), remember you are mostly after the ability to "disable" or "retarget" a function to a different version.
    I've been on teams that have tried multiple different approaches they all ended up with the same issues:
    - over-processed and slow deployment flow
    - monster features / merges
    which in turn took away most of the promises that made us choose serverless in the first place - remember the Holy Grail of delivery speed?
  5. When to use the approach?
    It's easier to answer the opposite question - so when not to use it? (but treat it with a pinch of salt, as some of these no-gos could have been removed already)
    - long running processes (because of the short lived natures of most serverless offerings
    - constant or least predictable latency sensitive applications
    - processes needing access to multiple VPCs - this is still not perfect but it used to be way worse, read more here.
    For me it's currently the default modus operandi. However I believe that one of the most important factors to consider when betting on it is: Do you have a team buy-in for it? It's not enough to impose it in the form of a diagram and a 2 hour long introduction to the core ideas behind it. It takes time and a lot of relearning to do it right. Pointing this out as I've seen teams that had it imposed (ignore for a moment it's hardly ever a good idea to do that). This was a big initial frustration of mine as well, serverless makes bald promises and they are often misread by the business people expecting it to solve all the problems with software delivery. Don't fall in that trap! It is not a panaceum.