8 Serverless best practices from a veteran CTO

The community has been discussing best practices for years, but only a few have been accepted.

Most serverless practitioners who follow these practices face large-scale work scenarios. Serverless architectures promise to work on large and burst workloads, so most best practices are more focused on scaling issues, such as Nordstrom in retail and iRobot in Internet of Things. If your goals are not on that scale, you probably don’t need to follow these best practices.

Remember that best practices are not “the only practice.” Best practices are based on a set of basic assumptions, and if your scenario doesn’t have those assumptions, then these best practices probably aren’t right for you.

My main assumption is that everyone builds applications that run on a large scale (even though they probably never will).

Here are my best Serverless practices.

This best practice has to do with function errors and scaling isolation.

In other words, if you use the switch statement in function, you’re probably doing it wrong.

Many tutorials and frameworks are based on the function large singleton, which is then preceded by a single proxy route using the Switch statement. I don’t like this pattern because it doesn’t scale well and tends to produce large and complex functions.

The problem with this is that when you want to extend, you need to extend the entire application, not specific elements.

Let’s say one part of your Web application needs to handle 1 million traffic, and another part needs to handle only 1,000 traffic, and when you need to optimize the former, you have to piggy-in the latter. It’s wasteful, and you can’t easily optimize the latter. Therefore, it is recommended to separate them.

Function that calls other functions is an anti-pattern.

This model works in rare cases, but fundamentally, don’t do it. This multiplies your costs, complicates debugging, and negates the value of isolating functions.

A function should push data to a data store or queue, and then trigger another function to do the rest.

This seems obvious to me.

A function has two phases: cold start (when the function is started for the first time) and warm start (when the function is started and ready to be executed). Cold starts are affected by many factors, such as the size of the ZIP file (or the code being uploaded) and the number of libraries that need to be instantiated.

The more code, the slower the cold start.

The more libraries that need to be instantiated, the slower the cold start will be.

For example, Java is a high performance language for warm starts on some platforms. But if you use too many libraries, you’ll find that it takes a lot of seconds to complete the cold boot. Some libraries are not required, and cold start performance affects not only startup but also scaling.

I firmly believe that developers should only use additional libraries when necessary.

Something like Express was built for servers, and serverless applications don’t need all of its elements. So why introduce all of its code and dependencies? Why introduce extra code? Not only will excess code not be run, it also poses a security risk.

Of course, if a library has been tested by you, and you know and trust it, then it’s ok to introduce it.

Do not use connection-based services unless you really have to.

This is going to get me in big trouble. Many Web application developers fall into the “we only know RDBMS” trap.

But it’s not about the RDBMS, it’s about the connections.

Serverless is best for collaborating with services, not connecting.

Services are designed to respond quickly to requests and deal with the complexity of the data layer. This is of great value in the serverless world and explains why databases like DynamoDB are well suited to serverless architectures.

To be honest, serverless practitioners are not against RDBMSS, they are against connections. Connections take time, and if you think about it, when one function expands to multiple, each function environment requires a connection, which introduces bottlenecks and I/O waits for function cold starts that are not necessary.

If you must use an RDBMS, you can put a connection pool service in the middle, or better yet, some kind of container that scales automatically.

The point is, you may need to rethink the data layer, and it’s not serverless’s fault. If you’re trying to reuse your current data layer and it doesn’t work, it’s probably due to a lack of understanding of serverless architectures.

Avoid using a single function proxy whenever possible. It doesn’t scale and doesn’t help isolate problems. In some cases, you can use a single proxy, such as a set of routing functions tied to a table that is relatively independent from the rest of the application. But in most of the applications I’ve worked on, this is the exception.

While avoiding a single agent adds administrative complexity, it does help isolate errors as you scale your application.

Then again, you would use some sort of CONFIGURATION management tool to run these things, wouldn’t you? You’re already using some kind of CI and CD tool, right? So, serverless still requires DevOps.

If the application is asynchronous, serverless tends to work best. This may not be obvious to Web applications that tend to respond to requests and have lots of queries.

As mentioned earlier, it is best not to have functions call other functions directly, so how to link functions together is an important issue. Queues can be used as circuit breakers, and if a function fails, you simply empty the queue that has piled up because of the failure, or push the failed message to a dead-letter queue (DLQ).

Basically, it’s about understanding how distributed systems work.

For client applications with a serverless back end, the best approach is to use CQRS. The key to this pattern is to separate the concerns of getting the data from those of entering the data.

In a serverless system, data flows through your system. They may eventually form data lakes, but more likely, they will be in some sort of fluid state. Therefore, treat data as dynamic, not static, at all times.

While this is not always possible, it is important to avoid querying data lakes in a serverless environment.

Serverless requires you to rethink the data layer. For those new to the serverless world who tend to think in terms of an RDBMS, they are likely to hit a wall, not only because of scaling issues, but also because their data structures become too rigid.

You’ll find that the data flow changes as the application changes, and scaling changes everything. It’s easy if all you have to do is redirect a data stream, but damming a database is much harder.

Creating the first serverless application is easy, and then you watch it expand. If you don’t know what you’re doing, it’s easy to fall into the same trap as other automatic scaling solutions.

If you don’t understand how applications scale, you can get yourself into trouble. If you use a slow cold start (relying on many libraries and using an RDBMS) and then hit a sudden spike in traffic, you can dramatically increase the concurrency of functions and overwhelm the connection count, slowing down the application.

So don’t assume that applications will necessarily run under the same load. Understanding how applications behave under different loads is still part of the job.

I could say a lot more here, but these are the things I want to tell people when I talk to them. I haven’t covered how to plan an application, or how to consider the cost of an application, because they are beyond the scope of this article. I’m sure a lot of people will say I’m wrong about RDBMSS. Like containers, I don’t hate RDBMSS, I just like using the right tools for my work. So, know your tools first!

https://medium.com/ @pauldjohnston/Serverless-best-practices – B3C97D551535

Author’s brief introduction

Paul Johnston, co-founder of a company focused on Serverless, former AWS team Serverless evangelist, senior CTO, Geek.

8 Serverless best practices from a veteran CTO

Related Posts

What is a JavaScript prototype? Read it and you’ll understand

Selenium+POI is used to automate batch word searching in Excel

Automated LFS: ALFS