System Design Look Back - Serverless and Sandbox

Andy,system design

How to design a Serverless platform?

No, you do need to. (Doge)

The concept of Serverless became popular when Amazon published Lambda as the industry milestone back in 2014. It is a critical technique as the basic Infra of Cloud Services like AWS, Azure, and GCP. Old VM companies like Linode struggle in this serverless trend, either hard to make a long-term profit or end up being acquired (opens in a new tab) by other companies since no Start-Up want to quickly verify their new business while managing a VM and hiring a DevOps engineer.

Everyone wants to build a Serverless platform somehow in big tech companies. In Alibaba and Bytedance, there are several teams working on how to use Node.js as the process-level isolation to build Serverless. And all of them ended up dead since it is hard to verify the value in this domain.

Nowadays, the high competition in this domain really makes creating a new product way easier than 5 or 10 years ago. It is petty funny that AWS Lambda and Cloudflare Worker have become the fundamental Infra for other new Cloud Services. Like Vercel Function and Netlify Function, both use AWS Lambda and Cloudflare Worker as their underlying Infra.

But there is still a space to discuss the need to build your own Serverless platform if you have the requirement for your user to execute their hostile code in your server and not be hijacked by Big Tech companies.

Process Level Sandbox in Javascript

There is no need to talk about VM Sandbox since Docker and K8s dominate it.

In the domain of process-level Sandbox, there are several Sandbox libs that support exec code in a process level and keep the host safe. But it is still a greenfield even Cloudflare opensource (opens in a new tab) their dependent Sandbox runtime lib.

There are some crucial components in process-level Sandbox:

and the other components around it:

It is a complex system involved with lots of classic CS topics and to build it from scratch is merely impossible and non-profitable.

Architechture

This article (opens in a new tab) from Cloudflare illustrates the basic architecture of a whole life cycle in process-level Sandbox.

Architechture
  1. API triggers the exec of function, maybe it is from an inbound HTTP proxy
  2. pass the handler to the HTTP server (the main process)
  3. the main process manages a Sandbox process pool that allocates a bunch of V8 waiting to exec code
  4. schedule the exec event with code to one of its Sandbox process
  5. the Sandbox process uses a V8 to compile and exec code and return results to the main process
  6. respond with the result to the outbound HTTP proxy and to the client

Cloudflare didn't open source all components they depend on, the workerd (opens in a new tab) lib is a tool to compile and exec Javascript code and offer isolation. It is just the runtime part of the architecture.

There are still other options in the community not complicate. Like isolated-vm (opens in a new tab), vm2 (opens in a new tab), and v8-sandbox (opens in a new tab). They are dedicated to different purposes in executing Javascript code on the server side and have their own trade-off.

Ending

To wrap it up. The future of Serverless and Sandbox is bright. The feeling of using Cloudflare is like back in the old days when the Open Web concept was created. You can host your blog on your physical machine at your house. And everyone can access your content in your house. Now you handle your blog to Cloudflare in every edge node in worldwide through a code snippet. And no government can block your content since they can't block every node on Earth. A variation of the Open Web!

© FTAndy.RSS