🐓 When to Switch from Node.js to Go 🦫

# [ $davids.sh ] · message #209

🐓 When to Switch from Node.js to Go 🦫

Straight to the point, no fluff

#go #nodejs #typescript

@ [ $davids.sh ] · # 1045

Regarding libraries, code design, patterns, stability, debugging, the functionality of the std lib, and so on, I've written and will continue to write about them. Now, I want to talk about specific problems that are very difficult for Node.js itself to solve.

Application footprint

A Node.js application can easily consume 250-500 MB of RAM or more (and that's just for the application itself, without any additional allocations). It takes minutes to start up and also shuts down slowly.

The main advantage of a distributed architecture is the ability to scale horizontally, allowing us to talk about hundreds of service instances. However, 100 Node.js instances mean 30-50 GB of RAM.

Yes, hardware is undoubtedly cheap now, but damn it, a similar application in Go will consume 10-50 MB of RAM, meaning 100 instances will require 5 GB of RAM, and that's just a couple of super cheap machines.

Deploy time

When you have 1 monolith or 3-4 services with 2-3 instances each, you won't notice how long it takes to deploy a new version of the application.

But when you have 20+ services with 5-7 instances of each, the system startup time can stretch out veeery significantly.

Sometimes you have to wait a whole hour for them to start up. Why? Because Kubernetes needs to download a new gigabyte-sized image and allocate a machine with sufficient RAM, plus the Node.js application startup time is very long.

(Aside: Yes, we use partial deploys, where only changed services are rolled out. But when a shared lib changes, it starts affecting everyone, and this happens often.)

And you sit there waiting for this entire wasted hour, only to then check metrics and go back to your tasks.

Go starts up almost instantly, and you can always keep a couple of empty, lightweight machines ready for new services to land on, while the machines currently running applications remain empty for the next deployment.

Event Loop

It's very easy to overload Node.js processes, and not even with complex CPU or memory operations. With 800 RMQ connections, we consistently observe an event loop lag of 250 ms, which is a disaster.

(Aside: This might be a personal problem with RMQ, amqplib, or something else, but we're using the most standard methods. Each individual queue has 1 RPS, and within the business logic, there are only 10 IO calls.)

Now imagine you have 5000-10000 devices (and that's just the beginning), and each has its own queue. Now do the math.

In my experience, Go has handled tens of thousands of connections in a single instance without any issues (and in reality, it can handle hundreds of thousands).

CI/CD

Image build times and sizes are measured in tens of minutes and gigabytes. Running Jest tests can consume up to 6 GB of memory (and that's just their issue) and take almost an hour (well, if the runner has enough resources to run all service tests in parallel).

Go's built-in tests do the same thing in the blink of an eye.

There could be many other points here, but these are the absolute basics. When we hit Node.js's limits without any subjective preferences, and to overcome them, we'll have to put in much more effort than, for example, starting to write new services and carefully refactoring old ones using Go (or starting with Go altogether).

P.S.

I accidentally stumbled upon a video about this, but in comparison with Java.

The first test clearly shows the picture you can observe when comparing Go with Java / C# / Node.js.

And in the second test, he's surprised why Go suddenly spiked in CPU usage, but it's simply because he made a super simple mistake (apparently, he hadn't written Go before). I found it just by looking at the executing function and its fix – it was adding exactly 1 line of code.
@ Kirill Arutyunov · # 1046

Write in PHP and you'll never have to ask yourself such questions!
@ [ $davids.sh ] · # 1047

Go write this in Rast's chat
@ Kirill Arutyunov · # 1048

No, I'm good in my own garden.
@ [ $davids.sh ] · # 1049

I like that your reaction speed is like a bot's)))
@ Kirill Arutyunov · # 1050

This is ChatGPT + my TG account, all trained on KVN broadcasts from the 90s
@ YURII VLADIMIROVICH · # 1051
On Node.js:
- When a project pulls in many dependencies, which in turn pull in other dependencies, and a lot of unused code that needs to be initialized, then yes, everything starts up slowly (or lazy initialization isn't used). (I'm more than sure that nobody on Node.js minifies their backend before deployment or cleans up unused code.)
- It shuts down slowly. If you're talking about shutdown, its role is to close connections, finish serving clients, and execute logic (what does Node.js have to do with this?).
- Who on your team is creating gigabyte-sized Docker images?
@ [ $davids.sh ] · # 1052

First, watch your language. You just said I have "hands from my ass," and that's ban-worthy.
@ [ $davids.sh ] · # 1053

Rephrase in a less aggressive tone
@ YURII VLADIMIROVICH · # 1054

You said that )There was nothing pointing to you)
@ [ $davids.sh ] · # 1055

So, I'm talking about my specific experience with the current project, and you're saying that those who are like that have their hands up their asses, and that doesn't concern me?
@ YURII VLADIMIROVICH · # 1056

This is about when something like that might happen.
@ [ $davids.sh ] · # 1057

Once again: rephrase in a less aggressive tone, please
@ YURII VLADIMIROVICH · # 1059

No problem) One sec
@ [ $davids.sh ] · # 1062

Regarding the first point, I specifically clarified: "but we use the most standard methods" – perhaps I phrased it awkwardly, but we have minimal dependencies. I'll even attach a photo (photo 1).
@ [ $davids.sh ] · # 1064

Regarding shutdown: yes, undoubtedly we have a graceful shutdown that will finish processing HTTP requests and RMQ subscriptions and only then close.

But the overhead of closing IO itself is very large according to the metrics.
@ YURII VLADIMIROVICH · # 1065

And what about the weight of the node_modules folder?
@ [ $davids.sh ] · # 1066

I'm not kidding about gigabyte Docker images (you can see the same picture on their Docker page).

Yes, you can take Alpine and install what you need yourself, but in the end, the gain is often incomparable to 5 MB of Golang.
@ [ $davids.sh ] · # 1067

If you're talking about weight, you answered in the previous message with a photo of the original node container weight.
@ Vassiliy ITK Kuzenkov · # 1068

I'll add a bit too. I'm interested in your experience, and I'll share mine.

Nest.js with 30+ modules also starts around 200MB (and this is a very fat application, but with lazy modules). It doesn't start in milliseconds, of course, but it's quite acceptable, within 3-5 seconds to start (here, Nest is fat from everything, but there are lazy modules, without them it would be worse, of course). We have a modular monolith, some modules can start separately, scaling is quite simple when needed. A bullseye-slim/bookworm-slim image + a bunch of dependencies ~100/300MB + we've covered it with caches and everything loads quickly on ECS. Canaries roll out in a couple of minutes.

Jest has been performing poorly with Node for a long time (Jest has its own errors, you can have a really bad time)), Vitest runs tests in milliseconds (even with Nest) and consumes significantly less (closures in v8 are very expensive, and Jest also shits in its own environment for every little thing - you can't get enough RAM for it)).

The event loop, of course, adds complexity to optimizations, and here's an example with a lot of connections - yes, when 5k connections are maintained, everything will lag noticeably, 250ms is not the limit :D, Node is quite limited here (although there is likely still room for creativity and optimizations - if there are such loads, then the business can find money for it too)). Go, of course, can indeed handle this much out of the box (in fact, not only Go, but any runtime lighter than Node's), but here again, I'm talking about the expressive capabilities of Go 😄
@ [ $davids.sh ] · # 1069

I'll gather our images again to check if I lied.
@ [ $davids.sh ] · # 1070

Currently, RMQ is our world's center, but before that it was Kafka, and in both situations, subscriptions/polling consumed a lot of time during application startup.

We are also covered in metrics and logs, which takes a good chunk in Node.js, but without them, we don't release new features (they can be disabled later, but I'm talking about launching with new features).

"3-5 seconds to start" – as I said, you also need to consider the time required by the orchestrator, and due to the bloat of images and RAM consumption, the application itself might start within this time, but all together it adds up to minutes.

Yes, we also have several modular monoliths (applications are divided by load or purpose, but share a common codebase).

bullseye-slim/bookworm-slim – we'll see, thank you)

Regarding jest – I understand it can be changed (we're actually trying to migrate their test runs to bun, but without success so far), but (1) a lot is already written on it, (2) it's so damn annoying that in something as essential as tests, we have to once again encounter the beloved JS "oh yes, under load this crap won't work, take another crap" 😢

We already have 250 ms for 800 queues with 10 IO requests... Even though we've thoroughly optimized with a profiler, it's practically an unavoidable fact. There's a solution, but we'll have to migrate to something else because Node.js simply can't handle 8000 IO RPS (total requests to and from the instance)? Seriously? It's really sad.

Well, I've already understood your love for Go, that's true))

Yes, undoubtedly, I must say: migrating to Go in this situation is one of the options.

There are many other languages that will also provide the necessary performance boost compared to Node.js.
@ YURII VLADIMIROVICH · # 1071

No, I'm talking about how much the dependencies folder weighs (how much it pulled in that's in your package.json).
@ Vassiliy ITK Kuzenkov · # 1072

My queues are small, and the standard Bull.js handles them perfectly. It's interesting to read about your problems :D We've also touched on SQS a bit (but mostly everything is on Bull).

I had a lot of socket connections. We've already scaled up to more powerful instances several times. Node.js isn't bad at this. But again, it requires scaling in the balance, including hardware power, not just horizontally.

There's also heavy processing, which is exactly what we moved to a neighboring service written in Rust.
@ [ $davids.sh ] · # 1073

Okay, you were right about this: we have 600 MB of absolute useless crap per 1 GB (and it's not even dependencies).

I'll go talk to DevOps about what this is and why it's there; I didn't even notice it appearing.

I'll also try bullseye-slim/bookworm-slim and let you know what we achieve.

If we can even get down to 200 MB, it'll be pure bliss (not comparable to 10 MB Go, but the current situation is definitely not an option).

Thanks for the tip)
@ YURII VLADIMIROVICH · # 1074

Maybe this will also be relevant

https://www.specfy.io/blog/1-efficient-dockerfile-nodejs-in-7-steps
@ [ $davids.sh ] · # 1075

I worked at a company where we had a giant enterprise Node.js system, but there, we had almost no external dependencies (except kafka.js): we wrote absolutely all the libraries ourselves (we even built a document database on top of FoundationDB in Node.js, and it worked).

But such luxury is rarely affordable in the wild.

And in general, you get tired of dancing around the application: "5000 RPS and 100 instances? No problem" – and then you hit a wall because of the stack.

I consider this a small load around which we have to do too much fiddling, simply because Node.js, RMQ, and actually, PostgreSQL are already burning my ass, but there are already postponed posts about that.

This is probably the first time I'm in a situation where I look at this and realize how lucky I was before to land in very specific situations where Go was immediately available (often thanks to me) or Node.js felt perfectly fine (I accidentally switched to it from Go a couple of years ago, but there it was just a luxury to work with because of a very interesting stack and a lot of time for DX).

And then, a year and a half ago, I found myself in a situation where Node.js, RMQ, and PostgreSQL were a given, and we couldn't escape it.

Again, everything works, these 5000 RPS take milliseconds, but damn, at what cost, why do I have to optimize so much? I'm not happy 😁
@ YURII VLADIMIROVICH · # 1076

Well, it's okay) In time, they'll invent a neural network that will simply read your requirements and write the most effective code for a business task, and you'll then say at what cost you were given services on Go, why so much optimization was needed) as they say, history is cyclical
@ [ $davids.sh ] · # 1077

As someone who developed the methodology for Business Systems Analytics and started IT-Kachalka purely with design and architecture, I am absolutely for such a future (I didn't even think a couple of years ago that I would return to full-time coding)).

If I'm given the chance to become a prompt architect – that would be happiness.
@ Artem · # 1100

...and in reality, it can be hundreds of thousands) ... The "Ь" is extra) On topic: I'm looking at the backend with Go, can you suggest any framework solutions for quick setup (I want to get a feel for what this Go lang of yours is like)?
@ YURII VLADIMIROVICH · # 1101

So, do you want to try Go itself, or a framework within Go?
@ [ $davids.sh ] · # 1103

If I were a frontend developer transitioning to Go, here's how I'd do it:

. Choose a project to build (e.g., Tinder, but with a twist 😉) . For HTTP, use Echo / Gin . For logging, use Zap / Zerolog . For configuration and secrets, use Viper . For database migrations, use Goose . For SQL, use sqlx + introspect with xo + compile SQL queries with sqlc . Follow the project structure as described here. For test helpers, use the testify library . Use a makefile for script automation (like npm scripts) . Start writing the application . Nothing works . Look up syntax here, find example projects here. Google (there are a billion free videos and text tutorials for Go) . Things start working . Break down the code into packages . Be surprised by CamelCase exports (but eventually appreciate how beautiful it is) . Learn to explicitly return errors (it's like smoking: unpleasant at first, but you can't do without it later) . Encounter the first nil pointer . Read about the difference between passing by Pointer vs. Value and how to correctly declare each type . Write the first goroutine . They don't start . Read about wg.WaitGroup and ctx.Context . Create the first channel . Encounter a deadlock . Learn about for, select, and the sequence of opening, communicating, and closing channels . A pile of firewood, the backend is ready 🔥
@ [ $davids.sh ] · # 1104

And regarding the Go framework – that's for @bondiano )))
@ Vassiliy ITK Kuzenkov · # 1106

No, I think it's not possible to build a convenient framework (like Rails) in Go, and I'm not against building it myself.

I ended up assembling a setup similar to what you described. By the way, sqlc is a great tool, and it's copied from Clojure's https://www.hugsql.org/getting-started :D
@ Artur G · # 1107

Interesting numbers.

It's great that Google collects images for me. 😁 I thought it was less than 100 MB per service. Too lazy to check. And instance startup is fast.

Go is good situationally. Node.js is the best! 😁

I believe if you like Go, you should just admit it to yourself and use it.
@ YURII VLADIMIROVICH · # 1108

It would also be good to have a quality comparative table where Node.js/Rust/Go/... are ideal, and where they are not (this should consider not only writing code and performing tasks but also the managerial aspect - development cost, support, etc.).
@ Artur G · # 1111

I think that's unrealistic. There's too much variety possible.
@ YURII VLADIMIROVICH · # 1133

Hi, what did Dev Ops tell you? What's that junk?

How much have you reduced the image by?
@ [ $davids.sh ] · # 1134

It was already late then, and now it's the weekend, so I've left the question for next week for now

But as soon as I have info, I'll definitely write
@ [ $davids.sh ] · # 1159

By the way, the answer was found:

First, instead of building the last phase from an image like bookworm, there was a build image, it was changed to bookworm, and 600 MB was removed.

Second, the build happens with lerna, and it adds a ton of completely unnecessary dependencies. We haven't had time to get rid of this yet, but if successful, we'll reduce it by at least 400 MB.

As a result, the 1.4 GB image should be around 400 MB, but I think we can even bring that down to 300 MB.