[ $davids.sh ] β€” david shekunts blog

πŸ‘‘The Ultimate Approach for Realtime Systems: Chats, Games, IoT, and More πŸ‘‘

# [ $davids.sh ] Β· message #187

πŸ‘‘The Ultimate Approach for Realtime Systems: Chats, Games, IoT, and More πŸ‘‘

Throughout my career, I have deployed 3 chats and 6 IoT systems into production, and only now have I come to one important concept

#top #highload #db #async #actor

(continuation in comments)

  • @ [ $davids.sh ] Β· # 640

    Here's how I most often structure similar systems:

    . Socket – a service that holds WS / TCP / UDP / MQTT connections. It exists separately so that connections don't drop when we redeploy or scale services with business logic.

    . Receiver – a service for processing messages with the main business logic.

    . Commander – optional, but in IoT, a service is most often needed that maps your commands to commands that the device will understand and sends them.

    Messages come into the Socket, go to the Receiver, it sends reply commands to the Commander, and the Commander sends messages through the Socket (a very CQRS-like approach with separation of read and write streams).

    What useful tricks are important to remember:

    . Services should communicate asynchronously (not wait for a response), otherwise, sooner or later, you'll hit a system-wide deadlock.

    . Accordingly, all messages should be "events" – a structure describing a fact that has occurred with data about that fact ("UserCreatedMessage", "MessageSuccessfullyProcessed", "MessageSuccessfullySent", etc.).

    . The Socket should contain minimal logic so that it doesn't have to be redeployed often, OR if speed is more important to you, then we combine the Socket, Receiver, and Commander.

    . All messages should be stored persistently (Kafka or a database).

    . In Realtime systems, 2 factors are maximally important: speed and linear load growth with an increase in the number of messages.

    I want to draw attention specifically to the last point: most often, the bottleneck of your system will not be the programming language, not the transport, and not the network, but database access.

    Firstly, each access takes time; secondly, database queries are difficult to batch because each message is processed independently; thirdly, if you don't have a master-master database, you'll have a limited number of connections.

    All of this leads to the fact that increasing the number of messages per second exponentially increases the load.

    To make the message processing speed constant and the load growth linear, the best option is to fetch the necessary data at the moment the service starts (or when the first message arrives), put it into memory (or a cache like Redis), make changes to the data directly in memory with each message, and periodically synchronize the data with the main database.

    This is not an easy task, but this is where the "Actors" pattern comes to our aid.

    Actors very well describe how to structure logic of this kind as conveniently and clearly as possible.

    The message has turned out to be too long as it is, so the continuation will be in the next part, but for now, watch this video about actors that I barely managed to find:

    https://www.youtube.com/watch?v=Fw-CXSG8KZE

  • @ Daniil full-stack deeplay Sherbakov Β· # 641

    And why specifically at service startup?

    Isn't it easier to bind projections to the user's socket, with information for a specific user?

    A full database mirror is expensive in terms of memory.

    *This is if we're talking about RAM; if it's in Redis, then the whole database is possible...

  • @ [ $davids.sh ] Β· # 642

    Added that they can be placed in memory upon receiving the first message)

    Regarding memory: it's definitely necessary to calculate the average/maximum size of the data needed for unloading and, based on this, decide where it will be stored (+ cache is more reliable in case of application crashes)

    In future articles, I will discuss a 3-layer architecture: memory, cache, and DB – and strategies for their use depending on the conditions

    And I'll also talk about a surprisingly cool cache specifically for the Actor model

  • @ Sergey Pogranichnyy πŸ“ Β· # 643

    Almost everything that was said is handled by Centrifugo.

  • @ [ $davids.sh ] Β· # 644

    A long time ago, I was working on a Django project with some guys and we used Centrifugo, specifically as a Socket service (it held the WS) that you could subscribe to directly.

    Can it do anything else?

  • @ Sergey Pogranichnyy πŸ“ Β· # 645

    Clients can make RPC calls directly through it. Messages can be stored in Redis, Dragonfly. The API for retrieving message history during a reconnect avalanche after a crash is very useful. It provides metrics. HTTP 3 is on the way.

  • @ [ $davids.sh ] Β· # 646

    Cool, so it's socket + transport + storage, convenient

    Does it have clustering? Master-master? Found it

  • @ [ $davids.sh ] Β· # 647

    And yes, a very important question: what are her drawbacks?

  • @ Sergey Pogranichnyy πŸ“ Β· # 648

  • @ Ivan ITK 🚫 Β· # 659

    Dapr has good documentation on actors, and there's a section on the MSDN website.