Here's how I most often structure similar systems:
. Socket β a service that holds WS / TCP / UDP / MQTT connections. It exists separately so that connections don't drop when we redeploy or scale services with business logic.
. Receiver β a service for processing messages with the main business logic.
. Commander β optional, but in IoT, a service is most often needed that maps your commands to commands that the device will understand and sends them.
Messages come into the Socket, go to the Receiver, it sends reply commands to the Commander, and the Commander sends messages through the Socket (a very CQRS-like approach with separation of read and write streams).
What useful tricks are important to remember:
. Services should communicate asynchronously (not wait for a response), otherwise, sooner or later, you'll hit a system-wide deadlock.
. Accordingly, all messages should be "events" β a structure describing a fact that has occurred with data about that fact ("UserCreatedMessage", "MessageSuccessfullyProcessed", "MessageSuccessfullySent", etc.).
. The Socket should contain minimal logic so that it doesn't have to be redeployed often, OR if speed is more important to you, then we combine the Socket, Receiver, and Commander.
. All messages should be stored persistently (Kafka or a database).
. In Realtime systems, 2 factors are maximally important: speed and linear load growth with an increase in the number of messages.
I want to draw attention specifically to the last point: most often, the bottleneck of your system will not be the programming language, not the transport, and not the network, but database access.
Firstly, each access takes time; secondly, database queries are difficult to batch because each message is processed independently; thirdly, if you don't have a master-master database, you'll have a limited number of connections.
All of this leads to the fact that increasing the number of messages per second exponentially increases the load.
To make the message processing speed constant and the load growth linear, the best option is to fetch the necessary data at the moment the service starts (or when the first message arrives), put it into memory (or a cache like Redis), make changes to the data directly in memory with each message, and periodically synchronize the data with the main database.
This is not an easy task, but this is where the "Actors" pattern comes to our aid.
Actors very well describe how to structure logic of this kind as conveniently and clearly as possible.
The message has turned out to be too long as it is, so the continuation will be in the next part, but for now, watch this video about actors that I barely managed to find:
https://www.youtube.com/watch?v=Fw-CXSG8KZE