Building an indie MMORPG – part 3: backend architecture, custom protocol, and chat

July 12, 2024

12 min read

How I designed clean backend architecture, built a custom binary protocol, and implemented real-time chat with a task scheduler for timed game events.

Introduction

Hi. Today there will be a lot of code :). We will mainly analyze the concept of the project on the backend side (there will be little frontend today). The first new thing I must mention is the transition from a text format (which was presented in the previous post) to a binary format in the messages sent. As you might guess, this is to maintain the highest possible performance. By wrapping the messages in pure bytes, I achieve a much smaller message size than if, for example, I were sending them in JSON format.

You also need to take into account that the presented code was and is continuously developed and extensively refactored multiple times. As in this series of posts, I present progress and the entire process, not the final result, some parts of the code may undergo significant changes in the future, as there were many moments when I realized that, for example, I named some element foolishly or miscomposed dependencies between certain classes. But don't worry, I will comment on everything in future posts :).

Clean architecture on the backend

I am somewhat a fanatic of good code organization, especially on the backend, so from the beginning, I wanted to maintain an appropriate structure of classes and modules in the project. I realized that the choice of the Spring Webflux framework and WebSocket communication might not be the final decision, and I wanted to organize my code so that in the future, switching to another framework and adding or changing to TCP communication would not require any modifications to the game logic code, which will be the most complex and intricate (it is, after all, the core of the game as it contains all the business rules). In other words, the game logic code had to be entirely independent of interchangeable (external) elements such as the framework, communication protocol, or database, which I wasn't even considering at this point.

The appropriate separation is ensured by clean architecture, as described by Uncle Bob in his book. It's worth mentioning that very similar concepts are also presented by patterns such as hexagonal architecture, layered architecture, or the so-called ports and adapters. Generally, it's about separating business logic from implementation details. Below, I provide a schematic diagram showing the layout of classes and modules in my backend.

Hexagonal architecture diagram – backend design pattern showcasing domain logic, ports, and adapters for a modular server-side application

The inner circle (CORE) is the core, meaning the game logic. It is entirely independent of the outer circle, which represents the implementation details—in this case, the Spring Webflux framework, WebSockets, and some database that I will choose in the future. In classes belonging to the game logic, there should be no imports (dependencies) of external entities. Ideally, the entire core will be written in pure Java, but it is known that sometimes lightweight, auxiliary libraries (like Lombok) are used. The rounded rectangles are classes or modules (sets of classes performing a specific task).

In our diagram, arrows represent the directions of communication, and the dotted lines indicate that a rectangle belonging to the inner circle is an interface implemented by a class belonging to the outer layer (achieving the appropriate separation that I keep mentioning).

Flow description

Now we will analyze the information flow in the game. It will be most intuitive to start with the GameWebsocketHandler class, which is responsible for the WebSocket connection with each client and also initiates the entire game logic and passes all incoming messages to it. Some of you will probably notice that GameWebsocketHandler is a new, extended version of the ChatWebsocketHandler class from the previous post.

import lombok.extern.slf4j.Slf4j;
import org.springframework.core.io.buffer.DataBuffer;
import org.springframework.web.reactive.socket.WebSocketHandler;
import org.springframework.web.reactive.socket.WebSocketMessage;
import org.springframework.web.reactive.socket.WebSocketSession;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.Game;
import pl.programowanieibiznes.mmorpgwebfluxserver.websocket.binarydatatranslators.TextMessagesTranslator;
import reactor.core.publisher.Mono;

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

@Slf4j
public class GameWebsocketHandler implements WebSocketHandler {

    private final Map<String, WebSocketSession> sessionsMap;
    private final TextMessagesTranslator textMessagesTranslator;
    private final Game game;

    public GameWebsocketHandler() {
        this.sessionsMap = new ConcurrentHashMap<>();
        this.textMessagesTranslator = new TextMessagesTranslator();
        this.game = Game.init(
                new WebfluxClientsNotificator(sessionsMap, textMessagesTranslator),
                new WebfluxGameTasksScheduler()
        );
    }

    @Override
    public Mono<Void> handle(WebSocketSession session) {
        final var username = session.getHandshakeInfo().getUri().getQuery().split("username=")[1];
        log.info("New client connected: " + username);

        sessionsMap.put(username, session);
        game.connectPlayer(username);

        return session.receive()
                .map(WebSocketMessage::getPayload)
                .doOnNext(it -> processBinaryMessage(it, username))
                .doOnError(e -> log.error("ChatWebsocketHandler error", e))
                .doFinally(signalType -> {
                    game.disconnectPlayer(username);
                    sessionsMap.remove(session.getId());
                    log.info("Client disconnected: " + username);
                })
                .then();
    }

    private void processBinaryMessage(DataBuffer dataBuffer, String executorId) {
        switch (dataBuffer.read()) {
            case ServerCommandTypes.SEND_MESSAGE -> textMessagesTranslator.fromBinary(dataBuffer)
                    .ifPresent(it -> game.sendMessage(executorId, it));
            default -> log.error("Unknown command");
        }
    }
}

First, this is one of the exceptions where I have shown imports in the code. I want to point out that aside from dependencies on pure Java objects like ConcurrentHashMap, most imports are from the Spring Webflux framework. And that makes sense because GameWebsocketHandler is an implementation detail belonging to the outer circle. If I decided on a different framework, this class would be removed, and the game logic would remain.

Analyzing the presented class, we see that the entire game logic is initialized in the constructor (among other things, we create a Game object). In the handle method, which is responsible for handling each connection from start to finish, we register a new client, translate the next binary message into an appropriate Java object (in this case, just a string), and pass it to the Game object, which is the facade (entry point) to the inner circle (game logic). At the very end, we unregister the client and also instruct the engine to remove it. In the event of an error, we simply log it. It is also worth noting two important differences compared to the previous version of this class, ChatWebsocketHandler:

We use the getPayload method to read subsequent incoming messages instead of getPayloadAsText because we want to achieve a binary format.
In this specific class, we do not send anything outwards - this process occurs in the logic layer.

Next, let's look at the Game class, which is the entry point to the entire game engine:

import lombok.AccessLevel;
import lombok.Builder;
import lombok.RequiredArgsConstructor;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.clientsnotificator.ClientsNotificator;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.gametaskscheduler.GameTasksScheduler;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.managers.MessageManager;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.players.Player;
import pl.programowanieibiznes.mmorpgwebfluxserver.logic.players.Players;

@RequiredArgsConstructor(access = AccessLevel.PRIVATE)
@Builder
public class Game {

    private final Players players;
    private final MessageManager messageManager;

    public static Game init(ClientsNotificator clientsNotificator, GameTasksScheduler gameTasksScheduler) {
        return Game.builder()
                .players(Players.empty())
                .messageManager(new MessageManager(clientsNotificator, gameTasksScheduler))
                .build();
    }

    public void connectPlayer(String username) {
        players.add(new Player(username));
    }

    public void disconnectPlayer(String username) {
        players.remove(username);
    }

    public void sendMessage(String executorId, String message) {
        messageManager.send(executorId, message, players);
    }
}

Again, I have shown imports to indicate that this time there are no dependencies on any external things except Lombok, which is a lightweight library. For now, the Game class is short. In the constructor, we initialize players as an empty object (after all, no player is connected at the start) and initialize the object responsible for sending messages.

It is important to understand that the Game class, as a facade pattern, acts as an entry point to the module and should not itself contain too much business logic. Its role is rather to pass the flow to deeper, specialized classes—in this case, the MessageManager. If you look at the architecture diagram above, you will see that the flow goes from Game to the Logic Layer, which includes all specialized classes performing specific tasks such as message sending, movement, combat, etc.

The methods connectPlayer and disconnectPlayer are simply saving and removing an active player to and from RAM. More interesting is the sendMessage method, whose responsibility is to send a message to other active players. So let's take a closer look:

@Slf4j
@RequiredArgsConstructor
public class MessageManager {

    private final ClientsNotificator clientsNotificator;
    private final GameTasksScheduler gameTasksScheduler;

    public void send(String executorId, String message, Players players) {
        players.get(executorId)
                .ifPresent(e -> {
                    final var finalMsg = e.username() + ": " + message;
                    final var receivers = players.getAll();

                    clientsNotificator.showMessage(finalMsg, receivers);
                    gameTasksScheduler.schedule(List.of(
                            new GameTasksScheduler.Task(Duration.ofSeconds(1), () -> clientsNotificator.showMessage("After 1s", receivers)),
                            new GameTasksScheduler.Task(Duration.ofSeconds(2), () -> clientsNotificator.showMessage("After 2s", receivers)),
                            new GameTasksScheduler.Task(Duration.ofSeconds(3), () -> clientsNotificator.showMessage("After 3s", receivers))
                    ));
                });
    }
}

We have the send method to analyze. First, we retrieve the message author based on their ID (in this case, the nickname) and if they exist, we perform the following steps:

We prepend the author's nickname to the message.
We retrieve all active players (creating a general chat).
We send the message to everyone.
We perform an additional action: sending a test message sequentially after 1, 2, 3 seconds.

Step 3, which involves sending the message to everyone using the ClientsNotificator, and the additional step 4 performed using the GameTasksScheduler, require more explanation.

ClientsNotificator

More precisely, this is an interface. We want to send a message via WebSocket, but according to the principles of clean architecture, we cannot call framework code at this point. Dependency injection comes to our aid here, and this is precisely what interfaces in Java are for. So, let's define the contract for message broadcasting:

public interface ClientsNotificator {

    void showMessage(String message, List<Player> receivers);

}

Pure Java code :). In the future, methods informing clients about other events besides sending messages (e.g., a move command) will be added here. Now let's move on to the implementation that belongs to the outer layer:

@Slf4j
@RequiredArgsConstructor
public class WebfluxClientsNotificator implements ClientsNotificator {

    private final Map<String, WebSocketSession> gameClients;
    private final TextMessagesTranslator textMessagesTranslator;

    @Override
    public void showMessage(String message, List<Player> receivers) {
        toSessionsFlux(receivers)
                .flatMap(session -> session.send(
                        Mono.just(session.binaryMessage(dataBufferFactory ->
                                dataBufferFactory.wrap(textMessagesTranslator.toBinary(message)))))
                )
                .doOnError(e -> log.error("WebfluxClientsNotificator error", e))
                .subscribe();
    }

    private Flux<WebSocketSession> toSessionsFlux(List<Player> receivers) {
        return Flux.fromIterable(
                receivers.stream()
                        .map(it -> Optional.ofNullable(gameClients.get(it.username())))
                        .filter(Optional::isPresent)
                        .map(Optional::get)
                        .toList()
        );
    }
}

The principle of operation is relatively simple. We extract active sessions based on the nicknames of all players*, map the text message to binary format, and send it. Of course, we log any potential errors.

Actually, it would be easier to simply call gameClients.values() in this case instead of passing players to the showMessage method and then extracting sessions by nicknames, since it's a general chat and we need everyone, but maybe I had a deeper concept ;p.

GameTasksScheduler

Time to explain the additional step that seems pointless here. Let me clarify. I simply wanted to test the functionality of performing specific actions in the game after a strictly defined time. Why did I need this? For example, when a player starts moving from tile to tile, this movement actually takes a certain amount of time. Let's assume exactly 1 second. The game must work in such a way that when the server receives the move command, it immediately checks if the target tile can be occupied, informs the relevant clients about the start of the move, and after this 1 second, it must check again if the tile is still free (someone faster might have occupied it earlier), in case of success, actually occupy the new place, execute the appropriate logic, and inform the relevant viewers again.

That's why I wanted to introduce this type of task scheduling object right away and the easiest way to test it was by simply sending messages after a specific time :). Of course, this code will be removed soon.

Below is the code for the interface and its implementation. We have a similar situation as with the notifier because the implementation needs to use the framework.

@FunctionalInterface
public interface GameTasksScheduler {

    void schedule(List<Task> tasks);

    record Task(Duration delay, Runnable runnable) {
    }
}

@Slf4j
public class WebfluxGameTasksScheduler implements GameTasksScheduler {
    @Override
    public void schedule(List<Task> tasks) {
        Flux.fromIterable(tasks)
                .flatMap(task ->
                        Mono.delay(task.delay())
                                .doOnNext(it -> task.runnable().run()))
                .doOnError(e -> log.error("WebfluxGameTasksScheduler error", e))
                .subscribe();
    }
}

StateRepository

I wouldn't be myself if I didn't mention the StateRepository interface, which appears in the architecture diagram. As I mentioned, I haven't introduced a database yet, but I plan to do so soon. Database access code, i.e., reading and writing, will once again require framework code and other dependencies, so I am hiding it behind the discussed interface.

Protocol

In the comments on the previous post, there were questions about how I came up with the binary WebSocket communication protocol. It's simple - the first byte represents a specific command, and the subsequent bytes contain all the necessary information for that command. This is the same on both the server and client sides. The limitation we have here is 256 different commands sent to the server and 256 to the clients.

For example, a request sent to the server to move to the right is just one byte (e.g., moving to the left is simply another value in the range from 0 to 255) because we don't need to transmit more data. Informing clients about a specific player occupying a new tile, however, requires more bytes:

1 byte for the command
2 bytes for the player identifier, as I assume there will be more than 256 players
2 bytes for the X coordinate (a map larger than 256 x 256 tiles)
2 bytes for the Y coordinate

We have calculated 7 bytes, but of course, there might be a requirement to send additional data. In our example, we need to send a text message. Remember that both the client and the server must know how to read each command. In this case, we can send the first byte of the command, the second byte representing the length of the message (assuming it doesn't exceed the appropriate length), and the rest of the bytes as the message itself.

Frontend cosmetics

The above changes on the backend did not require many modifications on the client side. I just had to teach it my developed protocol :):

useEffect(() => {
  if (lastMessage !== null) {
    lastMessage.data.arrayBuffer().then((buffer: ArrayBuffer) => {
      const data = new Uint8Array(buffer);
      const commandType = data[0];

      if (commandType === ClientCommandTypes.ShowMessage) {
        setMessages((prev) => [...prev, messageFromBinary(data.slice(1))]);
      }

      // todo read other commands
    });
  }
}, [lastMessage]);

The change involves reading a byte array instead of plain text and mapping it to a comprehensible object (in this case, a string). We check which command has arrived (for now, only one type) and perform the appropriate action (saving the message).

Result

Below is the result of the discussed code fragments. I realize there was a lot of material, and the post may take a few days to fully absorb. I wish you a comprehensive understanding, happy reading, and talk to you soon!