Building an indie MMORPG – part 4: Phaser integration, world rendering & movement basics

July 18, 2024

16 min read

Introducing Phaser to render the world in-browser. Explaining full backend modularity with Maven. Sending the map from server to client, drawing tiles, and handling first movement logic.

Introduction

We will start this post unusually by immediately inserting a video where I will show the new features that will be discussed today:

At first glance, you can see the introduction of the game window, which is powered by the Phaser library for creating games on the browser side. I could have used plain Canvas as well, but I read only good reviews about the library and simply liked it. To demonstrate Phaser in action with the proper elements of our project, we will also add the display of a single map fragment and the basics of movement. In the video, you can see that not all character animations were ready yet, but it was enough for me to proceed with the project. An extra element is showing a message that marching is not possible (you can't enter the water in the game).

As for creating graphics, at that time, I used a free online tool called Piskelapp, which is used to create pixel art sprites. This way, I prepared the first graphics of grass, water, black tile, and sand tiles, as well as the entire character spritesheet, which contains sprites of its static version and two for marching in each direction (as you can see, for now, I have drawn two directions). Generally speaking, drawing is not my strong suit, and I would be happy to collaborate with someone who likes and knows how to do these things :D. The worst are all vertical objects with a 45-degree perspective. At some point, I started generating "flat" graphics using AI models, but more on that in future posts.

And now, let's get to work!

Phaser

Embedding the Phaser window is relatively simple. Let's create a regular React component that will configure the game and create the scene (in the game, we will have one, but for example, in platformers, each level will be a different scene).

const PhaserGame = ({ map }: Props) => {
  const gameRef = useRef<Phaser.Game | null>(null);

  useEffect(() => {
    const config: Phaser.Types.Core.GameConfig = {
      type: Phaser.AUTO,
      parent: "game-container",
      width: GameConfig.VisibleColumns * GameConfig.TileSize,
      height: GameConfig.VisibleRows * GameConfig.TileSize,
      pixelArt: true,
      roundPixels: true,
      scale: {
        mode: Phaser.Scale.FIT,
        parent: "game-container",
      },
    };

    if (gameRef.current === null) {
      gameRef.current = new Phaser.Game(config);
      gameRef.current.scene.add("MainScene", MainScene, true, { map: map });
    }

    return () => {
      if (gameRef.current) {
        gameRef.current.destroy(true);
        gameRef.current = null;
      }
    };
  }, []);

  return (
    <div
      id="game-container"
      className="flex items-center justify-center my-1"
    ></div>
  );
};

The above class is a bridge between React and Phaser. We prepare the configuration, create the game object if it is not initialized, and launch the scene, passing it the input data - in our case, the map object. As for the width and height of the game window, the player will see 13 rows and 17 columns of tiles, and each tile is 32 by 32 pixels.

For bidirectional communication between React and Phaser, we will use EventsEmitter:

import { Events } from "phaser";

export const EventBus = new Events.EventEmitter();

And embedding our "bridge" looks as follows:

<div className="flex flex-col w-screen h-screen justify-between">
  {`Welcome ${props.username}`}
  {map && <PhaserGame map={map} />}
  <Chat username={props.username} messages={messages} />
</div>

It is worth noting here that thanks to the EventBus for consistency, we no longer need to pass new chat messages through props. Emission:

const handleSendMessage = (event: any) => {
  if (event.key === "Enter" && currentMessage.trim() !== "") {
    EventBus.emit("send-message", currentMessage.trim()); // emit msg
    setCurrentMessage("");
    event.preventDefault();
  }
};

and reception in the main game component (we register the listener in useEffect, which runs on startup):

EventBus.on("send-message", (message: string) =>
  sendMessage(messageToBinary(message))
);

The emission of the rest of the messages in the game will mainly occur in the following directions:

from the scene (MainScene) to the main game component (Game) - when an action is performed in the game and needs to be passed to the server
from the main game component to the scene - when a command is received from the server and needs to be executed in the game

The rest of the cases are probably only communication between the main game component and the user interface built in React (e.g., inventory or skill windows).

The last important element of Phaser is the often-mentioned scene. It is a class that extends the Scene object from the library. The four most important methods that interest us are:

init - called during scene registration. We pass the appropriate input data here.
preload - called before the scene starts. We load the input resources, which in our case are the sprites.
create - called at the start. We prepare our scene - draw its initial state, prepare animations, register event listeners.
update - called every game tick. Here we execute all the logic - read player actions, calculate everything, and display the new state on the screen.

Full modularity on the backend

In the previous post, I discussed clean architecture. I divided the framework classes and game logic into two Java packages: websocket and logic. A much better solution, which will ensure actual modularity and manually declared dependencies, is to introduce submodules in Maven, a project management tool primarily for Java. The division will look as follows:

logic - our core, the game logic. It only depends on the lightweight Lombok library and the interface module (below).
logic-interfaces - the simplest module containing structures and interfaces ensuring the connection between the logic module and the framework. No dependencies.
websocket - the framework module depending on Spring Webflux, Lombok, and the other two modules.

In fact, the logic-interfaces module could be part of logic, but for readability, I separated the classes belonging to the API logic into a distinct entity.

Thanks to this approach, I gain primarily real code separation and do not even have the possibility to use some external code in the most important module, logic.

Preparing and displaying the map

Let's introduce the first graphical element of the gameplay, the map! It consists of tiles with dimensions of 32x32 pixels. Each tile can represent a different type of terrain: grass, water, sand, tiles, swamps, etc. Various items, players, monsters, and game effects such as fire or temporary spell effects can be found on a tile. The player is always displayed in the center of the game area, which, as I mentioned earlier, consists of 13 rows of tiles and 17 columns:

Game camera viewport – 13x17 tile grid representing the visible map area centered on the player character during gameplay

Map display algorithm

Let's start by discussing how a piece of the map is displayed in the game (for now, just a piece; later, we will dynamically load areas based on the player's position). Just a reminder, at this stage, there is no account management or database saving, so every time you exit the game, you lose your progress, and each login (simply entering a nickname) gives you a new player.

The server starts, loading into memory an object representing the initial map.
The user goes to the game page, enters a nickname, and clicks the "Login" button.
A WebSocket connection to the server is established, and listeners for incoming messages are registered (at this point, the game window is not yet displayed).
The server stabilizes the new connection and calls the connectPlayer method on the logic layer.
A player object is created and added to the list of active participants.
The server processes the current state of the map into bytes and notifies the user.
The client receives the message to display the map, translates it from bytes to a TypeScript object, launches the game window using the Phaser library, passes the map object, and then draws the world.
The player sees a piece of the map along with their character and can make further moves.

Server-side map handling

At this stage of the project, I modeled the map on the backend as a two-dimensional array of Field objects:

public record Field(Tile tile, Character character) {

    public static Field empty(Tile tile) {
        return new Field(tile, null);
    }

    public FieldView toVisibleObject() {
        return new FieldView(
                tile.toVisibleObject(),
                Optional.ofNullable(character)
                        .map(Character::toVisibleObject)
        );
    }

    public boolean canWalkHere() {
        return tile.id() != Tile.WATER.id();
    }
}

This object represents a single tile in the game. At the moment, it contains information such as the type of surface and whether there is a player on it. At this stage, I have not yet implemented the ability to save a player on a specific tile, which is evident from the constructor and the lack of appropriate methods (I didn't need this at the time, but I have already laid the foundations). In the future, the Field object will also be able to contain monsters, as well as items or effects. Analyzing the code, we can see that Field has a method to check if a given tile can be entered (currently, it can't be if the surface is water) and a method to transform the state into an object representing the field view - FieldView, which belongs to the logic-interfaces module.

It is also worth looking at the Tile type, which represents the type of surface and also contains information about the speed coefficient for moving on it (e.g., walking on sand will be slower than on pavement, but for now, they are all rigidly set to 1):

public record Tile(byte id, BigDecimal speedFactor) {

    public static final Tile GRASS = new Tile((byte) 0x00, BigDecimal.ONE);
    public static final Tile WATER = new Tile((byte) 0x01, BigDecimal.ONE);
    public static final Tile PAVEMENT = new Tile((byte) 0x02, BigDecimal.ONE);
    public static final Tile BLACK_PLATE = new Tile((byte) 0x03, BigDecimal.ONE);
    public static final Tile SAND = new Tile((byte) 0x04, BigDecimal.ONE);

    public byte toVisibleObject() {
        return this.id;
    }
}

At the moment, the speed coefficient is not yet used in any way, so the surface representation is just its identifier (this is enough for the frontend to know what to display). It is worth noting that the identifier is modeled as a single byte, so for now, 256 different types of surfaces can be used - I hope that's enough :).

The initial state of the map is loaded into memory when the server starts. In the future, I plan to introduce a map editor and save it in XML or JSON files. For now, however, I create it as a simple two-dimensional array of Field objects.

Transforming the map to bytes, sending to the client, and drawing

Now let's look at the code snippets responsible for sending the map to the client. In the current algorithm, I send a map fragment enlarged by 3 tiles in each direction compared to what the player sees. Of course, I need to send a part of the world that corresponds to the player's current position, i.e., cut out a piece around them:

public final MapView toVisibleObject(Position position) {
        final var rows = MapConfig.VISIBLE_ROWS + 6;
        final var cols = MapConfig.VISIBLE_COLUMNS + 6;

        final var visibleMap = new FieldView[rows][cols];
        int rowStart = position.x() - rows / 2;
        int colStart = position.y() - cols / 2;

        for (int i = 0; i < rows; i++) {
            for (int j = 0; j < cols; j++) {
                visibleMap[i][j] = matrix[rowStart + i][colStart + j].toVisibleObject();
            }
        }

        return new MapView(visibleMap, (short) colStart, (short) rowStart);
}

It is worth noting that in addition to the map itself, we also pass the coordinates of the first tile. This will be useful in the future when we transform global coordinates to local ones and vice versa on the client side. Let's also look at the representation of a single tile and the entire map:

public record FieldView(byte tileId, Optional<String> characterName) {
}

public record MapView(FieldView[][] map, short x, short y) {
}

For now, we only present the surface identifier and optionally the nickname of the player occupying the tile, but as I mentioned, this is not yet utilized.

The transformation of the map object to bytes looks as follows:

public ByteBuffer toBinary(MapView mapView) {
        return BinaryWriterHelper.toBinary(dos -> {
            dos.writeByte(ClientCommandTypes.SHOW_MAP);
            dos.writeShort(mapView.x());
            dos.writeShort(mapView.y());
            final var map = mapView.map();
            dos.writeShort(map.length);
            dos.writeShort(map[0].length);

            for (FieldView[] row : map) {
                for (FieldView field : row) {
                    dos.writeByte(field.tileId());
                    byte[] nameBytes = field.characterName()
                            .map(it -> it.getBytes(StandardCharsets.UTF_8))
                            .orElseGet(() -> new byte[0]);
                    dos.writeByte(nameBytes.length);
                    dos.write(nameBytes);
            }
        }
    });
}

We send the subsequent bytes in the following way:

1 byte for the command
2 bytes for the X coordinate of the first tile
2 bytes for the Y coordinate of the first tile
2 bytes for the number of rows
2 bytes for the number of columns
sequence of tiles: 1 byte for the surface type, 1 byte for the player's nickname length, bytes for the player's nickname

On the frontend side, we receive the command to show the map, translate the bytes into a TypeScript object, and in the create method of the scene, we draw the map and display the player in the center. The map drawing code is shown below:

private drawMap() {
    const map = this.map.map;
    const centerX = this.scale.width / 2;
    const centerY = this.scale.height / 2;

    const localPlayerPos = this.getLocalPlayerPos();

    const startX = centerX - localPlayerPos.getLocalX() * GameConfig.TileSize;
    const startY = centerY - localPlayerPos.getLocalY() * GameConfig.TileSize;

    for (let row = 0; row < map.length; row++) {
      for (let col = 0; col < map[row].length; col++) {
        const x = startX + col * GameConfig.TileSize;
        const y = startY + row * GameConfig.TileSize;
        const field = map[row][col];

        let spriteKey;
        switch (field.tileId) {
          case 0x00:
            spriteKey = "grass";
            break;
          case 0x01:
            spriteKey = "water";
            break;
          case 0x02:
            spriteKey = "grass";
            break;
          case 0x03:
            spriteKey = "black_plate";
            break;
          default:
            spriteKey = "grass";
        }

        this.mapSprites.add(
          this.add
            .sprite(x, y, spriteKey)
            .setDisplaySize(GameConfig.TileSize, GameConfig.TileSize)
            .setDepth(1)
        );
      }
    }

    this.player = this.add
      .sprite(centerX - 5, centerY - 13, "player")
      .setDisplaySize(GameConfig.TileSize, GameConfig.TileSize)
      .setDepth(10);
  }

We draw the map in such a way that the player is in the center. All objects except the player are added to the mapSprites container, so that when the player moves, the entire map can be easily shifted except for the player object. I move the player sprite 5 pixels horizontally and 13 pixels vertically to display it in the center of the tile (this depends on the player's current graphics).

Character movement

Movement algorithm

The last point for today is to discuss the first version of the movement algorithm. To keep it brief, as the post is already extensive, let's start with the concept:

On the client side, we check if the player is currently moving, and if not, we check if any arrow key is pressed.
If so, we send 1 byte of the movement command to the server (move left, move right, move up, or move down - one of the bytes from 0x01 to 0x04).
We receive the command on the server side and call the appropriate method in the game logic. Let's assume the player wants to move to the tile on the right. We then call the walkRight method on the Game class.
Based on the player's current position and direction of movement, we calculate which tile they want to occupy. We check if the tile can be entered, and if so, we notify the client about the start of the walk and schedule a task to finally occupy the tile after a time determined by the player's speed. If the tile cannot be occupied, we send an appropriate message to the client.
On the client side, after receiving the notification about the start of the walk, we start moving the player's sprite accordingly in the update method and trigger its animation.
On the server side, after a time equal to the player's walking time from tile to tile has passed, we check again if the player can occupy the target tile. If so, we notify the client about the end of the walk and update their position. If not, we notify the client about the need to return to the tile from which the walk started.

Backend code

Finally, I will present the code responsible for movement on the backend side. On the frontend, nothing extraordinary happens in this case, as we simply check keyboard input, receive events from the server in the same way as handling chat, move the mapSprites container containing all objects except the player in the opposite direction, and trigger the player's animation - those curious will find all this in the basics of the Phaser library. The server controls all the logic, so let's look at the code responsible for walking to the right (other directions are analogous):

    @Override
    public void walkRight(String username) {
        walk(username, Position::right, WalkDirection.RIGHT);
    }

    private void walk(String username, Function<Position, Position> newPositionFun, WalkDirection walkDirection) {
        players.get(username)
                .ifPresent(player -> {
                    final var destinationPos = newPositionFun.apply(player.position());
                    if (gameMap.canWalk(destinationPos)) {
                        gameTasksScheduler.schedule(new GameTasksScheduler.Task(player.walkTime(), () -> {
                            if (gameMap.canWalk(destinationPos)) {
                                player.takePosition(destinationPos);
                                clientsNotificator.showFinishWalk(username, destinationPos.x(), destinationPos.y());
                            } else {
                                final var oldPos = player.position();
                                clientsNotificator.showTakeOldPosition(username, oldPos.x(), oldPos.y());
                            }
                        }));
                        clientsNotificator.showStartWalk(username, (short) (player.walkTime().toMillis()), walkDirection);
                    } else {
                        clientsNotificator.showMessage("Cannot walk", List.of(username));
                    }
                });
    }

... the following will also be useful for completion:

public record Position(short x, short y) {

    public Position left() {
        return new Position((short) (x - 1), y);
    }

    public Position up() {
        return new Position(x, (short) (y - 1));
    }

    public Position right() {
        return new Position((short) (x + 1), y);
    }

    public Position down() {
        return new Position(x, (short) (y + 1));
    }
}

The code does exactly what I described above, so there's no need to forcefully prolong and repeat myself. I will just mention that the walking algorithm was improved by me because one of the drawbacks of the presented version is, for example, that when the player holds the arrow key on the keyboard, the character does not walk smoothly but stops for a split second on each tile. Everything will be described in my future posts.

The post turned out to be much longer than I intended, but the discussed topics logically came together into a cohesive whole.

Best regards and see you next time :)