Understanding protocols by designing one

This is an article I originally added to our tutor group wiki to help one mature student understand network protocols. This guide does not include any programming.


A protocol is an agreed way of communicating, but it is less like a translation service and more like a language. I think the best way to learn this may be imagining that you have to invent a protocol.

Imagine you’re making a social game where players can move left or right on a 2D landscape, and also chat to each other.

As we know, there is a client part and a server part to networking (usually). In computer networking, protocols are defined in terms of packets; their IDs, and their payloads. When developing the protocol specification, you decide that there are the following data that need to be transmitted:

  • Move
    • Left
    • Right
  • Chat
  • Login
  • Logout

So you want to define a language that transmits this data efficiently. Notice that all of these basic activities require some parameters:

  • Login: Username? Password? Spawn location?
  • Logout: Quit/error?
  • Move: Left/right?
  • Chat: Message?

Now, when you want to do one of these things, the client need not transmit the name of the action to the server, because it would be overkill. There are only a limited number of options for what it is possible to do.

Transmitting “Move” or “Logout” requires n bytes where n is the string length, and the length is unpredictable for the recipient (“Move” is 4 letters, “Logout” is 6) unless you “Length-prefix” it (send a byte straight before to tell the server the length).

So, we create a protocol for shorthand sending and receiving of these messages. Let’s look at an example client protocol:

PacketID Meaning Parameters
00 Login Username (length prefixed string); Password (length prefixed string); Spawn location (byte)
255* Logout Quitting? (boolean)
01 Chat Message (length prefixed string)
02 Move Left? (boolean)**

*It’s convention for 255 (hexadecimal ‘ff’) to be logout – this is the highest value a byte can hold.

**This is the smallest way of transmitting this data. Your direction can be one of two values, so why transmit it as anything larger than 0/1? In this example, 1 = left, 0 = right.

Example client data transmission:

(N.b. I’ve put the numbers in binary to give an accurate representation of space savings and wasted length; an empty byte is 00000000 – spaces are added here for readability and would not be transmitted)

00000000 00000111SteGriff00001000password00000001
00000001 00001000Hi guys!
00000010 1
11111111 0

Above: A player logs in, says “Hi guys!” to the other players, moves left one step, and then his game client crashes without him quitting properly via the ingame menu.

The protocol refers to the fact that the server is developed to “understand” all of these signals, they are a pre-agreed format of communication.

How does one side know that a packet has finished?

Because it’s contents and their length are regulated. This is the one of the reasons for a protocol. When the server gets an 02 it says, “Look, it’s a chat packet! That means now I know what to expect: a number, then I read that number of bytes for the guy’s message. Then I’m done with this chat packet, and I can wait for the next packet from my other clients :)”

Let’s look at the server

One problem might be apparent, and that is that if the server just re-transmits all these “move” signals, every client is going to be VERY BUSY trying to keep track of where things are on the screen, and if a move packet is lost, players will appear in different places on different screens and none of them may be accurate.

Therefore, the server has a different transmission protocol, which the clients are programmed to process. Example:

PacketID Meaning Parameters
00 Login response Accepted? (boolean)
255 Kick Reason (length prefixed string)
01 Chat User (lps) Message (lps)
02 Location User (lps) Location (byte)

When a user attempts to log in, the server validates their password, and throws back 1 if they may proceed, or if they may not, a 0 and a severed connection.

The counterpart to quit is kick. The server owner might decide that a player is being inappropriate and send them a packet like:

11111111 00010100No swearing, please.

This is purely informational, the client doesn’t accept its fate and close itself down – the server subsequently drops the connection to that client.

When the server receives a chat packet, it sends a packet to every player with the speaker’s name* and a message.

00000001 00000111SteGriff00001000Hi guys!

*This is a bad way of doing it. Ideally, when a new user logs in, the server should tell everyone to add a player to their register, with a UserID associated to a PlayerName. You’d reserve a packet ID for this. This means that instead of broadcasting the name “SteGriff” everytime, you can just send binary 244 (assuming I’m the 244th player). This optimises repeated 8 byte transmissions to just 1 byte. Then every client says “oh yeah, player 244 is SteGriff; he must be talking.”

The solution to our move problem

So what about the player ‘ghosts’ in higgledy-piggledy places?

The server should’t pass on the change in location – rather, the players’ actual, absolute locations in the game world. So, if user ‘Tommy’ moved to space 131 in our game world — the last thing he sent was a packet meaning “move right” — the server updates the location it has on record for him, and tells everyone:

00000010 00000101Tommy10000011

…”Tommy is now at location 131″. Every client receives this, and draws him in that position. Because they were updated on his previous positional moves as well, hopefully they just see him making smooth progress. However, if they lost a packet, he will seemingly ‘jump’ across the screen, but at least he’s in the right place!

Wrapping up

Other protocols for more serious applications work very similarly. HTTP has a series of codes like 200 (OK), 403 (You’re not authorised to look at this), and 501 (It’s not you, it’s me). You might like to check out http://httpstatus.es/.

If you’ve started working with VB or C#, check out the packet libraries built into the .Net framework; they make it easy, like this method which automatically prefixes your strings with their length before writing them to the stream: BinaryWriter.Write()

Conclusion and sources

Hope this was useful and maybe even exciting. If you want to go ahead and write your own networking program now then I’ve done more than enough, but at the very least I hope it answered some of your questions.

I’m currently a professional web and software developer, but most of this information and the way it was presented was based on my experience with the Minecraft Classic protocol, which you may be interested to peruse.

New Programming Frontiers

I have recently decided to stop feeling guilty about programming in Visual Basic .Net. Not only is it the language in which I feel most comfortable, I can assert that the coding practices which I maintain in it are sound and viable. Contrastingly, as much as I try to structure C-style code nicely, I tend to end up making some messy judgements. I also enjoy the ease of working in Visual Studio, and know that porting to C# would be a breeze; while VB is not popular in industry, .Net certainly is, and in that, I find solace.

I decided that I would learn network programming in VB. There is a (reasonably) nice .Net library called System.Net.Sockets, which — I have been told — is a fairly low-level wrapper for the old Winsock API. So far, I have gotten along with it, and wrap most things in my own functions anyway.

What have I made, though?

In short, a bot for the 5 day old Multiplayer Survival mode of Minecraft, of course! With a little help from some patient friends, I have puzzled out how to successfully send and receive packets to a Minecraft server, and developed a bot which can, so far:

  • Log in
  • Receive player list
  • Receive chat
  • Send chat

… a result with which I am very pleased, for my first foray into that area of speciality.

Today’s work will involve trying to move the bot around the server, and perhaps even recognising block placements.

It’s all very exciting.