Jump to content
Dante Unbound: Share Bug Reports and Feedback Here! ×

Network Compression And Congestion Control


maciejs
 Share

Recommended Posts

        Networking code is one of the core systems of every multiplayer game. Unfortunately, it’s also one of the most tricky aspects. A popular title means thousands of players online, playing with different configurations, different network conditions, often interacting with people on other continents. Stable and robust netcode is the Holy Grail of every network programmer.

        The core of Warframe’s networking code is fairly mature, it goes back to the Dark Sector days and we shipped a bunch of games with it. I did a major overhaul before shipping The Darkness 2 (had to conform to X360/PS3 technical requirements) and it felt like it was in a good shape before starting Warframe. A few years passed, and even though we did few major changes in the meantime (specifically for Warframe), it started to become increasingly obvious that what was OK in 2012, is no longer working in 2014. The other problem is that Warframe is simply much bigger game than The Darkness 2 (D2). Just to give you a simple example: the average level of Warframe contains at least 10 times as many replicated objects as the average D2 multiplayer map. As mentioned, we made a few changes that improved the situation, but it was clear sooner or later we’d have to try something more radical. I’d like to briefly describe what I’ve been working more for the last few weeks. Majority of my effort was concentrated in two areas: network compression and congestion control.

 

Network compression
        Just to give you a better idea of what we’re up against -- a server sends updates over the network with some frequency, let’s assume 30 times per second. Let’s also assume we’ve spawned 30 NPC characters (not an unusual number for some missions in Warframe). Even if we only use 20 bytes for character that’s 30 * 20 * 30 = 18000 bytes per second for NPCs alone! Add players (they’re usually more expensive), multiply it by 3 again (3 clients) and that’s a 512kbps (=64 KB/s) server connection gone. 20 bytes per character is nothing to write home about, though, it’s barely enough for full-precision position and rotation data, for example, where’s space for health, animation, damage information? What do games do to get around that problem? For one thing -- we cheat. We don’t send information if we think we can get away with it (for example, NPC is not visible), we don’t use full precision for everything, we split properties into several group (critical/important/normal/cosmetic) and so on - every developer has their own bag of tricks. On top of that, network data is usually compressed. Improving this compression has had significant results, but first let’s go over some Network compression information and how it relates to Warframe.

        Network compression is an interesting problem in itself, game packets tend to be fairly small (few hundreds bytes), so algorithms designed to work with megabytes of data don’t really do that great for network compression. We can’t afford to waste more than few bytes for the header, for example. For a long time Warframe has been using LZF compression for all network traffic. It’s blazingly fast, easily available on all platforms we cared about (I’m not only talking about different hardware, also services like our Relay server) and actually gives decent compression ratio. Efficiency varies between mission types, but in our most demanding scenarios (Survival) we hover around 1.25:1 (80%). In other words, the compressed buffer is roughly 80% of the uncompressed one. Given all the limitations I mentioned, it seemed like a decent score, but we thought it could be improved.

        I briefly experimented with other compression algorithms, but it was hard to beat LZF significantly. Fortunately, I found out that the RAD Game Tools guys have just released their network and data compression library. You might not be familiar with the RAD name, but I’m 100% sure you’ve played a game that included their technology. They’re the people behind Bink movie codec and Miles Sound System (and some other notable middleware). RAD guys are basically a group of code geniuses, they usually have only a few programmers working on single project, but these programmers are amongst the leading experts in their domain. We asked for evaluation, did some quick tests and the results were very promising. The way their compression algorithm works, it first needs to learn about your data using real-world samples taken from the game. I’ve collected over 100MB of Warframe network traffic to create a final dictionary. Remember the 1.25:1 (80%) I mentioned? With our new compression system we were able to reach 1.8:1 (55%) in the same scenario. So those long co-op Survival missions you play used to work with data sent at 80% its original size, and now we've compressed that further to 55% its original size. Gains vary between different mission types, but the way I tried to set it up is that they’re the biggest in the most demanding situations (after all, compression is less important if we’re not sending too much data anyway). Having said that, the new compression system performs much better than the old one in every scenario I’ve tested. The lowest I’ve seen it go was around 1.4:1.

TL;DR version: Warframe uses a new network compression system, sends significantly less data.

 

Congestion control
        As you can see, we try hard to make sure we’re sending as little data as possible. There’s a very good reason for this - we don’t want to hit your network bandwidth limit. As you might know, there are 2 major network protocols - TCP and UDP. Without going into gory details, TCP is much more sophisticated and does many things behind your back, you can think of it as a reliable, safe family van. In contrast, UDP is bare bones, it lets you send and receive data and has some basic checksum support, but that’s it, think of it as a Formula 1 car. Most shooter games, including Warframe, communicate using UDP. If used correctly, it’s more efficient, but requires taking care about reliability, ordering and last but not least - congestion. If we tell it to send too much data, UDP will try to do it, but sooner or later your connection will start lagging and/or dropping packets. TCP has a built-in congestion control mechanism, with UDP you’re on your own. The basic idea is to ‘back off’ when we detect our connection quality deteriorating, then maybe increase the amount of transmitted data when we think it’s safe. Sounds simple on paper, but it’s a very tricky problem (let’s just say Linux comes with at least 10 different congestion control algorithms, each with own flaws and benefits).

        Warfame uses a slightly modified version of The Darkness 2 algorithm and while it gets the job done, it’s not very smart (hey, I coded it, so I’m allowed to say that). Over the past few weeks I experimented with many different approaches, mostly to find out they’re not doing that great in our situation. The problem, again, boils down to the fact that games are very different beast from most applications. “Mainstream” algorithms have to deal with a very wide variety of situations, we can do better if we only care about our limited case. Congestion control is all about finding our network limits and trying to send as much data as possible, just not too much. Typical congestion control algorithms do OK when we have to download a file for example, traffic is fairly steady and if we don’t have enough bandwidth, it’ll simply take a few seconds more. We don’t have this luxury with games. If damage information comes 10 seconds too late, it’ll break the immersion. Game traffic comes in bursts, too, it’s rarely steady, we can’t take too much time to learn our limits, we have to react immediately. Time spent on experiments wasn’t wasted, though, by learning flaws and advantages of different approaches I was able to come up with a little Frankenstein of an algorithm that seems to give  very decent results. It was still just a first step, I had to modify it further so that it works correctly with more than 1 client (we want clients to divide bandwidth fairly, don’t want one of them to take 90% for example), but it went fairly fast from there. Just in case you’re interested, a (not-so) pretty graph showing bandwidth given to 2 clients (it’s a little bit crude, but it was meant for internal purposes only). I have artificially limited host bandwidth to 30KB/s, so in a perfect world, each client should get roughly 15KB/s. As you can see, it actually gets pretty close (horizontal lines).

 

G0IiOwA.gif

Client 2 disconnects at some point (see the red line going down) and Client 1 quickly realizes he can now use the whole available bandwidth. He actually overshoots at first, but then discovers his limits and hovers around 29.5KB/s. It’ll not always look this nice, but this is honestly the first run I captured.

        I realize what’s on your mind right now is: “OK, nice graph, but why should I care, will it play better?”. I’m actually both excited and nervous for the next update - yes, the compression and congestion changes are coming on PC this week. Console players will be receiving this down the road.  Compression is easy to quantify and I can guarantee it works as expected. Congestion control is much less predictable so it’s harder to measure the impact. I don’t expect it to be perfect, nothing ever is, I do expect it to be significantly better than what we have right now. It doesn’t affect lag directly, but better congestion control means lower latency as well. We can’t make the data go faster (unless we break the speed of light barrier, working on that next), but we do try hard to make sure it’s not us slowing it down.
 

Link to comment
Share on other sites

This is a fascinating read, agreed! If that compression is as impressive as it sounds... this is really a major advance! Networking is really hard to do right, and you never hear about how everything is fine. You really just hear when it's bad.

 

Here's hoping that this change goes in smooth and calmly. No hype train, just well-wishes on this. This is the kind of stuff that determines the long-term health of Warframe - making sure its underpinnings are fast, efficient, and compatible.

 

So much research and work.

 

Thank you for enlightening us and your constant efforts! It's really appreciated. Can't wait to test the results with the next update.

Link to comment
Share on other sites

Interesting information there; thanks for the update.

 

So, from what I'm getting is that while connections aren't going to get faster (depends on the host and clients), they're likely to be much more stable and reliable over a long period of time, including managing information so there's less instances of characters suddenly slingshotting ahead or canisters taking a moment longer to explode than they should?

 

If anything, the reduced bandwith use and optimisation for clients sounds great!

Edited by Wiegraf
Link to comment
Share on other sites

This will benefit all of us and it will greatly help the long-term health of the game for sure. However, please take as much time needed to roll this out as smooth as possible, don't rush it. Thanks for the heads up!

Edited by LunskEE
Link to comment
Share on other sites

Great!

 

It's always nice to get this kind of nerd porn, having those peeks under the hood are very satisfying.

 

What about cheating more?

 

For example the inconsequential stuff, like when you open a coffin, break a crate or kill a character. 

 

When the network is congested, such as a slow internet host, the drops teleport large distances before they hit the ground. 

 

If you had like something like 256 predetermined trajectories and staggered object spawns: 

 

- cuts off the position update for the object on every packet until the object hits the ground. a single (object spawned at simulation frame 1/30 on trajectory 64) message needs to be sent and every client does the object drop by itself.

 

- staggered object spawning will avoid overcrowding the 30 messages/second, one "credit drop" spawn per message would be sent. 

 

Can you enlighten us on what is already there in this regard, and what can/is being planned to be done? 

Edited by BrazilianJoe
Link to comment
Share on other sites

This is amazing and updates and explanation etc are a big part of the reason I love warframe. It's great that you not only improve the game but also the background workings and are willing and able to tell us about them. Blown away as always by your work ethic and communication DE. Thanks.

Link to comment
Share on other sites

It's always interesting to take a peek at the perspective of a Network/Routing Programmer. It's great to hear that you're having successes with RAD's new technology, and it'll be exciting to see how this will affect player's connectivity and experience now and in the future.

EDIT: If you're in charge of networking, are you in charge of IRC management as well? It would be nice if you could look into adding a silent/invisible deghosting command in the circumstance the IRC connection needs to be reconnected (sudden internet drop out, etc etc). This would fix a lot of the problems players would have with the _1 suffix, and enable a more stable and consistent chat experience (Invitations tend to break when they _1).

Edited by Azure_Kyte
Link to comment
Share on other sites

Nice post, can't wait for this to go live on PS4. I play often with someone on a different continent in my clan and we notice a significance difference in play between host and client. For example, the player who is host will often take significantly more damage from enemies and will need to take cover, use health restores and need revives more often than the client. I almost always notice at least some difference between client and host play, but when there are many hops separating two players (which you would expect on separate continents), it is particularly noticeable. Hopefully this will help. 

 

I often try to figure out, based on the behavior of certain objects in the game, what is actually being sent over network. For example, with Mods and Fusion cores, I have noticed that when they pop from enemies they will appear in different places for each player sometimes. The "golden domes" in the void are notorious for this, sometimes the mod will drop in one place for one player, some place else for another, and another player will end up with the mod stuck still in the "golden dome." So from this, we can assume that the only thing that gets passed over network is that a certain type of mod popped from x, y, z location then local physics takes over to determine where it falls on the terrain. In the case of cannisters, the type of mod might not even be transmitted over network, as each player gets a random mod. I suppose there is always packet inspection for those really curious.

Link to comment
Share on other sites

So much research and work.

 

Thank you for enlightening us with a comprehensive post and your constant efforts for our playing experience! It's really appreciated. Can't wait to test the results with the next update.

Yaer, you're back?

 

I've noticed less activity from you, but I digress...

 

Edit Yaer: I never leaved, but I found my actions quite meaningless and I don't like striking without a good reason, as many of my kin. I still didn't find how to help the Tenno with my best; I keep ways since long forgotten or abandoned so for now their secrets will remain unspoken. If I see a field were my oddities are needed, I shall rise from the depths. And them... with me.

Until then, have you... a noble quest for me, Tenno? Let the Void carry it to me...

 

 

Thanks for this, this is interesting as games seem so simple, when the behind the scenes is insane ^^

Edited by Yaer
added an answer.
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share

×
×
  • Create New...