Jump to content
Dante Unbound: Known Issues ×
Dante Unbound: Share Bug Reports and Feedback Here! ×

Network Matchmaking From Networking Expert


RorschachRev
 Share

Recommended Posts

TCP/IP works as a protocol by having unique ports for different things. TCP can handle multiple sessions on a single port well because they have a bunch of information talking about the data being transmitted as a session. UDP does not have that extra session data, and would require a great deal of extra programming to make individual UDP sessions on a single port. I recommend using a unique set of ports for each computer behind a router. Your router will have documentation on how to forward the ports, but I found that using uPNP was better than manual port forwarding. With 2 computers behind an ancient DSL router, we used unique ports and manually forwarded the ports for both computers. We could play with each other, but couldn't join sessions. I put my laptop to use a different set of ports (not forwarded by router) and enabled uPNP on mine, while the other computer was manually forwarded. This works great for group play as long as I host, and neither myself or LAN partner have lag.

 

Having the port forward fail so much is a sad state of affairs. Having a port forward and a uPNP work for LAN play is the best we've done and an interesting hybrid failure as well.

 

My suggestion would be to have some actual game servers instead of just having some accounting servers. Even if Evolution Engine has a creative solution for lowering server costs, it isn't acceptable to have so many failures. I doubt that XBox and PS3 will have no networking problems, and their expectations as users is typically that it should just work. PC gamers are used to fiddling with settings a bit more, but I can't see the market really accepting the current state of networking.

 

On the servers, I highly recommend co-routines. The old SELECT methodology is limited as file system handlers since each socket is treated as a file handler. The high performance systems I've seen were Python and gEvent based. Using that combination, I was reading a 1MB shared memory buffer, converting it from bytes to JSON, pushing out to 100 clients every second via websockets, and using less CPU time than having a process list via "top" running for a few minutes. (No webserver necessary in that 'web' stack.) This scales well to 100k simultaneous users per machine. Not "per second" but "simultaneous" which is a much higher number of open connections. TCP tends to have more overhead than the delivery mechanism, so using UDP will scale well server side. Player host by default sure, but server host a game session for 1 platinum or for subscribers. After it scales and works well, I'd suggest having server-hosted as a failback for free.

Edited by RorschachRev
Link to comment
Share on other sites

I should mention that the web stack using websockets and everything else I wrote used about 45 lines of code and including memory management, memory maps, timers, a stored history for new connection clients, auto reconnect, auto resume, read settings from a configuration which it parsed, and was production quality code. This server thing is only as hard as you make it really.

Link to comment
Share on other sites

If you have a suggestion, take it to the feedback forums, these aren't bugs.

 

As far as actual servers, not going to happen. It would take thousands of servers to host, which would cost a lot of money and probably wouldn't be within DE's grasp of being a successful company. They already had a fallout, we don't want another.

Link to comment
Share on other sites

I could host 1000 simultaneous game sessions for $80/month I bet. The pathfinding and hitboxes are the most expensive CPU calculations. 

 

It *is* a bug when people can't play. I *did* detail a minor work around for LAN gaming, one person has ports forwarded and they are a client, the other person uses uPNP and is the host.

Link to comment
Share on other sites

I am filing a bunch of bug reports about the game, following the rules by breaking them into separate topics and posting them in the appropriate threads. Here's the first and most important (and easiest to fix) bug report:

https://forums.warframe.com/index.php?/topic/376332-steam-failures-required-warframe-fixes-for-steam/

Here is my reported AI bugs, and some recommendations on fixes:

https://forums.warframe.com/index.php?/topic/376364-ai-issues-reported-from-ai-developer/

Link to comment
Share on other sites

  • 2 weeks later...

The Devs understate how serious the matchmaking issue is. Out of 12 people online in our clan, we were able to find one set of 4, two sets of 3 that could play together. We also had a few pairs. With the Void Keys becoming more expensive, our ability to play together is even more limited while hunting drops. Matchmaking shouldn't be so hard to find people to play with.

Link to comment
Share on other sites

What are you up to RorschachRev? I don't get it... you don't appear to be an "networking expert" from the stuff you've wrote so far... (as some spelling is wrong, some facts as well and those "references" aren't really useful as well etc.)

 

You are right about the fact that Warframe's networking is poorly done. Also about select() being "slow".. though it's not really important for just 10 connections^^ More like 100+ which you don't even use in Warframe^^

Edited by ShiroiTora
Link to comment
Share on other sites

TL;DR: I want them to fix it.

 

The select() method of handling network sockets in the linux kernel uses file descriptors. The main advantage is high I/O throughput without blocking. http://man7.org/linux/man-pages/man2/select.2.html The select() method can scale to having 1000 simultaneous sockets fairly well, but not an order of magnitude more than that. For limited networking, it works really well. For server side networking it can not be used to solve the C10k problem. http://en.wikipedia.org/wiki/C10k_problem on a single machine. The number of servers necessary drastically increases by the limitations of select(). 

 

What I'm actually after is a game that works. I can't play with my IRL friends most of the time with anyone else. I'm buying them a new router that will arrive next week. My personal ax to grind probably won't be as sharp if spending another $40 allows us to play a F2P we've put money into. 

 

As for my qualifications, I've done stuff. I'd rather not talk about it here, mostly because I wouldn't be believed. My resume is verifiable by human resources staff. I've written a few protocols (TCP based, UDP based, two rode on TCP WebSockets) so I know more about this topic than your average 50k/year programmer.

 

The mental trap that DE has fallen into in terms of design is a really easy one to adopt by mistake. The idea is that a server suddenly has to do all of the work of the hosting, and that a client needs a direct TCP/IP connection initiated P2P. The pure P2P method is limited, and thousands of people have worked on it to provide a better connection for BitTorrent and eMule. Even when solutions within a protocol have been developed, not all clients embrace those solutions: http://en.wikipedia.org/wiki/Comparison_of_eDonkey_software My wishlist for the patching system would include LANCast and some of the protocol and hashing tricks. Warframe simply fails in most situations with most router hardware in the marketplace today, preventing people from playing together. The problem workaround in the eMule/eDonkey protocol is server routed connections. http://www.emule-project.net/home/perl/help.cgi?l=1&rm=show_topic&topic_id=103 I was looking for some of the C++ code that actually implements it, but I didn't find it in 5 minutes. The protocol description has byte offset descriptions and ascii art of UML models, but doesn't describe the server routed connections.

 

I was proposing something more aggressive, and that was UDP relay by server. That would allow incremental adoption of server hosting, while still allowing anyone with an outbound UDP connection the ability to play with anybody else picking the same matchmaking.

 

DE claims the networking issue of Strict NAT only affects 7% of the playerbase, but the #1 topic of discussion in our clan is consistently "who can invite/host players A, B?" We've made lists, but they don't work after random changes. What do I want? I want to play Warframe with my friends. DE doesn't want Dedicated servers or Dedicated client to client proxies, but the reasons provided are invalid. From these forum posts? I'd like to figure out the real reasons or disabuse people of incorrect assumptions. I was encouraged to post by my friends and propose solutions with the assurance that the community managers would respond. There was a well written post by another OpenBSD user (I haven't used it for many years) describing the flaws in various NAT technologies and describing why the DE approach wouldn't work when Warframe hit prime time. We're currently in that situation and the networking issues may kill a great game, great company, great concept and fun play. I just get exasperated at my inability to play with the people who recruited me to the game.

Link to comment
Share on other sites

Here's the source code to the eMule uPNP client in C with Visual Studio project files. It also compiles with gmake according to the author. http://sourceforge.net/p/emule/code/ci/dev/tree/miniupnpc/ 

C++ for Async proxy connections http://sourceforge.net/p/emule/code/ci/dev/tree/AsyncProxySocketLayer.cpp

Drop in replacement for MFC library that includes support for the socks/proxy routines (above), and thereby supporting their LowID routing method on the servers. http://sourceforge.net/p/emule/code/ci/dev/tree/AsyncSocketExLayer.cpp Obviously more complex than I'm going to post here, but also *not bad* and source code that compiles as an example.

 

Sounds like the LowID eMule routing is just a SOCKS 4/5 server that everyone connects to. This should be dead simple to configure on the server side , and client side just needs the open source C++ supplied. Server side tsocks and varnish aren't suitable. http://www.inet.no/dante/ is a BSD license, has commercial support, claims to be high performance SOCKS server.

 

Dante has a performance document: http://www.inet.no/dante/doc/1.4.x/dante_performance_1.4.0.pdf with some useful data:

"Sustained send/receive rates of around 0.8 Gigabit/s in each direction and combined rates of around 1.5 Gigabit/s (see Figure 1). IP-packets are for most of the two week period received and transmitted at a steady combined rate of around 250,000 packets per second, with around 150,000 of these being TCP segments, and around 100,000 being UDP packets. Roughly the same amount of packets are sent and received." Since the game is UDP based, the number of packets and overall bandwidth that Dante could handle would be higher.
Link to comment
Share on other sites

if you spell UPnP wrong all the time, I can't take you seriously.

Also, the problem is not Layer 4, nor is it how they listen to sockets.

 

It's true that people with strict NAT have problems, especially when P2Ping to each other. But they should be fine to reach a host without this kind of NAT. (also, people with shared IPv4 happen to have provider provided very strict NAT, each outgoing packet from port X might have different ports to different IPs)

And as long as Warframe isn't really P2P and still uses some kind of client->server setup (which I guess it does, otherwise it wouldn't matter "who" hosts), multiple people with strict NAT or shared IPs should be able to connect to the host.

 

 

The problem with Warframes networking is mainly its general programming. Such as:

- physics is mostly done by the host. At least if we're talking about item drops, doors or what ever.

- redundant packets, such as you might get the exp for capturing the tower, but you didn't yet receive the packet that you captured it... your game still thinks you're capturing it even though you've got the exp and thus finished.

- those host based physics and redudant packets that not only cause higher network load but actually higher CPU load as well. So the network might be fine to host a game with 4+ players, but the CPU might not because of all these protocol flaws.

- no improved host migration / best host detection (watch pings, measure bandwidth and remember it across games to detect network problems and how many people someone is able to host. Ask for host migration if someone with bad internet is hosting and a better host is available. Could be extended to a more complex network were clients can on-the-fly switch to host and you can have more than one hosts connected to each other.)

- no IPv6 yet if I'm not mistaken. IPv6 support would help reduce some connectivity issues with shared IPv4 (but non-shared IPv6)

 

Sure, I've also recently noticed problems with clients just dropping out of lobbies.. so thy were connected once and then loose their "connection", but that is probably just a bug as of now.

 

 

Dedicated servers might be nice, but aren't necessary if the current infrastructure is just improved and the protocol recreated. But they are using Lua for their game logic, so what do you expect?

Edited by ShiroiTora
Link to comment
Share on other sites

Ah you miss the subtle joke of uPNP. It isn't universal, so a lower u. Plug Not Play. When I worked for Microsoft I took the handle POSOS because it was a POS OS. (Especially the version I worked with.)

 

There is one set of problems making a connection, and another maintaining game state between all machines. The "P2P" aspect of the game networking is a misnomer, because P2P networking is actually written well. The difficulty I am extremely vexed over isn't the session, but the failure to connect ports. I understand the issues associated with using SOCK_DGRAM instead of SOCK_SEQPACKET (aka UDP and TCP). My understanding of the issue, and the reason that changing the router helped, is the communication dialog between the intended host and intended client about "what IP:PORT can I find you?" This is complicated by the fact that the port isn't really "open" in most situations, the UPnP (uPNP) is supposed to solve those problems a little bit. Unfortunately they are relying on a broken insecure methodology of having clients talk. Realistically if a port is open, forwarded, and static, that information can be passed by game client intending to be host to DE server to intended client. For some reason, an open port isn't enough for DE's programming. WTF? The port is open. Open. Waiting for connections. Their socket() call fails WHY? The failure is at Layer 3 of the seven layer networking burrito. 

 

For gaming, the whole "TCP vs UDP" has been discussed to death. TCP means that the network card could handle a lot of the overhead if you have a good one. (ASIC is the future of increased performance for another few years, so embrace low level where possible.) UDP is faster, but you have to handle missing packets, out of order packets, etc. I remember back in the day when cDc used triple UDP for back orfice administration - it does work with higher level logic. NAT tables are annoying to traverse, and it bypassed many stateful packet filters of that day. (I still like firewalker personally.) 

 

It sounds like your complaint is about shoddy transactional programming that assumes the network layer is always correct. (It isn't.) My complaint is inability to connect, not the lack or issue with their transactional systems.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...