One of the things we’ve been struggling with, even since the beta, when it comes to match making is the seemingly exponential server requirements for match making. That’s why when we’d release a beta things would be peachy and then they’d go to the public, it would fall apart.
Well, yesterday they finally found the issue deep deep in one of the licensed network libraries was something called flow control. This limits the number of packets that will get processed based on a number of criteria. There’s good reason for flow control but because this code is a “black box” to us, until we could meet with the developer in person and go through his code, we had no idea that you’d end up with scenarios where a player might only be processing 3 or 4 packets per second.
Believe it or not, 3 or 4 packets per second isn’t necessarily a bad thing in a client/server game. You connect to the server and they send back the connection info. Maybe that connection info is 10k. So it takes a second or two to get that info. It’s not that big of a deal if it’s 1 vs. 1. It’s not even that big of a deal if it’s 2 vs. 2.
But naturally, in Demigod, people want to to 3 vs. 3 or 4 vs. 4. And if you do that, well, very bad things start to happen because it starts to back up. You can tell it’s messed up because your connection window will not seem to make sense and it doesn’t – because it’s incredibly outdated.
If you have to get info every few seconds from say 8 people in a lobby, you’re going to quickly get backed up. Now, until we fix this, you can resolve a lot of this by keeping your games at 2 vs. 2 or maybe 3 vs. 3 if the people in the lobby have decent machines. It’s fixed on our end but we have to merge the fixed library into our code which, if you’re a developer, you know how carefully that needs to be done, even with version control.
But this problem I just described has been there since the beginning (i.e. February). The only reason connectivity has gotten better is because we’ve thrown a ridiculous number of servers (I think there’s like 8 servers now dedicated to just handling requests) plus the work we’ve been doing for the past 2 weeks has been improving things.
BTW, the only reason why the day 0 update worked better for some people than the most recent release is because the day 0 update used NAT and that’s a lot slower (so a lot fewer messages). The faster the system, the more it aggravates this.
I’m going to ask our technology architect to do a write up on this once this is fixed. I am heading back to doing more testing scenarios.
I plan to also write up an article on how games are made so people can get a better idea of how this sort of thing can happen in a major commercial game in the first place.