I only mentioned Borderlands because it was cited as being a candidate for this kind of problem in one of your quoted replies.
Fair enough -- I'm not trying to argue with you. What I'm saying is that while a number of other games also have this sort of problem under lesser load, the fact that they don't for any individual is not necessarily indicative that there are no problems, given the differing scale with AI War.
Having played Supreme Commander over the net without any trouble (aside from local performance issues - it is something of a monster after all) I'm not really seeing much in the way of evidence to suggest that my connection is now somehow incapable of handling heavy loads. With the router running up to scratch and all other net traffic activities performing without fault I'm not really sure what else there is to try.
From a networking standpoint, SupCom passes a lot less data than AI War; it is a beast, for sure, but it was the game I was playing most immediately prior to creating AI War, and I've seen the difference first hand. However, I'm also not suggesting that your network connection is incapable of handling heavy loads -- far from it.
What I'm saying is, when AI War's networking library dumps a lot of data into your network adapter, not all of that data is making it to the other end, and/or the other end isn't sending acks back. So this may not be your issue at all, it may be the other end has a router or network card issue that is preventing those acks. Some possibilities:
1. A firewall, QoS service on a router, or some other software sees the huge spike in transfer traffic, and thus smothers it.
QoS can be implemented in a variety of ways, but often it is geared towards smoothing out spikes in network traffic to ensure that VOIP still runs smoothly, for instance (on the Vonage routers). I doubt that any other game has quite the same sorts of spikes that AI War does, because usually you have X number of entities all moving around fairly constantly, rather than X low number of entities moving around and then a player suddenly giving 1000 or 4000 commands to units all at once (as with a big bulk move, or what the AI is doing here). So depending on the QoS algorithms on the various firewalls in use, they may be actively disrupting your AI War session despite the fact that they are not "broken" and do not interfere with other network traffic.
2. Some sort of faulty network driver can't cope with that amount of data being put into its queue without an overflow and losing a bit of it before then passing on the rest.
3. Some sort of filtering at the ISP level is freaking out.
The underlying network library that we are using is supposed to cope with that sort of thing, but I haven't gotten much response from the library author about any potential issues there are there. As far as he is concerned, I think, the issue is below his layer since it works for the vast majority of people but some specific computers have issues. This creates a lot of challenges for me, as thus far I can't disprove that it is someone's hardware, and I don't see any issues with his or my software (the game itself is super simple with how it handles this, and all transmissions are identical, so it's almost guaranteed to be the network library or lower).
I'll still try and join a game with someone other than V and see how that goes.
Sounds good. Again, there might be no problems on your end at all -- it's possible that one of the other players is the one with the QoS interfering, or bad network driver, or whatever issue. From the sounds of it your stuff is well maintained and up to date, and the age of your router is more of a plus than anything in my mind (since QoS was less standard until recently). Generally speaking, the vast majority of setups don't have any issues with AI War -- and in one of the other two reported instances of this issue, they were able to play just fine over the LAN, and saw less incidence of this over Hamachi compared to direct-over-internet connections. So there again, that points to some sort of filtering or throttling or QoS at the router or ISP level, where something either is failing because of, or actively rebelling against, having a data stream with periodic spikes above normal.
This is a very tough sort of situation for me, to be honest, because while part of my background is as an IT admin, and I've got some experience coding TCP sockets and such, I'm by no means an expert on all things networking. My expertise is in most of the other areas of creating games -- hence why I'm using an external networking library at all. So I have coordinate with the author of the network library for some of the nitty gritty with this, and his library is being used in a variety of products, including AI War, without any known incidence of failures of this sort. By the same token, there's a
ton of different hardware and driver software out there, all of which has its pluses and minuses, and different versions of each (some with bugs, some with not), and different sorts of filtering software, ISP policies, firewalls of both the software and hardware variety... this makes it a real challenge to find out what the root problem actually is, and generally with this sort of issue (which shows up in various games for various reasons with various different hardware) it is something outside the game or network library itself. It's the sort of thing where, if the game or network library couldn't handle it, it would fail for everyone or at least a majority of players, rather than an extremely tiny minority, right? So the trick is figuring out what else is in the pipeline between the affected players, and how to recommend what the fix is. There's always the possibility of some sort of odd edge case bug in my game or the network library, but I've been all through that to the extent that I can, and there's not much left for me or the network library to check at the moment with the data we have.
I really hate when these sort of issues come up, because of course the tendency for a lot of companies is to point the finger of blame to some other company, and I hate doing that even when it is true (or likely true), because it looks bad in general. So I always look internally first, but when all of those solutions seem to be exhausted, there's nowhere else to look but at the huge stack of other software (drivers, firewalls, etc) that is responsible for transfer of data. Hope that makes sense, and I hope that the comments above about QoS help you diagnose the issue if it is indeed on your end, or perhaps the issue is on the client end and one of them can simply update their NIC driver or router firmware and get the issue solved that way...