Author Topic: [Solved] Bad transfer performance  (Read 6079 times)

Offline NewName

  • Newbie
  • *
  • Posts: 3
[Solved] Bad transfer performance
« on: October 18, 2009, 08:29:55 am »
I'm playing a 4 player coop game with some friends, and we're getting some really bad transfer rates for the inital game state when joining a game.

The our most recent session, the game state was about 1 megabyte and took it took about half an hour for me to upload to the three other players. However, once the game loaded we had no lag at all, and the game ran fine until we decided to call it a night.

I have a 100 kb/s upload speed and the other players have 1000+ kb/s download speeds. We play over hamachi, but have also tried direct connections with the same results.

We still have the issue if someone else hosts. And I have confirmed that AI War is the only program using the network.

Is this a known issue, is there anything we could try to speed up the transfer rate?

« Last Edit: November 02, 2009, 08:51:40 am by Revenantus »

Offline Revenantus

  • Arcen Games Staff
  • Hero Member Mark III
  • *****
  • Posts: 1,063
Re: Bad transfer performance
« Reply #1 on: October 18, 2009, 10:37:40 am »
Hi NewName,

Welcome to the forum.

I apologize for the issue you're having, and can confirm that this is a known issue that affects a subset of players. There isn't currently any advice I can offer you to resolve the problem, but we are looking into this.

Thanks for the report.

Regards,
Revenantus

Offline x4000

  • Chris McElligott Park, Arcen Founder and Lead Dev
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,651
Re: Bad transfer performance
« Reply #2 on: October 18, 2009, 11:06:39 am »
Welcome to the forums!  Sorry to hear that you are having trouble with performance on connection.  This is not a known issue, although I did have one other report of that from last night, oddly.  May I ask what version of the game you are using?  Are you using the 1.999 prereleases, or 1.301, or something earlier.  You can tell by looking at the upper left of your in-game screen when you are on the main menu.

In general, you may find this upload time estimate to be of interest: http://www.filesanywhere.com/FileTransferTimeCalculator/UploadBackupTimeCalculator.asp?FROM=UPDATES

A 100kbps connection is pretty low for a game of that size, that's only 12.5kBps (bytes versus bits), which means that under perfect network conditions it will take 81.92 seconds to transfer 1MB (1024KB) of data.  Given that you are transferring this to three different other players, you're looking at 245.76 seconds to transfer that 3MB.

That, of course, is only around 4 minutes, which is certainly far less than what you are experiencing.  You mentioned that AI War is the only program using the network, but I do have to ask: are you using voice chat software?  For instance, Skype can supposedly use between 3 and 16 kilobytes per second, and so that's a huge portion of your 12.5.  If you're the host in Skype, it's even more.

But, assuming that there is no voice chat software (or VOIP phones like Vonage, or anything else you might not think of off hand -- Wii updates, PSN, etc), then the most likely culprit here is packet loss.  When a packet is lost, it has to be resent, and it has to be reordered with the other packets received on the other end.  This takes time, and can cause all sorts of slowdowns and extra load on the network.  When you multiply this times three clients, that can really start to add up, though 30 minutes is beyond extreme.

Also, at present, with the UDP networking of the game, Hamachi is definitely going to be a slowing influence rather than a speed-increasing influence (as it sometimes was when we used TCP for the game).  Hamachi adds some reencoding of packets, potentially a roundtrip through their servers if you don't have a direct connection to your peers, and also the overhead of encryption.  So on an Internet connection that is on the lower end of things, you're better off avoiding Hamachi if you can for large transfers of data like this.

This whole thing raises two questions for me:
1. Is there something that the game can do differently with regard to how it sends the sync data?  It's possible that the network library is choosing to use larger packets, which, when lost, cause more data to have to be resent compared to when packets are lost during in-game play.  I will investigate this.

2. Are the clients sharing a transfer queue for this initial sync?  In other words, is packet loss on one client able to affect the packet loss on anther client?  You could help me out with testing this (and potentially use this as a workaround for your own network performance) by having each client connect one at a time, so that the huge data transfer only goes to one of them at a time.  That would tell us if packet loss on each connection is causing slowdown on the other connections.


Beyond the above, I should point out that a 1MB save file is on the larger end for the game.  Certainly, with four players games are larger in general, and they grow with time as well, but one big determinant for this (besides the number of players) is the size of your galaxy maps.  If all else fails, playing on smaller galaxy maps should give you better performance.  That said, what the game is transferring is not actually 1MB of a savegame -- that is further compressed on send, and winds up being more like 70% of that size or potentially even a bit smaller.  It's basically using the best GZIP compression available today for this sort of sending.

Generally speaking, most of my 4-player savegames are in the 1.4MB range, and it used to take around 2-4 minutes or so to get everyone connected when I had a 450kbps upload speed.  I now have a 1500knps upload speed, and it takes around 30 seconds.  At 100kbps, I would not expect it to then start taking 30 minutes unless packet loss was also a factor.

Please let me know what you are able to find out with #2 above, if you have time, and I'll see what I can find out about #1.  Thanks for your support of the game!
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline x4000

  • Chris McElligott Park, Arcen Founder and Lead Dev
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,651
Re: Bad transfer performance
« Reply #3 on: October 18, 2009, 11:07:40 am »
can confirm that this is a known issue that affects a subset of players.

Is it?  My understanding with people like Lars was that the time required was scaling up pretty much as I would expect, not to something crazy like 30 minutes.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline Revenantus

  • Arcen Games Staff
  • Hero Member Mark III
  • *****
  • Posts: 1,063
Re: Bad transfer performance
« Reply #4 on: October 18, 2009, 11:38:16 am »
can confirm that this is a known issue that affects a subset of players.

Is it?  My understanding with people like Lars was that the time required was scaling up pretty much as I would expect, not to something crazy like 30 minutes.

I was discussing this with Lars on the IRC channel and he's absolutely convinced that there's something amiss.

I find it somewhat odd that multiplayer performance was unaffected after the initial transfer. A 30 minute transfer time for 3MB suggests an average transfer rate of 13.3kbps, and in theory if nothing else changed that rate should persist throughout play. From previous threads regarding lag issues I was under the impression that a 4 player multiplayer campaign required significantly more than a 13.3kbps upload rate to prevent lag.

Perhaps this whole issue can be attributed to a lack of bandwidth then, I'll collect some data on the game's bandwidth usage.

EDIT: To be clear, I mean that AI War is averaging 13.3kbps of bandwidth usage, regardless of anything else that's running.
« Last Edit: October 18, 2009, 11:42:57 am by Revenantus »

Offline NewName

  • Newbie
  • *
  • Posts: 3
Re: Bad transfer performance
« Reply #5 on: October 18, 2009, 12:34:42 pm »
Thanks for the quick and informative responses. We're using version 1.999O.

I'd like to apologise for my unclear notation, I meant 100 kilobytes/s rather than 100 kbits/s. My friends and I are all on ADSL2+ connections, so that's about 12 Mbits/s download and 1 Mbit/s upload (I probably should have used the industry standard kbits notation from the start!).

When we were playing, my computer was the only one using the network (the rest were off), and my household doesn't have any other internet enabled devices. My computer was only running AI War and a firefox session, so I'm almost certain it was complete access to my internet connection.

As far as packet loss goes, I have played other games over hamachi without noticable lag. And I would have thought such extreme packet loss would make both those games and AI War almost unplayable? Obviously I couldn't be certain without some kind of statistic from the game itself though.

In our last session, one of the players joined a few minutes before the others. And I think his transfer was going just as slowly. But I could be mistaken, so I'll give this a propper try on our next session (which could be a few days from now).

I'm also getting the 1 MB figure from the transfer window which comes up when a player joins (to them it says somthing like 564,025/1,025,556). The actual save file itself is about 1.3 MB.

Another thing that I was reculant to bring up before (as it's a bit sketchy, but might be of interest) was that I was using a very basic network monitoring program to look at the transfer rates over the hamachi network adapter. It would show tranfer rates of about 2 kbytes/s as a base line, and every 20-40 seconds this would shoot up to 100 kbytes/s for just a fraction of a second.  One of my friends told me he was getting <10 kb/s (not sure if me meant bytes/bits) out of hamachi from his end, and would see it spike every 40 seconds or so.

Also, I'd be quite happy to run any network monitoring programs you think would help find a solution to this problem (and could probably convince my friends to as well).

Offline x4000

  • Chris McElligott Park, Arcen Founder and Lead Dev
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,651
Re: Bad transfer performance
« Reply #6 on: October 18, 2009, 12:53:03 pm »
Very interesting, those figures.  That's not at all what I would expect.  Also very interesting to hear that it was going slow for one player when just he was connecting.

Whatever is actually going on, the trigger seems to be the volume of data that is being sent.  Why it is then going into a low-speed transmission for a few people, but not the majority, I am not sure.  It may be a bug in the lidgren network library that I'm using, or it may be some config aspect for that librarythat is just not set up optimally for large data sending.  Fortunately, the network library is open source, so I'll dig around in that code and see what I can find out with it.

Any sort of data you can give me from network monitoring is quite welcome, although I wouldn't spend a whole lot of time on that if I were you, because I don't know how much it will really tell me beyond what you've already said -- but that was hugely valuable, so if you find something else unexpected then by all means please let me know.

My statements about packet loss were basically to do with the size of data being sent at a go here.  So if one packet is lost, and then (say) 40 kbytes of data has to be sent again (and again, and again) until it finally makes it through, that could explain the insanely long load times there.  Versus when there's perhaps 20 kbyes total being sent, or even less, packet loss would have very little effect (as in other games, or during gameplay of AI War after that initial load).

The mechanism that is used to sync is exactly 100% the same mechanism that is used for in-game gameplay data transfers, which is what makes this so curious.  The only thing different is the size of the data being put in during the sync, and so apparently that is triggering whatever is amiss.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline x4000

  • Chris McElligott Park, Arcen Founder and Lead Dev
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,651
Re: Bad transfer performance
« Reply #7 on: October 19, 2009, 12:07:33 am »
Okay, the fix for this is now fixed in version S!  This was a really screwy issue, and I'm pretty sure it's a bug in the Lidgren Network library, or else I'm misunderstanding how some of it is supposed to work.  Let me try to explain what was going on, since I know a number of you are curious

First, a bit of background:
-------------------------
1. The game sends all data via UDP, through the lidgren network, which makes sure that things are Reliable (all messages get there), and Ordered (they get there in the right order).  LN is also responsible for things like resending when packets/messages don't arrive, and reconstituting messages on the other side when they are split across multiple packets (as will happen, since messages can be an infinite size).

2. On top of this, AI War has its own messaging system, with what I call "Player Messages."  Normally, these messages are reasonably small (under 3-4KB at most), and during the game these are sent to the server, and from the server, every 200ms.  During the game, it has 200ms in which to process the network sending without causing lag, which works beautifully in almost all circumstances at this stage.

3. When AI War does a full sync (such as loading a savegame), it used to create on giant PlayerMessage that has the entire save file in it (so this can range from 200KB to 1.5MB or more).  More recently, I've switched it so that it breaks this one big message up into 100 sub-messages, so that this can feed the progress percentage for the sync (the client checks how many it has received, out of how many the host has told this to expect, and both shows this to the client player as well as reports this to the host so it can show it to the host player).  All 100 of these messages were being dumped into the internal lidgren queue immediately, and as you can guess many of these messages could individually be from 20-150KB depending on the size of the save file.

The Symptoms:
---------------
The problem, after much experimenting and testing with my dad (who was able to duplicate this with his connection, was indeed resent packets.  Using some new network debugging info that I added (Shift+F3 in the new release), I was able to see this definitively.  His machine was sending data constantly, but it was duplicative and really jumbled in general.

The Problem:
------------
The symptoms aren't necessarily an indication that there is packet loss, even though that is normally the cause of resends of this sort.  Instead, from what I could tell in this case, it was a case of a traffic jam.  Lidgren waits a certain amount of time before doing a resend on each message if it does not get a response, but in this case there were so many large queued messages that this was causing a massive number of resends, which then caused other various network-contention problems.

In addition, this was causing so much flooding of the network card that it was sometimes literally killing my dad's Skype connection, which is just crazy.  My understanding had been that the Lidgren library was going to send each one in turn, properly queuing each one, but evidently there was a lot more overlap than that, and the result was a huge mess.

The Solution:
-------------
Fortunately, AI War already has a message-caching queue of its own, for when the network is temporarily unavailable.  This message-caching queue slowly doles out its messages to each connection to the server, one message per game cycle (So 20 messages per second per connection).  This was ideal for my needs here, because basically I needed a way to handle my own queue since Lidgren was not doing a good job of it (or because I was misinterpreting what Lidgren was supposed to be doing, but either way it amounts to the same thing).

So now these partial messages for the sync are dumped into a queue to be doled out one per connection per 30ms.  This basically means that if I want to avoid contention issues that would lead to lots of resends, I'd need to make each partial message small enough to be sent well within that 30ms interval.  I settled on 4.68KB max per message after some experimentation, and so now the sync data is divided into however many parts it will take to make sub-messages of that size, with a minimum of 200 parts.

The Caveats:
-------------
This solution works extremely well, it would seem from testing it so far, though it is not without some drawbacks.  Basically, if you have 8 players connecting all at once, and your server's network connection is not up to par, you might see a recurrence of these contention issues.  Or if your network connection is not able to keep up in general, basically, you might see some resends that are wasteful.  However, the mitigating factor here is that the messages are individually vastly smaller than they were in the past, so even in the worst case it should not be taking 20-30 minutes to connect even 8 players.  But some of this is going to need more testing under various network conditions.  I may need to make some sort of settings option to allow for some tuning of this in edge cases.  But, hopefully not -- and it should still be vastly better even for the worst edge cases, so that's an improvement either way.

The other drawback is more minor, but still worth noting.  Essentially, this solution works as a sort of speed limit for the transfer, in addition to causing it to not overlap.  Given the asynchronous, unreliable nature of UDP, this is pretty much necessary for now, though I may revisit it someday when more data comes in.  So this basically means that if you are connecting over a LAN and used to superfast loads, it's going to be way slower now -- more like an Internet load.  For a 1.5MB file, assuming you have a connection that is able to send at full speed or better, it will take 30-40 seconds on average, which is slower than before (10ish seconds), but still quite fast in general. 

The nice thing is that even the Internet connections, like my dad's now sends to my machine in 30-40 seconds instead of taking more like 3-5 minutes.  So this somewhat normalizes the load times (depending on filesize), assuming reasonable connections.  Even for people on the LAN, this is an improvement as it won't stress out their network card in the same way (thus avoiding potential interference with other network devices or programs).  So overall this should be a win for everyone, though I'm not thrilled about having a hard speed limit for transfers in the game (doesn't seem as future-proof as I'd like).

Next Steps:
-----------
The future-proofing aspects can be explored more, during the expansion work after 2.0, though.  As long as this works reasonably well for everyone, and doesn't cause any major problems for anyone, this is going to be the final sync code for 2.0; there's too much risk of breaking something else the day before release if I try to go crazy with this.

Next steps for this is for people who are affected by this issue to give it a try when possible and to see if it works.  Hopefully everyone will have the same success as with my dad's connection.  Please let me know what you find, and please post a screen shot of the Shift+F3 data from the host if you're reporting a problem.  Thanks!
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline NewName

  • Newbie
  • *
  • Posts: 3
Re: Bad transfer performance
« Reply #8 on: November 02, 2009, 08:50:06 am »
Hi,

Sorry for the very late update (it's been test week here at uni :( ). Version 2.0 solved all our transfer problems wonderfully, it now loads in under 30 seconds!

Thanks again (and the info on the solution was very interesting too)!

Offline Revenantus

  • Arcen Games Staff
  • Hero Member Mark III
  • *****
  • Posts: 1,063
Re: Bad transfer performance
« Reply #9 on: November 02, 2009, 08:51:27 am »
Great!

Enjoy playing AI War.