Since we kinda got into the "groove" already with the ed2k performance tuning topic, and since we have the tools for doing it in a fast manner, while having proper statistical results after each change, we continued the topic today with chemical, with very positive results. Overall, we noted radical overall performance improvent in downloading, queue and upload management. While your milage may vary, the list of fixes should say enough:
- Use 2-minute socket timeout when transfer is in progress, since upload/download slots are very valuable in ed2k, and eMule often puts clients in "stalled" mode (in upload-list, but no data transmitted) for longer periods of time. Two minutes ought to be sufficient to survive those stalled situations.
- When downloading was in progress, and client sent us QueueRanking, this means we are put back to queue. Hence, schedule next reask as normally (30 minutes). Prior to this patch, we never re-asked clients from which we received data previously, unless it connected us first.
- In Bittorrent module, increase socket timeouts to 140 seconds by default and only use the short 10-second timeout when torrent has more than 50 sources and transfer is in progress. This improves performance when downloading rare torrents, where peers are rare and valuable.
- When the remote client hasn't reasked us during last 60 minutes, only drop it from queue if it's not a source for us. This is more fair behaviour than previously, where the client was dropped from our queue, but we still expected to be in it's queue.
- Fixed client's credit score calculation, which was very wrong due to a subtle code error.
- Go back to 10-sec interval for upload slot opening check - 5 seconds interval caused too many upload slots to be opened.
- When socket timeouts while downloading was in progress, we correctly re-schedule next reask to T+30min again. This wasn't done before, and caused us to never reask clients from which we previously downloaded, unless they contacted us first.
- When socket timeouts, and download request was only half-way done, still schedule next reask to T+30min.
- When socket timeouts, and we were uploading to the client, put the client back to queue, instead of dropping it's upload request completely. The former wrong behaviour was based on the wrong assumtion that if we sent the client some data already, it is no longer interested in getting anything from us; now they are correctly queued again.
- When socket timeouts, reset upload state as well; not doing this caused us to upload to same clients over and over again, since when connection is established, the code checks for upload-state member, and if present, it indicates that we should start uploading to it. The effect of this wrong behaviour was that we ended up uploading hundreds of megabytes to nearly same set of clients.
- If the remote client doesn't have any parts that we'r interested in (NNP client), re-establish connection with the client in one hour to see if it has something for us now. Former wrong behaviour never reasked the client.
- We no longer attempt to use UDP protocol with clients which do announce support for the extended UDP protocol.
- Disabled the early disconnection system, which attempted to do smart checks and cut the connection when both parties had sent/received all requests. This system was originally implemented as an optimization, to speed up initial source queries, where possibly thousands of clients need to be connected, and waiting 30 seconds for socket to timeout seemed a waste of time. However, as it turns out, this system was source for a number of different bugs, some of which were patched, some of which were worked around, but at the end of the day, it causes more problems than it solves. For example, it caused erronous disconnections in race-conditions during secure ident verification, caused the last packets we sent out to be missed, resulting in eMule considering Hydranode source as QueueFull, and many others.
A very special thanks to chemical for assistance in testing and analysing the hundreds of megabytes of logfiles Hydranode produces, and the perl scripts used for analyzing that data.On other news, I'm running another ed2k+bt cooperative downloading test, the test torrent includes about 9 files, and hashes for ed2k network; there are about 30-40 peers on BT for this torrent, and total of 150 clients on ed2k for these files. It's really interesting to watch how well ed2k and BT protocols complement each other in case of rare files - sometimes BT download stalls for hours, but the ed2k kicks in, and sometimes vice versa - resulting in overall steady download-rate of 20-80kb/s, which for this rare stuff is pretty good.
Bottom line: There are new optimized builds available, for
Windows and
Linux, including BT module, which is now open for public testing. While there are some known issues with it, I think it's good enough to warrant more wider testing. Additionally,
Linux debug build is available (19mb download, over 100mb when unpacked), and for completeness,
source code tarball. Enjoy :)
Madcat, ZzZz