[dev] New networking code based on "proactor" pattern

faramir118 · February 9, 2011

Actually, I'm not sure how this works at all. Sending packets out of order should break the RC4 steam cipher.

Working on scatter/gather support now.

Ambal · February 9, 2011

faramir118, please, remember that scatter/gather API is supported only on Vista/Windows 7 operating systems - no need to bother for these specific cases at all. It is better to use old fashioned way of dealing with data buffer merging and mining.

P.S. I've checked you last commit with scatter/gather functionality discussed above - no sights of issue with packet drop by client. Very promising

faramir118 · February 10, 2011

Now sending/receiving as much data as is available. I too haven't had any issues. (sorry if I rebased while you were looking at it )

scatter/gather API is supported only on Vista/Windows 7 operating systems

Hadn't even checked which platforms support WINSOCK2. I'll leave it as is.

Ambal · February 10, 2011

faramir118, I'll check your code tomorrow for memory leaks so community can be able to start testing your patch Thank you very much and keep up the awesome work

P.S. No memory leaks detected so far - very nice Everyone are welcome to test this patch - we want to know whether or not ping/CPU usage went down and any issues you possibly might experience.

P.P.S. I'd edited topic to reflect what this patch actually does.

faramir118 · February 10, 2011

.S. No memory leaks detected so far - very nice Everyone are welcome to test this patch - we want to know whether or not ping/CPU usage went down and any issues you possibly might experience.

I pushed a commit to implement double-buffered sending and removed the message queue, as per your suggestion on github.

outgoing packet buffers are reused
(some small overhead when buffers must be enlarged)
for sending packets, cut down the amount of memcpy calls by half.
'enlarged' the critical sections in SendPacket and handle_write_stream
(we are dealing with shared variables now)
removed unnecessary lock in ProcessIncoming

Overall, much less CPU time spent on buffers. Could maybe remove resize overhead by picking a good starting size for the buffers.

Ambal · February 10, 2011

Thank you, faramir118 . Yes, you can set starting buffer size to 64 KB - this is our default value in config.

I'll test your changes tomorrow since currently my brain hurts so bad I can't think at all Once you'll done that, is there any possibility we can make use of "Network.Threads" config value to spawn several worker thread for processing async completion notifications?

P.S. As a small advice: in AsyncSocketHandler::handle_write_stream() you should place check to see if ALL the data from m_busyBuffer was written by last write() call. If not - you should keep sending pending data until finished. Otherwise we might loose data =/

Cheers

faramir118 · February 10, 2011

Multiple threads are possible, definitely something that have been on my list of things to do.

It is accomplished via creating multiple ACE_Proactors.

edit:

For testing, should we ask for system-wide CPU utilization?

There will probably be more CPU use in kernel-space than before, just not sure how much.

Ambal · February 10, 2011

I've found some docs regarding ACE_Proactor and ACE_Reactor pattern use on *nix and Windows systems: http://stevehuston.wordpress.com/tag/ace-proactor-overlapped-asynchronous/:

So, the Proactor situation on UNIX platforms is generally not too good for demanding applications. Again, Proactor on Windows is very good, and recommended for high-performance, highly scalable networked applications. On Linux, stick to ACE_Reactor using the ACE_Dev_Poll_Reactor implementation; on other systems, stick with ACE_Reactor and ACE_Select_Reactor or ACE_TP_Reactor depending on your need for multithreaded dispatching.

Information might be outdated since there were new versions of OSes as well as ACE which might have fixed those issues. If not then there is a solution called TProactor http://www.terabit.com.au/solutions.php with detailes about its internals: http://www.terabit.com.au/docs/Comparison.htm. Also their license allows us to use it w/o any issues if we will carry on their license file

Looks like not all *nix OSes are good for true async IO

P.S. Yes, faramir118, we should ask for kernel CPU utilization params too since async IO will push more resources to kernel level.

P.P.S. We can rename your "Network.Async" config option into "Network.OldEngine", set default to "0". Create a factory method which will create a proper networking engine based on either Reactor or Proactor patterns, so users will be able to fall back to current networking code in case any issues will be spotted. The reason to try this way and not stop developing patch is:

in many cases add-on Proactor improves performance up to 2 times (compared with Reactor);

faramir118 · February 12, 2011

Many changes, restructured all of the socket code. I will work on multiple proactor threads next.

TProactor sounds interesting. If it works well, we can probably do away with the ACE_Reactor-based sockets totally.

However, it was written for a very different version of ACE, and updating it would take time.

I'd still like to hear from *nix uses about performance with default ACE implementations.

Ambal · February 12, 2011

faramir118, your code changes are great and I don't see any potential drawbacks currently. Looking forward to see multi-threaded proactor code.

Yes, TProactor looks very nice but incorporating it into mangos might be a tough job. Also in case if someone will get poor performance or issues with TProactor they will not be able to fall back to old code - mangos will become unusable =/

Anyway, cheers to you for your awesome work

faramir118 · February 12, 2011

I pushed multi-threaded proactor support.

The one change I want to make is to move the acceptor into one of the spawned threads - right now it is in its own thread.

Undergarun · February 14, 2011

P.S. No memory leaks detected so far - very nice Everyone are welcome to test this patch - we want to know whether or not ping/CPU usage went down and any issues you possibly might experience.

Are you sure?

It ate 24 GB of ram in a few seconds (5-10 seconds) after mangos starts completly and started swapping...

If i use Network.OldEngine = 1

(28661 | 140737354053392) ACE_POSIX_AIOCB_Proactor::Max Number of AIOs=1024
(28661 | 140737354053392) ACE_POSIX_AIOCB_Proactor::Max Number of AIOs=1024
(28661 | 140737354053392) ACE_POSIX_AIOCB_Proactor::Max Number of AIOs=1024
(28661 | 140737354053392) ACE_POSIX_AIOCB_Proactor::Max Number of AIOs=1024
(28661 | 140737354053392) ACE_POSIX_AIOCB_Proactor::Max Number of AIOs=1024
Failed to complete asynchronous write to 10.0.0.9

http://pastebin.com/mSAK27pa

http://pastebin.com/dRPdezcn

Ambal · February 14, 2011

Thank you, Undergarun.

It must be something weird in message block resize algorithm if your mangos process ate all your RAM - this cannot be detected as memory leak at all. Another frustrating error is "Max Number of AIOs=1024". Default value should be not lower than 4096 AIOs... But ACE code thinks in other way:

#define ACE_AIO_DEFAULT_SIZE 1024

Undergarun, what is the value of "Network.Threads" config option?

P.S. faramir118, you should test for EAGAIN or EWOULDBLOCK (whateven ACE is using to report operation overflow condition) error in case our send or receive operation wasn't successful to reschedule it. Or even create a per-proactor queue of async requests to workaround issues with max async IO request problem(?). If every ACE_Proactor object allows us on *nix 1024 async io requests, then better load balancing might help us.

Cheers

Undergarun · February 14, 2011

Undergarun, what is the value of "Network.Threads" config option?

Network.Threads = 4 but i usually set it to 6 in the 3k+ ppl

faramir118 · February 14, 2011

I pushed a 'lazier' crunch routine

Also properly handled EAGAIN/EWOULDBLOCK (seems like both are used in different circumstances), although it is just a naive retry which probably won't solve the real problem.

I need to spend some time and get more familiar with *nix aio...

faramir118 · February 14, 2011

By default in ACE, as Ambal pointed out, maximum POSIX aio list size is 1024 elements, which equates to 512 clients (since each client can have a pending read and write at the same time). This limit can be changed, although certain platforms won't cooperate.

Asynchronous read/write calls fail immediately (with EAGAIN) if there aren't enough aio slots available. While it is a recoverable error, this isn't handled currently.

Possible things that can be done:

Write some runtime checks that avoid aio on platforms that have system-wide limits or low per-process limits
Fall back on some synchronous mechanism
Fall back on some queueing mechanism
use session queue to prevent exceeding aio slot limits

Thoughts?

Ambal · February 15, 2011

Hi, faramir118.

I'd stick with "session queue to prevent exceeding aio slot limits". All you need is to:

1) push all async read/write requests into one queue

2) count amount of active requests

3) upon read/write callbacks lower the active request count

4) execute pending tasks until active count limit will reach ACE_AIO_DEFAULT_SIZE (or whatever value you will pick up).

This solution will work across all the platforms. You just need to find out if that ACE_AIO_DEFAULT_SIZE limit is applied for one ACE_Proactor object or per process. In first case implentation will be simple, in the second one it will require some more syncronisation. ACE_Atomic_Op<> will fit just fine to handle active request count.

P.S. You will definitely need to count/remove from queue requests while disconnecting a client.

Cheers

cyberium · February 15, 2011

Is EAGAIN have olso something to do with thread creation?

I mean i experienced same 1024 limitation with 'ThreadManager.spawn()' function.

So if all thread slot will be used for communication we are not being able to spawn some other to our specific task.

Just confirm me iam wrong and it's not related to thread.

Ambal · February 15, 2011

cyberium, ThreadManager.spawn() limitation has nothing to do with AIO stuff.

faramir118 · February 15, 2011

Actually, by session queue I meant when players try to log in, put them in a queue and not let them into the world.

Ambal · February 15, 2011

Actually, by session queue I meant when players try to log in, put them in a queue and not let them into the world.

ah, you are too evil, faramir118 Ok, so "some queueing mechanism" will be the best choice then

faramir118 · February 22, 2011

Pushed the queue mechanism. The general idea is there, but I will make a couple changes soon.

I want to use a priority queue, so that I can prioritize write ops.

Deterministically speaking, write is a much more reliable operation - read ops never timeout, so if the internal ACE aiocb list is full of reads we may deadlock until a client wakes us up.

PS: I hope the code and comments make sense... [asynchronous (de)multiplexer] layered on [asynchronous event demupliplexing] will be a nightmare for whoever wants to read this code sometime in the future.

Ambal · February 22, 2011

Hi, faramir118. I'm glad to hear that you continue working on this patch Priority queue sounds very good to me - indeed, writes should have more priority than reads.

Cheers

faramir118 · March 2, 2011

Decided to try a little test:

750 clients, idle at character screen (my c# client, so not even CMSG_PING)

I think it shows the weakness of select-based servers, and that for large servers there is some performance gain to be found.

I'll work on having the clients send data (ping or some invalid opcode w/ data). This should give a more realistic comparison since we will get to test completion callback vs fdset scanning (and also will ignore the 'overhead' of the mangos core )

Download results here.

PS: found some crashes related to outstanding async operations which dereference already-deleted sockets, so I will have to work on that as well.

PPS: should have probably explained the images...

For the CPU graphs, look at the graph and not the label.
The label is the % at the moment I took the picture, which could have been during a context switch, etc.
There is a CPU spike when the clients start connecting, and you can use the private bytes graph's slope to judge connection rate
The select/debug graph is off a bit, clients were running slow due to attached debugger
Thus, actual connection rate is probably higher
If there are any questions, just ask!

leak · March 2, 2011

Been following your changes for a while now.

I don't know how "known" this is but both mangos (i only verified for realmd on win) and TC suffer from fd/handler limit issues. Once the fd limit is reached the servers become unconnectable and this condition persists even if the number of connections being open drops below the fd limit.

On Unix (where most likely ACE_Dev_Poll_Reactor is in use) ACE triggered epoll code goes into endless loop and never recovers unless the app is killed. Derex wrote a patch overloading the error handling and suspending the reactor to prevent this loop (https://github.com/derex/TrinityCore/commit/5f50d8c20f47bcf73ce6fce09e2e98bc8fe1ccbe).

On Win no solution has been found so far as the ACE_TP_Reactor which is used there doesn't really trigger error handling as Unix/ACE_Dev_Poll_Reactor does.

Long story short, you might want to verify if your proactor implementation suffers from the same problem.

Shooting something like this at the servers might reveal that:

#!/usr/bin/perl
use IO::Socket;
my @sockarr;

for(my $i = 0; $i <= 5000; $i++)
{
   if ($sockarr[$i] = IO::Socket::INET->new(Proto=>"tcp",PeerAddr=>"127.0.0.1",PeerPort=>"3724"))
   {
       print "connection: ${i} successful\\n";
   } else {
       print "connection: ${i} failed\\n";
   }
}

print "end";
sleep 10000;

On Win mangos is using FD_SETSIZE 4096 from what i've seen, so this script might need to be started multiple times as Perl has a handler limit below 4096.

Sign In

[dev] New networking code based on "proactor" pattern

Recommended Posts

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Recently Browsing 0 members

Contact Us

Repositories

Important Information