Go Back   ZeroC Forums > Comments

Reply
 
LinkBack Thread Tools Rating: Thread Rating: 2 votes, 4.50 average. Display Modes
  #1 (permalink)  
Old 02-27-2003
Ken Carpenter Ken Carpenter is offline
Registered User
 
 
Join Date: Feb 2003
Location: Vancouver, B.C., Canada
Posts: 31
Server Scalability & Asynchronous IO

It is my understanding that using select() with a large FD_SET is not very scalable. As a result, all the network servers I've written have used asynchronous IO and completion ports (on Windows obviously).

What mechanism does ICE use to allow scalability to thousands of simultaneous network connections? I can't imagine it uses a thread-per-connection model, and using grep I can't find any evidence of completion ports.

Thanks,


Ken Carpenter
Reply With Quote
  #2 (permalink)  
Old 02-27-2003
marc's Avatar
marc marc is offline
ZeroC Staff
 
Name: Marc Laukien
Organization: ZeroC, Inc.
Project: The Internet Communications Engine
 
Join Date: Feb 2003
Location: Florida
Posts: 1,772
Re: Server Scalability & Asynchronous IO

Quote:
Originally posted by Ken Carpenter
It is my understanding that using select() with a large FD_SET is not very scalable. As a result, all the network servers I've written have used asynchronous IO and completion ports (on Windows obviously).
That's right, and that's why we have a special optimization for WIN32. Have a look at ThreadPool.cpp. Search for the following comment:

//
// Optimization for WIN32 specific version of fd_set. Looping with a
// FD_ISSET test like for Unix is very unefficient for WIN32.
//

Quote:
Originally posted by Ken Carpenter
What mechanism does ICE use to allow scalability to thousands of simultaneous network connections? I can't imagine it uses a thread-per-connection model, and using grep I can't find any evidence of completion ports.
We use a thread pool model, using the leader-follower pattern. This means that the number of threads being used doesn't increase with the number of connections. Again, if you are interested in the details, have a look at ThreadPool.cpp.

Furthermore, Ice uses "Active Connection Management" (ACM): Connections which have been idle for a certain time are automatically closed (gracefully, so that no messages get lost). When the connection is needed again, it is reestablished. (ACM is optional and can be switched on or off using configuration parameters.)
Reply With Quote
  #3 (permalink)  
Old 02-27-2003
Ken Carpenter Ken Carpenter is offline
Registered User
 
 
Join Date: Feb 2003
Location: Vancouver, B.C., Canada
Posts: 31
I just whipped out my copy of POSA2 to review the Leader/Followers pattern.

Do you have any plans to change ICE to use WaitForMultipleObjects() on Windows?

Thanks,


Ken Carpenter
Reply With Quote
  #4 (permalink)  
Old 02-27-2003
marc's Avatar
marc marc is offline
ZeroC Staff
 
Name: Marc Laukien
Organization: ZeroC, Inc.
Project: The Internet Communications Engine
 
Join Date: Feb 2003
Location: Florida
Posts: 1,772
Quote:
Originally posted by Ken Carpenter
I just whipped out my copy of POSA2 to review the Leader/Followers pattern.

Do you have any plans to change ICE to use WaitForMultipleObjects() on Windows?

Thanks,

Ken Carpenter
I don't see any benefit in using WaitForMultipleObjects(). I usually try to avoid non-standard (i.e., non-posix) calls unless they provide some sort of significant benefit.

Note that a change to WaitForMultipleObject() would have far reaching consequences. For example, all the transport plugins are currently select()-able. I guess we would have to make them compatible with WaitForMultipleObjects() then. This then also raises the question if third-party libraries, such as OpenSSL, can be used without modifications.
Reply With Quote
  #5 (permalink)  
Old 02-27-2003
Ken Carpenter Ken Carpenter is offline
Registered User
 
 
Join Date: Feb 2003
Location: Vancouver, B.C., Canada
Posts: 31
What would you say is the limit for the maximum number of connections per server, which still leaves enough processor time to actually do work?

Obviously this varies with the machine in question and with the nature of the request, but a ballpark figure would be helpful, or if you have statistics for a particular server or request type (i.e., a simple database query/response).

Thanks,


Ken Carpenter
Reply With Quote
  #6 (permalink)  
Old 02-27-2003
marc's Avatar
marc marc is offline
ZeroC Staff
 
Name: Marc Laukien
Organization: ZeroC, Inc.
Project: The Internet Communications Engine
 
Join Date: Feb 2003
Location: Florida
Posts: 1,772
Quote:
Originally posted by Ken Carpenter
What would you say is the limit for the maximum number of connections per server, which still leaves enough processor time to actually do work?

Obviously this varies with the machine in question and with the nature of the request, but a ballpark figure would be helpful, or if you have statistics for a particular server or request type (i.e., a simple database query/response).
That's really difficult to say. The connections alone are probably not the problem. (Although you would have to configure your system so that it allows enough connections. For example, the Windows default for WaitForMultipleObjects() is just 64.)

If all these connections are busy all the time, then it really depends on how much processing needs to be done for the requests arriving over these connections. To give an estimate on the number of connections is impossible in this case, without knowing more about the request processing.

If only a few connections are busy at the same time, 10,000 connections shouldn't be a problem. Of course, to save resources, I would recommend Active Connection Management, so that idle connections are closed and re-established on demand.

In general, I would avoid designs which require huge numbers of simultanous connections, by using a multi-tier architecture.
Reply With Quote
  #7 (permalink)  
Old 02-28-2003
matthew's Avatar
matthew matthew is offline
ZeroC Staff
 
Name: Matthew Newhook
Organization: ZeroC, Inc.
Project: Internet Communications Engine
 
Join Date: Feb 2003
Location: NL, Canada
Posts: 1,001
Quote:
For example, the Windows default for WaitForMultipleObjects() is just 64
Unless this recently changed I don't think 64 handles is a default - I think 64 is a hard limit!

Regards, Matthew
Reply With Quote
  #8 (permalink)  
Old 02-28-2003
Ken Carpenter Ken Carpenter is offline
Registered User
 
 
Join Date: Feb 2003
Location: Vancouver, B.C., Canada
Posts: 31
That's right Matthew. You can wait on at most 64 event sources (e.g., sockets) per thread with WaitForMultipleObjects. So if you need more than 64, you need multiple threads.

There is a default limit on FD_SETs of 64, but this can be changed by a #define before including the winsock header.

One thing I still don't understand is how Ice avoids the overhead of checking which of, say 2000, sockets is readable. It looks to me like you loop over whole array of handles in the ThreadPool.cpp code.

Can you tell me where I'm going wrong in the following scenario:

- server hosts an Ice object and is waiting in select()
- 2000 clients are connected to the server, therefore there are 2000 socket handles (accept() created one for each connection)
- one client calls a method in the Ice object
- server receives data and so select() returns
- ThreadPool.cpp iterates over, on average, 2000/2 handles to determine which one is ready

With WaitForMultipleObjects(), the index of the handle/socket that satisfied the wait condition is returned (or the one in the array with the lowest index if more than one handle satisfied the wait condition). There is, therefore, no need to iterate over the handles to check for readiness.

I suspect I have missed something somewhere, since you guys seem to know what you're doing. Can you set me straight?


Ken Carpenter
Reply With Quote
  #9 (permalink)  
Old 02-28-2003
marc's Avatar
marc marc is offline
ZeroC Staff
 
Name: Marc Laukien
Organization: ZeroC, Inc.
Project: The Internet Communications Engine
 
Join Date: Feb 2003
Location: Florida
Posts: 1,772
Have a look at the optimization for WIN32 in ThreadPool.cpp. We don't loop over 2000 connections for WIN32, but only over the connections which are marked as readable. So there is no 2000-loop for WIN32.

In this case, WIN32 is faster than our Linux implementation, because we can make use of the known format of the WIN32 struct fd_set. However, even under Linux, if you have 2000 connections open simultanously, a tight 2000-loop should be your least worry.

Any server that needs so many connections should use ACM (Active Connection Management), so that idle connections are closed, to save resources.

If all 2000 connections are busy all the time, so that no idle connections can be closed, then a tight 2000-loop would even be less relevant, because the processing time is what counts. You would need a very fast machine for such a case.

Finally, you can lower the number of handles to loop over in Linux, by simply using multiple thread pools. In Ice, you can give each object adapter a separate thread pool (optional), so if you have 2,000 connections and 20 adapters w/ separate pools, then you only need to loop over 100 handles in each of them - just like with multiple threads calling WaitForMultipleObjects().
Reply With Quote
  #10 (permalink)  
Old 02-28-2003
andreynech andreynech is offline
Registered User
 
Name: Andrey Nechypurenko
Organization: Siemens AG
Project: remotely controled vehicle
 
Join Date: Feb 2003
Location: Munich, Germany
Posts: 36
Hi Marc,

My attantion was catched by the following phrase from you:

> In Ice, you can give each object adapter a separate thread pool
I could not remember the relevant part in documentation about this feature. Am I just overlook it and should review the documentation more carefully or it is a kind of undocumented featrure. If the later, could you please point me to the examples (if any) and tell whether it possible to set the priority for each thread pool? Maybe it is also possible to create priority lanes ?

Thank you,
Andrey.
Reply With Quote
  #11 (permalink)  
Old 02-28-2003
marc's Avatar
marc marc is offline
ZeroC Staff
 
Name: Marc Laukien
Organization: ZeroC, Inc.
Project: The Internet Communications Engine
 
Join Date: Feb 2003
Location: Florida
Posts: 1,772
It is documented, but not very prominently:

Quote:
C.3 Ice Object Adapter Properties

[...]

name.ThreadPool.Size

Synopsis

name.ThreadPool.Size=num

Description

If num is set to a value larger than zero, the object adapter creates its own, private thread pool with num threads for dispatching requests. This is useful to ensure that a minimum number of threads is available for dispatching requests on certain Ice objects, in order to avoid deadlocks because of thread starvation.
We still must write a chapter that covers the object adapter and all its configuration parameters in detail.

Regarding the priority lanes: Yes, I think we could fairly easily add such a feature. This would just be an additional thread pool configuration parameter, I guess.
Reply With Quote
  #12 (permalink)  
Old 03-02-2003
Ken Carpenter Ken Carpenter is offline
Registered User
 
 
Join Date: Feb 2003
Location: Vancouver, B.C., Canada
Posts: 31
Quote:
Originally posted by marc
Have a look at the optimization for WIN32 in ThreadPool.cpp. We don't loop over 2000 connections for WIN32, but only over the connections which are marked as readable. So there is no 2000-loop for WIN32.
Ahhh. Now I see why I was confused. I didn't realize the array being iterated there was only a list of ready handles! Doh!

Thanks for the clarification.


Ken Carpenter
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Seattle conference on scalability xdm Comments 0 02-20-2007 08:08 PM
Asynchronous Programming ty263 Help Center 2 12-01-2006 07:46 AM
Asynchronous - AMI and AMD mohdiarra Help Center 10 10-18-2006 09:30 PM
Scalability Questions feline Help Center 9 06-28-2005 02:24 AM
Asynchronous invocation stephan Help Center 5 04-21-2004 08:30 AM


All times are GMT -4. The time now is 03:05 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.0.0
(c) 2008 ZeroC, Inc.