Go Back   ZeroC Forums > Help Center

Reply
 
LinkBack Thread Tools Rate Thread Display Modes
  #1 (permalink)  
Old 03-11-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
possible thread starvation issue, proxy hangs

in my application there is an mfc ui and a linux server app. normally, the server just updates the ui via proxy, but occasionally the ui thread needs to call something on the server because the user changed something. all callbacks are protected by mutexes and 99% of the time everything works fine, but once in a while when many things are happenning at once, a call on one of the proxies will throw an exception or the proxy function call will execute but the proxy just hangs. the mutex situation on the callbacks has been double and triple checked and looks fine.

i was under the impression that with the default config all calls are basically serialized, but it seems like when many things are happening at once ice has problems. by the way, there are no nested callbacks anywhere. will just upping the Ice.ThreadPool.Server.SizeMax or Ice.ThreadPool.Client.SizeMax have a chance to solve this? what reason could there be for a proxy to hang in the ui after seemingly successfully executing the call on the server side? if too many requests come in too little time, will the proxies ever throw exceptions or will they just wait? what is this thread starvation/deadlock risk? is that even possible without nested callbacks? any help much appreciated.

thanks,
peter
Reply With Quote
  #2 (permalink)  
Old 03-11-2008
benoit's Avatar
benoit benoit is online now
ZeroC Staff
 
Name: Benoit Foucher
Organization: ZeroC, Inc.
Project: Ice
 
Join Date: Feb 2003
Location: Rennes, France
Posts: 1,541
Hi,

Which Ice version do you use?

Increasing the number of thread pool threads could be a solution if the problem is caused by thread starvation but before doing this you should ensure this is the case.

If the client request still hangs after it was dispatched by the server and the server sent the response, this usually indicates that the client thread pool thread is busy doing something else instead of listening for the outgoing connection and reading the server response. This can occur if you're using bidirectional connections or AMI. Is it the case?

The best way to investigate deadlock or hang issues is to attach to the process with the debugger and check the stack trace of each thread. If you post the traces here, we'll be happy to take a look.

Cheers,
Benoit.
Reply With Quote
  #3 (permalink)  
Old 03-11-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
I use Ice 3.2.0 and unfortunately don't have the debug info right now. I am not using bidirectional connections or AMI. What is happening is that several proxies will try to contact the ui at the same time, possibly coinciding with a call to ice_ping on a separate thread. It seems there is a pattern where if too many calls are made on different proxies at once, the ui side hangs. When the ui is closed, all the backed up calls are executed at once on the server side. For some reason the server seems to handle this gracefully once the ui exits, but the ui hangs. Is there any way too many calls on proxies at the same time can cause ice to hang?

There is a possibly related issue where on startup, many proxies are registered at once by the ui. Each proxy has its own thread in the ui. Rarely, only a few will register and then ice will hang. If thread starvation could be a possible cause, which should I change, the --Ice.ThreadPool.Client or Server variables.

Thanks,
Peter
Reply With Quote
  #4 (permalink)  
Old 03-11-2008
matthew's Avatar
matthew matthew is offline
ZeroC Staff
 
Name: Matthew Newhook
Organization: ZeroC, Inc.
Project: Internet Communications Engine
 
Join Date: Feb 2003
Location: NL, Canada
Posts: 1,060
I'm sorry Peter but I'm afraid I don't understand the above explanation because the terminology is a bit mixed up. Clients call on Ice objects hosted in servers using a proxy.

Quote:
There is a possibly related issue where on startup, many proxies are registered at once by the ui.
What do you mean by that? Do you mean that the UI calls on the server to register its callback objects?

Quote:
Each proxy has its own thread in the ui.
I'm afraid I don't understand what you mean here either. Is this a thread that you allocate? What does this thread do, and why do you want to devote a thread to a proxy?

Quote:
Rarely, only a few will register and then ice will hang. If thread starvation could be a possible cause, which should I change, the --Ice.ThreadPool.Client or Server variables.
It sounds like perhaps what is occurring is that you sometimes get a callback on one of the previously registered objects prior to all callbacks being registered. That is typically the client does:

Code:
foreach callback proxy:
   server->register(callback)
And this typically completes with no callback being made prior to the entire group being registered. However, if a callback is made during the registration process then it causes a hang.

Is the object adapter activated in the UI at the point that you are making these calls? If not, then the calls from the server will hang blocking the calling thread. If the callbacks are made from threads allocated from the server side thread pool, and there is only a single thread in that pool then no further invocations can be handled and all calls on the server will block.

The solution here is to either:
- increase the size of the server side thread pool. (Ice.ThreadPool.Server.Size=(some number > 1)
OR
- make the callbacks to the UI using some thread other than a thread from the server side thread pool. You can do this using a work-queue -- see demo/IceUtil/workqueue for an example.

You might also want to review your UI code to ensure that you are not updating the the UI directly from callbacks. This is, in general, not safe! I wrote a series of articles on 4 articles on integrating UIs with Ice starting in issue 12 of Connections - http://www.zeroc.com/newsletter/issue12.pdf. You might also want to look at our bundled MFC demo - demo/Ice/MFC.
Reply With Quote
  #5 (permalink)  
Old 03-12-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
Thanks for the response and here are some answers to your questions

What do you mean by that? Do you mean that the UI calls on the server to register its callback objects?
yes, a typical interface looks like this

// Kicker

interface IKickerInstanceClient {
void updateParameters(XMLKickerElement parameters, bool playSound);
};

interface IKickerInstance {
void registerClient(IKickerInstanceClient* client);
void updateParameters(XMLKickerElement parameters);
XMLKickerElement getParameters();
};
sequence<Object*> KickerInstanceSeq;

the server side implements IKickerInstance and client implements IKickerInstanceClient.

I'm afraid I don't understand what you mean here either. Is this a thread that you allocate? What does this thread do, and why do you want to devote a thread to a proxy?

each dialog is modal and contained in a class. there is a thread that runs the modal dialog and terminates on exit. this way dialogs can be created and destroyed dynamically by non-mfc code. each ui has a corresponding thread and proxy to its server.

It sounds like perhaps what is occurring is that you sometimes get a callback on one of the previously registered objects prior to all callbacks being registered. That is typically the client does:

hmm, quite possibly. this still doesn't explain the random times ice hangs after things have been working right for hours.

You might also want to review your UI code to ensure that you are not updating the the UI directly from callbacks.

i only "invalidate" in mfc terms on callbacks. this returns immediately and just tells windows to repaint the next time it loops around. i think the problem lies somewhere in the fact that there are mutexes for basically every callback and in some other parts of the ui code to prevent the parameters being passed back and forth from being corrupted. i can't see any way of taking the mutexes out of the ui code without the risk that ice and the ui will try to edit the parameters at the same time. once again, the mutex code has been double checked and works properly 99% of the time.

possibly there are just too many threads trying to use ice at once which causes ice to hang on rare occasions? some of the callbacks do a little computation and all of them have mutexes so they aren't lightning fast. either way i have upped the maxsize of the threadpools on both ui and server without any issues...
Reply With Quote
  #6 (permalink)  
Old 03-12-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
Also, another thing I forgot to explicitly mention is that each server instance calls ice_ping once per second. This means every second at basically the same time ~20 ice_pings are attempted(1 per proxy), each by a different thread. The function to call the ping is protected by a mutex which I'm now realizing is probably unnecessary because the proxy is threadsafe. Just to clarify, there is only one mutex per server and one mutex per client instance, but it is used to protect almost every function because the data is shared intra-object. Does this seem like a likely candidate to be causing the rare ice hanging?
Reply With Quote
  #7 (permalink)  
Old 03-12-2008
matthew's Avatar
matthew matthew is offline
ZeroC Staff
 
Name: Matthew Newhook
Organization: ZeroC, Inc.
Project: Internet Communications Engine
 
Join Date: Feb 2003
Location: NL, Canada
Posts: 1,060
Quote:
Originally Posted by pdb1013 View Post
You might also want to review your UI code to ensure that you are not updating the the UI directly from callbacks.

i only "invalidate" in mfc terms on callbacks. this returns immediately and just tells windows to repaint the next time it loops around. i think the problem lies somewhere in the fact that there are mutexes for basically every callback and in some other parts of the ui code to prevent the parameters being passed back and forth from being corrupted. i can't see any way of taking the mutexes out of the ui code without the risk that ice and the ui will try to edit the parameters at the same time. once again, the mutex code has been double checked and works properly 99% of the time.

possibly there are just too many threads trying to use ice at once which causes ice to hang on rare occasions? some of the callbacks do a little computation and all of them have mutexes so they aren't lightning fast. either way i have upped the maxsize of the threadpools on both ui and server without any issues...
Too many threads using Ice will not cause random hangs. Hangs are most typically caused by deadlocks in your code (thread A locking mutex M1 and then trying to acquire M2, while thread B has locked M2 and is trying to acquire M1).

Quote:
Also, another thing I forgot to explicitly mention is that each server instance calls ice_ping once per second. This means every second at basically the same time ~20 ice_pings are attempted(1 per proxy), each by a different thread.The function to call the ping is protected by a mutex which I'm now realizing is probably unnecessary because the proxy is threadsafe.
What is this ping for? To detect the client going away by the server? Since you are sending callbacks why do you need to do that? You'll know the client has disappeared when the callback fails. If the ping is there for the client to detect the server going away you should probably ping from the client side.

At any rate, 20 pings a second is certainly excessive If this ping from the server is really necessary you should probably look to move to a session model, where the session is responsible for pinging. Look at demo/Ice/session for an example.

Quote:
Just to clarify, there is only one mutex per server and one mutex per client instance, but it is used to protect almost every function because the data is shared intra-object. Does this seem like a likely candidate to be causing the rare ice hanging?
No, it doesn't sound likely. The best way to find out the reason for the hang is to break your application in a debugger when the hang occurs. Then you will find out exactly what is occurring.
Reply With Quote
  #8 (permalink)  
Old 03-13-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
Too many threads using Ice will not cause random hangs. Hangs are most typically caused by deadlocks in your code (thread A locking mutex M1 and then trying to acquire M2, while thread B has locked M2 and is trying to acquire M1).

What I am going to do to address this is change everything involving the ui mutex to use trylock instead of lock and just return safely if the lock isn't acquired. This way, only the server mutex is blocking and if there is ever deadlock it will have to be on the server. What is the proper way to use the trylock helper object? If I just use IceUtil::Mutex::TryLock lock(uimutex_); how am I to tell whether the lock was acquired? Is there no exception-safe implementation of trylock?


What is this ping for? To detect the client going away by the server? Since you are sending callbacks why do you need to do that? You'll know the client has disappeared when the callback fails. If the ping is there for the client to detect the server going away you should probably ping from the client side.

At any rate, 20 pings a second is certainly excessive If this ping from the server is really necessary you should probably look to move to a session model, where the session is responsible for pinging. Look at demo/Ice/session for an example.


The ping is so the server can detect the client going away. The computers running the server and client are in separate locations and if the connection goes down the server needs to know immediately. I will definitely look into the session model as an alternative. Is there any possibility of a ping and proxy call occurring at the same time causing problems? Thanks again.
Reply With Quote
  #9 (permalink)  
Old 03-14-2008
matthew's Avatar
matthew matthew is offline
ZeroC Staff
 
Name: Matthew Newhook
Organization: ZeroC, Inc.
Project: Internet Communications Engine
 
Join Date: Feb 2003
Location: NL, Canada
Posts: 1,060
Quote:
Originally Posted by pdb1013 View Post
...
What I am going to do to address this is change everything involving the ui mutex to use trylock instead of lock and just return safely if the lock isn't acquired. This way, only the server mutex is blocking and if there is ever deadlock it will have to be on the server. What is the proper way to use the trylock helper object? If I just use IceUtil::Mutex::TryLock lock(uimutex_); how am I to tell whether the lock was acquired? Is there no exception-safe implementation of trylock?
You should call acquired on the TryLock object to find out whether the lock was obtained.

Code:
void
dosomething()
{
   IceUtil::Mutex::TryLock lock(_mut);
   if(!lock.acquired())
   {
       // lock was not acquired.
       return;
   } 
}
However, this does not sound like a very good solution. Surely you don't want to lose updates from the server? If I were you I would figure out really why you are getting unexpected deadlocks and fix the source of the problem.

Quote:
The ping is so the server can detect the client going away. The computers running the server and client are in separate locations and if the connection goes down the server needs to know immediately. I will definitely look into the session model as an alternative.
Typically you would ping from the client to the server, and use use a timeout on the server side to detect the client disappearing. See demo/Ice/session for an example.

Quote:
Is there any possibility of a ping and proxy call occurring at the same time causing problems? Thanks again.
Ice has no problems with concurrent calls.
Reply With Quote
  #10 (permalink)  
Old 03-14-2008
pdb1013 pdb1013 is offline
Registered User
 
Name: Peter Brandt
Organization: Jump Trading
Project: windows ui for linux app
 
Join Date: Mar 2008
Posts: 6
I am making the proxy call return boolean to inform the server if the update was successful. If not it can deal with it in a sensible way (reset the flag that the client needs to be updated and try again in a bit). I feel problems like this are inherently very difficult to reproduce and track down especially when things get mixed in with ui/windows code. The proxy hang/deadlock occurs about once a week in code that runs 24/7 and makes several calls on proxies per second. If you can use a design pattern from the start where you know deadlock is impossible isn't that a very good solution? Either way, this has been a big help and keep up the good work.
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
global variable and thread issue in ice solikhin Help Center 3 07-12-2007 05:13 AM
IceStorm hangs T. Barry Help Center 1 03-02-2006 04:49 AM
Is a proxy can be called in different thread at the same time? rano Help Center 1 06-01-2005 11:54 PM
One proxy one thread? level Help Center 4 03-30-2004 10:05 PM
inheritence test hangs iostream Help Center 8 06-03-2003 08:30 PM


All times are GMT -4. The time now is 09:01 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.0.0
(c) 2008 ZeroC, Inc.