Results 1 to 4 of 4

Thread: IceGridNode daemon stalls Linux boot process

  1. #1
    jharriot is offline Registered User
    Name: John Harriott
    Organization: BAE Systems
    Project: P3 Mid Life Upgrade
    Join Date
    Mar 2007
    Posts
    30

    IceGridNode daemon stalls Linux boot process

    Hi,

    I'm running IceGridNode (Ice 3.3.1) as a daemon on RHEL 5.2. The IceGridRegistry is running on a separate Windows platform.
    If the registry is not running, the Linux boot process stalls for a time period that varies from 30sec to a couple of minutes (for different PCs) when attempting to start the icegridnode daemon. N.B. I have not set any timeouts on connections to the Locator endpoint.

    This situation is highly possible as the PCs can be started in any order.

    The important thing is for the Linux icegridnode to eventually connect once the registry is alive.

    Are there IceGrid properties that control timeouts and retry intervals for IceGridNode connection with the registry?

    Are there recommended settings for IceGridNode when run across a network of PCs that startup in any order?

    Cheers John

  2. #2
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi John,

    It should be possible to start the IceGrid node before the registry. Order shouldn't matter. It sounds like in your case the IceGrid node doesn't detect in a timely manner that the registry isn't running.

    Can you try running the node with --nowarn to see if it makes a difference? By default, the node tries to ping the IceGrid locator with a 15s timeout, with the retry this check can last up to 30s if the locator is unreachable. Passing --nowarn when starting the node will disable this check.

    You should also use timeouts for the endpoints of the registry, node and locator proxy. I recommend checking out the configuration files of the demo/IceGrid/replication demo from your Ice distribution for an example where timeouts are configured on the node and registry endpoints.

    Cheers,
    Benoit.

  3. #3
    jharriot is offline Registered User
    Name: John Harriott
    Organization: BAE Systems
    Project: P3 Mid Life Upgrade
    Join Date
    Mar 2007
    Posts
    30
    Hi,

    Futher investigations (with the registry inactive):
    a. I have run IceGridNode as a console application and as a daemon (service icegridnode start).
    b. I redirected the Ice.Stderr to a file so I could examine Ice trace statements.
    c. I set the Locator endpoint timeout to 5 secs.

    In both instances the trace log showed two initial attempts to connect with the registry spaced about 15 sec apart. Subsequent attempts were 5 sec apart (as I expected with timeout=5000 ms).

    While IceGridNode was attempting to connect "ps -ae | grep icegrid" showed 4 processes, one of which was defunct. After 70 secs the service responded with the [OK] msg to indicate the service had started. Also the "ps -ae | grep icegrid" showed only one process active.

    Playing around with the endpoint timeout suggests a delay of (30 + 8*timeout) occurs before the service is reported as started, i.e. changing the timeout to 10 sec increased the delay to 110 sec. If I include the --nowarn option the 30 sec initial delay is removed.

    FYI. The file /etc/init.d/icegridnode was taken from the examples in the Ice distribution.

  4. #4
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi John,

    You're right, the daemonized node gives back control to the caller only after the node tried to connect with the registry several times. We will look into changing this behavior for the next release, thanks for bringing this to our attention.

    Cheers,
    Benoit.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Virtual Machine Port Renaming Stalls
    By jsternberg in forum Help Center
    Replies: 2
    Last Post: 03-30-2011, 01:57 PM
  2. Replies: 2
    Last Post: 01-06-2009, 07:07 AM
  3. icegridnode can't start on daemon mode ?
    By ewiniar in forum Help Center
    Replies: 2
    Last Post: 05-29-2006, 05:24 AM
  4. IceBox as Daemon
    By xdm in forum Help Center
    Replies: 6
    Last Post: 05-03-2006, 09:55 PM
  5. stop the daemon
    By nsns in forum Help Center
    Replies: 6
    Last Post: 09-15-2004, 02:21 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •