Page 1 of 3 1 2 3 LastLast
Results 1 to 15 of 41

Thread: BadMagicException on single server

  1. #1
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27

    BadMagicException on single server

    Hi all!

    1. We are using Ice 3.4.2 binaries, cluster of Windows Server 2008 R2 Enterprise machines. We have one server App with 16 worker threads each transferring (synchronously) files to Clients.
    This produces a 100% load over 1Gigabit adapter (as we see in Task Manager). Every time we run our scenario we see this errors in log:

    -! 10/26/11 11:20:42.139 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0x05, 0x83, 0x11, 0x4e
    local address = 10.65.60.91:50505
    remote address = 10.65.60.91:49239

    Notice - BadMagicException is on the single server.

    Also regularly we see another warnings from Ice:

    -! 10/26/11 11:23:45.734 DfsService.server.1: warning: connection exception:
    TcpTransceiver.cpp:414: Ice::ConnectionLostException:
    connection lost: The specified network name is no longer available.
    local address = 10.65.60.91:50509
    remote address = 10.65.60.92:59634

    and

    -! 10/26/11 11:38:02.272 DfsService.server.1: warning: connection exception:
    Outgoing.cpp:226: Ice::TimeoutException:
    timeout while sending or receiving data
    local address = 10.65.60.91:51182
    remote address = 10.65.60.92:59634

    (we overrided timeout with 2 minutes for the server App).

    We talked to our admins - they say everything is OK with hardware, though we think it is not.

    Do you have any ideas what can cause this errors? Especially BadMagicException, which seems very, very strange to us.

    2. Another question - we downloaded Ice 3.4.2 binaries for MSVC 2010, while our apps are built with MSVC 2010 SP1 - is this acceptable?

    Thanks.

  2. #2
    mes's Avatar
    mes
    mes is online now ZeroC Staff
    Name: Mark Spruiell
    Organization: ZeroC, Inc.
    Project: Ice Developer
    Join Date
    Feb 2003
    Location
    California
    Posts
    1,445
    Hi Andy,

    I agree that the BadMagicException is the most serious issue. Does this only happen when the system is running at 100% CPU?

    This exception indicates that Ice has received an invalid packet on one of its connections. The most likely reason is a non-Ice program mistakenly attempting to connect and communicate with an Ice endpoint. It could also be caused by a concurrency bug in Ice.

    -! 10/26/11 11:20:42.139 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0x05, 0x83, 0x11, 0x4e
    local address = 10.65.60.91:50505
    remote address = 10.65.60.91:49239
    The key piece of information here is the remote address. The first step is to verify whether it is an Ice program or a non-Ice program at the remote end. Assuming it's an Ice program, your next step should be to enable the following properties in both programs:

    Ice.Trace.Network=3
    Ice.Trace.Protocol=1


    Review the logs and see if anything grabs your attention, and feel free to post them here and we'll take a look. Also, if you can create a small test case that reproduces the exception, we'll gladly try that too.

    2. Another question - we downloaded Ice 3.4.2 binaries for MSVC 2010, while our apps are built with MSVC 2010 SP1 - is this acceptable?
    I built our C++ test suite with VS 2010SP1 while linking with the VS 2010 libraries from the 3.4.2 binary distribution. All the tests ran successfully, so I don't expect there will be any compatibility issues.

    Regards,
    Mark

  3. #3
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi, Mark, thanks for answer!

    Does this only happen when the system is running at 100% CPU?
    Not the CPU, but 100% network load. We run our scenario approx. 30 times - high load was the first symptom we noticed.

    Let me tell more details about our system.
    We are using IceGrid containing 20 nodes. First node contains two servers, let's call them
    FileSource and DfsService. Other nodes contain one DfsAgent per node.
    As I mentioned, DfsService has 16 threads, each does a very simple thing - obtains proxy to a file from FileSource, then reads bytes from this proxy and writes these chunks to some DfsAgent.

    It turned out that BadMagicException appears when DfsService reads through file proxy bytes from FileSource.
    Other exceptions from my first post happen between DfsService and DfsAgents when writing bytes.

    Another thing - we deployed our system on two independent clusters - both discovered this behaviour.

    I try to write a small test case asap, also turn on Ice traces and review the logs.

    Upd.
    I've found similar post here BadMagicException and Crash, Is TCP lost data?, but with no answers

    Regards,
    andy
    Last edited by andy_84; 10-27-2011 at 12:49 AM.

  4. #4
    linkman is offline Registered User
    Name: liao changbin
    Organization: huhan university
    Project: scada
    Join Date
    Jul 2009
    Posts
    23

    It is me

    I posted "BadMagicException and Crash, Is TCP lost data?" 5 months ago.
    I haven used CommView(like WireShark, but can use in single computer) capture some network packets, and found more details, if you want the packets file, i will upload sometimes.

  5. #5
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi!
    Caught again, here are traces.

    DfsService:

    -- 10/27/11 14:03:29.480 DfsService.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57249
    remote address = 10.65.60.71:51148
    -- 10/27/11 14:03:29.480 DfsService.server.1: Network: received 1326 of 1326 bytes via tcp
    local address = 10.65.60.91:57609
    remote address = 10.65.60.91:57130
    -- 10/27/11 14:03:29.480 DfsService.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57256
    remote address = 10.65.60.78:65416
    -- 10/27/11 14:03:29.480 DfsService.server.1: Protocol: received reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 524318
    request id = 199018
    reply status = 0 (ok)
    -- 10/27/11 14:03:29.480 DfsService.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57244
    remote address = 10.65.60.87:65036
    -- 10/27/11 14:03:29.481 DfsService.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57256
    remote address = 10.65.60.78:65416
    -- 10/27/11 14:03:29.481 DfsService.server.1: Network: received 14 of 14 bytes via tcp
    local address = 10.65.60.91:57609
    remote address = 10.65.60.91:57130
    -! 10/27/11 14:03:29.481 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0x10, 0x4f, 0x00, 0x0f
    local address = 10.65.60.91:57609
    remote address = 10.65.60.91:57130

    -- 10/27/11 14:03:29.481 DfsService.server.1: Network: closing tcp connection
    local address = 10.65.60.91:57609
    remote address = 10.65.60.91:57130
    -- 10/27/11 14:03:29.482 DfsService.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57249
    remote address = 10.65.60.71:51148
    -- 10/27/11 14:03:29.482 DfsService.server.1: Network: sent 84 of 84 bytes via tcp
    local address = 10.65.60.91:57249
    remote address = 10.65.60.71:51148
    -- 10/27/11 14:03:29.482 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 87
    request id = 23901
    identity = DfsAgent.13
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =

    FileSource:

    -- 10/27/11 14:03:29.479 Coordinator.server.1: Protocol: received request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 199019
    identity = 304d6087-311a-4cd2-9761-4bb97849dd5c
    facet =
    operation = read
    mode = 0 (normal)
    context =
    -- 10/27/11 14:03:29.479 Coordinator.server.1: Network: sent 65536 of 65536 bytes via tcp
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:57609
    -- 10/27/11 14:03:29.480 Coordinator.server.1: Protocol: sending reply
    message type = 2 (reply)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 524318
    request id = 199010
    reply status = 0 (ok)
    -! 10/27/11 14:03:29.485 Coordinator.server.1: warning: connection exception:
    TcpTransceiver.cpp:341: Ice::ConnectionLostException:
    connection lost: The specified network name is no longer available.
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:57609

    -- 10/27/11 14:03:29.485 Coordinator.server.1: Network: closing tcp connection
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:57609
    -- 10/27/11 14:03:29.485 Coordinator.server.1: Network: accepted tcp connection
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:58377
    -- 10/27/11 14:03:29.485 Coordinator.server.1: Protocol: sending validate connection
    message type = 3 (validate connection)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 14
    -- 10/27/11 14:03:29.485 Coordinator.server.1: Network: sent 14 of 14 bytes via tcp
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:58377
    -- 10/27/11 14:03:29.486 Coordinator.server.1: Network: received 14 of 14 bytes via tcp
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:58377
    -- 10/27/11 14:03:29.486 Coordinator.server.1: Network: received 72 of 72 bytes via tcp
    local address = 10.65.60.91:57130
    remote address = 10.65.60.91:58377
    -- 10/27/11 14:03:29.486 Coordinator.server.1: Protocol: received request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1
    identity = 14a071e1-cb2b-45d3-b7f6-bb4b209b3b06
    facet =
    operation = read
    mode = 0 (normal)
    context =

    Still trying to make a small test program reproducing BadMagicException.

    Regards, andy

  6. #6
    mes's Avatar
    mes
    mes is online now ZeroC Staff
    Name: Mark Spruiell
    Organization: ZeroC, Inc.
    Project: Ice Developer
    Join Date
    Feb 2003
    Location
    California
    Posts
    1,445
    Andy,

    Thanks for the log output.

    The ConnectionLostException is definitely expected in this situation. The sender receives this exception because the receiver ungracefully closes its end of the connection after detecting the "bad magic" condition.

    Can you check the Windows event log on this host to see whether any network-related events were logged around the time of the BadMagicException?

    Regards,
    Mark

  7. #7
    mes's Avatar
    mes
    mes is online now ZeroC Staff
    Name: Mark Spruiell
    Organization: ZeroC, Inc.
    Project: Ice Developer
    Join Date
    Feb 2003
    Location
    California
    Posts
    1,445
    Andy,

    One other request: please review the discussion in this thread. Your situation sounds quite similar to the one described there.

    Regards,
    Mark

  8. #8
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Quote Originally Posted by mes View Post
    Andy,

    One other request: please review the discussion in this thread. Your situation sounds quite similar to the one described there.

    Regards,
    Mark
    Hi, Mark!
    Funny thing happened - this post was created by my colleague as we started to develop the first version of our system Unfortunately, he left our team, but we still turn off all TCP offload options on new servers we configure - so this is not the case

    Regards, andy

  9. #9
    linkman is offline Registered User
    Name: liao changbin
    Organization: huhan university
    Project: scada
    Join Date
    Jul 2009
    Posts
    23

    you can try to limit the tcp packet size less than 65535

    just to try, split one lage ICE invoke to multi invokes.

  10. #10
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi, Mark!

    Quote Originally Posted by mes View Post
    Can you check the Windows event log on this host to see whether any network-related events were logged around the time of the BadMagicException?
    No, nothing strange in Event log. System was working the whole night, new error WSAECONNRESET appeared in log:


    -! 10/28/11 04:55:29.600 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0x80, 0xfa, 0x00, 0x00
    local address = 10.65.60.91:51352
    remote address = 10.65.60.91:59922
    -! 10/28/11 05:00:53.428 DfsService.server.1: warning: connection exception:
    TcpTransceiver.cpp:414: Ice::ConnectionLostException:
    connection lost: The specified network name is no longer available.
    local address = 10.65.60.91:54077
    remote address = 10.65.60.91:59922
    -! 10/28/11 06:17:34.248 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0x85, 0xc0, 0x74, 0x6a
    local address = 10.65.60.91:54165
    remote address = 10.65.60.91:59922
    -! 10/28/11 06:35:00.397 DfsService.server.1: warning: connection exception:
    Outgoing.cpp:226: Ice::TimeoutException:
    timeout while sending or receiving data
    local address = 10.65.60.91:60950
    remote address = 10.65.60.74:53596
    -! 10/28/11 06:35:00.404 DfsService.server.1: warning: connection exception:
    TcpTransceiver.cpp:163: Ice::ConnectionLostException:
    connection lost: WSAECONNRESET
    local address = 10.65.60.91:59972
    remote address = 10.65.81.6:49208
    -! 10/28/11 06:35:00.410 DfsService.server.1: warning: connection exception:
    TcpTransceiver.cpp:163: Ice::ConnectionLostException:
    connection lost: WSAECONNRESET
    0.0.0.0:59972
    -! 10/28/11 07:02:50.869 DfsService.server.1: warning: connection exception:
    ConnectionI.cpp:1250: Ice::BadMagicException:
    unknown magic number: 0xb0, 0x6e, 0x42, 0x48
    local address = 10.65.60.91:55559
    remote address = 10.65.60.91:59922
    -! 10/28/11 07:20:37.572 DfsService.server.1: warning: connection exception:
    Outgoing.cpp:226: Ice::TimeoutException:
    timeout while sending or receiving data
    local address = 10.65.60.91:55553
    remote address = 10.65.60.76:51668

    Maybe it's time for me to start commenting out source code and finally get a small test reproducing this errors.

    Regards, andy

  11. #11
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi, all!

    Today we turned off all communications between DfsService and DfsAgents and left only DfsService and FileSource (aka Coordinator), they are on the same host. And we've got another error in logs!

    FileSource (aka Coordinator):


    -! 10/28/11 15:32:16.786 Coordinator.server.1: warning: connection exception:
    TcpTransceiver.cpp:169: Ice::SocketException:
    socket exception: WSAEFAULT
    local address = 10.65.60.91:65228
    remote address = 10.65.60.91:50843

    DfsService:


    -! 10/28/11 15:32:16.786 DfsService.server.1: warning: connection exception:
    TcpTransceiver.cpp:414: Ice::ConnectionLostException:
    connection lost: The specified network name is no longer available.
    local address = 10.65.60.91:50843
    remote address = 10.65.60.91:65228

    Can anyone explain what's happening?

    Regards, andy

  12. #12
    bernard's Avatar
    bernard is online now ZeroC Staff
    Name: Bernard Normier
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Palm Beach Gardens, FL
    Posts
    1,294
    Hi Andy,

    It would be great if you could reproduce this problem in a test-case that we can try out.

    The BadMagicException and WSAEFAULT sound like a memory corruption, probably in the FileSource/Coordinator server (i.e. the faulty FileSource sends an incorrect magic to the DfsService, and occasionally fails with WSAEFAULT).

    Did you try to run your servers with a memory checker like Purify?

    Best regards,
    Bernard
    Bernard Normier
    ZeroC, Inc.

  13. #13
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi, Bernard!

    Thanks for an idea, we'll try to do this on Monday.

    Another thing we noticed about Coordinator - it warns several times a second that thread pool is running low on threads. Adapter thread pool size is set to 32. Can this intensive workload cause any errors? No such warnings could be found in DfsService log.

    Regards, andy

  14. #14
    andy_84 is offline Registered User
    Name: Andrew Solovyov
    Organization: MAI
    Project: simple grid-like system
    Join Date
    Nov 2010
    Posts
    27
    Hi, all!

    Unfortunately, I haven't examined BadMagicException in Coordinator yet

    Rather, I collected ICE traces when all of 16 worker threads in DfsService got stuck in synchronous calls to a single DfsAgent and then failed with a TimeoutException. What surprised me - I couldn't find in DfsAgent's traces that it received this calls! Please, explain, how this can happen!!

    DfsAgent:


    -- 11/08/11 01:55:17.546 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324610
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:17.546 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:17.591 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324611
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:17.591 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:17.675 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324612
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:17.675 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:17.934 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324613
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:17.934 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:18.152 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324614
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:18.152 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:20.733 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324615
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:20.733 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:20.750 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324616
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:20.750 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:20.975 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324617
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:20.975 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:21.113 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324618
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:21.113 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:21.440 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324619
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:21.440 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:21.441 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324620
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:21.441 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:21.897 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324621
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:21.897 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:22.352 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324622
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:22.352 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:22.605 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324623
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:22.605 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:24.115 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 86
    request id = 1324624
    identity = DfsAgent.8
    facet =
    operation = openWrite
    mode = 0 (normal)
    context =
    -- 11/08/11 01:55:24.115 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    -- 11/08/11 01:55:40.932 DfsService.server.1: Protocol: sending request
    message type = 0 (request)
    compression status = 0 (not compressed; do not compress response, if any)
    message size = 53
    request id = 1324625
    identity = DfsAgent.8
    facet =
    operation = getAgentState
    mode = 2 (idempotent)
    context =
    -- 11/08/11 01:55:40.932 DfsService.server.1: Network: sent 53 of 53 bytes via tcp
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    The very moment, when we got TimeoutException:


    -! 11/08/11 01:57:17.437 DfsService.server.1: warning: connection exception:
    Outgoing.cpp:226: Ice::TimeoutException:
    timeout while sending or receiving data
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092
    -- 11/08/11 01:57:17.437 DfsService.server.1: Network: closing tcp connection
    local address = 10.65.60.91:64591
    remote address = 10.65.60.86:61092

    None of the received requests for the bolded request id's were found in ICE traces of DfsAgent.8 server.

    Regards, andy

  15. #15
    mes's Avatar
    mes
    mes is online now ZeroC Staff
    Name: Mark Spruiell
    Organization: ZeroC, Inc.
    Project: Ice Developer
    Join Date
    Feb 2003
    Location
    California
    Posts
    1,445
    Hi Andy,

    Your traces don't show any successful request/reply exchanges with DfsAgent.8. Was it working properly before the timeout occurred?

    In a situation like this where multiple pending requests to the same server result in a timeout, the reason is often that the server has deadlocked. Have you tried attaching to the server in a debugger to examine what all of its threads are doing?

    Regarding the missing trace output in the server: Is the server creating multiple communicators? If so, you may not be enabling the tracing properties in the correct communicator. Is the server managed by IceGrid? If so, have you confirmed that the property changes have been deployed to the server's configuration?

    Regards,
    Mark

Page 1 of 3 1 2 3 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. BadMagicException and Crash, Is TCP lost data?
    By linkman in forum Help Center
    Replies: 2
    Last Post: 08-10-2011, 01:51 AM
  2. BadMagicException in mid time intensive runs
    By Andrew S in forum Help Center
    Replies: 7
    Last Post: 01-28-2009, 09:30 AM
  3. Single threaded approach
    By volk in forum Help Center
    Replies: 1
    Last Post: 06-13-2008, 07:57 AM
  4. What are reasons for BadMagicException
    By rhochmuth in forum Help Center
    Replies: 13
    Last Post: 06-09-2007, 02:52 AM
  5. single stream for two-way communication?
    By vukicevic in forum Help Center
    Replies: 2
    Last Post: 05-19-2003, 03:17 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •