Hi, all!
Unfortunately, I haven't examined BadMagicException in Coordinator yet 
Rather, I collected ICE traces when all of 16 worker threads in DfsService got stuck in synchronous calls to a single DfsAgent and then failed with a TimeoutException. What surprised me - I couldn't find in DfsAgent's traces that it received this calls! Please, explain, how this can happen!!
DfsAgent:
-- 11/08/11 01:55:17.546 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324610
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:17.546 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:17.591 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324611
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:17.591 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:17.675 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324612
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:17.675 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:17.934 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324613
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:17.934 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:18.152 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324614
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:18.152 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:20.733 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324615
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:20.733 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:20.750 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324616
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:20.750 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:20.975 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324617
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:20.975 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:21.113 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324618
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:21.113 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:21.440 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324619
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:21.440 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:21.441 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324620
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:21.441 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:21.897 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324621
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:21.897 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:22.352 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324622
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:22.352 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:22.605 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324623
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:22.605 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:24.115 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 86
request id = 1324624
identity = DfsAgent.8
facet =
operation = openWrite
mode = 0 (normal)
context =
-- 11/08/11 01:55:24.115 DfsService.server.1: Network: sent 86 of 86 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:55:40.932 DfsService.server.1: Protocol: sending request
message type = 0 (request)
compression status = 0 (not compressed; do not compress response, if any)
message size = 53
request id = 1324625
identity = DfsAgent.8
facet =
operation = getAgentState
mode = 2 (idempotent)
context =
-- 11/08/11 01:55:40.932 DfsService.server.1: Network: sent 53 of 53 bytes via tcp
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
The very moment, when we got TimeoutException:
-! 11/08/11 01:57:17.437 DfsService.server.1: warning: connection exception:
Outgoing.cpp:226: Ice::TimeoutException:
timeout while sending or receiving data
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
-- 11/08/11 01:57:17.437 DfsService.server.1: Network: closing tcp connection
local address = 10.65.60.91:64591
remote address = 10.65.60.86:61092
None of the received requests for the bolded request id's were found in ICE traces of DfsAgent.8 server.
Regards, andy