Page 1 of 2 1 2 LastLast
Results 1 to 15 of 17

Thread: Bus Error under Mac OX 10.4 and IcePy 3.3.0

  1. #1
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115

    Bus Error under Mac OX 10.4 and IcePy 3.3.0

    (The following also applies to Ice 3.2.1)

    We have the following setup:
    • Java server running inside of IceGrid, behind Glacier2
    • Trivial python client accessing a Glacier2 session

    The client logs in and can interact with the server normally. When trying to acquire one particular servant (of many) via the session facade, the client segfaults and "Bus error" is printed.

    The gdb output for the client is:
    Code:
    (gdb) run buserror.py
    Starting program: /opt/local/bin/python buserror.py
    Reading symbols for shared libraries . done
    
    Program received signal SIGTRAP, Trace/breakpoint trap.
    0x8fe01010 in __dyld__dyld_start ()
    (gdb) c
    Continuing.
    Reading symbols for shared libraries ... done
    Reading symbols for shared libraries . done
    Reading symbols for shared libraries . done
    Reading symbols for shared libraries . done
    Reading symbols for shared libraries . done
    Reading symbols for shared libraries ...... done
    08/12/08 16:58:37.966 warning: deprecated property: Ice.MonitorConnections
    Now
    
    Program received signal EXC_BAD_ACCESS, Could not access memory.
    Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
    0x007224e9 in IcePy::SyncTypedInvocation::invoke ()
    (gdb) bt
    #0  0x007224e9 in IcePy::SyncTypedInvocation::invoke ()
    #1  0x0071b6fb in operationInvoke ()
    #2  0x0027fa7e in PyEval_EvalFrame (f=0x6083f0) at Python/ceval.c:3568
    #3  0x00280619 in PyEval_EvalCodeEx (co=0x13a6620, globals=0x5b14b0, locals=0x0, args=0x608cfc, argcount=1, kws=0x608d00, kwcount=0, defs=0x13f263c, defcount=1, closure=0x0) at Python/ceval.c:2741
    #4  0x0027e453 in PyEval_EvalFrame (f=0x608bb0) at Python/ceval.c:3661
    #5  0x00280619 in PyEval_EvalCodeEx (co=0x4fe60, globals=0x20a50, locals=0x20a50, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2741
    #6  0x00280859 in PyEval_EvalCode (co=0x4fe60, globals=0x20a50, locals=0x20a50) at Python/ceval.c:484
    #7  0x002a75b1 in PyRun_FileExFlags (fp=0xa000bda0, filename=0xbfffd8f3 "buserror.py", start=257, globals=0x20a50, locals=0x20a50, closeit=1, flags=0xbfffd608) at Python/pythonrun.c:1287
    #8  0x002a7894 in PyRun_SimpleFileExFlags (fp=0xa000bda0, filename=0xbfffd8f3 "buserror.py", closeit=1, flags=0xbfffd608) at Python/pythonrun.c:871
    #9  0x002b13dc in Py_Main (argc=1, argv=0xbfffd688) at Modules/main.c:493
    #10 0x000018ee in ?? ()
    #11 0x00001815 in ?? ()
    (gdb)
    The client code is:
    Code:
    import omero
    c = omero.client()
    s = c.createSession()
    print "Now"
    s.getScriptService()
    The bus error is triggered at line 263 in IceInternal.ServantManager.java. (connection.sendResponse) and no exception shows up in the server.

    An interesting corollary is that the log file for the server is halted (for lack of a better term) on the first bus error. I.e. our log statement "Creating servant" is printed, and afterwards nothing else even though further interactions are possible with the server.

  2. #2
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi,

    Note that we no longer support Mac OS X 10.4 for Ice 3.3.0. Which python version do you use and did you build python yourself? Did you try running the IcePy tests to see if they worked?

    The best would be to try on Mac OS X 10.5 using python installed with the system and see if you can reproduce the problem.

    Cheers,
    Benoit.

  3. #3
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115

    Comparison to Java

    Similar code in Java hangs but does not crash.

    Code:
    public class buserror {
        public static void main(String[] args) throws Exception {
            omero.client c = new omero.client();
            omero.api.ServiceFactoryPrx s = c.createSession();
            System.out.println("first");
            s.getQueryService();
            System.out.println("now");
            s.getScriptService();
        }
    }
    produces:

    Code:
    first
    now
    Sending kill -QUIT to the process returns:

    Code:
    Full thread dump Java HotSpot(TM) Client VM (1.5.0_13-121 mixed mode, sharing):
    
    "DestroyJavaVM" prio=5 tid=0x00501320 nid=0x1803800 waiting on condition [0x00000000..0xb0800000]
    
    "Ice.SelectorThread" prio=5 tid=0x00511b70 nid=0x1969000 runnable [0xb0e8d000..0xb0e8dd10]
            at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
            at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:118)
            at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:69)
            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
            - locked <0x266200d0> (a sun.nio.ch.Util$1)
            - locked <0x266200e0> (a java.util.Collections$UnmodifiableSet)
            - locked <0x26620088> (a sun.nio.ch.KQueueSelectorImpl)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
            at IceInternal.Selector.select(Selector.java:139)
            at IceInternal.SelectorThread.run(SelectorThread.java:120)
            at IceInternal.SelectorThread$HelperThread.run(SelectorThread.java:273)
    
    "Ice.ThreadPool.Client-0" prio=5 tid=0x00511860 nid=0x1968c00 runnable [0xb0e0c000..0xb0e0cd10]
            at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
            at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:118)
            at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:69)
            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
            - locked <0x26620228> (a sun.nio.ch.Util$1)
            - locked <0x26620238> (a java.util.Collections$UnmodifiableSet)
            - locked <0x266201e0> (a sun.nio.ch.KQueueSelectorImpl)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
            at IceInternal.Selector.select(Selector.java:139)
            at IceInternal.ThreadPool.run(ThreadPool.java:335)
            at IceInternal.ThreadPool.access$100(ThreadPool.java:12)
            at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:951)
    
    "Ice.EndpointHostResolverThread" prio=5 tid=0x00511230 nid=0x199ba00 in Object.wait() [0xb0d8b000..0xb0d8bd10]
            at java.lang.Object.wait(Native Method)
            - waiting on <0x26620390> (a IceInternal.EndpointHostResolver)
            at java.lang.Object.wait(Object.java:474)
            at IceInternal.EndpointHostResolver.run(EndpointHostResolver.java:77)
            - locked <0x26620390> (a IceInternal.EndpointHostResolver)
            at IceInternal.EndpointHostResolver$HelperThread.run(EndpointHostResolver.java:146)
    
    "Ice.ThreadPool.Server-0" prio=5 tid=0x005108c0 nid=0x199be00 runnable [0xb0d0a000..0xb0d0ad10]
            at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
            at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:118)
            at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:69)
            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
            - locked <0x26620478> (a sun.nio.ch.Util$1)
            - locked <0x26620488> (a java.util.Collections$UnmodifiableSet)
            - locked <0x26620430> (a sun.nio.ch.KQueueSelectorImpl)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
            at IceInternal.Selector.select(Selector.java:139)
            at IceInternal.ThreadPool.run(ThreadPool.java:335)
            at IceInternal.ThreadPool.access$100(ThreadPool.java:12)
            at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:951)
    
    "Ice.Timer" prio=5 tid=0x0050fa70 nid=0x198ec00 in Object.wait() [0xb0c89000..0xb0c89d10]
            at java.lang.Object.wait(Native Method)
            - waiting on <0x26b1fd98> (a IceInternal.Timer)
            at IceInternal.Timer.run(Timer.java:217)
            - locked <0x26b1fd98> (a IceInternal.Timer)
    
    "Low Memory Detector" daemon prio=5 tid=0x00508f60 nid=0x1816800 runnable [0x00000000..0x00000000]
    
    "CompilerThread0" daemon prio=9 tid=0x00508560 nid=0x1816400 waiting on condition [0x00000000..0xb0b06748]
    
    "Signal Dispatcher" daemon prio=9 tid=0x00508090 nid=0x1813200 waiting on condition [0x00000000..0x00000000]
    
    "Finalizer" daemon prio=8 tid=0x00507830 nid=0x1811600 in Object.wait() [0xb0a04000..0xb0a04d10]
            at java.lang.Object.wait(Native Method)
            - waiting on <0x26a949e0> (a java.lang.ref.ReferenceQueue$Lock)
            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
            - locked <0x26a949e0> (a java.lang.ref.ReferenceQueue$Lock)
            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
            at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
    
    "Reference Handler" daemon prio=10 tid=0x00507480 nid=0x1810800 in Object.wait() [0xb0983000..0xb0983d10]
            at java.lang.Object.wait(Native Method)
            - waiting on <0x26a94a60> (a java.lang.ref.Reference$Lock)
            at java.lang.Object.wait(Object.java:474)
            at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
            - locked <0x26a94a60> (a java.lang.ref.Reference$Lock)
    
    "VM Thread" prio=9 tid=0x00506c00 nid=0x1810400 runnable 
    
    "VM Periodic Task Thread" prio=9 tid=0x00509b30 nid=0x1816c00 waiting on condition 
    
    "Exception Catcher Thread" prio=10 tid=0x005015a0 nid=0x1803c00 runnable

  4. #4
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115

    Reproduced under Ubuntu/Hardy with Ice 3.2.1

    Using the standard packages under hardy (currently 3.2.1) the Python example above segfaults.

  5. #5
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    Sorry, our posts crossed each other, Benoit.

    Quote Originally Posted by benoit View Post
    Note that we no longer support Mac OS X 10.4 for Ice 3.3.0. Which python version do you use and did you build python yourself? Did you try running the IcePy tests to see if they worked?
    This is Python 2.4.5 from MacPorts. I did not try to run the tests, since I'm also using the Ice port which has always worked admirably. But I will do that now.

    The best would be to try on Mac OS X 10.5 using python installed with the system and see if you can reproduce the problem.
    Unfortunately that's not an option in the immediate term. I might should add that this setup was working under 3.2.1 on 10.4 and only started to segfault when I switched the server to AMD (This particular method, however, is not AMD, making it possibly unrelated.) The switch to 3.3.0 made no difference.

  6. #6
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    python allTests.py passes using the 3.3.0 ports under 10.4.

  7. #7
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi Josh,

    I suspect the Java client hangs because you don't destroy the communicator (and therefore threads created by the communicator aren't terminated and the JVM hangs on exit waiting for these threads to be gone). I'm afraid it's difficult to know what could be the problem with so little information. The fact that you changed how the server is implemented shouldn't have any effects on your Python client.

    Could you perhaps narrow down the crash to a small self contained test case that we could use to try to reproduce it?

    Btw, are you running this code on an Intel platform? Did you try setting the DYLD_BIND_AT_LAUNCH=1 environment variable to see it made a difference (we used to experience random crashes on 10.4 without this variable set in the environment).

    Cheers,
    Benoit.

  8. #8
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    Quote Originally Posted by benoit View Post
    I suspect the Java client hangs because you don't destroy the communicator (and therefore threads created by the communicator aren't terminated and the JVM hangs on exit waiting for these threads to be gone).
    Apologies, you're of course right. In my attempt to make a minimal code sample, the shutdown was left out. The Java code does not hang.

    Quote Originally Posted by benoit View Post
    I'm afraid it's difficult to know what could be the problem with so little information. The fact that you changed how the server is implemented shouldn't have any effects on your Python client.
    I definitely agree that it shouldn't make a difference, but it's so odd that other facade methods should work normally. The only other substantial interface difference is that the script service is forward declared (without ["ami","amd"] while the definition is in another file (with ["amd","ami"]). Am I grasping at straws?

    Quote Originally Posted by benoit View Post
    Could you perhaps narrow down the crash to a small self contained test case that we could use to try to reproduce it?
    I'm certainly working on it, but as always, there are a terrible number of moving parts. Finding the one that has quite abruptly started causing a segfault isn't proving easy.

    Quote Originally Posted by benoit View Post
    Btw, are you running this code on an Intel platform? Did you try setting the DYLD_BIND_AT_LAUNCH=1 environment variable to see it made a difference (we used to experience random crashes on 10.4 without this variable set in the environment).
    I had not tried DYLD_BIND_AT_LAUNCH, but it made no difference.

    Thanks,
    ~J.

  9. #9
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    By the way, here's the Ubuntu/Hardy backtrace:
    Code:
    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7f30e04986e0 (LWP 22916)]
    0x00007f30dee2281a in IcePy::OperationI::invoke (this=0x8c76c0, proxy=@0x7fffe84b8c20, args=0x7f30e0458050, pyctx=0x702580) at Operation.cpp:570
    570                         Py_INCREF(ret);
    (gdb) bt
    #0  0x00007f30dee2281a in IcePy::OperationI::invoke (this=0x8c76c0, proxy=@0x7fffe84b8c20, args=0x7f30e0458050, pyctx=0x702580) at Operation.cpp:570
    #1  0x00007f30dee1d71c in operationInvoke (self=0x7f30dd7342d0, args=0x7f30e0436a50) at Operation.cpp:206
    #2  0x0000000000477146 in PyEval_EvalFrame ()
    #3  0x00000000004784f5 in PyEval_EvalCodeEx ()
    #4  0x000000000047780d in PyEval_EvalFrame ()
    #5  0x00000000004784f5 in PyEval_EvalCodeEx ()
    #6  0x0000000000478622 in PyEval_EvalCode ()
    #7  0x000000000049ccca in PyRun_FileExFlags ()
    #8  0x000000000049cf19 in PyRun_SimpleFileExFlags ()
    #9  0x00000000004125c9 in Py_Main ()
    #10 0x00007f30df6b71c4 in __libc_start_main () from /lib/libc.so.6
    #11 0x0000000000411ae9 in _start ()
    (gdb)

  10. #10
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi,

    It crashes when the getScriptService operation is invoked, is this correct? It looks like the crash occurs when un-marshalling the response from the server. What is the signature of this Slice operation?

    Cheers,
    Benoit.

  11. #11
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    Correct.
    Code:
            interface ServiceFactory extends Glacier2::Session
            {
                ...
                IScript*   getScriptService() throws ServerError;
    with the ... all of the same type:
    Code:
                IAdmin*    getAdminService() throws ServerError;
                IConfig*   getConfigService() throws ServerError;
                IPixels*   getPixelsService() throws ServerError;
                IPojos*    getPojosService() throws ServerError;
                IQuery*    getQueryService() throws ServerError;
                // etc.
    all of which function normally. The implementation (just for reference) is:

    Code:
       public IScriptPrx getScriptService(Ice.Current current) throws ServerError {
            Ice.Identity id = getIdentity(current, SCRIPTSERVICE.value);
            Ice.Object servant = current.adapter.find(id);
            if (servant == null) {
                servant = (Ice.Object) this.context.getBean(SCRIPTSERVICE.value);
                if (servant instanceof Ice.TieBase) {
                    Ice.TieBase tie = (Ice.TieBase) servant;
                    Object obj = tie.ice_delegate();
                    if (obj instanceof ServiceFactoryAware) {
                        ((ServiceFactoryAware) obj).setServiceFactory(this);
                    }
                }
                current.adapter.add(servant, id);
            }
            Ice.ObjectPrx prx = current.adapter.createProxy(id);
            return IScriptPrxHelper.checkedCast(prx);
        }

  12. #12
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Hi Josh,

    I believe I was able to reproduce. The problem shows up if the client doesn't load the Slice definition for the proxy type returned by the invocation.

    Do you forward declare the IScript interface in the definition of the ServiceFactory interface and don't load the definition of the IScript interface in the python client? If that's the case, I believe loading the Slice file where the IScript interface is defined should solve the issue.

    Cheers,
    Benoit.

  13. #13
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    You're right. I had wondered about this after my last post, but unfortunately hadn't been able to test it during lunch. Thanks for being faster.

    Do you have an idea why the server log would freeze when the client segfaults, and if there is a possible workaround?

  14. #14
    benoit's Avatar
    benoit is offline ZeroC Staff
    Name: Benoit Foucher
    Organization: ZeroC, Inc.
    Project: Ice
    Join Date
    Feb 2003
    Location
    Rennes, France
    Posts
    2,196
    Could you detail a little more how logging works in your Java server? Is this just straight logging on the standard output? Can the server still process incoming requests after this "log freeze" occurs?

    Cheers,
    Benoit.

  15. #15
    joshmoore is offline Registered User
    Name: Josh Moore
    Organization: Glencoe Software, Inc.
    Project: OMERO, http://trac.openmicroscopy.org.uk/omero
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    115
    Hi Benoit,

    The Java server logs via commons-logging/log4j to standard out which is caught by icegridnode and printed to var/log/master.out:

    Code:
    #IceGrid.Node.Output=var/log
    #IceGrid.Node.RedirectErrToOut=1
    
    # Defines that the standard out from this node and all
    # contained servers will be saved in the following files.
    # If Output is set, then output from the servers will be
    # redirected _after_ Ice.Communicator creation.
    Ice.StdOut=var/log/master.out
    Ice.StdErr=var/log/master.err
    The only other possibly related tidbit is that the Java process is started via python's os.execvpe (where icegridnode runs "python ...").

    And yes, the server still behaves completely normally except for stdout. The only way to "restart" the log that I've found is restarting the server.

Page 1 of 2 1 2 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. 3.3.1 test suite failures on Mac OS X 10.5
    By blair in forum Bug Reports
    Replies: 4
    Last Post: 04-05-2009, 05:43 PM
  2. Replies: 2
    Last Post: 12-16-2005, 10:10 PM
  3. Ice 3.0.0 / Mac OS X 10.4.3, GCC 4.0.1
    By lindholm in forum Bug Reports
    Replies: 3
    Last Post: 11-21-2005, 04:54 AM
  4. Compiling demos on Mac OS X 10.4.2
    By CAS in forum Help Center
    Replies: 4
    Last Post: 09-30-2005, 07:33 PM
  5. IcePy 2.0.0 fails, hangs on Mac OS X 10.3
    By Michael Halle in forum Bug Reports
    Replies: 1
    Last Post: 11-30-2004, 10:07 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •