SUMMARY: unhelpful error messages

From: Steve Elliott (se@comp.lancs.ac.uk)
Date: Sat May 07 1994 - 17:19:46 CDT


I recently asked:
>Last night one of my systems, a SS10 with SunOS4.1.3,
>crashed with the following errors:
>mb_mapfree: MR == 0!!!
>mb_mapfree: MR == 0!!!
>mb_mapfree: MR == 0!!!
>panic on cpu0: tmp_rename

>Anyone able to decipher this? The system uses tmpfs, my suspicion
>is that /tmp filled up and bumped into the system trying to swap.
 
I got one reply:
        From: Joe Silva <jsilva@com.polaris1.rocknroll>
        Message-Id: <9405041919.AA05384@rocknroll.polaris1.com>
        To: se@uk.ac.lancs.comp
        Subject: Re: unhelpful error messages
        Cc: jsilva@COM.DMC
        Content-Type: X-sun-attachment
        Sender: postmaster@uk.ac.nsfnet-relay
        Status: RO

        Steve,
        It appears you may have a software bug. See attached.

        Joe

         Bug Id: 1029783
         Category: kernel
         Subcategory: driver
         Release summary: 4.1prebeta, 4.0.3
         Synopsis: mb_mapalloc can call back while driver 'protected' by spl.
                 Integrated in releases: 4.1
         Summary:
        There is a potential problem for drivers using mb_mapalloc() directly
        and expecting a callback in the case where resources aren't immediately
        available.

        The callback can occur while the driver is in its interrupt handler.

        This is because the callback routine is called by mb_mapfree, which
        can be called from the interrupt handler from another driver which
        has a higher priority.

        The scenario:

                Driver A calls mb_mapalloc(), DVMA is not available, so driver A's
                callback routine is queued.
                
                Driver A gets an interrupt.

                During the driver A interrupt handler, driver B gets an interrupt,
                this can happen since driver B interrupts are higher priority.

                Driver B calls mb_mapfree() which calls driver A's callback routine.
                
                Driver A didn't expect to be re-entered this way and something bad
                happens, such as a request list being damaged.

The observant will note that the bug above was fixed in SunOS 4.1
As I stated, we use 4.1.3. A chat with a Sun engineer over the phone
and a quick look at Sun's online databse pointed me to patch #100507-05.

Keywords: tmpfs crash fail assertion leaks anonymous tmp_rename panic spars files
Synopsis: SunOS 4.1.1, 4.1.2, 4.1.3: tmpfs jumbo patch

SunOS release: 4.1.1 4.1.2 4.1.3 4.1.3C

Topic: fixes for several tmpfs bugs

So I installed the patch. The guy who was running a large suimulation
when the system crashed ran his simulation again last night and everything
was OK

Steve



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:09:00 CDT