SUMMARY: serious problems after attempted install of Patch ID 106541-05

From: Nico Wieland (nico.wieland@ubs.com)
Date: Tue Jul 20 1999 - 07:18:02 CDT


Dear Admins,

first of all, thanks a lot for all the help. I hope this summary will be
of use, as there are a couple of people who had this problem, and there
will probably be more :-)

(Original problem description at the end; in brief, the system remained
unbootable after unsuccessful installation of patch ID 106541-05)

The reason for this issue is still not too clear; it looks like
106541-05 doesn't postpatch correctly under certain circumstances. In my
case, the reason could be that /var filled too quickly and, as it wasn't
on a separate slice, therefore filled up the / partition. (This was the
last time having /var on / for me ;-) One person suggested that it could
be the fault of 106541-04, as he went through all the updates without
problem from 01 thru 04, but i don't think so as all the other people
having this problem were directly installing revision 05.

The solution, as some people suggested, was to boot the machine from
CD-ROM (or via network) in single user mode, fsck the boot- and any
other system partition under it and mount them eg. on /mnt, then apply
the patch again with

patchadd -u -R /mnt 106541-05

After this, the system booted like a charm, and I could even
successfully fsck /export/home which before, when booted from CD-ROM,
seemed dead beyond repair. (This is strange...) And again, I recommed
everyone to have /var *not* on / but on a separate slice (not like I had
it).

Some questions still remain though:

- What (if anything) is wrong with the patch? (I actually don't think
it's only the patch's fault, as there are obviously *lots* of successful
patch installations.)
- How to avoid this trouble/what exactly screws up the affected systems?
- Why does this happen only on some systems but not on others?
(Hardware-independend btw, there are affected Ultra60, Ultra5, etc.)

(I'm really wondering if also people with a separate /var slice and lots
of free space on / had this problem too....)

Thanx to:
        Bill (sysadm@its.brooklin.cuny.edu)
        Lauradel Collins (lauradel@cs.uoregon.edu)
        Jason Kau (jason.kau@gtri.gatech.edu)
        Robin Landis (robin.landis@exim.gov)
        Jeffrey Liu (jliu@aptix.com)
        Colin Melville (Colin_Melville@mastercard.com)
        Michael A. Peterson (peterson@chem.ufl.edu)
        Mike Salehi (Mike.Salehi@usa.xerox.com)
        Toomas Soome (tsoome@ut.ee)
        R. Steudle (rsteudle@swiss.sun.com)
        Max Trummer (max@bidland.com)

Nico Wieland nico.wieland@ubs.com
UBS AG / CoC CTS ++41-1-2387335

Original Message:

>Dear Admins,
>
>I tried to install Patch ID 106541-05 (kernel patch) on my Ultra 5
>(Solaris 7) (single user mode). the installation started fine, but in
>the progress of patch application, something seemed to have filled up /
>- the installation program then stayed in a loop, it just displayed
>progress- and errormessages very quickly but always the same ones. as i
>saw that there was no progress anymore, i cancelled the installation
>with ^C. the program told me that the system was not changed (as usual);
>after this i searched for files that could have been filling up the root
>partition, found some things i don't need (nsr stuff i don't use anyway,
>some huge log files etc.) and deleted them, then i rebooted. at startup
>the system gave me some strange error messages, and i thought, well, it
>probably got a bit confused and rebooted once more. then the problems
>started; init dropped me into single user mode but instead of working as
>usual, it now just says:
>
>INIT: cannot create /var/adm/utmp or /var/adm/utmpx
>INIT: failed write of utmpx entry:" "
>INIT: failed write of utmpx entry:" "
>INIT: SINGLE USER MODE
>ENTER RUN LEVEL (0-6, s or S):
>
>i can enter whatever i like, it just says 'will change to state S' (or
>whatever) then go back to 'ENTER...'. then i booted from cdrom, tried to
>mount the partitions, and all have been corrupted - /export/home that
>much that fsck was not able to fix it anymore. i don't care too much
>about that, as i have a backup, but .... i'd hate to reinstall
>everything. now, i can't boot anymore and don't have much of a clue on
>how to bring it back up.
>
>does anyone have an idea what happened, what the problem could be now,
>and what i can do to a) bring the system back and b) avoid that such a
>thing happens again? this is quite urgent, as it is a server machine.
>
>i will summarize.
>
>many thanks in advance,
>
>Nico Wieland



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:23 CDT