[SUMMARY] SunOS: A major overhaul

From: Ace Stewart (jstewart@zookeeper.cns.syr.edu)
Date: Fri Jan 03 1992 - 02:26:35 CST


My apologies for the lateness of this summary; a vacation interrupted.
The best was a combination of things that people repeated, however the
answer which provides the most insight was from:

jdschn@nicsn1.monsanto.com (John D Schneider)

...to whom I owe great thanx. It is the first answer in the list.

My thanx to the following for answers:

>From: simon@freeside.Aus.Sun.COM (Simon Woodhead - Technical Consultant)
>From: markets!keith@uunet.UU.NET (Keith Farrar)
>From: mike@inti.lbl.gov (Michael Helm)
>From: rackow@antares.mcs.anl.gov
>From: Brendan Kehoe <brendan@cs.widener.edu>
>From: kalli!kevin@fourx.Aus.Sun.COM (Kevin Sheehan {Consulting Poster Child}
>From: Arie Bikker <aribi@geo.vu.nl>
>From: "Anthony A. Datri" <datri@concave.convex.com>
>From: randy@ncbi.nlm.nih.gov (Rand S. Huntzinger)
>From: parens@cards.dazixco.ingr.com (paul arens x.0108)
>From: zjat02@trc.amoco.com (Jon A. Tankersley)
>From: jdschn@nicsn1.monsanto.com (John D Schneider)
>From: Mike Raffety <miker@sbcoc.com>
>From: Ian Angles <ia@st-andrews.ac.uk>

John Stewart
Senior UNIX/VMS Consultant
Academic Computing Services
Syracuse University
(315) 443-3995

Here was the original question:

=====================================================================
Question about updating:

Our current configuration is a 4/280 running 4.1.1, a 4/490 running
4.1.1, and a Sparc2 running 4.1.1 -- all of these are running as
NFS servers to one another for various reasons not important here.

We are updating to a 690 and two 670's which requires an update to
4.1.2 Our problem is as follows: we can't update to 4.1.2 on the old
machines and then dump and restore to the new machines since the new
machines are different architectures and the update from the tapes
to 4.1.2 won't be configured for the right kernel on the new machines.

We also don't feel like installing a new OS on three machines and
then manually copying who knows how many files from our old systems to
bring the new systems up as if nothing happened. This is definitely
not a good idea for us. Yet, it may be the way.

Has anyone done this? Upgraded hardware and done an OS upgrade at the
same time -- and if so, how did you approach it? We're running multi-user
on at least one of these machines and are trying to keep downtime to
an absolute minimum but this definitely throws a wrench in the works..

Please, any advice will help. My thanx.

=====================================================================

>From: jdschn@nicsn1.monsanto.com (John D Schneider)

We haven't done exactly what you have done, but we have upgraded several
Sun 3's to Sun 4's, which required a reinstall of SunOS after the hardware,
as well as a restore of the user's previous environment.
We did them one a time. On each machine to be upgraded we:

1) Take a good backup to tape the night before the upgrade.

2) Backed up those files on our system that are installation modified, using
a script I developed over a period of weeks as I upgraded all our Suns to
SunOS 4.1.1. We first NFS mount a disk that has a little room onto /mnt:

mount server:/usr/temp_backup /mnt

then run this script (which will need to be customized for your site):

#! /bin/sh -x
# script to backup to a server all the important
# files that will need to be reinstated after an upgrade to SUN OS 4.1.1.
# built by John D. Schneider 04/16/91
# History:
# 05/04/91 JDS Added name server config files

mkdir /mnt/etc
mkdir /mnt/var
mkdir /mnt/usr
mkdir /mnt/usr/etc
mkdir /mnt/usr/spool
mkdir /mnt/usr/spool/cron
mkdir /mnt/usr/local
mkdir /mnt/sys
mkdir /mnt/sys/conf

cp -rp /etc/perm* /mnt/etc
cp -rp /etc/volnm.dat /mnt/etc
cp -rp /etc/rc* /mnt/etc
cp -rp /etc/remote /mnt/etc
cp -rp /etc/sendmail* /mnt/etc
cp -rp /etc/shells /mnt/etc
cp -rp /etc/netmasks /mnt/etc
cp -rp /etc/ethers /mnt/etc
cp -rp /etc/netgroup /mnt/etc
cp -rp /etc/exports /mnt/etc
cp -rp /etc/maillist* /mnt/etc
cp -rp /etc/aliases /mnt/etc
cp -rp /etc/auto* /mnt/etc
cp -rp /etc/group /mnt/etc
cp -rp /etc/passwd /mnt/etc
cp -rp /etc/services /mnt/etc
cp -rp /etc/admin* /mnt/etc
cp -rp /etc/fstab /mnt/etc
cp -rp /etc/format.dat /mnt/etc
cp -rp /etc/hosts /mnt/etc
cp -rp /etc/hosts.equiv /mnt/etc
cp -rp /etc/termcap /mnt/etc
cp -rp /sysinfo /mnt
cp -rp /usr/spool/cron/.proto /mnt/usr/spool/cron
cp -rp /sys/conf/NODENAME /mnt/sys/conf
cp -rp /.cshrc /mnt
cp -rp /.defaults /mnt
cp -rp /.exrc /mnt
cp -rp /.login /mnt
cp -rp /.permissions /mnt
cp -rp /.profile /mnt
cp -rp /.rhosts /mnt
cp -rp /.rootmenu /mnt
cp -rp /.suntools /mnt
cp -rp /etc/resolv.conf /mnt/etc

cp -rp /usr/etc/savefs /mnt/usr/etc # NSR program files
cp -rp /usr/etc/recover /mnt/usr/etc

find /usr/local -fstype 4.2 -xdev -print | cpio -paduvm /mnt
find /usr/cops -fstype 4.2 -xdev -print | cpio -paduvm /mnt
find /usr/var -fstype 4.2 -xdev -print | cpio -paduvm /mnt

if [ -f /etc/namedb ] # name server files
then
mkdir /mnt/etc/namedb
cd /etc/namedb
find . -fstype 4.2 -xdev -print | cpio -paduvm /mnt/etc/namedb
cp -rp /etc/inetd.conf /mnt/etc
cp -rp /etc/named.boot /mnt/etc
fi
cd
exit

3) After this script is run, we also backup individual user's directories
that happen to be on the same disk as the OS:

mkdir /mnt/usr2/USERNAME
cd /usr2/USERNAME
find . -fstype 4.2 -xdev -print | cpio -paduvm /mnt/usr2/USERNAME

4) Shutdown the node and install the new hardware.

5) Install a new version of SunOS using tapes or CDROM.

6) NFS mount your backup disk as you did before:

mount server:/usr/temp_backup /mnt

7) Run the following script which restores the modified files backed up
earlier, as well as any user data (under /usr2):

#! /bin/sh -x
# script to restore important files backed up by 'sunosbackup' to a server
# These files are to be backed up by 'sunosbackup' prior to a SUN OS upgrade
# and then restored using this script.
# built by John D. Schneider 04/16/91

cp -rp /mnt/etc/perm* /etc #restore SUNLINK config files

cp -p /etc/volnm.dat /etc/volnm.dat.org
cp -rp /mnt/etc/volnm.dat /etc

cp -p /etc/rc /etc/rc.org # restore local rc files
cp -p /etc/rc.boot /etc/rc.boot.org
cp -p /etc/rc.local /etc/rc.local.org
cp -p /mnt/etc/rc /etc
cp -p /mnt/etc/rc.boot /etc
cp -p /mnt/etc/rc.local /etc

cp -p /etc/remote /etc/remote.org # restore /etc/remote
cp -p /mnt/etc/remote /etc

if [ -f /mnt/etc/sendmail.cf ]
then
cp -p /etc/sendmail.cf /etc/sendmail.cf.org # restore /etc/sendmail.cf
cp -rp /mnt/etc/sendmail* /etc
fi

cp -p /mnt/etc/shells /etc

cp -p /etc/netmasks /etc/netmasks.org # restore /etc/netmasks
cp -p /mnt/etc/netmasks /etc

cp -p /mnt/etc/ethers /etc # restore /etc/ethers

cp -p /mnt/etc/netgroup /etc # restore /etc/netgroup

cp -p /etc/exports /etc/exports.org # restore /etc/exports
cp -p /mnt/etc/exports /etc

cp -rp /mnt/etc/maillist* /etc # restore any maillists

cp -p /etc/aliases /etc/aliases.org # restore /etc/aliases
cp -p /mnt/etc/aliases /etc

cp -rp /mnt/etc/auto* /etc

cp -p /etc/group /etc/group.org # restore /etc/group file
cp -p /mnt/etc/group /etc

cp -p /etc/passwd /etc/passwd.org # restore /etc/passwd file
cp -p /mnt/etc/passwd /etc

cp -rp /mnt/etc/admin* /etc # restore all /etc/admin* files

cp -p /etc/fstab /etc/fstab.org # restore /etc/fstab
cp -p /mnt/etc/fstab /etc

cp -p /etc/format.dat /etc/format.dat.org # restore /etc/format.dat
cp -p /mnt/etc/format.dat /etc

cp -p /etc/hosts /etc/hosts.org # restore /etc/hosts
cp -p /mnt/etc/hosts /etc

cp -p /etc/termcap /etc/termcap.org # restore /etc/termcap
cp -p /mnt/etc/termcap /etc

cp -p /mnt/sysinfo / # restore /sysinfo

cp -p /mnt/.cshrc / # restore root's customizations
cp -p /mnt/.defaults /
cp -p /mnt/.exrc /
cp -p /mnt/.login /
cp -p /mnt/.permissions /
cp -p /mnt/.profile /
cp -p /mnt/.rhosts /
cp -p /mnt/.rootmenu /
cp -p /mnt/.suntools /
cp -p /mnt/etc/resolv.conf /etc

mkdir /usr/spool # restore /cron/.proto
mkdir /usr/spool/cron
mkdir /usr/var/spool/mail
cp -p /usr/spool/cron/.proto /usr/spool/cron/.proto.org
cp -p /mnt/usr/spool/cron/.proto /usr/spool/cron

mkdir /usr/local # restore /usr/local
cd /mnt/usr/local
find . -print | cpio -paduvm /usr/local

mkdir /usr/cops # restore /usr/cops
cd /mnt/usr/cops
find . -print | cpio -paduvm /usr/cops

mkdir /var # restore /var
cd /mnt/var
find . -print |cpio -paduvm /var

if [ -d /mnt/etc/namedb ] # restore name server files
then
mkdir /etc/namedb
cd /mnt/etc/namedb
find . -print | cpio -paduvm /etc
cp -p /mnt/etc/inetd.conf /etc
cp -p /mnt/etc/named.boot /etc
fi

# restore /usr/usr2 local directories and files if any exist
if [ -d /mnt/usr/usr2 ]
then
mkdir /usr/usr2
cd /mnt/usr/usr2
find . -print | cpio -paduvm /usr/usr2
fi

cd
exit

8) Install application softare (we have Sunlink DNI, TE100, and a few others).

9) Reboot and test.

=====================================================================

>From: Mike Raffety <miker@sbcoc.com>

Well, hopefully, you kept all your site-specific files OUT of /usr
(and, to the extent possible, root, too).

Then you CAN dump/restore all the other partitions (e.g., /home or
/usr/local or whatever).

Please be sure to summarize back to the list; thanks.

=====================================================================

>From: Ian Angles <ia@st-andrews.ac.uk>

Hmmm, mehtinks its time to bite the bullet, but a few words of advice.
We're in the middle of updating our machines, both hardware (ss1+ to
ss2/IPC) and software (4.1 to 4.1.1) at the same time. What we've done
is go through all our config files (or the important ones at any rate)
and note the differences between old & new. Then a patch file is
generated (by hand or 'diff -c' which can be used in the next upgrade.
Since we're doing about 80 odd machines it's been worth it - I can get
a new cluster (1 server & 7 clients) up and running in about 6 hours
from naked discs to on the network. Of course, things still have to be
hand tuned but it's worked OK so far (about 50% of our machines are
done).

=====================================================================

>From: simon@freeside.Aus.Sun.COM (Simon Woodhead - Technical Consultant)

Here's an idea. It's somewhat dependent on how "clean" your
current installation is, ie how much is customised in / & /usr.

1) Do a generic 4.1.2 install on the new machines.

2) Use the new 4.1.2 sunupgrade uitility **with "-dummy" set**, to
produce a list of all the files that 4.1.2 would normally
replace.

3) Examine list of files to enable you to change (or copy across)
whatever is necessary on the new machines.

The user partitions can be dump/restored in the normal way.
The kernel config files should need virtually no change (just
the MACHINE & CPU).

The "sunupgrade" utility is pretty good. Those of us within
Sun who've used it have been impressed. The -dummy flag has
been found to be really useful...

Your 4.1.2 documentation will explain how sunupgrade works,
what it produces as output, and how to interpret that output.

=====================================================================

>From: markets!keith@uunet.UU.NET (Keith Farrar)

4.1.2 restore should be able to read the 4.1.1 dump tapes.
You could do an interactive restore, pruning off the subdirectories
which you don't want to transfer over to the new machine (anything
kvm-specific or otherwise superceded, for example).

The above was my approach with our last concurrent h/w s/w upgrade
(4/280 with 4.0.3 to 4/490 with 4.1_PSR_A).

======================================================================

>From: mike@inti.lbl.gov (Michael Helm)

Answering this depends a lot on the details of your configuration,
but really, most machines have only arch & sys dependent things
in /etc/, no? So if you tar this off some place (& prune out the
links to executables & garbage files) then you have most of the copying
done. You shouldn't need to restore anything if your partitions
are laid out compatibly. If you are having some kind of space allocation
problems (not enuf /usr, swap, or things in the wrong place &c) maybe
you should solve this first? Another possibility is to replace the
simplest machine 1st with the new hardware, whichever one is least
used, fewest disks, users, &c. Then you will have a system back on
the air quicker, with resources that can support partial installations
of the other hardware (like diskless boots).

Good luck!

======================================================================

>From: rackow@antares.mcs.anl.gov

We recently did an upgrade/hardware swap that is slightly more radical than
what you are proposing. It really isn't all that bad if you are putting in
new disks as well.

Our old config was 4 Sun3/280 each with about 2 gig of disk acting as file
servers for a bunch of Sun3 diskless desktops and a few sparcs. They also
servered files to a variety of other hardware platforms. The new servers are
Sparcstation-2's with 4+ gig of SCSI disk each.

What we did was to bring up the new machines on the new OS and got all the
disks formated/partitioned/newfs'd. We then brought the new servers up in
parallel to the old server. Next was to reboot the oldservers without the
user file systems mounted. This will keep the file stable while your
doing the copies... Now do a
"cd newdiskpart; rsh oldserver dump 0df 8888888 - | restore if -"
to copy the disks across the net.

Using parallel copies going on on the 3 nets of the system, we were able to
copy the user files in about 2.5 hours. The servers were up and running and
the users saw very little downtime or change in function.

-_Gene

=====================================================================

>From: Brendan Kehoe <brendan@cs.widener.edu>

I'd suggest not upgrading to 4.1.2. :)

=====================================================================

>From: jack@laguna.CCSF.Caltech.EDU (Jack Stewart)

We have the boards on order to do a board upgrade. I don't think that they are
+++ shipping yet but I can let you know how it goes after we do the
+++upgrade.

The 4.1.2 release only contains support for the MP machines, to the best of my
+++knowledge. I don't believe that there are any new file systems,
+++dumps, restores or tars.

What is it that you are actually trying to do on these machines? The files on
+++the user disks should be fine. The files in the / and /usr parti
+++tion on the old system won't be any good to you anyway. There ar
+++e only about 5 configuration files (printcap, passwd, etc). All
+++of these can be restored from tape (I usually cheat and copy them
+++ to one of the user disks). So I am not exactly sure of what you
+++ are trying to do (or what the area of concern is)? If you descr
+++ibe it to me in more detail I'ld be glad to help.

=====================================================================

>From: kalli!kevin@fourx.Aus.Sun.COM (Kevin Sheehan {Consulting
Poster Child}

This may be a silly assumption, but I would presume that the upgrade
kit anticipated that problem. I know the Sun guys here get them up
in a matter of hours, so that doesn't sound like a re-install to me...

=====================================================================

>From: Arie Bikker <aribi@geo.vu.nl>

I did something like this when stepping from 3.5 to 4.0.3.Here's the recipe:
You need a spare partition.
Build 4.1.2 plain vanilla
Dump /xxx.4.1.1 on the spare (possibly from tape)
Dump and restore in one operation /xxx.4.1.2 on spare
Change fstab to switch /xxx.4.1.2 to spare
Reboot (and hope for the best)
If alls well dump and restore (mounted) spare to (original) /xxx.4.1.2

You can repeat this for any of the installation affected partitions.
This procedure saves the files you had added to plain vanilla.
The altered files is a different story. A simple find with appropriate mtime an
+++d exec
will do that job.

=====================================================================

>From: "Anthony A. Datri" <datri@concave.convex.com>

This is a perfect example of why it's important to isolate local changes/
additions as much as possible. Local software should all live under one
places -- /usr/local, for example -- that is easily isolable. Things like
/etc/hosts, /etc/aliases, /etc/passwd, and so forth should be maintained and
distributed from a central database. Thus, you isolate the boilerplate
(the os) from local state. Of course, certain stupidities of Sun's get
in the way, like their brain-dead treatment of ifconfig in the rc files.

=====================================================================

>From: randy@ncbi.nlm.nih.gov (Rand S. Huntzinger)

I wish I could help you but we haven't done this yet. Like you, it's
on the plate for the future. If you get any particularly bright replies
please let me know - or post a summary back to the net.

I'm afraid however, that we're probably going to be stuck with doing it
the old blood and guts way. There are things which you can do to help minimize
the downtime.

1. A detailed plan is essential. You spend a lot of time figuring out
what you're going to need to do and in what order. With a plan you
can also delegate pieces of the job to others, so you can get things
going quickly.

2. I have been known to dump stuff off of the tape (oops - CD) and
pick out the files I need to edit on the old systembefore the
upgrade. Things like the kernel configuration file, rc.local, fstab,
printcap, passwd, magic, etc. By editingthese files before the
the upgrade, you don't have to waste time editing them when the system
is unavailable.

3. Decide ahead of time which things you can defer installing until after
the system is available to users. Inform users using /etc/motd, mail,
etc. what will not be available immediately and then inform them as
things become available. This allows people to get working on some
things while you're completing the job. Again - a plan is essential.

4. If you have to do dumps and restores, and you have multiple tape
drives, you can do them in parallel. This can save lots of time.
But like fsck - don't try to dump or restore two partitions on the
same disk at the same time. That turns a big win into a big loss.

5. If you're going to have to do dumps and restores - do the dumps on
inactive filesystems. That is the system down, or the filesystem
dismounted, or the filesystem read-only. You don't want to mess up
you dumps on something this important.

6. Again, related to planning. Make sure you have everything you're
going to need laid out ready to go. You don't want to have the
system down while you're looking for a tape or a manual.

7. I usually do most of my work with the system in multi-user mode
and logins disabled or a prominant /etc/motd message saying the
system is not officially up and can be shut down at short notice.
This allows me to connect multiple windows from my workstation to
the machine over the net and do more than one thing at once.

8. Keep records of what you did in building the system, and add to the
list records of the changes you made afterwards so it'll be easier
to build your plan for the next upgrade.

9. Once you've built one machine, you can probably build the kernels,
prepared edited files, etc. for the others on it. The 690 and 670's
are, I think, the same architecture.

Hope this helps - it's really pretty general sort of stuff and you've
probably thought of much of it already.

=====================================================================

>From: parens@cards.dazixco.ingr.com (paul arens x.0108)

John,

I have upgraded two systems with the 4.1.2 OS this month. One of the upgrades
+++was a 4/330 in our production environment, the other was building
+++ an "out-of-the-box" 670 with 4.1.2 . While neither of these qui
+++te matches your exact situation I thought I'd share what I learne
+++d.

The installation process is (for a full rebuilt) almost identical to 4.1.1. Ve
+++ry easy, and straight forward. As for the operation of 4.1.2 OS
+++we have seen very few "features" that have caused us problems. T
+++he only large one is that any setUID programs compiled on 4.1.2
+++are NOT backwards compatible to 4.1.1. Upward compatiblity works
+++ fine!

The 670 that I helped build was a rather flaky box. We had repeated problems w
+++ith building the CDC IPI drives (still no explaination on why).
+++The Sun tech line gave us bad info on installing the PrestoServe
+++boards which forced them to send us a replacement.

All in all: With the exception of the flaky hardware my experiences with both
+++products has been quite good. The 4.1.2 seems tobe very stable.
+++ The MP unit that we have (now a 630) has stabilized quite well a
+++nd is more than pulling its load. So, other than the normal prob
+++lems associated with completely rebuilding the /usr partition the
+++ upgrade went quite well.

Good Luck.!!! Merry Christmass.!!! I am interested in hearing how you finally
+++ handle the upgrade.

Paul Arens
Manager Systems Product Support
Dazix, An Intergraph Company
parens@dazixco.ingr.com

=====================================================================

>From: zjat02@trc.amoco.com (Jon A. Tankersley)
To: jstewart@mailbox.syr.edu
Subj: Re: [SunOS] A major overhaul -- has anyone done this?

Yes... It is a real pain. Basically, keep a copy of /etc and /var (for mail)
around.

Most of /usr will get wiped out, so you may want to copy any extra stuff you
put there (local, etc.) someplace, or you can try to preserve /usr.

>From the restore stand point, you don't have to restore into the same director
+++y.
You can restore anywhere and copy the necessary files over.
=====================================================================



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:33 CDT