[SUMMARY] Invincible

From: Shawn Tagseth <Shawn.Tagseth_at_crystaldecisions.com>
Date: Thu Nov 22 2001 - 14:24:46 EST
Thanks for the help everyone...  The machine did restart end up restarting
after a /usr/sbin/reboot (thanks Simon Convey).  I mistakenly thought a
reboot == shutdown -i 6.  

As for the problem, Rainer Heilke said he ran into a similar problem a while
ago and increased shmsys:shminfo_shmmax in /etc/system (an ipcs -a before
reboot did show a lot of shared memory, but that is not unusual in our dev

BTW the machine is a Dual 450 220R, running a pitifully old Solaris 8 kernel
(there are reasons why its still at 108528-03).  There were no messages in
the logs.

Thanks again everyone.  Below are the full responses.


From: Jonathan A. Zdziarski [mailto:jonathan@networkdweebs.com]
Subject: RE: Invincible

Do you have enough CPU cycles to do an lsof and find out what resources ps
might be spinning on?
what if you kill the ps (somehow) and truss it or sotruss it
do you still have a /proc ?
From: Convey, Simon [mailto:simon.convey@csfb.com]
Subject: RE: Invincible

or if that doesn't work,
/usr/sbin/uadmin 2 0
From: Yura Pismerov [mailto:ypismerov@tucows.com]
Subject: Re: Invincible
	It might be some system limits (number of processes, open files,
are not tuned up properly, so something is eating up allthe resources
and the rest users/processes are queued up waiting for them. Check both
system wide and user limits once you bring the box back online.
From: Heilke, Rainer [mailto:Rainer.Heilke@atcoitek.com]
Subject: RE: Invincible

We had similar problems on one of our servers. We found one issue, and Sun
suggested a second fix. The first that we found was that our set
shmsys:shminfo_shmmax= setting in /etc/system was too low. Depending upon
what the server is used for, you may want to bump this up to 60% of the
physical RAM (remember to adjust it when you add more RAM). There is also a
bug on Solaris 2.6 that Sun alerted us to. If the system is fairly fully
configured, add a kobj setting into the /etc/system file as well. The
default was 100000, and we bumped it to 200000. The line looks like: set
kobj_map_space_len=0x200000  This is as per SRDB 20267. This was supposedly
fixed in Sol7 and 8. You didn't say which OS you were running, so I'm
throwing it in as an "in case". The value must always be on an even
boundary. Start with 200000, go to 300000, etc.

Rainer Heilke
From: Daniel Zhuang [mailto:daniel.zhuang@amdocs.com]
Subject: RE: Invincible
did you try to exit the telnet session which you issued command "shutdown
...", in some cases, it starts to reboot.
as I know, Oracle internal processes prevent system from rebooting. kill
them if any.
From: Gaziz Nugmanov [mailto:sunman@lists.gaziz.ca]
Subject: Re: Invincible
1/ check your PATH
2/ check/verify all the packages installed
3/ have you been hacked? Sure?
4/ reinstall OS

-----Original Message-----
From: Shawn Tagseth [mailto:Shawn.Tagseth@crystaldecisions.com]
Sent: Thursday, November 22, 2001 10:19 AM
To: Sunmanagers (E-mail)
Subject: Invincible

I have a machine that is acting very strangley.  I can log into the machine
and run various processess (eg vmstat, uptime etc), but as soon as I try to
run commands like ps, pkill or top they hang and do not come back even with
a ^C.  Vmstat is showing 100% sys usage.

Its a machine used by our developers and seemed to go into this state while
running suns workshop debugger.

I call the machine invincible because I thought "I could work all day to
find out what is causing this (without ps etc) or I can restart it".
Deadlines being what they are (and because I figured because I can log in as
of this moment) we decided to restart.  A shutdown -g 0 -i 6 is not bringing
the system down.

Shawn K. Tagseth

PS its been about 20 minutes since the shutdown command and the machine is
still running :(
Received on Thu Nov 22 19:24:46 2001

This archive was generated by hypermail 2.1.8 : Wed Mar 23 2016 - 16:32:36 EDT