SUMMARY: Monitoring Endless Fork

From: James Kwong (kwong@solar.acast.nova.edu)
Date: Fri Feb 28 1997 - 17:04:32 CST


Thanks to all who take time to respond.

Karl E. Vogel <vogelke@ss118.region2.wpafb.af.mil
Rick Schieche <scud@lewis-deh2.army.mil>
Chan Cao <cao@exapps.com>
Casper Dik <casper@holland.Sun.COM>
Toth-Abonyi Mihaly <M.Toth-Abonyi@cc.u-szeged.hu>
Russell A Weeks <rweeks@math.usu.edu>
Kris Briscoe <internet1!svho1nfs_1.supervalu.com!hxktb0@cedar.mr.net>
Colin Melville <iv08480@WPRT13.MDC.COM>
David Lee <T.D.Lee@durham.ac.uk>
Peter Schauss <ps4330@okc01.rb.jccbi.gov>
Rich Kulawiec <rsk@itw.com>
David Montgomery <david@cs.newcastle.edu.au>
Jacques Rall <jacques.rall@za.eds.com>
Sue Gray <itusjg@ntx.city.unisa.edu.au>
Torsten Metzner <tom@diophant.uni-paderborn.de>
Ric Anderson <ric@rtd.com>
Kevin Sheehan <kevin@uniq.com.au>
K. Ravi <RAVKRISH.IN.ORACLE.COM.ofcmail@in.oracle.com>

Hope I didn't miss anyone.

Solutions:
1. Almost all of you suggested to edit /etc/system and add
   set maxuprc = <NUM> (where <NUM> ranging from 50 - 512.
   That's what I want!)

2. A few suggested to delete the script.
   Did some testing on a SS5, with maxuprc set to 1000 (max limit for that
   SS5, can't get any further). Load average raised up to 20-123 right after
   I deleted the script, then slowly back to normal. Worked!

3. Karl sent me a script called "skill". He suggested to rename the script
   and use the script to kill lot of stuff at once. Worked too!

   Also Available at the following as suggested by Ric:
        ftp://jaguar.cs.utah.edu/pub/skill/skill-3.7.tar.Z
   
4. One suggested su as the user and do a "kill -1 -1". Haven't tried.

5. Some suggested to turn on accounting and monitor the log or find who's the
   bad guy afterward. Haven't tried.

6. As pointed out by Colin, set max_uproc details can be found in Cockroft's
   Solaris Tuning Guide. Also in Answerbook, and Sunsolve online and
        http://www.geocities.com/SiliconValley/6706/Sun.html
   Apologized for not RTFM enough...

-- James.

>Original Question:
>Yesterday, someone was running a shell script called "test". The content of
>the script is as following:
> test
> test
> test
> ....
> ....
>
>The file permission is 700 and he type "sh test" by accident (i guess).
>
>So, within 30 seconds, no one could login to the system and load average
>became very high even with a SS1000E. Was able to do a ps on console and
>found that the user has 1984 processes out of 2055 total. All the processes
>were running with CPU time 0:00. Trying to kill those processes with a
>simple foreach loop but got a message "Vfork failed".
>
>
>Question:
>Is there a way to prevent or handle a user running an endless shell-spawning
>program?
>
>I looked up "man limit" and it has no prevention of these kind.
>
>Look up "Proctool" and "Idled" manual and couldn't find a way to setup
>a monitor for that.
>

+---------------------------------+----------------------------------------+
| Unix System Administrator | James Kwong |
| Nova Southeastern University | kwong@solar.acast.nova.edu |
| 3301 College Avenue | 954-262-4906 |
| Ft. Lauderdale FL 33314 | |
+---------------------------------+----------------------------------------+



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:47 CDT