SUMMARY twilight zone cronjob

From: Steve Swaney (swanes@eq.gs.com)
Date: Sat Sep 25 1993 - 16:27:11 CDT


On Sep 21, 9:30am, The problem:

> Symptoms:
>
> Every few nights, between 3:15 and 3:16 AM, the system gets in a "confused"
state. Any command which pokes at the kernel structures, i.e. uptime, ps, dies.
The system behaves exactly as if we had built a new kernel and moved it to
/vmunix, did not reboot and then used commands which read /vmunix. i.e ps
reports:
>
> ps: /dev/mem: read error on ktextseg: Bad address
> ps: could not read kernel VM
>
> Other than this problem, the system behaves normally.
>
> Now strangely enough there is a cron job that runs at 3:15, the familiar:
>
> 15 3 * * * find / -name .nfs\* -mtime +7 -exec rm -f {} \; -o -fstype nfs
-prune
>
> Yep, the cron job that ships with every Sun!
>
> We've checked the obvious:
>
> No other systems on the network are running cronjobs that could possibly
affect the problem system.
>
> No other cronjobs or batch processes run near the time the problem
> occurs.
>
> Online DiskSuite appears to be in a normal state, before and after the
> problem occurs.
>
> /vmunix does not appear to be changed in any way.

Several people submitted suggestions. I believe the most correct solution was
from Ron Vasey. who wrote:

On Sep 22, 3:02pm, vasey@issi.com wrote:

> For a couple of weeks we had problems with a server crashing at 3:16
> every night, and suspected it was related to the nightly find/rm cron
> job, but couldn't figure out how. In desperation I ran a find . -ls
> manually and discovered a corrupted directory entry that appeared to
> "execute" itself -- and then the system -- with uncanny predictability.
>
> Although it wouldn't respond to normal methods (which also killed SunOS),
> an unlink (8) and fsck fixed everything. Sure wish I knew what planet
> THAT came from! Hope this helps!
>

We suspect that we a similar but slightly different file system problem. The
lesson to be learned is that:

        If you system blows up between 3:15 and 3:16, Suspect sun's cron job!

Our thanks to:
markus@octavia.anu.edu.au
vasey@issi.com
dennett@Kodak.COM
strombrg@hydra.acs.uci.edu
atkina@atkina.wan.gs.com

-- 
Steve Swaney
swanes@eq.gs.com
1 (212) 902-4526 (desk)
1 (917) 859-4229 (cellular)



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:17 CDT