Summary: IO-Wait of 99% - how to diagnose

From: <extern.Tobias.Kronwitter_at_AUDI.DE>
Date: Thu Dec 23 2004 - 11:08:39 EST
Hello all,

thank you for the overwhelming help:

Kanellopoulos Angelos
Jeremy Ahl
PEter (ITServ GmbH)
Imrick Michael
Beth Dodge
Alvin Gunkel
Clive McAdam
Terry Franklin
Murugesan K
Mossey Fahey
Victor Engle
Rebstock, Roland

The broad consensus is, that most likely have a "display problem".
The io ist not really high, nor is the server slow, which would indicate a
high load.

After a reboot however, the iowait indicated a normal values again. Up to
now (I waited with this summary) we had no high iowaits any more.
In case we will experience the problem again, we will install the bug-fix:

	--------------------
	Tobias,
	
	Sun introduced a bug in kernel patch 108528-28. I thought the bug
was 
	fixed in -29 but from your stats it appears not to have been. You
may 
	try installing 117000-05 which seems to be the latest kernel patch.
	
	Here is a link to the bug description on sunsolve and a link to the
new 
	kernel patch,

	
http://sunsolve.sun.com/search/document.do?assetkey=urn:cds:docid:1-1-497822
8-1
	
http://sunsolve.sun.com/pub-cgi/pdownload.pl?target=117000-05&method=h


	Vic
	------------------

If so, I will post a second summary.

Thank you
Season Greatings to all of you


Regards Tobias




Hello all,

on a Solaris8 / SUN-Fire V440 (SunOS iuaw740 5.8 Generic_108528-29 sun4u
sparc SUNW,Sun-Fire-V440) we are experiencing a very high IO-Wait problem.
This Server is configured with Veritas vxvm 4.0 / mp1 and has SAN-Disks
connected via an Emulex 9002 FCA.

top reports the following:

load averages:  0.02,  0.01,  0.02
11:02:12
82 processes:  81 sleeping, 1 on cpu
CPU states:  0.0% idle,  0.0% user,  0.5% kernel, 99.5% iowait,  0.0% swap
Memory: 8192M real, 6375M free, 469M swap in use, 22G swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  5818 root       1  58    0 2344K 1440K cpu/0    0:00  0.05% top
   813 root       6  39    0 5240K 4440K sleep    4:56  0.04% picld
 29405 root       6  58    0 4728K 2816K sleep    0:05  0.01% elxdiscoveryd
 28964 root       1  48    0 2544K 2008K sleep    0:00  0.01% bash
  5813 root       1  38    0 6248K 2728K sleep    0:00  0.01% sshd
  1444 root      12  58    0 5368K 5080K sleep    0:31  0.00% mibiisa
 17990 dctm_run   3  58    0   40M   12M sleep    0:07  0.00% documentum
  5816 dctm_run   1  38    0 1392K 1144K sleep    0:00  0.00% sar
  5817 dctm_run   1  48    0 1456K 1128K sleep    0:00  0.00% sadc
  1471 root       1  58    0    0K    0K sleep    0:59  0.00% se.sparcv9.5.8
   980 root       5  58    0 4200K 2440K sleep    0:17  0.00% automountd
 10808 dctm_run   5  58    0   39M   22M sleep    0:09  0.00% documentum
    17 root       1  58    0   12M   10M sleep    0:07  0.00% vxconfigd
  3111 dctm_run   4  58    0 5672K 3728K sleep    0:07  0.00% dmdocbroker
 28215 dctm_run   1   2    0 1896K 1440K sleep    0:06  0.00% ksh


iostat doesn't indicate hi disk io:

bash-2.03# iostat 5 15
   tty        sd0           sd1           sd2           sd3            cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt
id
   8   35   0   0    0  100   4    9  100   4   10    4   1    9    1  1 25
73
 125  451   0   0    0   14   4    7   14   4    6    0   0    0    0  1 99
0
   0   16   0   0    0    0   0    5    0   0    6    0   0    0    0  1 99
0
   0   16   0   0    0    9  18    3    9  18    3    0   1    4    0  1 99
0
   0   16   0   0    0   67  26   23   61  26   26    2   1    6    0  0 100
0
   0   16   0   0    0    0   0    0    0   0    0    0   0    0    0  0 100
0
   0   16   0   0    0    0   0    0    0   0    0    0   0    0    0  0 100
0
   0   16   0   0    0    0   0    0    0   0    0    0   0    0    0  0 100
0
   0   16   0   0    0    0   0    0    0   0    0    0   0    0    0  1 99
0
   0   16   0   0    0    4   8    3    4   8    3    0   0    4    0  1 98
0
  10   37   0   0    0    2   1    5    2   1    4    0   1    5    0  0 100
0
   0   16   0   0    0    4   3   23   21   5   19    0   0    0    0  0 100
0
  35  137   0   0    0    5   2    7    5   2    5    0   0    0    0  0 100
0
 126  421   0   0    0   14   4    5   14   4    6    0   0    0    0  1 99
0
   0   16   0   0    0    0   0    0    0   0    0    0   0    0    0  1 99
0


the san disks are not under load either:

iostat -xnp
                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
    1.5    2.7   50.7   50.5  0.0  0.0    0.0    8.8   0   2 c1t0d0
    0.2    0.0    0.1    0.0  0.0  0.0    0.0    0.1   0   0 c1t0d0s0
    0.0    0.1    0.0    0.4  0.0  0.0    0.0    6.2   0   0 c1t0d0s1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c1t0d0s2
    1.2    2.7   50.5   50.0  0.0  0.0    0.0    9.7   0   2 c1t0d0s3
    0.2    0.0    0.2    0.0  0.0  0.0    0.0    0.9   0   0 c1t0d0s4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t0d0s5
    1.5    2.7   51.4   50.0  0.0  0.0    0.0   10.1   0   2 c1t1d0
    0.2    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t1d0s0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t1d0s1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c1t1d0s2
    1.3    2.7   51.3   50.0  0.0  0.0    0.0   10.6   0   2 c1t1d0s3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    3.2   0   0 c1t1d0s4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t1d0s5
    0.6    0.5    2.5    1.9  0.0  0.0    0.0    9.1   0   0 c1t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c1t2d0s2
    0.4    0.5    2.3    1.9  0.0  0.0    0.0   12.1   0   0 c1t2d0s3
    0.1    0.0    0.1    0.0  0.0  0.0    0.0    0.2   0   0 c1t2d0s4
    0.4    0.5    1.5    1.9  0.0  0.0    0.0    9.9   0   0 c1t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1t3d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    2.2   0   0 c1t3d0s3
    0.2    0.5    1.4    1.9  0.0  0.0    0.0   12.8   0   0 c1t3d0s4
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.9   0   0 c3t30d0
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.9   0   0 c3t30d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d0s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t30d1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t30d1s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d1s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.6   0   0 c3t30d2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.6   0   0 c3t30d2s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d2s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t30d3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t30d3s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d3s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d4s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d4s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d5s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d5s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.3   0   0 c3t30d6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.3   0   0 c3t30d6s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d6s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.3   0   0 c3t30d7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.3   0   0 c3t30d7s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d7s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d8
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t30d8s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t30d8s7
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t70d0
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t70d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d0s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.6   0   0 c3t70d1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.6   0   0 c3t70d1s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d1s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t70d2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t70d2s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d2s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t70d3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.7   0   0 c3t70d3s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d3s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t70d4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.5   0   0 c3t70d4s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d4s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d5s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d5s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d6s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d6s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d7s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d7s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d8
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.4   0   0 c3t70d8s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t70d8s7
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c4t31d0
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c4t31d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d0s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d1s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d1s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d2s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d2s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d3s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d3s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d4s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d4s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d5s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d5s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d6s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d6s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d7s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d7s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d8
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d8s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t31d8s7
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c4t71d0
    0.0    0.0    2.0    0.0  0.0  0.0    0.0    0.7   0   0 c4t71d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d0s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d1s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d1s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d2s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d2s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d3s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d3s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d4
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d4s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d4s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d5
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d5s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d5s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d6
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d6s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d6s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d7s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d7s7
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d8
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d8s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t71d8s7


	-->	c3t30d0, c3t70d0	are the same LUN viewed at via two
hba's => is one plex (mirror) of a volume
		c4t71d0, c4t71d0	are the same LUN viewed at via two
hba's => is the other plex of the same volume

It looks like, the disks aren't the problem.

Network looks ok also:

RAWIP
        rawipInDatagrams    =     0     rawipInErrors       =     0
        rawipInCksumErrs    =     0     rawipOutDatagrams   =     0
        rawipOutErrors      =     0

UDP
        udpInDatagrams      = 33591     udpInErrors         =     0
        udpOutDatagrams     = 33596     udpOutErrors        =     0

TCP     tcpRtoAlgorithm     =     4     tcpRtoMin           =   400
        tcpRtoMax           = 60000     tcpMaxConn          =    -1
        tcpActiveOpens      = 13806     tcpPassiveOpens     = 14845
        tcpAttemptFails     =     9     tcpEstabResets      =   749
        tcpCurrEstab        =    15     tcpOutSegs          =13312404
        tcpOutDataSegs      =11237898   tcpOutDataBytes     =63660618
        tcpRetransSegs      =   422     tcpRetransBytes     =336018
        tcpOutAck           =2067205    tcpOutAckDelayed    =1853406
        tcpOutUrg           =     0     tcpOutWinUpdate     =    15
        tcpOutWinProbe      =    13     tcpOutControl       = 58063
        tcpOutRsts          =  1520     tcpOutFastRetrans   =    85
        tcpInSegs           =11835531
        tcpInAckSegs        =10236185   tcpInAckBytes       =63671992
        tcpInDupAck         = 39885     tcpInAckUnsent      =     0
        tcpInInorderSegs    =9700449    tcpInInorderBytes   =1566751826
        tcpInUnorderSegs    =     1     tcpInUnorderBytes   =   551
        tcpInDupSegs        =    64     tcpInDupBytes       =  4171
        tcpInPartDupSegs    =     0     tcpInPartDupBytes   =     0
        tcpInPastWinSegs    =     0     tcpInPastWinBytes   =     0
        tcpInWinProbe       =     0     tcpInWinUpdate      =     3
        tcpInClosed         =   184     tcpRttNoUpdate      =   347
        tcpRttUpdate        =10222115   tcpTimRetrans       =  1649
        tcpTimRetransDrop   =     5     tcpTimKeepalive     =   181
        tcpTimKeepaliveProbe=    16     tcpTimKeepaliveDrop =     1
        tcpListenDrop       =     0     tcpListenDropQ0     =     0
        tcpHalfOpenDrop     =     0     tcpOutSackRetrans   =     0

IPv4    ipForwarding        =     2     ipDefaultTTL        =   255
        ipInReceives        =11593385   ipInHdrErrors       =     0
        ipInAddrErrors      =     0     ipInCksumErrs       =     0
        ipForwDatagrams     =     0     ipForwProhibits     =     0
        ipInUnknownProtos   =     0     ipInDiscards        =     0
        ipInDelivers        =11849241   ipOutRequests       =13132669
        ipOutDiscards       =     0     ipOutNoRoutes       =     3
        ipReasmTimeout      =    60     ipReasmReqds        =     0
        ipReasmOKs          =     0     ipReasmFails        =     0
        ipReasmDuplicates   =     0     ipReasmPartDups     =     0
        ipFragOKs           =     0     ipFragFails         =     0
        ipFragCreates       =     0     ipRoutingDiscards   =     0
        tcpInErrs           =     0     udpNoPorts          =  4188
        udpInCksumErrs      =     0     udpInOverflows      =     0
        rawipInOverflows    =     0     ipsecInSucceeded    =     0
        ipsecInFailed       =     0     ipInIPv6            =     0
        ipOutIPv6           =     0     ipOutSwitchIPv6     =   169


What else could be the reason ?
===============================

Who could we diagnose this problem ?
====================================


Thank you for your help.
Tobias
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Dec 23 11:14:05 2004

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:41 EST