SUMMARY: Solaris 2.6 : some syscalls Hanging (more)

From: DAUBIGNE Sebastien - BOR ( SDaubigne_at_bordeaux-bersol.sema.slb.com ) <SDaubigne_at_bordeaux-bersol.sema.slb.com>
Date: Wed Jul 02 2003 - 05:36:33 EDT
This summary is quite late, but for those who care, our poll() syscall
hanging in the kernel was due to incompatibility between a JNI FC card
driver and the kernel patch [crash dump shows JNI driver kernel thread
locking some resources].
Although theses components were not modified recently, it seems the recent
high workload triggered the bug.
After upgrading both JNI driver and kernel patch to the latest recommended
version, everything goes fine.
---
Sebastien DAUBIGNE
sdaubigne@bordeaux-bersol.sema.slb.com
<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  - (+33)5.57.26.56.36
SchlumbergerSema - SGS/DWH/Pessac

	-----Message d'origine-----
	De:	DAUBIGNE Sebastien  - BOR (
SDaubigne@bordeaux-bersol.sema.slb.com )
	Date:	vendredi 16 mai 2003 17:13
	@:	sunmanagers@sunmanagers.org; sunhelp@sunhelp.org
	Objet:	Solaris 2.6 : some syscalls Hanging (more)

	Additionnal information : This is not related to /dev/kstat
(However, thank you Haywood Steven).
	It is related to the poll() syscall.
	(mp,vm,io)stat commands all issue poll(0,0,<INTERVAL>) syscalls to
sleep between each sample (instead of an alarm() + sigsuspend() combination
used by sar and sleep commands).
	For instance on a sane host :
	# truss -aeflv all vmstat 5 3
	(....)
	12687/1:        poll(0x00000000, 0, 5000)       (sleeping...)
	(......... 5 seconds sleep .............)
	12687/1:        poll(0x00000000, 0, 5000)                       = 0
	(...)

	This didn't work when our host went bad. The poll(0x00000000, 0,
5000) waited indefinitely (instead of 5 seconds).
	I've reproduced the bug with the following small C program :
	#include <poll.h>
	#include<stdio.h>
	main(){
	  if(poll(0,0,5000)<0)
	    perror("poll");
	}

	On a sane host, it waits 5 seconds. On our bad host, it waits
indefinitely.
	Other applications hang on semop() calls.
	Looks strange.

	Any suggestion ?
	---
	Sebastien DAUBIGNE
	sdaubigne@bordeaux-bersol.sema.slb.com
<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  - (+33)5.57.26.56.36
	SchlumbergerSema - SGS/DWH/Pessac

		-----Message d'origine-----
		De:	DAUBIGNE Sebastien  - BOR (
SDaubigne@bordeaux-bersol.sema.slb.com )
		Date:	vendredi 16 mai 2003 14:36
		@:	sunmanagers@sunmanagers.org; sunhelp@sunhelp.org
		Objet:	Solaris 2.6 : some syscalls Hanging

		Solaris 2.6, kernel 105181-28.

		We'got some processes "hanging" (i.e. with blocking
syscalls):

		vmstat, iostat, mpstat are hanging on poll() syscall : they
open
		"/dev/kstat", issue some ioctl() on.it, and wait for data
with poll(). But,
		poll() never returns.
		sar is working fine (note that it doesn't use poll()).
		We also have some Oracle background and shadow processes
hanging on semop()

		Other processes are working fine (those that don't call
poll() or semop()).

		It seems some kernel syscall (at least poll() and semop())
are waiting
		indefinitely :  Looks like some deadlock.
		System activity is low (sar shows 70% CPU free, lots of
memory free, no page
		scan). The only thing I can't see is mutex contention, as
mpstat is hanging.

		Any idea ?

		---
		Sebastien DAUBIGNE
		sdaubigne@bordeaux-bersol.sema.slb.com
		<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  -
(+33)5.57.26.56.36
		SchlumbergerSema - SGS/DWH/Pessac
		_______________________________________________
		sunmanagers mailing list
		sunmanagers@sunmanagers.org
		http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Wed Jul 2 05:36:27 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:15 EST