[not a summary] is it ufs problem?

From: Grzegorz Bakalarski <G.Bakalarski_at_icm.edu.pl>
Date: Fri Sep 15 2006 - 07:24:52 EDT
Dear All,

This is not a summary but rather a status report...

I've got 2 answers: one saying I should expect more
from Vendor's support  (are you on contract, aren't you) and
other one (thanks Scott Lawson) with advice to load
latest kernel and UFS patches and also SAN Foundation Kit
patches. 
I did according to Scott's advice ... However I have to
wait for check. Presently my temporary workaround
works not bad (i.e. mount with -o nologging,forcedirectio,noatime) -
I loaded 6 weekly data without errors (slowly but ahead). And I still have 10 weeklies
to load so I don't want experiments at present. I will
check if this patching is the solution possibly within week or two ...

Have a nice weekend!

GB

On Tue, Sep 12, 2006 at 03:28:36PM +0200, Grzegorz Bakalarski wrote:
> I have a problem which makes me hopeless ...
> I have an application (from external vendor) - aka
> database UI and admin utilities.
> Each week I run update using admin utilities
> (perl scripts) to add data to existing indexes.
> The application has run excellent for weeks ...
> But suddenly started to fail in updateing process.
> The error message says that "run" (i.e. update) file
> is not synchronized (i.e. expected - calculated
> offset of pointer in a "run" file  is different 
> than real offset of pointer in a run file returned 
> by ftell C function).
> The very, very , very strange thing is that the error 
> message appears in a random way. Starting from the same
> backup copy (ufsdump/ufsrestore) the update can proceed
> without error or can crash ...
> E.g. one time I can load data from week 21,22,23 (one data by
> one) and other time I can load data from week 21, 22 and I fail
> to load data from week 23 ...
> I contacted to vendor support team and they claim the problem
> may be in hardware disk error or ufs filesystem corruption...
> While I can agree with them the point is that I tried to run updates
> on different devices (SUN StorEdge 3310 SCSI array, NetApp FAS 3050 FC or
> SATA arrays - two different NetApp FC connected arrays), starting 
> from new ufs filesystem ... ANd starting from the same backupe
> copy leaded to different results - the same input data & the same 
> application = different results...
> 
> Now I started thinking about serious ufs bug ...
> The database update process is very heavy disk based task.
> Maybe not extremmaly like in big Oracle installations however
> I can see tranferrates about 120-160MBytes/s which is much 
> for my V440 server ... So with default mount option there may be
> a lot of cached/dirty data in memory ...
> 
> What is also strange that no changes to the server have been
> made since May 4th ... 
> The only application installed during that time was QLogic SANSurfer
> (we have QLogin 2342 FC HBA in this server) 
> During May & June all was fine.
> First error appaered in last week of June (and has been
> manually corrected by support) , and the second error appeared
> in last week of July, and then I could not advance updates anymore
> (however I could load previous data on backup copies )
> 
> Currently I'm testing the following mount options:
> -o forcedirectio,nologging,noatime and I successfully
> loaded 3 weekly data.
> The vendor on its production servers uses VxFS (I do not have a license).
> Has anyone of you heard of similar problem or seen bug report or
> workaround?
> 
> My system is:
> 
> SUN Fire V440,
> SUNOS 5.9 Generic_118558-26 sun4u sparc SUNW,Sun-Fire-V440
> 8GB memory
> 4x 1.062GHz Ultra Sprac IIIi processors
> I've done full hardware test during start up 
> I've run extensive memory and processor tests from SunVTS ..
> No single sign of problems
> 
> Please help,
> 
> GB
> _______________________________________________
> sunmanagers mailing list
> sunmanagers@sunmanagers.org
> http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Fri Sep 15 07:25:32 2006

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:01 EST