SUMMARY: SS10 - 41 with one cpu

From: nagel@post.inf-wiss.ivp.uni-konstanz.de
Date: Thu Apr 15 1993 - 08:23:49 CDT


Hi Sun-managers,

my original question on Apr 6 was:
...
We have got last week a processor upgrade card SS10-20 --> SS10-41 (one cpu, 40MHz). The change was an easy job and the maschine was booting properly and the speed was also impressing. After the upgrade we found problems with our development software ObjektWorks Smalltalk-80 Rel. 4.0 . Until a couple of time (it can be the whole day) the software crashes, but only with SEGMENTATION FAULT and without any other system message. After the first crash the next crash happens after a shorter period of time

o on. It happens only with the SmallTalk software.
Then we reboot the machine and the thing begins again.

Might it be a problem with the new cpu or with the software? I mean it looks like a programming problem inside of the SmallTalk software.
Does anyone have any ideas to that matter of fact?

I have told the facts to the local SUN Hotline and the local SmallTalk distributor but they don't know nothing about problems with the new cpu card.
...
------------------------------------------------------------------------------

Thanks for all the quick replies. The winners are Liesl Andrico from ParcPlace and Edward Eldridge. Liesl has asked a system manager of her company and he gives a definitive answer.
ParcPlace has written through the data cache machine instructions into particular addresses and now has Sun changed the way they use the icache. Crash!!
You can read the whole thing below.
All in all it's not so nice for a software developer because for all that mistakes we have got to pay the bucks again. We have at the time no maintenance contract.

Two of you give me also a hint for some patches (100726-04, 100744). I have installed the first one without any changes! The other one I shall try later.

Thanks again to
Liesl Andrico andrico@parcplace.com and krasner@parcplace.com
Edward Eldridge eddy@is.morgan.com
Mark Plotnick mp@allegra.att.com
Christian Lawrence cal@soac.bellcore.com
Alain Brossard brossard@siisun.epfl.ch

Here are the original replies:
------------------------------------------------------------------------------
>From portland!portland!andrico@parcplace.parcplace.com Fri Apr 9 05:14:59 1993
Date: Thu, 8 Apr 93 15:18:03 PDT
From: andrico@parcplace.com (Liesl Andrico)
To: nagel@post.inf-wiss.ivp.uni-konstanz.de
Subject: smalltalk crashes
Cc: andrico@parcplace.com
Content-Length: 2728

I have the answer to your problem. I am a system manager for the
company who developed the smalltalk-object works product you are
having trouble with. I sent your message to the technical team.
The problem is that sun changed the way they use the icache. It
really isn't your fault. But, unless you have a service contract,
we can't help you fix it. I suppose if you put it on a different
machine, it might work again.

This is the message I received. I am not good at editing messages,
so you should know this was not meant to be nasty or aweful. he
was just passing on the facts. I wish you the best of luck getting
your program working again. solutions are to switch to the newer
2.1 Solaris version, move the program to a machine that doesn't have
the icache problem, or use a support contract to have the problem
fixed.

Here is the message
>From krasner@parcplace.com Wed Apr 7 21:12:05 1993

We get a slight performance boost if we can rely on processors keeping their
instruction and data caches straight. That is, when we write a machine
instruction into a particular address (through the data cache), we can
run faster if we don't have to do anything to have it appear when we
execute at that address (via the instruction cache), rather than having
the old contents of that address remain in the icache.

When we built 4.0, we didn't know that two years later, Sun would put out
machines that did not allow us to make this assumption, so we did. So
4.0 systems will crash at seemingly random times on the Sparc10, Model 41,
and other new machines.

Our Solaris2 (4.1a/1.0a) release does not make this assumption, and does not
appear to have crashes (at least not these kind).

Had this customer (purchased suppport and) contacted our support people, they
would have had it answered quickly.

Then someone in support messed up, or they sent their broadcast message before
getting an answer.

You may want to sanitize my message before sending it on. The bottom line is
that the customer cannot (without buying our sources) fix this problem, and we
have no intension of putting out a 4.0 system with it fixed. We even hope to
avoid putting out a 4.1 (SunOS4.x.x) system with this fixed. We strongly prefer
them to move to 4.1a on Solars2.

This is not what most customers want to hear, so the response needs to be
carefully worded.

I wish I could be more help. Please contact me if anything is unclear
or if you have questions.

********************************************************************
Liesl Andrico System Administrator
921 SW Washington, Ste 312 E-mail andrico@parcplace.com
Portland, OR 97205 phone (503) 220-1423
********************************************************************

-------------------------------------------------------------------------------

>From eddy@is.morgan.com Wed Apr 7 11:57:02 1993
From: eddy@is.morgan.com (Edward Eldridge )
Date: Wed, 7 Apr 1993 05:52:29 -0400
In-Reply-To: nagel@post.inf-wiss.ivp.uni-konstanz.de
        "SS10 - 41 with one cpu" (Apr 6, 7:25am)
X-Organisation: Morgan Stanley International
X-Address: 4th Floor, 25 Cabot Square, Canary Wharf, London E14 4QA
Reply-To: Eddy Eldridge <eddy@is.morgan.com>
X-Phone: +44 71 425 8424, +44 372 450727 (home)
X-Mailer: Z-Mail (2.1.3 10feb93)
To: nagel@post.inf-wiss.ivp.uni-konstanz.de
Subject: Re: SS10 - 41 with one cpu
Content-Length: 1924

yep. model 41 and 4.0 of Objectworks don't like each other. ParcPlace have fixed
the problem in 2.1 of SunOS and might fix it in 4.1.3, but in both cases only
for 4.1 of Objectworks.

Sun are bringing out a model 40 which should work with no changes from
ParcPlace, but that is for ParcPlace to say. In essence the two stage cache on
the 41 seems to have upset Objectworks.

Cheers

Eddy

-------------------------------------------------------------------------------

>From cal@soac.bellcore.com Wed Apr 7 14:29:23 1993
Date: Wed, 7 Apr 1993 08:30:11 -0400
From: Christian Lawrence <cal@soac.bellcore.com>
To: nagel@post.inf-wiss.ivp.uni-konstanz.de
Subject: Re: SS10 - 41 with one cpu
Content-Length: 153

there is a jumbo patch for SS10/41 - 100744 - but don't know if its related
or not. in any event would be a good idea to load it if you haven't already

-----------------------------------------------------------------------------

>From mp@allegra.att.com Wed Apr 7 17:42:07 1993
Date: Wed, 7 Apr 93 11:40:25 EDT
From: mp@allegra.att.com (Mark Plotnick)
To: nagel@post.inf-wiss.ivp.uni-konstanz.de
Subject: Re: SS10 - 41 with one cpu
Content-Length: 117

have you installed patch 100726-03? It fixes bugs on SS-10's
that cause programs to occasionally get memory faults.

-------------------------------------------------------------------------------

>From brossard@siisun.epfl.ch Thu Apr 8 11:34:55 1993
Date: Thu, 8 Apr 93 11:03:17 +0200
From: brossard@siisun.epfl.ch (Alain Brossard EPFL-SIC/SII)
To: nagel@post.inf-wiss.ivp.uni-konstanz.de
Subject: Re: SS10 - 41 with one cpu
X-Sun-Charset: US-ASCII
Content-Length: 227

   We do have one lab here which reported a problem with Liken
on SS10/41, this problem seems to occur seemingly randomly and
never occured while the machines were SS10/30. However I have
no hard facts on this.

                                                Alain

-----------------------------------------------------------------------------

A nice day to all of you. The list is really great.

Hans-J. Nagel
Dept. Information Science
University of Constance
PF 5560
D-7750 Konstanz
Tel. ++49/7531-88 3535
Fax. ++49/7531-88 2601
e-mail: nagel@inf-wiss.ivp.uni-konstanz.de



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:44 CDT