Summary: Detection and auto-shutdown upon reaching a certain temperature

From: sunhux G <sunhux_at_gmail.com>
Date: Wed Apr 29 2009 - 12:00:39 EDT
Thanks for the numerous replies, Allan's replies match what I need
followed by a Perl script from Allan at the end.


But first, there's this constant harassment from Hike, this list's
moderator??

you really are a troll--this question deals with easily found information
and is "common knowledge"--nothing that you should be posting about to the
sunmangers mailing list.

next, you'll ask how to raid a quad ethernet card!
LOL!!!

it seems your learning curve is very shallow.


========================Allan's reply ==========================

> Can Solaris detect the estimated temperature of the datacentre and
> then shuts itself down (init 5) gracefully?

That depends on your hardware. My old E-450 had ambient temperature
readings.  If your hardware had AMBIENT amongst the prtdiag output,
I'll send you my short Perl script that pages when the room got hot.
Alternatively get an APC temperature/humidity probe and monitor that
with nut, the network UPS tool.


> Yes, I do have an E450 and in it's "prtdiag -v" there's an
> Ambient temperature.  Our E450 gave 1 Celcius less than the
> room temperature  (reported by our DC aircon).
> In your E450, is the Ambient temp close to your room temperature?

Allan's reply :
My E-450's Celsius temperature was maybe a degree low, but it was pretty
close. Note that the temperature is read where it's sucking air in, so
if it's low in the rack (as mine was) it may be cooler than the
wall-mounted thermostat.


================================================================================

The cut off temperature of a CPU is 65 Deg C. The best bet would be to
shutdown a server when it reaches 62 Deg C.

If you have many servers in a data center and you are using Tivoli or any
other monitoring software, you can build into it the logic to shutdown a
server when it reaches this critical temperature.

================================================================================

It depends upon the hardware.  A good rule of thumb is if it is as new as
Solaris 9 or later, it most likely would have temperature checks.  You are
correct if there are temperature reports in prtdiag then they exist in the
system.

They do not check the temperature of the data center -- they are checking
the temperature of the component.  However, if the datacenter has gotten to
hot, then the component will get to hot and the system will gracefully shut
down.



====================== Perl script from Allan ========================

#!/usr/local/bin/perl -w

# Ambient temperature monitor and reporting <allan@cookie.org>

# Some Sun hardware reports ambient temperatures, all in Celcius.
# Life is easier if you think celsius instead of converting to
# Fahrenheit. The important temperatures are:
#  20C cool, a pleasant but efficient data center temperature
#  25C warm, badger your physical plant people NOW
#  30C hot, the server should be shut down so the disks don't melt
# Melted disks are neat, but they contain zero data.

# This is a very old Perl script, meaning that it's not very elegant.
# Feel free to simplify syntax anywhere you like.

# test, or unprivileged user location:
#my($log_file) = "/tmp/ambient.log";
# a rational system-level location
my($log_file) = "/var/log/ambient.log";
# this works under Solaris 8 on Sun E-450 hardware, YMMV
my($ambient_cmd) = "/usr/platform/sun4u/sbin/prtdiag -v | grep AMBIENT";
# the log allows difference statements, as well as eyeballing
my($new_ambient,$entry_num);
# Celcius warning temperature
my($err_temp) = 25;
# enter your list of pager addresses, with commas
my($err_addr) = "ADMIN\@PAGER.SVC";

# slurp the log into an array
open (LAST, "< $log_file");
my(@last_log) = <LAST>;
close LAST;

# how long is the array?
$entry_num = @last_log;

# when the log is in /tmp/ or you're just starting
if( $entry_num == 0 ) {
    $new_ambient = qx/$ambient_cmd/;
    $new_ambient =~ s/\D*(\d+)\D*/$1/;
    my($now_date) = &GetNumericNow();
    $last_log[$entry_num] = "$now_date:$new_ambient";
    &WriteFile($log_file,@last_log,"\n");
}
$entry_num--;

# data to compare against the current readings
my($last_date,$last_ambient) = split /\|/, $last_log[$entry_num];
chomp $last_ambient;

# I should use qx// in place of double quotes more often
$new_ambient = qx/$ambient_cmd/;
$new_ambient =~ s/\D*(\d+)\D*/$1/;

# for script testing
#print "|$new_ambient| vs. |$last_ambient|\n";

# since we're logging, we'll note any change including non-alarm temps
if( $new_ambient != $last_ambient ) {
    my($now_date) = &GetNumericNow();
    &error_check($new_ambient,$last_ambient,$err_temp,$err_addr,$now_date);
    my($last_duration) = &GetDuration($last_date,$now_date);
    $last_log[$entry_num] = "$last_date|$last_ambient|$last_duration\n";
    $entry_num++;
    $last_log[$entry_num] = "$now_date|$new_ambient";
    &WriteFile($log_file,@last_log,"\n");
}

sub GetNumericNow{
    my($date,$sec,$min,$hour,$mday,$mon,$year,$wday);
    ($sec,$min,$hour,$mday,$mon,$year,$wday,undef,undef) = localtime(time);
    # Always use 4-digit years, lest Y3K see your code in use >8^)
    $year += 1900;
    $mon++;
    # I _know_ there are more efficient ways to pad, but I didn't then
    &LeadZero(4,\$year);
    &LeadZero(2,\$mon);
    &LeadZero(2,\$mday);
    &LeadZero(2,\$hour);
    &LeadZero(2,\$min);
    &LeadZero(2,\$sec);
    # Human readable, yet easy to parse when we read the log again
    $date = "$year\/$mon\/$mday,$hour:$min:$sec";
    return($date);
}

# I thought it would be nifty to know how long the temperature
# remained constant, until I had a machine room which changed between
# 15C and 16C every few minutes. The log gets long if you have lots of
# fluctuations.
sub GetDuration {
    my($last_date,$now_date) = @_;
    # for script testing
    #print "compare:\n$last_date\n$now_date\n";
    my($duration,$Lyear,$Lmon,$Lmday,$Lhour,$Lmin,$Lsec);
    my($Nyear,$Nmon,$Nmday,$Nhour,$Nmin,$Nsec);
    my($Dyear,$Dmon,$Dmday,$Dhour,$Dmin,$Dsec);
    $last_date =~ m|(\d\d\d\d)/(\d\d)/(\d\d),(\d\d):(\d\d):(\d\d)|;
    ($Lyear,$Lmon,$Lmday,$Lhour,$Lmin,$Lsec) = ($1,$2,$3,$4,$5,$6);
    $now_date =~ m|(\d\d\d\d)/(\d\d)/(\d\d),(\d\d):(\d\d):(\d\d)|;
    ($Nyear,$Nmon,$Nmday,$Nhour,$Nmin,$Nsec) = ($1,$2,$3,$4,$5,$6);
    # this is why we use 4-digit years
    $Dyear = $Nyear - $Lyear;
    # Julian date would simplify math, but logs would be less readable
    if(($Nmon - $Lmon) < 0) {$Dyear--} # borrowing check
    $Dmon = (($Nmon +12) - $Lmon) % 12;
    # month's day count varies
    if(($Nmday - $Lmday) < 0) {$Dmon--} # borrowing check
    $Dmday = $Nmday - $Lmday;
    if(($Nhour - $Lhour) < 0) {$Dmday--} # borrowing check
    $Dhour = (($Nhour +24) - $Lhour) % 24;
    if(($Nmin - $Lmin) < 0) {$Dhour--} # borrowing check
    $Dmin = (($Nmin +60) - $Lmin) % 60;
    if(($Nsec - $Lsec) < 0) {$Dmin--} # borrowing check
    $Dsec = (($Nsec +60) - $Lsec) % 60;
    if($Dyear) { $duration .= "$Dyear years, "}
    if($Dmon) { $duration .= "$Dmon months, "}
    if($Dmday) { $duration .= "$Dmday days, "}
    &LeadZero(2,\$Dhour);
    &LeadZero(2,\$Dmin);
    &LeadZero(2,\$Dsec);
    $duration .= "$Dhour";
    $duration .= ":$Dmin";
    $duration .= ":$Dsec";
    return ($duration);
}

# yes, please, replace this with some one-line formatting trick
sub LeadZero {
    my($length,$valR) = @_;
    my($temp,$counter);
    my($neg_length) = $length * (-1);
    while ($counter++ < $length) {
    $temp .= "0";
    }
    $temp .= $$valR;
    $$valR = substr $temp, $neg_length, $length
}

# email for help if server room overheats
sub error_check {
    my ($new_ambient,$last_ambient,$err_temp,$err_addr,$now_date) = @_;
    my ($rising,$falling) = 0;
    my ($err_msg);
    if ($new_ambient gt $last_ambient) {
    $rising = 1;
    } elsif ($new_ambient lt $last_ambient) {
    $falling = 1;
    }
    # if it is/was too hot & temp is changing, report current
    if ( ( ($new_ambient ge $err_temp) || ($last_ambient ge $err_temp) )
     && ($rising || $falling) ) {
    $err_msg = "Server Ambient Temp: $new_ambient ";
    if ($rising eq 1) {
        $err_msg .= "rising!\n$now_date";
    } elsif ($falling eq 1) {
        $err_msg .= "falling >8^)\n$now_date";
    }
    &email($err_msg,$err_addr);
    }
}

# All of my Suns run sendmail. If yours doesn't, punt.
sub email{
    my($NOTE,$ADDR) = @_;
    my($MAILCMD,$FILE);
    # because redirecting in a file makes email simpler
    $FILE  = "/tmp/temp.$$";
    &WriteFile($FILE,$NOTE);
    $MAILCMD = "/usr/bin/mailx -s Ambient $ADDR < $FILE";
    `$MAILCMD`;
}

# If you ever see my code anywhere else, you'll probably see these two
# subroutines. I'm not saying they're great, but they are old friends.

# Allan's standard read to string from file
sub ReadFile {
    my($this_file) = @_;
    my($read_scalar) = "";
    my($file_size) = -s $this_file;
    open (GETIT, "<$this_file");
    my($read_size) = read GETIT,$read_scalar,$file_size;
    close (GETIT);
    if ($read_size != $file_size) {
        print STDERR "read error:\n FILE: $file_size READ: $read_size\n";
    }
    return ($read_scalar);
}

# Allan's standard write array to file
sub WriteFile {
    my($this_file,@write_list) = @_;
    open (OUTFILE, "> $this_file");
    print OUTFILE @write_list;
    close (OUTFILE);
    select (STDOUT);
}


==============================================================




On Sun, Apr 26, 2009 at 4:20 PM, sunhux G <sunhux@gmail.com> wrote:

> Hi
>
>
> Can Solaris detect the estimated temperature of the datacentre and
> then shuts itself down (init 5) gracefully?
>
> Or is there any script to do this?
>
> Which version onwards of Solaris has this feature?  I seem to recall
> seeing something about temperature in prtdiag.
>
>
> Can this temperature give an approximation of the DC's temperature?
>
> I thought of implementing something which if a Solaris server detects
> its temperature hit certain temperature, this gives an indication of the
> room's temperature and thus it will send a signal (via ssh) to shut down
> other servers and finally itself down.
>
>
>
> Thanks
> U
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Wed Apr 29 12:01:17 2009

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:14 EST