Weasel - Syracuse Univ. Physics Computational Cluster

Weasel - Configuration

Summary:

  • 50 dual 1GHz Pentium III slave nodes
  • 102 GHz total
  • 32 GB RAM, 580 GB storage
  • 120 amps, ~2000 lbs
  • Performance: 202879.53 BogoMips

Weasel has 1 head node and 50 slave nodes each with 2 Pentium III 1GHz processors.

The head node, slave nodes, network switch, monitor/keyboard/mouse, and power strips are all contained and mounted to two midpoint rack units. The power is dispersed across six 20amp circuits, and in a conditioned room with two 5 ton air conditioning units (heat,cool,humidity). Room is set to approx. 65degF and 55% RH. Look at the installation photo-log to see what Weasel looks like.

weasel (w00) - server

dual 1GHz PentiumIII, 512mb RAM
two 80gb IDE (mirrored), floppy, CDRW
(two 35gb nfs partitions)
8mm tape drive (Sony AIT-1)
two 100Mbit ethernet (local & gateway)
4U rackmount case

w01-w26 - client nodes (26)

dual 1GHz PentiumIII, 512mb RAM (two 256mb)
10GB EIDE, floppy
100Mbit ethernet 
1U rackmount case

w39-w50 - client nodes (12)

dual 1GHz PentiumIII, 512MB RAM (single 512mb)
10GB EIDE, floppy
100Mbit ethernet
1U rackmount case

w27-w50 - client nodes (12)

dual 1GHz PentiumIII, 1GB RAM
10GB EIDE, floppy
100Mbit ethernet        
1U rackmount case

Interconnect

HP ProCurve 4000M 10/100 Switch 

Additional configuration changes


06Dec2001 - w48: 2nd sdram bad, access to higher mem hangs, removed/replaced & all is fine 
25Feb2002 - w11: Replaced bad hard drive (#1)
04Mar2002 - w15: Replaced hard drive that had: SMART Status Bad (Mar 04 2002, returned 2 drives for replacement)
02Jul2002 - w44: Replaced bad/unrecognizable hard drive
30Jul2002 - w37 w46: recloned hard drives
27Sep2002 - w43: hard drive corrupt, recloned/replaced (Oct 4, 2002, returned 2 drives for replacement)
09Dec2002 - w28: hard drive corrupt, recloned
23Dec2002 - w27: hard drive corrupt, recloned
30Dec2002 - w42: hard drive unreadable, replaced
07Jan2003 - w46: hard drive replaced (Jan 17 2003 returned 2 drives for replacement)
13Feb2003 - w46: hard drive replaced - again? 
02May2003 - w46: maxtor replacement write errors, replaced 
02May2003 - w30: intermittant errors on hard drive, replaced w/ seagate 10gb (returned 3 drives)
10Jun2003 - w32: quantum hard drive unrecognizable, replace with maxtor 10gb (#10) 
28Jul2003 - w38: quantum hard drive unrecognizable, replace with same 10gb 
11Aug2003 - w40: quantum hard drive unrecognizable, replace with same 10gb (returned 3 drives)
18Sep2003 - w06: quantum hard drive unrecognizable, replaced with maxtor 13.6gb 
07Oct2003 - w18: quantum 10gb hard drive SMART bad, replaced with IBM 15gb
07Oct2003 - w38: quantum 10gb hard drive unrecognizable, replaced with Maxtor 20gb (return 3 drives)
04Nov2003 - w24: quantum 10gb hard drive unrecognizable, replaced with Maxtor 10gb  
10Nov2003 - w08: quantum 10gb hard drive unreadable, replaced with Maxtor 10gb  
26Jan2004 - w14: quantum 10gb hard drive unreadable, replaced with Maxtor 10gb (return 3 drives) 
05Feb2004 - w48: quantum 10gb hard drive unreadable, replaced with Maxtor 10gb  
14May2004 - w03: quantum 10gb hard drive unreadable, replaced with Maxtor 10gb (#20-return 4 drives) 
25Jun2004 - w35: hard drive unreadable, replaced 
23Aug2004 - w03 w26: hard drive unreadable, replaced (return 4 drives) 
(w22 & w28 intermittant errors, drives may be on their way out)
28Mar2005 - w22: hard drive unreadable, replaced with Maxtor 20gb 
04May2005 - w31 & w47: hard drive unreadable, replaced with used Maxtor 10gb 
26May2005 - w40: hard drive rw errors, replaced with new Maxtor 30gb 
23Jun2005 - w21: hard drive rw errors, replaced with new Maxtor 30gb 
23Jun2005 - w04: kernel panics, swapped memory & hard drive (Maxtor 30gb), one cpu fan bad 
01Jul2005 - w44: hard drive unrecognizable, replaced (Maxtor 30gb) 
03Jan2006 - w11 & 48: Power supplies reaplced, w32 kernel panic-rebooted waiting on further symptoms
            w04: kernel panic, processor fan dead, down until replacement ordered
16Feb2006 - w04, w06, w08 all had cpu cooler fans replaced
            w12 experienced freezing with unknown cause, but appears functioning after reseating memory and connections
            w14 recloned with new hard drive
20Mar2006 - w27: hard drive unrecognizable, replaced 
17Apr2006 - w13: hard drive unrecognizable, replaced 
08May2006 - w47 & 48: Both cpu blower fans replaced
02Nov2006 - w03 w28: one cpu blower fan replaced, w47 heatsinks reseated due to slow/heat (25 minor exit fans currently dead)
07Dec2006 - w05: one cpu blower fan, one side exit fan, 2 middle fans replaced
12Dec2006 - w04: one cpu blower fan replaced (other one new recently)
15Dec2006 - w32 w36: both cpu blower fans replaced 
14Mar2007 - w39: cpu1 blower fan replaced (last week w11 kernel panics twice, appears ok) 
27Apr2007 - w44: both cpu blower fans replaced (w48 rebooted twice due to hangs in last month) 
07Jun2007 - w41: hard drive replaced with used drive 
26Jul2007 - w15: new power supply (w48 reimaged, frequent hangs) 
02Jan2008 - w03: hard drive bad, replaced with maxtor 40gb (w02 hung a couple times) 
22Feb2008 - w19 & w34: hard drives bad, replaced with maxtor 40gb 
            w35: one cpu fan replaced (other was replaced awhile ago


Maintained by Dan Kirkpatrick
Last updated 06 August 2010