What I did over my Summer
Vacation
or, Bring Me to Life: in which
I teach an old dog many new tricks
Call me old-fashioned, but I revel in finding vintage systems
and salvaged hardware. One of my favorite YouTube channels
is
The 8-bit Guy, who
rescues old Commodores from attics and junk-heaps,
and spends
considerable time scrubbing them out, soldering, tinkering, and
releasing them back into service. Then, there is "Play With Junk", a
European fellow whose disembodied voice accompanies thorough
teardowns of "big iron" enterprise-class systems and
components.
So when
Todd offered me the task of
salvaging a decommissioned Xeon rackmount server, I eagerly
accepted his challenge. Together we lugged this beast into my
apartment, where it sat for quite a while. I had so much to do in
my second semester at MCC, and I needed to figure out a way to
position and connect the machine to my other systems.
This is a
Dell
PowerEdge 2650 system, of approximate 2004 vintage. It
features twin 32-bit Xeon CPUs at 2.8GHz with 512K L2 cache. This
is, I believe, the
"Prestonia"
Xeon 2.8B. FSB is 533MT/s (also noted 533MHz, perhaps this
is one in the same.) The bus is
PCI/PCI-X with 64-bit
full-length slots. Installed memory is 12288MB
(12GB)
ECC DDR
DIMMS. Five hot-swap
Ultra3 SCSI
hard disk drives are installed in front. A venerable 3.5"
1.44MB floppy drive and standard CD-ROM optical are accessed from
the front panel. The I/O ports are all the standard PC-compatible
ports of those bygone days. Notably, there are twin
1000BaseT
Ethernet ports and a
DRAC3 module for
out-of-band management, also Ethernet-based.
Twin
hot-swap power
supply units, complete with fans, plug into standard 110V power. A
small matrix LCD display scrolls text status messages, and a sturdy
bezel panel can be placed over the front panel, concealing the
unsightly workings behind it, and provides its own multifunction
status lamp for at-a-glance diagnostics. The BIOS revision is A17.
This machine is designed for
high
availability and
fault-tolerance. Its
many fans and its 500W power make it a quite noisy and thirsty
guest in my home office.
I did not open the case itself, although the outside case does
indicate its long period of service, with quite a bit of grime
accumulated. It would benefit from a thorough cleanup and
detailing.
On June 8, 2018, the
Solemnity
of the Sacred Heart of Jesus, I wrestled this behemoth
into my office and found a way to connect everything necessary.
Soon I decided its hostname would be "corazon", which in Spanish
means "heart". With the twin Ethernet interfaces, there is one for
Jesus and one for the
Immaculate
Heart of Mary.
I plugged in the left-side power supply to a
UPS-protected outlet.
I powered on the system, and the front-panel LCD indicated
trouble with fan RPM and one of the two PSUs.
I plugged in the right-side power supply to a UPS-unprotected
outlet, keeping the left-side plugged in. When I powered on again,
the front-panel LCD still complained: "E0412: RPM FAN 2"
This led to several bouts of swap-and-test with both the power
supplies and the HDDs, as there were failures each time. I feel
that I must misunderstand the numbering scheme used by the system
or something about the hot-swap facility, because the failures did
not occur in any predictable or systematic way, such that swapping
could indicate a bad connector, port or device. I made no attempt
to remedy these PSU issues by disassembly, because that is a
high-risk and counterproductive activity for someone who is
untrained and lacks the proper tools.
At first I had assumed that these PS and FAN warnings were
fatal errors and would prevent a system boot, but as it turns out,
the system simply takes its sweet time to run POST-time diagnostics
and probes. Ultimately, I was able to boot an operating system and
begin the installation process.
I perused the BIOS and fiddled with the settings. I enabled
DHCP addressing on the DRAC module.
As I prepared to install the operating system, my next order
of business was a
memtest. I had
enabled "OS Install Mode", which had the effect of limiting RAM to
256MB. That was ostensibly some kind of Windows hack, so I disabled
it, but memtest still threw plenty of errors such as: 0x00effedc80
at 3839.8 MB.
The bezel seems to be missing the link cable for its LED.
For as long as I am performing maintenance on the system, I do not
intend to reinstall the bezel anyway. The connector seems
proprietary and specialized; it may be worth picking one up on
eBay.
The HDDs gave me their share of trouble. They had arrived
without being seated, which must have been an common-sense test for
me. Once I had them firmly plugged, I found that one wouldn't
respond; SCSI commands failed and the LED flashed amber. As I
swapped disks around, I could not quite understand whether it was a
disk unit failure, or the controller, etc. Four of the disks
are
Seagate
including the failed one, and the fifth is IBM/Hitachi, which
indicates that perhaps had been a replacement while in service.
Each has a capacity of 68.3GB.
I ran a surface scan on each disk from the
Adaptec SCSI controller's BIOS
screen. No bad block errors were indicated. I enabled
RAID 5 and assembled
a new container with the 4 remaining physical disks: a 205GB
container. I triggered a SCRUB, and it ran for several hours.
Ending a second day of labors, I retired to pray and sleep through
the night.
I chose
CentOS
6.9 as the preferred Linux distribution here. CentOS is
essentially the free-download version of Red Hat Enterprise Linux.
I could not take advantage of the latest CentOS 7, because hardware
support has already been discontinued. The kernel did indicate
a
Speculative
Store Bypass (Spectre-NG) vulnerability. The world is
moving to 64-bit systems; developers and vendors find it quite
difficult to support legacy hardware with diminishing returns on
their investment. Likewise, we will find that many 32-bit software
package maintainers will drop support and EOL their applications.
This is a concern from not only a standpoint of bug-fixes, but also
security vulnerabilities. Your security risks grow as you
accumulate more abandoned and unmaintained software in a world
where everyone is connected to the Internet, and
new holes are
inevitably discovered in old systems.
CentOS 6.9-minimal was correctly installed,
with
LUKS
encrypted filesystem and ext4 on top. Unfortunately,
CentOS 6 will not install with
XFS. My other wish was
to experiment with
ZFS, but
documentation indicated that 32-bit systems were unstable and
unreliable, so I decided to simply play it safe. I undertook an
installation of the groups "X Window System" and "KDE Desktop",
just temporarily, so that I could have a web browser and terminal
windows while I did more work. The peculiar thing about the OS
workout is that the machine does not act like it has bad RAM, but I
would yet have grave reservations about the errors claimed by
memtest. I partitioned the filesystems into /, /home, and
/data.
Since I am quite inexperienced at encrypted filesystems, I
expect to have quite a few learning experiences in my near
future.
Pavlos was wise to warn
me about learning to walk before we can run. When I decided to
replace the passphrase with a better one, I was reminded of the
importance of meticulous documentation in such matters. I could not
read my writing and nearly decided all my data was lost when the
new (and old) passphrase wouldn't work. Then, I discovered that all
three logical volumes had their own passphrases attached, and I had
to run
cryptsetup(8) multiple
times in order to first add the new one, and delete the old one: a
list of more than one passphrase can be applied to each.
I connected Ethernet to the management port, which is DRAC
III. There is no way to communicate with the DRAC. Every
modern web browser refuses to connect due to an outdated
RC4 cipher, and other HTTP protocol errors. I
tried:
Chromium Version 66.0.3359.181 (Official Build) Built on
Ubuntu , running on Ubuntu 18.04 (64-bit)
Internet Explorer 11.0.9600.19003, update KB4103768
Firefox 60.0.2 (64-bit)
Lynx Version 2.8.9dev.16 (11 Jul 2017) libwww-FM 2.14, SSL-MM
1.4.1, GNUTLS 3.5.17, ncurses 6.1.20180127(wide)
For all intents and purposes, to access the DRAC in a modern
environment, you would need to maintain a legacy OS running an
outdated browser. I came to find out that DRAC3 has been obsolete
since Windows Server 2008. The DRAC is an essential component of
datacenter operations; without this kind of remote management, you
would need to send a human operator onto the machine room floor,
plug in a keyboard and monitor, and power-cycle or diagnose the
system in person. The DRAC allows all of this to happen remotely
from the comfort of your own home, office, or coffeehouse (please
use VPNs and firewalls.)
I continued to install software, while making notes of the
items I would need to scrub before it is taken away from my
personal home network. I used "
yum
groupinstall" for a "kitchen sink" approach to loading software
that may be useful in such an enterprise server environment. I
configured the NFS client to mount from my desktop server, but of
course this Xeon would surely be configured as a fileserver in its
own right.
I snapped some photos of the server while it was up and running. You can see the front panel, indicating a bad HDD with amber LED to the left, and the status LCD readout on the right. The top view includes the CentOS 6.9 installation CD, the Seagate HDD in its hot-swap carrier, the detached front bezel, and one of the two hot-swap PSUs. Finally the rear view, mostly unremarkable.
In summary, I am thankful to be appointed as steward of this
Xeon server which I have christened "corazon". It has given me
experience with a hardware RAID controller, an up-close encounter
with this and other HA/redundancy features that are now common to
commercial enterprise systems, and I have augmented the education
and experience I gained with my
CompTIA A+
certification.
Amen.
Given by my hand,
Mr. Robert Andrew Earl
Student, Red Hat Academy, Cisco Network Academy
Class of 2020