About the www.rimboy.com webserver


On July 12, 2002, www.rimboy.com pulled the plug on the Intel hardware that had served it for the past 2.5 years. In actuality, www.rimboy.com had always been served by Intel hardware, from its humble beginnings as rimboy.ml.org in the closet of the apartment Jen and I shared back in 1998. When ml.org folded rimboy.com was born (see the FAQ section for more info on that), still being served on the end of a cable modem. Rimboy.com was first served by an old IBM PS/1 SX/20 computer donated by my wife's mother. It handled mail, web, and firewall duties before I started investigating port forwarding for the hardware I was accumulating. www.rimboy.com would be served by a 486/dx66 w/ 8 megs of ram running behind my firewall for several months before the harddrive puked. By this point (IIRC), I had upgraded the sx20 to a dx/66 that would serve me for the next 3+ years.

When Jen and I purchased our house in December of 1999, I wanted to keep www.rimboy.com online since it at the time was a frequently cited site for getting linux up and running with the @home cable modem service. Thus, new hardware was purchased with the intent that it would fill in long enough until I could figure out what my long term solution for www.rimboy.com's needs would be. I came across a deal (at that time) from HiTechCafe on a Dell P100 with 32 megs of ram. The appealing part about this system was the fact that it had a scsi CDRom drive. I knew if it had a scsi cdrom drive then it would have a scsi bus. Unfortunately, Dell deployed scsi in those early pentiums since atapi cdrom drives were a dicey support proposition at best. Thus the scsi controller was more or less brain dead and my plans to build "a real server" (meaning using scsi hard drives) was more or less stopped at boot. Needing a solution, I went ahead and ran with the Western Digital hard drive that came with the system. I never felt comfortable about WD serving up the site given their track record. Thus, it came with great joy that despite the occasional power outage every 200+ days (no ups on the thing) the drive spun down as the server for www.rimboy.com without incident. Like I said, this system was retired after 2.5 years of day in day out serving with only a reboot after a power outage.

Upon opening and inspection of the system, I found that despite 2 fans pulling in and pushing out air, the insides of the system was free from the usual dust bunnies that accumulate in most systems. This is most amazing given that the machine sits in the electronics shop of the building I work in. Sawing is only one of a few things that takes place in the room. I yanked the network card out for re-use in the new server (and for several other reasons). Incidentally, the Netgear card was the first 10/100 network card I purchased. I'm not sure what possessed me to purchase the card since 100 mbit networking was still a ways off for the building's network.

Back in May 2002, my friend John, who I know from a period of my working career that I still cannot believe existed, gave me a Motorola StarMax3000 Macintosh clone. After several attempts at installing linux he threw up his hands and said, take it, I'm giving up (more or less). Having traded a few emails with him in the past regarding trying to get linux running on this Mac, I wondered what it was that stumped him. John's a sharp guy. What in the world's going on with this hardware?

The hardware, as mentioned was a StarMax3000. A 180 MHz 603e PowerPC (PPC) processor, John gave me the system with 80 megs of ram. The motherboard is known as the Tanzania design, a motherboard that Apple spec'd for clone makers to use, of which Motorola and Umax (amongst others) as did Apple (PM 4400) used. The Tanzania design is often maligned (and rightly so) given it's peculiar memory limitations and other oddities people face when using one. That said, I determined that it would be more than adequate to serve as a web server and secondary mail server. It was going to be a major improvement over the Dell P100. One of the features of the Starmax systems was the fact that they could accepts PS/2 keyboards and mice in addition to the available ADB products. Furthermore, it had a standard HD-15 SVGA monitor connector which meant I did not have to keep a DB15-HD15 adapter lying about. The other nice selling point about the system was the fact that it used IDE drives. Thus any of the standard IDE drives could be used in the system. It would not be until the G3 that Apple would really support IDE in a Mac, and frankly G3's still belong on a graphic designers desktop. Not serving up some 2 bit website (such as mine). Note, before you email me, I know Apple did IDE in the Performa 6200 and the LC series of 68k macs. For the 6200 series, I suggest you visit www.lowendmac.net as to why using the 6200 for anything (and I mean anything) is not a good reason. Look under the road apples section and learn why it's quite possibly the worst Mac Apple ever built (and I mean worst).

I came to the conclusion that if I built a new webserver it would be on something that was not Intel or AMD based. I came to this conclusion given the proliferation of Linux. Who does not have a Linux system these days? (I don't really want to know the answer to that one fwiw). I wanted a challenge and a challenge this system presented. There are some side benefits such as a few security thru obscurity bonuses. But at the end of the day, I wanted to say my web server's running on a Mac. No it's not OSX and no it's not running MacOS (well, for about 30 seconds until BootX kicks in). Besides I reasoned, I could redeploy my P100 hardware into my cluster project.

I decided that I'd run with Yellow Dog Linux (7.2) since a short amount of investigation revealed it would be the distro to use. SuSE costs money (sorry, I'll pass thank-you) and is overkill given the number of discs they provide, Debian drives me nuts and at the time I forgot about Mandrake. Later attempts to install Mandrake would prove useless so YDL was the way to go. I dropped an 8 gig drive into the machine which I figure is roomy compared the the 2 gigs that were in that Dell. It became very clear during the install of YDL that several show stopping bugs were present. These mainly manifested themselves in the drive partitioning sequence. After several reboots I finally managed to get the drive sliced and diced the way I wanted it. It was teeth pulling on my part and I certainly cannot fault John having problems. For one of his first linux installs he could not have it worse.

The first thing I noticed was that the network card in the StarMax, despite working fine in MacOS would puke in Linux. I figured the best thing to do would be to get the latest linux kernel (2.4.18) and compile it to support any number of NIC's that I had lying about the house. I was delighted to find that the Intel EE 100's were supported and thus set about getting the server up and running on that card.

The fun began shortly after reboot. The system was unstable in a bad way. Kernel panics for signals you don't normally see. Suspecting the Intel card, I dropped in a Farallon card that Linux worked with. Unfortunately it too showed many of the same problems. It was weird. I could compile a kernel several times over, but a random http request, ssh login, or installing an rpm would send the machine to splitsville. Dayton Hamvention was just around the corner and so the machine sat for most of the rest of May.

Towards the end of June I decided to see if I could nail down the problems I was facing. Certain kernels ran fine, but my 2.4.18 kernel was unuseable. In desperation, I began trimming out anything and everything that looked like fluff via make menuconfig. Still no dice. Compile time ranges 45min to an hour for a make dep && make clean && make bzImage && make modules && make modules_install. At this point I decided it was time to get my google on and figure out what it was that was causing this otherwise fine piece of hardware to hang up and dial the bit bucket.

Given that the crashes were frequent and waiting for fsck on an 8 gig drive is at the low end of my level of tolerance, I implemented ext3 journaling filesystem. ext3 has emerged as a popular option for linux users, especially given the fact that one can easily convert between ext2 and ext3. When I setup my machine it was ext2. Converting to ext3 required compiling a kernel with support enabled, running tunefs and rebooting. While I'm normally one to preach SGI's XFS filesystem, XFS requires the install kernel to support XFS if you want your rootfs to use XFS. Given that YDL does not support XFS (nor does it support ext3 which is disappointing) ext3 was about my only option. Mandrake PPC 8.2 does support XFS, but the installer is very broken, especially on clones based on the Tanzania board, such as the Motorola StarMax3000. I stay away from reiserfs like the plague. Because it is.

After some research, it was clear that one guy was the goto guy for Linux on PPC. I dropped him a line, described the problems, hardware I was using, and what I had already tried. Less than 24 hours later I had an email from him stating that PPC support, especially for the 603e procs was broken in the 2.4.18 kernel. He suggested several download options and mentioned that the rc2 kernel for 2.4.19 would fix most of the problems. The problem is that the rc2 kernel appears to only be available via bitkeeper. I decided to give rc1 for 2.4.19 a spin. Unfortunately something is broken during the compile. I went back thru his email and searched googled to find this: http://www.ppckernel.org/. Right there in front of my face was the benh set of kernels and in particular a rc1 for 2.4.19 as patched by benh. After a quick download and compile (that was not broken) I gave the system a reboot.

Everything seemed to run well so I left it overnight. I came in the next morning to find the system still up and serving pages. So, after making some final config changes and verifying that it would serve as my secondary mail server, I redeployed the old Dell P100 on a different IP address and brought up the new www.rimboy.com (known as zoot since I'm a muppet fan) on the old Dell's IP address. We were serving! I had to fix a perl script or two that runs the backend of this site. But otherwise it was a painless switch. I put the system on an UPS given that the recent weather has made the building power flicker on occasion. That said, despite those flickers the Dell still managed to turn in 257 days of uptime before I issued the halt command. The record incidentally, was somewhere in the neighborhood of 360 days before building power was terminated in order to complete some work. While I was robbed of a year of uptime, it's not imperative that my site need that kind of uptime. Better yet, it's a clear sign that upgrades have not taken place. You can deduce what you want about the config on that Dell. Just bear in mind what distros were available 2.5 years previous.

I pulled the old NIC that was in the P100 and put it into the new webserver. Given that it's a netgear card revision that works under the tulip driver, the system came back up and I had officially made the switch. The next few weeks should be interesting to see in terms of long term stability. I think the system will do fine, but given the various sets of problems I faced in bringing the system online I'm still just a tad leary of what's in store. Not to worry, the old P100 currently sits on top of Zoot on the off chance it needs to be pressed into emergency service. I don't expect to make that switch though. If you're reading this, then the Mac is serving just fine.


Update 2/3/2004
Still chugging along. Recently had 200+ days of uptime before they cut building power to do some work. Otherwise it's stable. Seems to handle the recent site redesign without issue and with no noticeable speed problem. It may not be the fastest, but it's not been a slouch either.