...making Linux just a little more fun!

Pixie Chronicles: Part 1 Lessons from Mistakes

By Henry Grebler

Adam

I had a friend, Adam, who passed away far too young. Adam seemed to be competent at an amazing number of activities, mostly self-taught. He built himself a house almost single-handedly.

He once shared with me the secret of building. Adam would say that it was not possible to build a house without making mistakes. The secret, he said, was hiding your mistakes. For example, builders can never get the floor and walls to fit together exactly. So they invented skirting boards; for walls and ceiling: mouldings and cornices.

Subsequently, somebody must have decided to extend the concept and they became decorative. Now that's really hiding your mistakes - pretend that they are a feature!

Perhaps because I know (or think I know) more than the average bear about Linux, I get "adventurous", particularly on my projects at home. When I extend my activities beyond my areas of competence I frequently make mistakes. That's not intrinsically a bad thing: you have to break eggs to make an omelette. But it's important to be able to recover from mistakes.

As a result, I've adapted Adam's advice to fields which interest me.

Project

My project for last week was to build a server. Sounds simple. What could possibly go wrong? Watch and learn.

I run (the fairly old) Fedora Core 5 on my desktop, so I thought I'd install the latest Fedora on my server. I say "server" but it's actually a Pentium II, one of a pair a friend was throwing out. The other one replaced my aging (12-year-old Pentium 1 100) firewall.

But my project is more about concept than commercial-strength mail and web-serving, so grunt[1] is not important to me.

Installing from CDs holds no fascination for me: I've done it so many times, the novelty has worn off. I decided I'd do a "Look, Ma! No hands!" install.

The Pentium II was a Compaq which must have been quite leading edge when it was purchased. Of particular interest was the fact that it had on-board hardware PXE capability. I conceived a plan which would see the machine perform a network boot, then perform a network install driven by a kickstart file. Everything would be automatic: partitioning, timezone, system language, configuration of 2 NICs, root password - the lot.

As an exercise in building a single server, this approach is not very efficient. However, as an exercise in building multiple servers, it is excellent. One of the dangers of having a human (me) perform the same task repeatedly is that boredom sets in and the human (me) loses concentration, leading to mistakes. It is worth investing effort into automating the process. Although I currently have neither the hardware nor the need to build a farm of servers, I would like to learn the technique for the future. So for me, this is a learning exercise.

The following sections are written from the perspective of the machine to be built, which I'll call the target machine.

Typical Install

To provide some context, here is an overview of a typical install. Bear in mind that my old Compaq only has a CD drive (not DVD).

  1. Turn on the target machine.
  2. Insert the first install CD.
    • Machine boots from the removable disk, loading the install environment.
  3. Proceed through many screens answering questions about partitioning, timezone, system language, etc.
    • The machine begins to install from the first CD.
    • After a while, the machine asks for the second CD.
  4. Change CDs.
  5. Repeat until all 6 install CDs have been processed.

Note that the numbered steps involve human intervention. The indented sentences describe the target machine's behaviour.

Intended Install

Here is my rough plan for how things were supposed to go.

  1. Turn on the target machine.
    • Machine starts the boot process, asks the network for an IP address, then downloads the install environment.
    • The install environment uses the kickstart file to guide the rest of the install. It obtains the install files from the network. No further intervention is required.

Actual First Attempt

And here's what happened the first time.

  1. I turned on the target machine.
    • The machine booted, asked the network for an IP address, then downloaded the install environment.
    • The install environment used the kickstart file to guide the rest of the install.

However, it was here that I discovered the first of several mistakes.

I had not defined the partitioning information completely, so the machine stopped to ask for clarification.

I guess I could have gone back and fixed the problem and restarted, but it occurred to me that I might have made other mistakes. I decided to answer the partitioning questions and proceed.

Remarkably, that was the only problem. Or so it seemed. The install proceeded. It seemed to take a very long time, but I was not too surprised by this: the target machine is, after all, a Pentium II.

It is at this stage that a kickstart install starts to pay back. The install may take a long time, but I don't have to stay around to change CDs. There are always other tasks to be done, and in this case I left the room for several hours.

On my return, the machine was again asking for clarification about partitioning!

What had happened?!

Things Go Wrong

There is a temptation when things go wrong to start doing. This is almost invariably bad. It often leads to regretting! It takes some discipline and experience to stop and think.

"Doing" quite often leads to activities which modify the "crime scene". One then gets to a point where one wants to know what something looked like before the "doing" began. Too late.

"Thinking" doesn't prevent all mistakes, only some.

What had gone wrong?

How does one even begin to think in such a case? Well, every activity typically has a beginning, a middle, and an end.

The install had started ok. I'd then had the partitioning problem. After that, the install seemed to proceed as expected. So beginning and middle seemed to be off the hook. I concluded that I should look at the end.

What should have happened after the install completed?

I remembered that one of the kickstart parameters related to this. I had requested that the machine reboot after install.

In an install from CD, once the install environment has processed the last CD, it asks the user to remove any install media from the drive. By default, when the machine subsequently reboots, it takes its data from the recently installed hard drive.

However, in my case, on reboot, the machine performed its PXE boot - which restarted the install process. Thank heavens for my partitioning mistake. Without it, the install would have proceeded as before, come to the end - and started all over again. It might still be at it now, repeatedly booting and installing!

I guess after a few days I might have grown suspicious.

It's an ill wind that blows no good; and sometimes, as in this case, a mistake can be your friend.

Analysis

My first mistake was unexceptional. To install FC10 I needed to get a number of steps right. As we shall soon see, I've glossed over some of the steps to tell this part of the story simply. It is unsurprising that I did not get everything right the first time. (I would have been very surprised had everything just worked!)

The second mistake was much more interesting. It relates to the case where an action is repeated but the conditions change between the first and a later occurrence of that action. It's the sort of problem that is common in loops.

In this case, the first time I booted the machine, I wanted it to use PXE; the second time, I didn't. I hadn't realised that I had wanted different behaviours.

Even stated like this, it was not clear to me what to do next. At first sight, the simple solution is to request that the install environment not reboot at the end of the install. But, as we shall see in the next part, that solution is somewhat short-sighted.

Lessons

When something goes wrong, don't rush to act. Stop, think, and plan before acting.


[1] Reading Henry's articles is giving me a great workout in Aussie colloquialisms - "grunt" in this case, I'm told, means "muscle", i.e. computing power. -- Kat


Talkback: Discuss this article with The Answer Gang


[BIO]

Henry was born in Germany in 1946, migrating to Australia in 1950. In his childhood, he taught himself to take apart the family radio and put it back together again - with very few parts left over.

After ignominiously flunking out of Medicine (best result: a sup in Biochemistry - which he flunked), he switched to Computation, the name given to the nascent field which would become Computer Science. His early computer experience includes relics such as punch cards, paper tape and mag tape.

He has spent his days working with computers, mostly for computer manufacturers or software developers. It is his darkest secret that he has been paid to do the sorts of things he would have paid money to be allowed to do. Just don't tell any of his employers.

He has used Linux as his personal home desktop since the family got its first PC in 1996. Back then, when the family shared the one PC, it was a dual-boot Windows/Slackware setup. Now that each member has his/her own computer, Henry somehow survives in a purely Linux world.

He lives in a suburb of Melbourne, Australia.


Copyright © 2010, Henry Grebler. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 173 of Linux Gazette, April 2010

Tux