PXE Boot on Ubuntu 16.04 (Part 1)

As I’ve mentioned a few times before, I’m currently building a nice PXE-based platform for testing and imaging machines at work. Building a PXE server is relatively straight-forward, but it takes some understanding of how PXE works.

How PXE booting works

PXE (Preboot eXecution Environment) is a standard way to boot a client device over a standard network connection.  PXE works by following a defined method to send a ‘bootstrap’ to the client machine, which then completes the boot process.

The procedure for PXE is pretty simple when you’re booting Linux:

  1. Compatible device boots, and the option ROM in the NIC attempts to get a DHCP lease.  If it is successful, the DHCP server gives the client device the name of a file to boot from.
  2. The client device connects to a TFTP (Trivial FTP) and attempts to download the file.  If the download is successful, it will begin to boot from the file if it is possible.
  3. The bootable file either contains or refers to a set of parameters to create a bootstrap, frequently initramfs and vmlinuz files, along with some configuration for the root filesystem.
  4. The initramfs and vmlinuz files are downloaded by the client and control is handed off to the bootstrap as it loads the rootfs over the network (usually NFS or HTTP).

In this guide, I’ll be showing you how to make a PXE boot server on Ubuntu 16.04.  The same methods should work on newer versions without much issue, though.

Hardware

PXE booting isn’t terribly hardware intensive, especially if a little bit of a wait is acceptable.  The only real hardware issue is that, since PXE uses DHCP, being able to isolate network ports can be extremely useful.  DHCP can temporarily take down a whole network due to the way it works, so be careful about doing anything involving it without understanding what you’re doing!

Software

The software requirements for PXE are pretty light.  We’ll be using:

  • tftpd-hpa
  • isc-dhcp-server
  • NFS

Yeah, really, only 3 pieces of software are required for a basic PXE server.  Once you get that stuff installed, check out part 2!

Running only specific tests in Memtester

Panucci uses memtester to run a pattern RAM test. All well and good, but some of these tests take forever – unfortunately, time is of the essence here. Fortunately, memtester offers a way to run only certain tests. From the documentation:

-add ability to run only specified tests by setting the environment
variable MEMTESTER_TEST_MASK to a bitmask of the indexes of the tests to be
run. Thanks: Ian Alexander.

This means that you can configure which tests run by adding a simple environment variable.

Things don’t work normally when you run using *sudo*, though, so we need to add it to /etc/environment, like this:

MEMTESTER_TEST_MASK=xxxxx

To generate the mask, look at the list of tests memtester runs:

Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok

Simple create a binary number, with the most significant digit being the bottom of the list (i.e., 16-bit writes) and turning a test on with a 1 or off with a 0, being sure to leave out that “top” test, Stuck Address, since it runs no matter what. Take the full number and convert it to decimal, put it into the environment file, and run memtester again.

Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Checkerboard : ok
Bit Flip : ok
Walking Ones : ok
8-bit Writes : ok
16-bit Writes : ok

/32 doesn’t do me a bit of good.

So I’m rebuilding my server at work to use a standard PXE boot, rather than the clunky boot method that DRBL provides (which, without documentation-diving, requires me to make changes to the server, essentially). I learned something, uh, vital today. I need to check netmasks more closely.

Wonder what happens when you try to set up NAT on an interface (or set of interfaces…) that your idiotic self set as netmask 255.255.255.255? In other words, a /32?

Yeah, it, uh, refuses to connect to anything, even *the same device that gave it the IP address in the first place*.

Go self.

And I said “Why?”

Fun stupidity from today – I’m rebuilding Farnsworth, the imaging server at work, into a fatter, but more useful, form (i.e., I’m building it out of a 24-bay server with 3.5″ bays and more room to work).

I have 24x 1TB SATA drives (specifically, Seagate ES.2). I’m curious, though, why they have a jumper on them that limits SATA speed. With jumper, it’s set to 1.5gbps SATA, without, it’s set to 3.0gbps SATA. Other than having to remove these little half-height jumpers with tweezers, I’m somewhat amused, honestly. Why would this be there?

The importance of solving the right problem

At my job, I am something of a general problem-solver. I build spreadsheets, tiny web apps, up to imaging servers with custom software.

Our guy in data destruction (responsible for wiping the drives to ensure they’re reasonably secured) was having some problems keeping track of inventory when he counted it up, so I worked with him to make a little inventory clicker application that runs on an iPad we stuck magnets to – the joy of e-recycling!

He said that it works great, it looks great, it does exactly what he needed!

There’s just one problem – he never uses it. As it turns out, he just never remembers to use it, because it’s so far from his routine. Alarms might’ve worked, except the iPad is silent, likely due to a busted speaker.

Still, we solved the wrong problem first; the software doesn’t mean a thing if it doesn’t get used.

Using Savon with SellerCloud

At work, we’re moving to a new sales/inventory/fulfillment platform called SellerCloud. Their API, while open, is SOAP-based, which means I had to learn something new to make this all work.

XML is evil, but here’s the final bit to get it really going:

client = Savon.client(wsdl: "http://--.ws.sellercloud.com/scservice.asmx?WSDL", soap_header: {'AuthHeader' => {:@xmlns => "http://api.sellercloud.com/", 'UserName' => "user", 'Password' => "password", "ApplicationName" => "Test", "ApplicationVersion" => 1}}, log: true, env_namespace: :soap, pretty_print_xml: true)

Building a testing platform in Ruby/Sinatra

As I mentioned, we’re building a new testing system at my job. We’re both web people first, so for us, it makes sense to make something that can use web technologies to make the whole thing easier to work on.

We’re using Ruby, with the Sinatra gem (think a simplified Rails, and you’re close). The PXE image boots into the i3 window manager, and autostarts the testing script (which is plain Ruby outside the Sinatra bits).

The testing uses a few small utilities, like memtester, seeker ( a very small utility I found to test seeking), and Ruby to glue it all together.

Fun things I’ve learned:

1) Trying to parse output from a curses-based UI is not going to end well
1a) It’s OK to make the curses-based UI pop up on the screen as long as it’s for good reason
2) Ruby has really simple script forking, which means that I can the HDD and memory testing simultaneously
2a) Ruby has really simple script forking, which means, much to my chagrin, that no, splitting memory testing into four parts doesn’t make it go faster.
3) Ruby makes really nice glue for stuff like this
3a) I think I’m stuck, help.
4) Assuming that a device will pass a test, and only failing it if it doesn’t, is a perfectly viable strategy
4a) Don’t forget to actually mark it as failed.
5) Don’t overthink your plans — sometimes it’s best to figure out a basic plan and run with it.
5a) Don’t underthink your plans, either — that’ll come back to bite ya.

Building a modern system testing application

So, part of the project that led to Seymour (my drive geometry script) picked up yesterday — we are building a simplified system test suite that runs over a PXE image.

Right now, we have a Ruby script that uses the Sinatra gem to generate a web page that we can use to monitor the progress. It’s fairly simple so far – it displays the status of a memory testing tool, and we’re implementing hard drive testing soon, but it’s a good aesthetic, and it’s useful – plus it is way, way cheaper than our current solution.

I learned today how to do forking in Ruby. Beyond being really, really easy, it is really useful for something like this – we don’t want the application to be held up because we’re waiting for the memory test to complete.

To do the forking, I wrote the following:
memstatus = "Testing in Progress"
fork do
memstatus = system("sudo memtester #{memTestAmt.to_s} 1").passfail
exit
end

I’m using tempfiles to store the results, which is fortunately a nice clean package in Ruby.

So, we fork for the memory test and again for the hard drive test, then use the result to determine the content of the page. Seems pretty simple, I guess? It’s been interesting so far, even for a simple application that does so many hacky things.

Drive Geometry Specs in Clonezilla

Yesterday, I posted about automating drive geometry in Clonezilla.  I realized that the actual information is somewhat lacking, so here’s an explanation about how this all fits together.

sda-pt.sf is the only file that I’ve found to be strictly necessary to change drive geometry between images. sda-pt.sf contains the partition table for device sda, appropriately enough.

Here’s an example of sda-pt

label: dos
label-id: 0xf0a1cc41
device: /dev/sda
unit: sectors

/dev/sda1 : start= 2048, size= 204800, type=7, bootable
/dev/sda2 : start= 206848, size= 142272512, type=7
/dev/sda3 : start= 142479360, size= 13821952, type=7

The first four lines aren’t particularly noteworthy. It’s a DOS partition table, it came from /dev/sda (the first disk in a system, essentially), and the numbers used are specified in sectors of the disk (usually 512 bytes per sector).

What’s really interesting here is the last three lines. These are the actual geometry specs, and the .sf format specifies start and a size.

Before we begin digging into the process to resize, I want to point out one small factor that left me a little confused at first. Note how, for example, sda1 (first partition) has a size of 204800 (exactly 100MB, coincidentally), and starts at 2048, while sda2 starts as 206848. While this may seem intuitive to some, it had me a little confused at first, because I forgot that the start sector is included in the size.

Moving on, let’s get into the real fun – calculating the new partition tables from this. In this example, we want the second partition to grow, since this image is for a Windows 7 machine. The first partition is the system partition, third partition is recovery, so we don’t need to worry about those growing.
486 178 048
For this example, we’ll make the geometry spec for a 256GB drive, which we figure will have 500,000,000 sectors.

We start from the end and work our way back on this. sda3 is 13,821,952 sectors, so the start point should be (500,000,000 - 13,821,952) = 486,178,048

We can change the line for sda3 to the following: /dev/sda3 : start= 486178048, size= 13821952, type=7

Since sda2 is our primary partition, we now need to calculate its new size. This, fortunately, is a simple matter of subtracting the start of sda3 from the start of sda2. In this instance, 486178048 - 206848 = 485,971,200.

This means we can change the line for sda2 to the following: /dev/sda2 : start= 206848, size= 485971200, type=7

The first partition doesn’t change at all, so we now have a new geometry specification:

/dev/sda1 : start= 2048, size= 204800, type=7, bootable
/dev/sda2 : start= 206848, size= 485971200, type=7
/dev/sda3 : start= 486178048, size= 13821952, type=7

Please be sure to keep types the same, though – that will break things.

That’s it for today, though! Hope you got something useful out of this, and feel free to ask questions!