PXEBooting

From WBITT's Cooker!

(Difference between revisions)
Jump to: navigation, search
m (=)
(=)
Line 16: Line 16:
Import the gpg key. It would be wise to do it at this moment:
Import the gpg key. It would be wise to do it at this moment:
-
{{
+
{
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
-
}}
+
}

Revision as of 23:17, 4 February 2010

PXE boot-howto-simplified:

=============

Scenario: storage.homedomain.com : Install server kvm.homedomain.com : Physical host to host kvm virtual machines. Will be used as test server to be installed using PXE from the install server. xen.homedomain.com : Physical host to host xen virtual machines. Will be used as test server to be installed using PXE from the install server.


The install server (storage):


Copy the (expanded) CentOS-5.4-x86_64 DVD in a directory /data/cdimages/CentOS-5.4-x86_64 .


Import the gpg key. It would be wise to do it at this moment:

{ rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5 }


Create a local repository for this server, so it can install any required packages , even if it is not connected to internet.

{{{ [root@storage ~]# vi /etc/yum.repos.d/CentOS-localmedia.repo [CentOS-5.4-x86_64-local] name=RedHat Enterprise Linux $releasever - $basearch baseurl=file:///data/cdimages/CentOS-5.4-x86_64/ enabled=1 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release [root@storage ~]# }}}

You can verify the new repository using:

2.1 kB     00:00

CentOS-5.4-x86_64-local/primary_db

You must verify it using the following method too:-

tail

yum-protect-packages.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-protectbase.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-security.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-tmprepo.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-updateonboot.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-utils.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-verify.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local yum-versionlock.noarch 1.1.16-13.el5.centos CentOS-5.4-x86_64-local zenity.x86_64 2.16.0-2.el5 CentOS-5.4-x86_64-local zsh-html.x86_64 4.2.6-3.el5 CentOS-5.4-x86_64-local [root@storage ~]#


This shows that the repository and it's packages is being correctly loaded / listed.


You need to have the following installed on the install server:

dhcp (server) syslinux tfpt httpd


{{{ [root@storage ~]# yum -y install httpd tftp-server syslinux dhcp }}}

Now create an httpd alias , which will be used by the machines being installed, to pull pakcages from httpd repository.

{{{ vi /etc/httpd/conf.d/CentOS-5.4-x86_64.conf

Alias /CentOS-5.4-x86_64 /data/cdimages/CentOS-5.4-x86_64/ <Location /CentOS-5.4-x86_64>

   Order deny,allow
   Allow from all
   Options +Indexes

</Location> }}}

The "Options +Indexes" seems to be un-necessary, but it is better to put it here. This will help you later to download individual packages from this local repository in difficult situations.


Restart Apache server:

service httpd restart

And check through a browser, if you are able to view the web page of your repository.


If you don't have a web browser available to you, try using a simple wget to pull index.html from this link.

{{{ [root@storage ~]# wget http://192.168.1.100/CentOS-5.4-x86_64/ --2010-02-05 11:41:11-- http://192.168.1.100/CentOS-5.4-x86_64/ Connecting to 192.168.1.100:80... connected. HTTP request sent, awaiting response... 200 OK Length: 6623 (6.5K) [text/html] Saving to: `index.html'

100%[========================================================================>] 6,623 --.-K/s in 0s

2010-02-05 11:41:11 (134 MB/s) - `index.html' saved [6623/6623]

[root@storage ~]# }}}


Time to setup dhcp service on this (install) server:

  1. vi /etc/dhcpd.conf

ddns-update-style interim; ignore client-updates;


subnet 192.168.1.0 netmask 255.255.255.0 {

  1. --- default gateway
       option routers                  192.168.1.1;
       option subnet-mask              255.255.255.0;
       option domain-name              "homedomain.com";
       filename "pxelinux.0";
       range dynamic-bootp 192.168.1.11 192.168.1.20;
       default-lease-time 21600;
       max-lease-time 43200;
       next-server 192.168.1.100;

}


Start the dhcpd service. Make sure that there is no other DHCP service running elsewhere on your network.

service dhcpd restart


Now it is time to setup the TFTP service:-

[root@storage ~]# cat /etc/xinetd.d/tftp service tftp {

       socket_type             = dgram
       protocol                = udp
       wait                    = yes
       user                    = root
       server                  = /usr/sbin/in.tftpd
       server_args             = -s /tftpboot
       disable                 = no
       per_source              = 11
       cps                     = 100 2
       flags                   = IPv4

} [root@storage ~]#


Restart the xinetd service:-

service xinetd restart chkconfig --level 35 xinetd on


Next, we need to copy two special files from a special location of distribution media/repository to /tftpboot directory.

[root@storage ~]# ls /data/cdimages/CentOS-5.4-x86_64/images/pxeboot boot.iso diskboot.img minstg2.img pxeboot README stage2.img TRANS.TBL xen [root@storage ~]#


We will copy the vmlinuz and initrd.img files, from this location to /tftpboot directory :-

[root@storage ~]# cp /data/cdimages/CentOS-5.4-x86_64/images/pxeboot/* /tftpboot/

[root@storage ~]# ls /tftpboot/ -l total 9140 -rw-r--r-- 1 root root 7397850 Feb 5 12:41 initrd.img -rw-r--r-- 1 root root 265 Feb 5 12:41 README -r--r--r-- 1 root root 659 Feb 5 12:41 TRANS.TBL -rw-r--r-- 1 root root 1932284 Feb 5 12:41 vmlinuz [root@storage ~]#


The README and TRANS.TBL are not needed, but there is no harm in having them here in /tftpboot/ .

Basically we need only vmlinuz and initrd.img .


Now is the time to copy the pxelinux.0 file (from the syslinux package) to /tftpboot directory .

  1. cp /usr/lib/syslinux/pxelinux.0 /tftpboot/


Now make sure that all files and direcoties inside /tftpboot is world readable.

  1. chmod +r /tftpboot/* -R

[root@storage tftpboot]# ls -l total 9152 -rw-r--r-- 1 root root 7397850 Feb 5 12:41 initrd.img -rw-r--r-- 1 root root 13148 Feb 5 12:53 pxelinux.0 drwxr-xr-x 2 root root 4096 Feb 5 12:48 pxelinux.cfg -rw-r--r-- 1 root root 1932284 Feb 5 12:41 vmlinuz [root@storage tftpboot]#

(I have removed the README and TRANS.TBL files from /tftpboot to remove clutter).


PXE configuration detail:- When a client boots, by default it will look for a configuration file from TFTP, with the same name as it's MAC address. However, afrer trying several options, it will fall back to requesting a default file, with the name "default". This file needs to be in a directory in /tftp of the install server.

  1. mkdir /tftpboot/pxelinux.cfg
  1. vi /tftpboot/pxelinux.cfg/default

prompt 1 timeout 5 default linux label linux kernel vmlinuz append vga=normal initrd=initrd.img


Once these settings are done, you can now try rebooting your client computer and see if it is able to get this TFTP image from this install server.

You will need to enable PXE boot from the network card in the BIOS of that machine. (This is the defult/intended behavior in an HPCC environment).

Note: If you get TFTP open timeouts on the client machine, (the machine, which is to be installed through PXE boot), then may be you did not start your xinetd service yet. Or may be a firewall issue.


I had my client machine setup to boot from PXE, and here is what I see in /var/log/messages:

Feb 5 12:59:54 storage dhcpd: DHCPDISCOVER from 00:13:72:81:3a:3d via eth0 Feb 5 12:59:55 storage dhcpd: DHCPOFFER on 192.168.1.20 to 00:13:72:81:3a:3d via eth0 Feb 5 12:59:58 storage dhcpd: DHCPREQUEST for 192.168.1.20 (192.168.1.100) from 00:13:72:81:3a:3d via eth0 Feb 5 12:59:58 storage dhcpd: DHCPACK on 192.168.1.20 to 00:13:72:81:3a:3d via eth0 Feb 5 12:59:58 storage in.tftpd[19447]: tftp: client does not accept options


Note: The message "in.tftpd[19447]: tftp: client does not accept options" , is not an error message. Just information, nothing to worry about.


By doing this, you have managed to start up the interactive installation . Congratulations!

Please note, that this is not what we want. Read the next section.

=================================================================================

We want nodes to install through kickstart, automatically. And also, we do not want them to get stuck in an install loop for ever. Means, that we want that a node, should only get installed, when it is asked to, and should boot from local disk, when the installation is over.

We will achieve our objectives in steps. First we fix the automatic installation problem, using kickstart.

By default, when a system is installed, there is a file in /root , named anaconda-ks.cfg . We will use the same file as tempelate. Here:


cp /root/anaconda-ks.cfg /var/www/html/compute-ks.cfg

[root@storage ~]# vi /var/www/html/compute-ks.cfg install text url --url http://192.168.1.100/CentOS-5.4-x86_64 lang en_US.UTF-8 keyboard us network --device eth0 --bootproto dhcp rootpw --iscrypted $1$VQPyk3Ev$JePfY50WaA.aBhKT3xsBq. firewall --disabled authconfig --enableshadow --enablemd5 selinux --disabled timezone Asia/Riyadh bootloader --location=mbr --driveorder=sda --append=""

  1. The following is the partition information you requested
  2. Note that any partitions you deleted are not expressed
  3. here so unless you clear all partitions first, this is
  4. not guaranteed to work

zerombr clearpart --all --initlabel

  1. part / --fstype ext3 --size=1 --grow
  2. part swap --size=256

part / --fstype ext3 --size=3000 part swap --size=512 reboot

%packages @core


Make sure that this file is readable by Apache, or world readable.

[root@storage ~]# ls -l /var/www/html/ total 8 -rw------- 1 root root 766 Feb 5 13:14 compute-ks.cfg -rw-r--r-- 1 root root 31 Nov 23 07:31 index.html

[root@storage ~]# chmod +r /var/www/html/*.cfg

As you can see that all compute node, as many as ther would be, would install without a hostname, and without a permanent IP assigned to them. Also as soon as they get installed, they will reboot and go through another install cycle. Which is not desired. However, before we go on and solve that issue, lets try to install the node as per above configuration. You will need to modify the /tftpboot/pxelinux.cfg/default file as :-


Edit the tftpboot file again and add extra options:-

[root@headnode ~]# vi /tftpboot/pxelinux.cfg/default

prompt 1 timeout 5 default linux label linux

       kernel vmlinuz
       append vga=normal initrd=initrd.img ip=dhcp ksdevice=eth0 ks=http://192.168.1.100/compute-ks.cfg


Reboot your client machine and see the magic. You should be able to see activity in the apache access_log. Entries such as following:-

[root@storage ~]# tail -f /var/log/httpd/access_log . . . . . . 192.168.1.20 - - [05/Feb/2010:13:22:41 +0300] "GET /CentOS-5.4-x86_64/CentOS/pcsc-lite-libs-1.4.4-0.1.el5.x86_64.rpm HTTP/1.1" 200 24120 "-" "urlgrabber/3.1.0 yum/3.2.22" . . . . . .


During this install session, I was able to see the following as the last entries in the apache access_log :-

192.168.1.20 - - [05/Feb/2010:13:24:54 +0300] "GET /CentOS-5.4-x86_64/CentOS/NetworkManager-0.7.0-9.el5.x86_64.rpm HTTP/1.1" 200 1099937 "-" "urlgrabber/3.1.0 yum/3.2.22" 192.168.1.20 - - [05/Feb/2010:13:24:54 +0300] "GET /CentOS-5.4-x86_64/CentOS/NetworkManager-glib-0.7.0-9.el5.i386.rpm HTTP/1.1" 200 83448 "-" "urlgrabber/3.1.0 yum/3.2.22"


As you might guess, these entries may vary as per the installation packages selected. In other words, it is not guaranteed that NetworkManager would be the last package obtained from the repository during an install.

Basically I am trying to tell you that we need to find/device a way so we know that when installation of a particular node is complete.


End of day.

Here are things to do / ideas: Create a text file of all hostnames, IP and MAC addresses, delimited by a space. Create a program perl/bash, which will read this file and generates a proper kickstart and related tftpboot file in /tftpboot/pxelinux.cfg/ directory. Create a small text file and make it available in the /var/www/html directory, which will be pulled and copied to the installed node in /tmp, during (at the end of) the %post. Some program should constantly monitor the apache logs and check if a node accesses this file. When some node accesses this file, that means that the node is now completed installation and it's tftpboot file can be removed. This also means that the default tftpboot file should only contain a boot from local disk option. This way if some nodes mac-based tftp file is removed, its PXE will fall through and land on the default, which will make it boot from the local disk. This should do it.

Will work on this next day.

Personal tools