Troubleshooting
warewulfd
The Warewulf server (warewulfd
) sends logs to the systemd journal.
journalctl -u warewulfd.service
To increase the verbosity of the log, specify either --verbose
or
--debug
in the warewulfd OPTIONS.
echo "OPTIONS=--debug" >>/etc/default/warewulfd
systemctl restart warewulfd.service
iPXE
If you’re using iPXE to boot (the default), you can get a command prompt by pressing with C-b during boot.
From the iPXE command prompt, you can run the same commands from default.ipxe to troubleshoot potential boot problems.
For example, the following commands perform a (relatively) normal Warewulf boot. (Substitute your Warewulf server’s IP address in place of 10.0.0.1, update the port number if you have changed it from the default of 9873, and substitute your cluster node’s MAC address in place of 00:00:00:00:00:00.)
set uri http://10.0.0.1:9873/provision/00:00:00:00:00:00
kernel --name kernel ${uri}?stage=kernel
imgextract --name image ${uri}?stage=image&compress=gz
imgextract --name system ${uri}?stage=system&compress=gz
imgextract --name runtime ${uri}?stage=runtime&compress=gz
boot kernel initrd=image initrd=system initrd=runtime
The
uri
variable points towarewulfd
for future reference. This includes the cluster node’s MAC address so that Warewulf knows what image and overlays to provide.The
kernel
command fetches a kernel for later booting.The
imgextract
command fetches and decompresses the images that will make up the booted noe image. In a typical environment this is used to load a minimal “initial ramdisk” which, then, boots the rest of the system. Warewulf, by default, loads the entire image as an initial ramdisk, and also loads the system and runtime overlays at this time time.The
boot
command tells iPXE to boot the system with the given kernel and ramdisks.
Note
This example does not provide assetkey
information to warewulfd
. If
your nodes have defined asset tags, provide it in the uri
variable for
the node you are trying to boot.
For example, you may want to try booting to a pre-init shell with debug logging
enabled. To do so, substitute the boot
command above.
boot kernel initrd=image initrd=system initrd=runtime rdinit=/bin/sh
Note
You may be more familiar with specifying init=
on the kernel command
line. rdinit
indicates “ramdisk init.” Since Warewulf, by default, boots
the node image as an initial ramdisk, we must use rdinit=
here.
GRUB
If you’re using GRUB to boot, you can get a command prompt by pressing “c” when prompted during boot.
From the GRUB command prompt, you can enter the same commands that you would otherwise find in grub.cfg.ww.
For example, the following commands perform a (relatively) normal Warewulf boot. (Substitute your Warewulf server’s IP address in place of 10.0.0.1, and update the port number if you have changed it from the default of 9873.)
uri="(http,10.0.0.1:9873)/provision/${net_default_mac}"
linux "${uri}?stage=kernel" wwid=${net_default_mac}
initrd "${uri}?stage=image&compress=gz" "${uri}?stage=system&compress=gz" "${uri}?stage=runtime&compress=gz"
boot
The
uri
variable points towarewulfd
for future reference.${net_default_mac}
provides Warewulf with the MAC address of the booting node, so that Warewulf knows what image and overlays to provide it.The
linux
command tells GRUB what kernel to boot, as provided bywarewulfd
. Thewwid
kernel argument helpswwclient
identify the node during runtime.The
initrd
command tells GRUB what images to load into memory for boot. In a typical environment this is used to load a minimal “initial ramdisk” which, then, boots the rest of the system. Warewulf, by default, loads the entire image as an initial ramdisk, and also loads the system and runtime overlays at this time time.The
boot
command tells GRUB to boot the system with the previously-defined configuration.
Note
This example does not provide assetkey
information to warewulfd
. If
your nodes have defined asset tags, provide it in the uri
variable for
the node you are trying to boot.
For example, you may want to try booting to a pre-init shell with debug logging
enabled. To do so, substitute the linux
command above.
linux "${uri}?stage=kernel" wwid=${net_default_mac} debug rdinit=/bin/sh
Note
You may be more familiar with specifying init=
on the kernel command
line. rdinit
indicates “ramdisk init.” Since Warewulf, by default, boots
the node image as an initial ramdisk, we must use rdinit=
here.
Dracut
By default, dracut simply panics and terminates when it encounters an issue.
Dracut looks at the kernel command line for its configuration. You can configure it for additional logging and to switch to an interactive shell on error:
wwctl profile set default --kernelargs=rd.shell,rd.debug,log_buf_len=1M
For more information on debugging Dracut problems, see the Fedora dracut problems guide.
Ignition
If partition creation doesn’t work as expected you have a few options to investigate:
Add
systemd.log_level=debug
and orrd.debug
to the kernelArgs of the node you’re working on.After the next boot you should be able to find verbose information on the node with
journalctl -u ignition-ww4-disks.service
.You could also check the content of
/warewulf/ignition.json
.You could try to tinker with
/warewulf/ignition.json
calling/usr/lib/dracut/modules.d/30ignition/ignition \ --platform=metal \ --stage=disks \ --config-cache=/warewulf/ignition.json \ --log-to-stdout
after each iteration on the node directly until you find the settings you need. (Make sure to unmount all partitions if
ignition
was partially successful.)Sometimes you need to add
should_exist: "true"
for the swap partition as well.
Running Containers on Cluster Nodes
Some container runtimes, notably Podman, require file system features that are
not available in initrootfs
. Cluster nodes using Podman (and some other
container runtimes) should be configured with --root=tmpfs
.