Kernel and kdump
Kdump is a kernel crash dumping mechanism. It is very reliable because the crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever system crashes. This second kernel, often called the capture kernel, boots with very little memory and captures the dump image.
The first kernel reserves a section of memory that the capture kernel uses to boot. Kexec enables booting the capture kernel without going through the BIOS, so contents of the first kernel's memory are preserved, which is essentially the kernel crash dump.
How to Use Kdump
Step 1: Configuring Kdump
- First, install the
kernel-debuginfopackages using the following command line.
dnf install --enablerepo=fedora-debuginfo --enablerepo=updates-debuginfo kexec-tools crash kernel-debuginfo
- NOTE: The
kernel-debuginfopackages are only required to examine the resulting kernel dump file. If you are setting up kdump on a machine simply to capture a dump file that will be analyzed by someone else or on a different machine, you can skip those packages.
- Fedora 34 and older: Use
kdumpctl estimateto determine the recommended crash kernel size to use. Then add the
crashkernelcommand line option using the recommended size. For example:
grubby --args="crashkernel=512M" --update-kernel=ALL
- Fedora 35 and newer: Use
kdumpctl reset-crashkernel. This determines a range-based
crashkernelvalue, and adds the proper
crashkernelcommand line option to the currently running kernel's boot entry. See the man page how to specify a kernel.
- Optionally, edit the kdump configuration file at
/etc/kdump.conf. This will allow you to write the dump file over the network or to a location on the local system other than /var/crash. For additional information, consult the mkdumprd man page and the comments in /etc/kdump.conf.
- Next, activate the kdump system service at startup using the following the command.
systemctl enable kdump.service
- Finally, reboot your system.
- kdump.service takes care of pre-loading the capture kernel at system boot time.
- It is recommended to either set up a serial console or switch to run level 3 (init 3) for testing purposes. The reason is that kdump does not reset the console if you are in X or framebuffer mode, and no message might be visible on console after system crash. You may also see screen corruption in graphics mode during capture.
- Capturing a crash dump can take a long time, especially if the system has a lot of memory. Be patient. The system will reboot after the dump is captured.
Step 2: Capturing the Dump
Normally kernel panic() will trigger booting into capture kernel but for testing purposes one can simulate the trigger in one of the following ways.
- Enable SysRq then trigger a panic through
echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger
- Trigger by inserting a module which calls panic().
The system will boot into the capture kernel. A kernel dump will be automatically saved in
/var/crash/<dumpdir> and the system will boot back into the regular kernel. The name of the dump directory will depend on date and time of crash. For example,
Step 3: Dump Analysis
Once the system has returned from recovering the crash, you may wish to analyse the kernel dump file using the
- First, locate the recent vmcore dump file:
find /var/crash -type f -mtime -1
- One you have located a vmcore dump file, call
crash /usr/lib/debug/lib/modules/`uname -r`/vmlinux /var/crash/2021-07-17-10\:36/vmcore
For more information on using the
crash tool, see #More Documentation.
The versions of
crash can be very reliant on the version of kernel running. On Fedora, from time-to-time the package versions can get out of sync and can lead to partially working crash dumps. This may manifest as warning messages from
crash such as
page excluded: kernel virtual address: ffff.........9d28 type: "..."
If you want to know specifically what versions are supported, you can examine the
srpm for the version of
kexec-tools you are running, in particular
makedumpfile.h will have something like
#define OLDEST_VERSION KERNEL_VERSION(2, 6, 15)/* linux-2.6.15 */ #define LATEST_VERSION KERNEL_VERSION(4, 5, 3)/* linux-4.5.3 */
If you run
makedumpfile against an unsupported kernel version it will probably still mostly work. It will output an error message to the console, but it can be easy to miss in the
If the dump is behaving unexpectedly you can modify
kdump.conf to not filter any pages (perhaps except zero-filled pages with
-d1) and only use it to compress the
-c. This might result in a more useful
vmcore. If that fails, you could take
makedumpfile out of the picture entirely by change the
scp, which will simply copy
/proc/vmcore to a permanent location.
If having further issues, you may also try building the latest
crash tool from source. If you are at the point of debugging kernel crash dumps you can probably figure it out :) You might want to try something like:
$ sudo dnf builddep crash # quick way to get the right libraries $ git clone https://github.com/crash-utility/crash.git $ cd crash $ make lzo # don't forget the lzo if you're using compressed dumps
- Kernel Source (Documentation/kdump/kdump.txt).
- Using crash - http://people.redhat.com/anderson