Linux kernel has a special feature called Watchdog timer which would reset the system in case of any software faults | system hangs | or any application crashes after a timeout has reached.
2. Linux Watchdog Timer
A Watchdog timer is a hardware circuit that can reset
the computer system in case of any system fault.
Linux provides a simple watchdog interface
/dev/watchdog device file
Once the /dev/watchdog device is opened by any
process:
the kernel will begin a 60 seconds countdown to reboot the
system.
The countdown gets reset if some data is written to that
file.
Closing the /dev/watchdog device, may or may not stop
the countdown. It depends on how kernel configured the
watchdog timer.
3/25/2016Rajkumar Rampelli2
3. User space daemon role
User space daemon (or a watchdog daemon) notifies the watchdog
driver (via /dev/watchdog device file) that userspace is active.
It sends this notification at regular intervals of time
After receiving this notification, watchdog driver tells the hardware
watchdog that everything is in order, and should wait for yet another
little while to reset the system
If user space daemon failed to send this notification
The hardware watchdog resets the system (rebooting the system)
after the timeout occurs.
Typical userspace daemon code looks like
while (1) {
ioctl(fd, WDIOC_KEEPALIVE, 0);
sleep(10);
}
3/25/2016Rajkumar Rampelli3
4. CONFIG_WATCHDOG_NOWAYOUT
Watchdog will be disabled if its device file closed by
the daemon due to any bug in the daemon.
It leads to the situation where system crash but system
will not reboot as watchdog is disabled.
To fix this issue, CONFIG_WATCHDOG_NOWAYOUT
configuration option is introduced to disable
watchdog shutdown when device file got closed.
If this configuration set to Y then there is no way of
disabling watchdog timer.
If watchdog daemon crashes, then system will
reboot after the timeout has reached.
3/25/2016Rajkumar Rampelli4
5. Magic close feature
If driver supports “Magic close” feature, then watchdog timer will
not be disabled unless a special character ‘V’ has been sent to the
/dev/watchdog device file just before closing the file.
static const struct watchdog_info tegra_wdt_info = {
.options = WDIOF_SETTIMEOUT |
WDIOF_KEEPALIVEPING |
WDIOF_MAGICCLOSE,
.identity = "Tegra WDT",
};
Code reference: source/drivers/watchdog/tegra_wdt.c
A simple ioctl() on a file descriptor to /dev/watchdog asking
WDIOC_GETSUPPORT allows one to determine if this flag is set.
Pseudo code
if (ioctl(fd, WDIOC_GETSUPPORT, &info)) {
// code;
}
3/25/2016Rajkumar Rampelli5