Sunday, January 29, 2012

More Systemd Fun: The Blame Game And Stopping Services With Prejudice


  Systemd, Lennart Poettering's new init system that is taking the Linux world by storm, is all full of little tricks and treats. Today we will play the slow-boot blame game, and learn to how stop services so completely the poor things will never ever run again.

Stomping Services

In the olden days of sysvinit there were several ways to stop a service:
  • Temporarily from the command line, like /etc/init.d/servicename stop, or service servicename stop
  • Changing its startup link in /etc/rcn.d from Sfoo to Kfoo
  • Remove all init scripts
  • Which didn't always do the job because sometimes there lurked a script to automatically restart it that bypassed the init scripts, so you had to hunt this down and change it
In Managing Services on Linux with systemd we learned how systemd simplifies starting and stopping services, both per-session and at boot. systemd has one more way of stopping services, and that is stopping them so completely they will never start again. Or at least not until you change your mind and make them start again. This is how to stop a running service temporarily:
# systemctl stop servicename.service
This stops it from starting at boot, but does not stop a running service:
# systemctl disable servicename.service
That also prevents it from being started by anything else, such as plugging in some hardware, or by socket or bus activation. But you can still start and stop it manually. And there is one way to really really stop a service for good, short of uninstalling it, and that is masking it by linking it to /dev/null:


# ln -s /dev/null /etc/systemd/system/servicename.service
# systemctl daemon-reload


When you do this you can't even start the service manually. Nothing can touch it. After being vexed by mysterious scripts that sneaked around behind my back and restarted services I wanted killed on sysvinit distros, I like this particular command a lot. The unit files in /etc/systemd/system override /lib/systemd/system, which have the same names. Unless your chosen Linux distro does something weird, /etc/systemd/system is for sysadmins and /lib/systemd/system is for your distro maintainers, for configuring package management. So masking should survive system updates.
What if you change your mind? Pish tosh, it's easy. Simply delete the symlink and run systemctl enable servicename.service.
While we're here, let's talk about two reload commands: daemon-reload and reload. The daemon-reload option reloads the entire systemd manager configuration without disrupting active services. reload reloads the configuration files for specific services without disrupting service, like this:
# systemctl reload servicename.service
This reloads the actual configuration file used by the hardy sysadmin, for example the /etc/ssh/sshd_config file for an SSH server, and not its systemd unit file, sshd.service. So this is what to use when you make configuration changes.

Pointing the Finger of Slow Boot Blame

Boot times seem to be an even bigger obsession than uptimes for some folks, and a lot of energy is going into trimming boot times. The time it takes for a PC to boot is controlled by two things: how the long the BIOS takes to do its part, and then the operating system.

Figure 1: A simple boot profile graph generated by systemd-analyze.
Figure 1: A simple boot profile graph generated by systemd-analyze.
There isn't much we can do about BIOS times. Some are fast, some are slow, and unless you're using a system with OpenBIOS you're at the mercy of your motherboard vendor. The PC BIOS hasn't advanced much since its inception lo so many decades ago, except for recurring attempts to lock users out and control what we can install on our own computers, which I think is pretty sad. With quad-core systems as common as dust mites it seems a computer should boot as fast as flicking a light switch.
But I digress, because we're supposed to be talking about systemd. systemd reports to the syslog a boot-time summary, like this example from Fedora 16 XFCE in /var/log/messages:
Jan 09 06:30:13 fedora-verne systemd[1]: Startup finished in 2s 817ms 839us (kernel) + 4s 629ms 345us (initrd) + 1min 11s 618ms 643us (userspace) = 1min 19s 65ms 827us
So this shows that the kernel was initiated in 2 seconds and change, initrd took 4 seconds and change, and and everything else took over a minute nineteen seconds. For all we've been hearing how sysvinit is like all slow and systemd is supposed to be faster, this seems like a long time. (Though it is faster than Windows 7 on the same machine, which needs 6 minutes to completely load all the OEM crap- and ad-ware.) So what's taking so long? We can find out with the systemd-analyze blame command, as this snippet shows:

$ systemd-analyze blame
  60057ms sendmail.service
  51241ms firstboot-graphical.service
  3574ms sshd-keygen.service
  3439ms NetworkManager.service
  3101ms udev-settle.service
  3025ms netfs.service
  2411ms iptables.service
  2411ms ip6tables.service
  2173ms abrtd.service
  2149ms nfs-idmap.service
  2116ms systemd-logind.service
  2097ms avahi-daemon.service
  1337ms iscsi.service

Sendmail? Firstboot graphical service? Iptables? Avahi? Iscsi?? This is from a fresh Fedora installation, so I have a lot of housecleaning to do. There are some limitations to this command: it doesn't show which services start in parallel or what's holding up the ones that take longer. But it does show me a lot of slow services that don't need to be running at all on my system.
If you like pretty graphs systemd includes a cool command for automatically generating an SVG image from the blame output, like this:
$ systemd-analyze plot > graph1.svg
You can view this nice graph in the Eye of GNOME image viewer, or GIMP via the SVG plugin. Figure 1 shows what it looks like.
This doesn't tell you much more than the text output of systemd-analyze blame, but it looks pretty and might give you some clues where the bottlenecks are.

No comments:

Post a Comment