V

A jumpy road with load

On June 4, I upgraded Debian on one of my servers from "Lenny" to "Squeeze". Before the upgrade, system load of this server was generally low, mostly below 0.1, except for a daily spike every morning when several backup jobs run. Once Debian Squeeze was running I quickly observed that the server load was significantly higher. Watching top I figured that the new version 1.4 of Munin – running on this server, too – was the main culprit.

On June 5, I decided that the high load might very well affect my electricity bill and the environment. Since the Munin graphs of this server are not much accessed, I configured Munin to run on CGI. Afterwards, the load was down at 0.2.

Clearly, that was still higher than before the Debian upgrade. But top didn't show anything (you know that feeling that the load always decreases once you start top?) I almost set the findings aside as some artifact.

Early June 7, I replaced the UPS supporting this server. The old UPS was monitored by the nut package, the new one was connected to apcupsd. I did not expect this to have any influence on the load. Thus, I didn't even check.

In the morning of June 9, I received a message from the server letting me know that the attached GSM-USB-modem was apparently out of order. Actually, the message said that there was no log file from the previous day. Generally, this happened when the modem was not attached at all. But it was.

Looking at the Munin load graph, I assumed that for some unknown reason the modem was not detected when I rebooted the server on June 7. At least the log files said so. Strangely enough, during the 2 days without the modem the server's load was back down close to 0. After I made sure the modem was working, the load was up again (BTW, without showing any running processes in top).

On June 10, just to prove the point, I re-scheduled the job which was querying the modem. Until then, I had a Cron job running gammu backupsms every 5 minutes asking the modem whether SMS arrived. To see how it affects the load I changed the interval to 10 minutes. And, indeed, the load was reduced by about 50%.

Late June 10, I gave gammu-smsd a try, hoping that the high load of a repeated gammu backupsms was caused mainly by the initialization of the modem (which took longer since Debian Squeeze was running), and that the gammu-smsd daemon might cut that down by constantly monitoring the modem.

Apparently, this is the case. Since I switched Munin to run on Fast-CGI, and since I use gammu-smsd instead of gammu, the load after the Debian upgrade is in fact lower than before. As it should be.

Discussion

Johann Klasek, 2012-05-25 21:51

Load under Linux is sometimes different to other unix flavours (ie. Solaris). Linux counts also processes as "running" when doing some I/O even if they are in waiting/blocking condition (from the perspective of the scheduler). I experienced this on systems with heavy disk load or I/O bound task (tasks in state "D"), which leads to high loads, although heavy cpu usage is completely missing … (hence top will not show the bad processes on top). Maybe this played a role in the above scenario, too. At least this may show that top is not well suited to spot the tasks causing a high load …

Enter your comment. Wiki syntax is allowed:
N S F Y J
 
 
blog/110613_a_jumpy_road_with_load.txt · Last modified: 2011-06-15 19:33 by andreas