V

SysAdmin'ish Blog

668.5

Of course, you know that 42 is The Answer to the Ultimate Question of Life, The Universe, and Everything.

So, clearly, you know that 21 is only half the truth1).

However, do you also know what 668.5 is?

No? — Well, it's the average ;-)







· 2013-04-15 19:55 · 1 Comment

Deleting mail attachments

For ages2) I have been using a Perl script called demime written by Nick Simicich in order to remove attachments and HTML from message, e.g. for mailing lists or my personal archiving. demime still works astonishingly good but it does have its quirks, and it is rather radical in what it does. Hence, I was always looking for alternatives.

Recently, I came across remove_mime.pl by Aaron Ciuffo which is based on a script by Mike A. Leonetti. Eventually, these two encouraged me to write my own script: attachment_filter.pl

Resources

A crude summary of what I have found so far on the Perl side:

Scripts

  • demime.pl by Nick Simicich: 15 years old and still rocking!
    Strips down messages to text/plain; allows to convert HTML by means of lynx; almost no dependencies; not for the faint of heart. Unfortunately, the original link http://scifi.squawk.com/demime.html is not available anymore, but feel free to download my copy: demime_1.1e_as1.pl
  • remove_mime.pl by Aaron Sciuffo: Removes attachments based on filename extensions; simple; depends on Email::MIME.
  • remove_attachments.pl by Mike A. Leonetti: The mother script of remove_mime.pl; also only depends on Email::MIME; includes instructions of how to use with Postfix; sends autoreply to sender if attachments have been removed.
  • strip-attachments.pl by Pjotr Prins: Uses Perl's Mail::Box; processes a mbox mailbox; writes attachments to a folder, and deletes the attachments from the e-mail. The script, as it is, removes attachments based on their size (if larger than 16K).
  • mimefilter.pl by D.G.M. Salvetti: Comes as a Debian package; some dependencies; filters based on Content-Type/MIME type; a bit awkward to handle; sends autoreplies.
  • stripmime.pl by Alex Wetmore: No dependencies; leaves only text/plain; converts HTML to text by means of an over-simplified HTML parser; not recommended because it breaks quite a bit.
  • attachment_filter.pl: my own attempt; uses Email::MIME; removes attachments based on their MIME Content-Type; works with nested multipart attachments; follows the KISS principle.

Modules

  • Email::MIME: Easy MIME message parsing, by Ricardo Signes, Casey West, Simon Cozens; depends on and extends Email::Simple; much recommended; for examples see the scripts by Mike A. Leonetti, Aaron Ciuffo, and my own.
  • Email::MIME::Attachment::Stripper: Handy to remove and extract attachments, but I couldn't find code that shows how to re-assemble messages so I haven't tried it, yet, and instead sticked to Email::MIME.

Further reading

How to kill ethernet with USB

Have you ever told a friend or client that s/he could wire up the PC and external devices all alone because plugs only fit into 1 specific socket and cannot be mixed up? And if there is more than 1 socket of the same type they are color coded (like for PS/2 and audio sockets)? — I did, but I won't do it again, probably, at least not if I have to assume that a standard USB wire of type A-B is around!

I thought I had seen it all, like one guy who plugged in a 15-pin VGA wire upside-down. In former times, every sysadmin knew to warn people that phone wires with smaller RJ plugs do fit perfectly into RJ45, and in fact can cause great damage. Fortunately, these wires are pretty much extinct nowadays. I really thought it was safe to let users do the wiring.

Today, a friend of mine called me because he couldn't install his new printer. The printer offered LAN and USB. My friend made sure the printer works by itself but when connected to a PC via USB the PC did not find any printer.

Well, of course not! I had no idea how damn well a Type B USB plug fits into a RJ45. Besides, I don't want to know how much damage it can cause. It short-circuits at least 6 pins! However, for my friends sake I'll now go and check it out.

No new devices could be found

We got a computer with rather new hardware and no operating system pre-installed. When I wanted to install Windows 7 it prompted me for drivers which I provided, but no matter what, Windows 7 refused to continue the installation and kept telling me

No new devices could be found

The computer was equipped with an Intel DH61AG board and an SSD drive. In the BIOS we activated AHCI. So, I was not surprised that Windows asked for drivers.

First, I thought I got the wrong drivers. I double-checked and I tried different ones from the CD that came with the computer. No go. Second, I downloaded up-to-date drivers for the "F6 floppy" (apparently some manufacturers still list them in this category), but this did not help either.

At that point I started to [guugl] for the error. Sure enough there were plenty of postings, however, none of the suggested solutions seemed appropriate. I had the right drivers for the right architectures. The hard disk was OK (I could boot Ubuntu 12.04 without any problems), and so on. Anyway, just for the record, one more thing I tried was completely erasing the hard disk, though to no avail.

One poster said that the problem was a supposedly faulty USB stick which was used for the installation. This got me thinking because I was using a (virtual) CD-Rom connected via USB. So, I tried a (real) Windows 7 DVD instead. This, indeed, worked without any problem. In fact, Windows 7 didn't even ask for drivers. It showed me the license agreement, asked for the installation mode and then offered me a list of partitions from the hard disk.

I stopped the installation right away, because I wanted to know what was going on. A few more experiments, and I knew what really was the problem:

I had my installation disk (the Windows 7 DVD) plugged into a USB 3 connector! I guess, every time Windows tried to load the drivers for the AHCI and/or Intel (Rapid) Storage the drive on USB 3 got disconnected. The error message was perhaps misleading.

Once I plugged my installation disk into the normal USB socket everything worked as expected. No drivers needed.

Courier IMAP & POP3 stats per user

So far, I have been using courier-analog from the Courier Mail Server to generate IMAP bandwidth reports on a per user basis. However, recently, when I copied a user's mailbox of about 600 MB from a remote server via imapsync I was surprised that courier-analog showed only a few KB of traffic. First, I thought that courier-analog only counts outbound traffic, but looking at the source code it seems that it counts only the body values (see below). So, I quickly wrote my own script to get per user stats.

A typical IMAP log line on my server running courier-imap version 4.8.0-3 (from Debian) reads

Jul 29 06:36:47 server imapd-ssl: LOGOUT, user=example, ip=[::ffff:127.0.0.1], \
headers=0, body=0, rcvd=48, sent=292, time=344, starttls=1

Courier IMAP distinguishes headers, bodies, received and sent bytes. Clearly, headers and body can be zero while there is still some traffic. But more importantly, body gives only the bytes of messages sent, apparently:

Jul 23 22:12:28 server imapd-ssl: DISCONNECTED, user=example, ip=[::ffff:127.0.0.1], \
headers=0, body=0, rcvd=14084220, sent=8141, time=123, starttls=1

For simplicity I opted to preprocess the logfile with sed and then sum up the received and sent values with awk.

Script

courier-traffic-imap
#!/bin/bash -e
#
# Calculate traffic of Courier IMAPd per user
 
[[ $1 ]] && DATE="$1" || DATE=$(date +%b\ %e -d yesterday)
[[ $1 == today ]] && DATE=$(date +%b\ %e)
 
LOGFILE=${2:-/var/log/mail.log}
THRESHOLD=2900000 # (almost 3 MB)
UNIT=2  # Bytes (B): 0, KB: 1, MB: 2, GB: 3, ...
 
echo -e "== Courier IMAP traffic stats per user of $DATE ==\n"
 
# We preprocess $LOGFILE with sed, extracting only imapd lines,
# the user name, received and sent bytes.
# Numbers are then summed up by awk, and finally pretty printed
 
# We use uniq to work around a bug in courier imap that fills the logs
# with thousands of identical lines
 
sed -nre "
	s/^$DATE (..:..:..) .* imapd.* user=([^,]*),.* rcvd=([0-9]+), sent=([0-9]+).*$/\\2 \\3 \\4 \\1/p
	" $LOGFILE \
| uniq \
| awk -v threshold=$THRESHOLD -v unit=$UNIT '
	# sum up numbers: uc=count, ui=received, uo=sent, u(i/o)g=global totals
	{ uc[$1]++; ui[$1]+=$2; uo[$1]+=$3; ucg++; uig+=$2; uog+=$3; }
	END {
		# get unit string (ustr) and exponent (uexp)
		split("B KB MB GB TB PB",units); ustr=units[unit+1]; uexp=3*unit
		# print results
		for (user in uc) {
			u[user] = ui[user]+uo[user] # per user sum of received and sent
			if (u[user] > threshold)
				printf "%28s %6d %-2s  [#%5d: r %5d s %5d ]\n",
					user,u[user]/10^uexp,ustr,uc[user],ui[user]/10^uexp,uo[user]/10^uexp \
					| "sort -nr -k 2"
		}
		close("sort -nr -k 2")
		printf "\n%28s %6d %-2s  [#%5d: r %5d s %5d ]\n",
			"-- total --",(uig+uog)/10^uexp,ustr,ucg,uig/10^uexp,uog/10^uexp
	} '

Example

$ bin/courier-traffic-imap 'Jul 23' /var/log/mail.log.1
== Courier IMAP traffic stats per user of Jul 23 ==

                     example    624 MB  [#    3: r     0 s   623 ]
                another-user     90 MB  [#   36: r    40 s    50 ]
            and.another-user     89 MB  [#    7: r    85 s     4 ]
                   and.so.on     68 MB  [#   92: r    10 s    58 ]
                and-so-forth     42 MB  [#   20: r    31 s    10 ]
                
                 -- total --   1002 MB  [# 1027: r   207 s   795 ]

POP3 stats

The logfile format is exactly the same. Just replace every imapd with pop3d. Of course, it would be easy to put both into 1 script. However, POP3 traffic is much lower than IMAP traffic. I guess I'll ignore it completely.

Vertical, sliding panel with auto-hide & pin

Last year, I was playing around with a vertical, sliding panel that appears and disappears automatically depending on the mouse pointer3). Sebastian Bohnen asked me whether it was possible to add a pin to the panel that would stop the auto-hide when pinned.

I replied with a rough sketch that I thought must have been quite incomprehensible but it didn't stop him from adding this feature. Here's the bare bones:

Of course, we need a pin. A picture of a pin would be nice but to keep it simple let's just use an empty <div> for now:

<div id="rightPanelPin"> </div>

Then we need to register a Javascript function to be triggered when the "pin" is clicked. So, we add

$('#rightPanelPin').click(rightPanelPin);

to the $(document).ready(function()).

The actual function rightPanelPin() toggles a variable and the content of the "pin" (should show a pinned pin or something when clicked).

var rightpanelpinned = false;
function rightPanelPin() {
    if (rightpanelpinned) {
        rightpanelpinned = false;
        $('#rightPanelPin').html(' ');
    } else {
        rightpanelpinned = true;
        $('#rightPanelPin').html('P');
    }
}                                                    

That's it. Now we can add && !rightpanelpinned to the collapsePanel() function to stop the auto-hide if the panel is pinned.

function collapsePanel() {
    if (rightpanel && !rightpanelpinned) {
(...)

Demo

I have updated the jsfiddle demo to include the pin feature

Sebastian currently has the panel with a pin live at

  • http://e-ducation.net/technicalscience/ (went off-line in 2014!?)

As of now, to show it: Click on 'Open', and 'toggle layout'. Then the vertical panel is on the left side.

Further reading

Does Google fight spam?

About a year ago, maybe longer, I observed a particular type of fake lottery win spam messages hitting our mail server. Fortunately, it turned out that this type could easily be filtered. Even though the senders change often and the format of the messages changed over time, some crucial parts remained the same.

All of this spam is sent via Google Mail. So, of course, I reported the messages to Google. Since we received quite a lot of them and since there are apparent patterns, it should be easy for Google to track down the spammers and stop them. Or so I thought.

I have reported these messages many times since at least 1 year. But they keep coming. Go figure!

Lotto spam via Google of the last 12 months

2012-05: |========+=========+=========+=========+=========* (est.) [ 150]
2012-04: |========+=========+=========+*                           [  95]
2012-03: |========+=========+=======*                              [  86]
2012-02: |*                                                        [   7]
2012-01: *                                                         [   5]
2011-12:                                                           [   2]
2011-11: |========*                                                [  30]
2011-10: |========+=========*                                      [  61]
2011-09:                                                           [   0]
2011-08:                                                           [   2]
2011-07: |=*                                                       [   9]
2011-06:                                                           [   0]

1 bar sign (=) represents 3 spam messages. The current number of May 2012 is 51 as of 2012-05-10 12:00.

Update of 2012-05-17

In the course of further analysis I found that a few of the original numbers shown above were slightly off for 2011. I have corrected them now.

Some of the numbers for July and August 2011 were actually too high. In these 2 months the spammer also used Yahoo. One could assume that Yahoo managed to lock the spammers out whereas Google did not. But of course, that's just wildest guesses ;-)

Here is a chart similar to the one above showing the percentages of spam messages coming in via Google and the percentages of the particular lotto spam per week:




RSS feeds: Blog posts (abstracts), Blog posts (full text), Comments

See also: SysAdmin'ish Blog Archive & Tag cloud

1)
Does anybody know who came up with this phrase?
2)
Well, almost ages, but I do remember the times where e-mail had no attachments ;-)
 
blog/index.txt · Last modified: 2013-04-21 18:51 by andreas