IMAP2RSS

17/10/2004

The idea

Now that I had my IMAP server and was comfortable with RSS, I wanted to combine the two in perl. I wrote IMAP2RSS to generate an RSS feed that would mirror the content of an IMAP mailbox.

For example, I am a subscriber to a mailing list that arrives to my server and gets stored in an IMAP mailbox. If I want to draw more attention to that mailing list, providing an RSS feed could be a good idea. New comers can see what the mailing list is about before subscribing (giving away their address) and there's nothing else to do but run the script and automatically upload the feed to a server.

This script can be used for many other things like having a backup or just sharing information you have.

Dependencies

This perl script didn't took me too long to write because it depends upon many modules being installed. I chose XML::RSS to handle the feeds and Net::IMAP::Simple to handle the mail server. Then I took MIME::Words to handle MIME entities that are found in e-mails (especially if you have messages in french like I do), and the Date::Parse - Date::Format duo to handle the dates.

The output

We just know that the input is an IMAP mailbox on a designated server but when it comes to the output, there are choices to be made. I decided to create an RSS 1.0 feed with the content module because it permits the basic formatting we need and Dublin Core information that could be useful. I sticked to ISO-8859-1 encoding for simplicity's sake.

In the end, we have a nice RSS feed that displays messages in a basic format with author, date and subject. The way I handle the rdf:resource as linked to the Message-ID is questionable but that's a minor trick to get a reference instead of nothing.

The code

#!perl -w
# IMAP2RSS

use strict;
use Net::IMAP::Simple;
use MIME::Words qw(decode_mimewords);
use XML::RSS;
use Date::Parse;
use Date::Format;

# the feed to be generated
my $feed = new XML::RSS(
  version => '1.0', 
  encoding => 'ISO-8859-1'
);
$feed->channel(
  title => 'TITLE OF THE FEED', 
  link => 'URL OF THE HOME OF THE FEED', 
  description => 'DESCRIPTION OF THE FEED', 
  language => 'LANGUAGE OF THE MESSAGES', 
  syn => { 
    updatePeriod => 'daily', 
    updateFrequency => '1', 
    updateBase => '1970-01-01T00:00+00:00'
  }
);
$feed->add_module(
  prefix => 'content', 
  uri => 'http://purl.org/rss/1.0/modules/content/'
);

# the IMAP server to connect to
my $server = new Net::IMAP::Simple('192.168.0.21');
$server->login('MYLOGIN','MYPASSWORD') or die "Can't login";
# the IMAP mailbox
my $nbmsg = $server->select('MAILBOX');

# loop through the last 15 messages
for (my $i = $nbmsg; $i > $nbmsg-15; $i--) {
  # get the message
  my $lines = $server->get($i);
  my ($d, $subject, $date, $from, $ref);
  my $inbody = 0;
  my $body = "";
  # loop inside the message
  foreach my $l (@$lines) {
    $l =~ s/(?:\012\015|\012|\015)//;
    # decode MIME entities
    $d = decode_mimewords($l,);
    # grab essential headers
    if ($d =~ /Subject: (.*)/i) { $subject = $1; }
    if ($d =~ /From: (.*)/i) { $from = $1; }
    if ($d =~ /Message-ID: <(.*)>/i) { $ref=$1; }
    if ($d =~ /Date: (.*)/i) { $date = str2time ($1); }
    # grab the message body
    $body .= $d if ($inbody == 1);
    $inbody = 1 if (length $d < 2);
  }
  # convert the date to Dublin Core format
  $date = time2str("%Y-%m-%dT%T%z",$date);
  $date =~ s/(\d\d)$/:$1/;
  $from =~ s/ <.*>//;
  $from =~ s/"//g;
  # add the message as an RSS item
  $feed->add_item(
    title => $subject, 
    link => $ref, 
    description => $body, 
    dc => { 
      date => $date, 
      creator => $from
    }, content => { 
      encoded => $body 
    }
  );
}

$server->quit();
# save the feed with simple formating
my $file = $feed->as_string;
$file =~ s/(<content:encoded>)/$1<![CDATA[\n<pre>/g;
$file =~ s/(<\/content:encoded>)/<\/pre>\n]]>$1/g;
open (FILE, '> armada.rss');
print FILE $file;
close(FILE);

Making the feed available is now just a matter of placing it at the right location on server or to upload it through FTP.

Future improvements

If ever I was to decide to handle HTML e-mails that wouldn't be too difficult to include in the code because we have opted for the solution of likewise marked content. Attachments handling could be added but that wouldn't make much sense in the RSS context. What I'd like is to be able to handle more encodings but, right now, ISO-8859-1 is enough, though the future seems to belong to UTF-8. On the forseable future, I could add URL detection and linking.

main menu