[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

problems with b2m, M-x unrmail

From: Jonathan Kamens
Subject: problems with b2m, M-x unrmail
Date: Sun, 9 Jun 2002 16:28:04 -0400

There is a program called "b2m" included with GNU Emacs which reads a
Babyl file on stdin and converts it to an mbox file on stdout.
Unfortunately, it has at least five different problems.

1) It doesn't look for "Return-Path", "From" or "Sender" lines in the
   headers of the messages to find meaningful addresses to put in the
   "From " lines of its output.

2) The fake address it *does* put into the "From " lines is "Babyl to
   mail by b2m", including both the quotation marks and spaces.  Most
   programs which read mbox files will not tolerate "From " lines
   whose addresses contain spaces, so the mbox file this program
   produces isn't valid with many programs.  D'oh!

3) The header it puts for each message in the mbox file is the pruned
   header rather than the original full header.  I think the latter
   makes much more sense -- let whatever program is displaying the
   mbox file do appropriate header filtering rather than forcing it to
   use Emacs's.

4) It puts the current date in the "From " line of every message
   rather than figuring out from the message's "Date:" header what to
   put in its "From " line.

5) It doesn't quote "From " lines in message bodies.

It seems to me that at the very least, (2) and (5) from the list above
need to be fixed.

There's also a function in Emacs called M-x unrmail which performs the
same function.  It partially fixes (1) (it looks for "From",
"Really-from" or "Sender" but doesn't look for "Return-Path"), fixes
(2), fixes (3) but doesn't strip out the Emacs-specific
"X-Coding-System" header, doesn't fix (4), and fixes (5).  I believe
that at the very least, "return-path" should be put at the beginning
of the list of headers checked for a return address.

I've written a Perl script, b2m.pl, to provide the same functionality
as b2m.c and M-x unrmail while addressing all of the problems listed
above.  It is attached to this message.  I'd be happy to have you
distribute it as part of GNU Emacs instead of, or in addition to, the
b2m C program.  I will maintain it if you do so.

  Jonathan Kamens

# b2m.pl - Script to convert a Babyl file to an mbox file

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA.

# Maintained by Jonathan Kamens <address@hidden>.

# Requires CPAN modules: MailTools (for Mail::Address), TimeDate (for
# Date::Parse).

use warnings;
use strict;
use File::Basename;
use Getopt::Long;
use Mail::Address;
use Date::Parse;

my($whoami) = basename $0;
my($version) = '$Revision: 1.3 $';
my($usage) = "Usage: $whoami [--help] [--version] [--[no]full-headers] 
\tBy default, full headers are printed.\n";

my($opt_help, $opt_version);
my($opt_full_headers) = 1;

die $usage if (! GetOptions(
                            'help' => \$opt_help,
                            'version' => \$opt_version,
                            'full-headers!' => \$opt_full_headers,

if ($opt_help) {
    print $usage;
elsif ($opt_version) {
    print "$whoami version: $version\n";

die $usage if (@ARGV > 1);

$/ = "\n\037";

if (<> !~ /^BABYL OPTIONS:/) {
    die "$whoami: $ARGV is not a Babyl file\n$usage";

while (<>) {
    my($msg_num) = $. - 1;
    my($labels, $full_header, $header);

    # This will strip the initial form feed, any whitespace that may
    # be following it, and then a newline
    # This will strip the ^_ off of the end of the message

    if (! s/(.*)\n//) {
        warn "$whoami: message $msg_num in $ARGV is malformatted\n";
    $labels = $1;

    s/(?:((?:.+\n)+)\n+)?\*\*\* EOOH \*\*\*\n+// || goto malformatted;
    $full_header = $1;

    if (s/((?:.+\n)+)\n+//) {
        $header = $1;
    else {
        # Message has no body
        $header = $_;
        $_ = '';

    if (! $full_header) {
        $full_header = $header;

    # End message with a single newline

    # Quote "^From "
    s/(^|\n)From /$1>From /g;

    # Strip the integer indicating whether the header is pruned
    $labels =~ s/^\d+[,\s]*//; 
    # Strip extra commas and whitespace from the end
    $labels =~ s/[,\s]+$//;
    # Now collapse extra commas and whitespace in the remaining label string
    $labels =~ s/[,\s]+/, /g;
    foreach my $rmail_header qw(summary-line x-coding-system) {
        $full_header =~ s/(^|\n)$rmail_header:.*\n/$1/i;

    foreach my $addr_header qw(return-path from really-from sender) {
        if ($full_header =~ /(?:^|\n)$addr_header:\s*((?:\S.*\n)+)/i) {
            my($addr) = Mail::Address->parse($1);
            $from_addr = $addr->address($addr);

    if (! $from_addr) {
        $from_addr = "address@hidden";

    if ($full_header =~ /(?:^|\n)date:\s*(\S.*\S)/i) {
        $time = str2time($1);

    if (! $time) {
        # No Date header or we failed to parse it
        $time = time;

    print("From ", $from_addr, " ", scalar(localtime($time)), "\n",
          ($opt_full_headers ? $full_header : $header),
          ($labels ? "X-Babyl-Labels: $labels\n" : ""), "\n",
          $_) || die "$whoami: error writing to stdout: $!\n";

close(STDOUT) || die "$whoami: Error closing stdout: $!\n";

reply via email to

[Prev in Thread] Current Thread [Next in Thread]