The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
package Mail::Google::Procmailrc;
use 5.010000;
use strict;
use warnings;
use XML::Fast;
use Data::Dumper;
use Carp;
require Exporter;
our @ISA = qw(Exporter);
our %EXPORT_TAGS = ( 'all' => [ qw(  ) ] );
our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
our @EXPORT = qw( );
our $VERSION = '0.022';
=head1 NAME

Mail::Google::Procmailrc - Perl module that allows easy conversion from Gmail mail filters to Procmail rules

=head1 SYNOPSIS

  use Mail::Google::Procmailrc;
  my $o = Mail::Google::Procmailrc->new(<path-to-mail-folders>);
  $o->convert(<google-mail-filter-path>, <procmail-rules-output-path>);

or, you can use it with the helper script

  google-to-procmailrc --input ./mailFilters.xml --output test-procmail.rc --mdir-path $HOME/somemail

=head1 DESCRIPTION

You may want at some point, for some reason to export all your gmail mail rules as
procmail filters. 

If you use a mail setup involving OfflineIMAP fetching multiple folders(labels) from
Google, you'll notice that there is a certain overhead involved.

That's because OfflineIMAP needs to tell the IMAP server, which messages it has, in
order to retrieve only the ones that it doesn't, and then for each new message fetched
it also needs to update the SQLite local dbs with the statuses of the new messages.
And it has to do that for every folder(label). That highly depends on which labels you
fetch with OfflineIMAP.

If you want to make the sync faster, you can consider only fetching the "[Gmail]/All Mail"
folder or the "INBOX" folder, but then you still have to solve mail triage.

Procmail is quite good for mail triage, but the mailFilters.xml file that you can
export from Gmail is not suited for use with Procmail AFAIK.

This module aims to solve that problem by converting mailFilters.xml to a set of
procmail rules (effectively a procmailrc file).

Normally, you'd use the script that comes with this module to migrate your Gmail rules
to procmail and then you can just maintain the procmail rules. At least that's how I(plan to) use it.

=cut

our $tmplt1= qq{:%s
* ^From:.*%s.*
%s

};

our $tmplt2 = qq{:%s
* ^To: .*%s.*
%s

};

our $tmplt3 = qq{:%s
* ^Subject: .*%s.*
%s
};

our $tmplt4 = qq{:%sB
* .*%s.*
%s

};

our $tmplt_catchall = qq{:0
*.*
%s

};


=head1 METHODS


=head2 new($folders_path, $archive_dir, $trash_dir)

The constructor receives $folders_path which is the directory where all the
mail folders resulting from the triage will be placed. The other two, archive and trash
will be used with the same role as archiving e-mail, and the B<[Gmail]/Trash> folder.

=cut

sub new {
    #my ($class_name,$folders_path,$archive_dir,$trash_dir) = @_;
    my ($class_name,$c) = @_;
    my $o = bless {},$class_name;
    # TODO check for folders_path
    $o->{folders_path}     = $c->{mdir_path} ;
    $o->{archive_dir}      = $c->{archive_dir} // 'archive';
    $o->{trash_dir}        = $c->{trash_dir}   // 'trash';
    $o->{inbox_dir}        = $c->{inbox_dir}   // 'inbox';
    $o->{debug}            = $c->{debug}       // 0;
    $o->{debug_rule_count} = 0 if $o->{debug};
    $o->{labels_found}     = {};
    return $o;
};

=head2 generate_create_dirs_script

This method will generate a script called B<create.sh> . This script
will create all the folders. Afterwards, there will be a one-to-one mapping between
the labels present in your mailFilters.xml and folders on disk. They will be empty, but
after running procmail they will be filled with the mails that correspond to them.

=cut

sub generate_create_dirs_script {
    my ($self) = @_;
    my $create_script = "create.sh";
    open my $fh,">$create_script";
    print $fh "#!/bin/bash\n";
    my $p = $self->{folders_path};
    my @labels;
    push @labels, (keys %{ $self->{labels_found} } );
    push @labels, $self->{archive_dir};
    push @labels, $self->{trash_dir};
    push @labels, $self->{inbox_dir};
    for my $l ( @labels ) {
        print $fh qq{mkdir -p "$p/$l"\n};
    };
    close $fh;
    `chmod +x $create_script`;
};

=head2 generate_rule_catchall()

Returns a catchall rule for the inbox

=cut

sub generate_rule_catchall {
    my ($self) = @_;
    my $dir = $self->{folders_path}.'/'.$self->{inbox_dir}.'/';
    $dir = qq{"$dir"};
    $self->apply_debug_info(\$dir);
    return sprintf($tmplt_catchall,$dir);
};

=head2 convert($input,$output)

This method takes the input XML and uses L</adapt> to convert it to
the procmail rules file.

=cut

sub convert {
    my ($self,$i,$o) = @_;
    my $google_filters_xml = undef;
    {
        local $/ = undef;
        open my $fh,"<$i";
        $google_filters_xml = <$fh>;
        close $fh;
    };
    my $xml_nested = xml2hash($google_filters_xml);
    my $buf = $self->adapt($xml_nested);
    open my $fh,">$o";
    print $fh $buf  ;
    close $fh;
    $self->generate_create_dirs_script;
};

=head2 adapt($x)

Receives as parameter a nested hash structure generated by XML::Fast from parsing
the mailFilters.xml file, then traverses that structure. This is where the conversion
happens.

=cut

sub adapt {
    my ($self,$x) = @_;
    my $buf;
    for my $o (@{ $x->{feed}->{entry} }) {
        next if $o->{title} ne 'Mail Filter';
        next if !exists $o->{"apps:property"};
        next if ref($o->{"apps:property"}) ne "ARRAY";
        my $adapt_hash = {};
        for my $p (@{ $o->{"apps:property"} }) {
            my $key = $p->{'-name' };
            my $val = $p->{'-value'};
            $adapt_hash->{$key} = $val;
        };
        $buf .= $self->adapt_rule($adapt_hash);
    };

    $buf .= $self->generate_rule_catchall();
    return $buf;
};

=head2 collect_labels

Collects labels found so that B<create.sh> can be generated. See the L</generate_create_dirs_script> method for
details.

=cut

sub collect_label {
    my ($self,$l) = @_;
    $self->{labels_found}->{$l} = 1;
};

=head2 compute_flag()

Generates appropriate flag if the rule chain has not ended yet.

=cut

sub compute_flag {
    my ($self) = @_;
    my $r = $self->{rules_remaining};
    my $m = $self->{multi_rule};
    my $g = $self->{goes_to_inbox};
    return "0 wc" if $g;
    if($m) {
        return "0 wc" if $r  > 1;
        return "0"  if $r == 1;
        croak "[E] r < 0 ?!";
    } else {
        return "0";
    };
};

=head2 rule_from

Generic "From:" rule conversion.

=cut

sub rule_from {
    my ($self, $h) = @_;
    my $buf = "";
    my $rules_remaining = $self->{rules_remaining};
    if(exists $h->{from} && exists $h->{label} && $rules_remaining > 0) {
        my $dir  = $self->{folders_path}.'/'.$h->{label}.'/';
        my $flag = $self->compute_flag;
        $dir = qq{"$dir"};
        $self->apply_debug_info(\$dir);
        $buf     = sprintf($tmplt1, $flag, $h->{from},$dir);
        $self->{rules_remaining}--;
    };
    return $buf;
};


=head2 apply_debug_info()

Adds a B<X-Procmail-Debug> header to the e-mail so you can trace back to the rule
that matched it.

This is triggered through the B<--debug> parameter.

=cut


sub apply_debug_info {
    my ($self,$refdir) = @_;
    if($self->{debug} && ref($refdir) eq "SCALAR") {
        my $rule_no = $self->{debug_rule_count};
        my $olddir = $$refdir;
        $$refdir = qq<
            {
               :0 fb
               | formail -I "X-Procmail-Debug: $rule_no"

               :0 a:
               $olddir
            }

>;
        $self->{debug_rule_count}++;
    };
};


=head2 rule_to

Generic "To:" rule conversion.

=cut

sub rule_to {
    my ($self, $h) = @_;
    my $buf = "";
    my $rules_remaining = $self->{rules_remaining};
    if(exists $h->{to}   && exists $h->{label} && $rules_remaining > 0) {
        my $dir = $self->{folders_path}.'/'.$h->{label}.'/';
        my $flag = $self->compute_flag;
        $dir = qq{"$dir"};
        $self->apply_debug_info(\$dir);
        $buf    = sprintf($tmplt2,$flag,$h->{to},$dir);
        $self->{rules_remaining}--;
    };
    return $buf;
};

sub rule_body {
    my ($self, $h) = @_;
    my $buf = "";
    my $rules_remaining = $self->{rules_remaining};
    if(exists $h->{hasTheWord}  && exists $h->{label} && $rules_remaining > 0) {
        my $dir = $self->{folders_path}.'/'.$h->{label}.'/';
        my $flag = $self->compute_flag;
        $dir = qq{"$dir"};
        $self->apply_debug_info(\$dir);
        $buf    = sprintf($tmplt4,$flag,$h->{hasTheWord},$dir);
        $self->{rules_remaining}--;
    };
    return $buf;
};

sub rule_subject {
    my ($self, $h) = @_;
    my $buf = "";
    my $rules_remaining = $self->{rules_remaining};
    if(exists $h->{subject}   && exists $h->{label} && $rules_remaining > 0) {
        my $dir = $self->{folders_path}.'/'.$h->{label}.'/';
        my $flag = $self->compute_flag;
        $dir = qq{"$dir"};
        $self->apply_debug_info(\$dir);
        $buf    = sprintf($tmplt3,$flag,$h->{subject},$dir);
        $self->{rules_remaining}--;
    };
    return $buf;
};


=head2 rule_archive

Uses the L</rule_to> , L</rule_from>, L</rule_subject>, L</rule_body> methods to archive e-mail.

=cut

sub rule_archive {
    my ($self, $h) = @_;
    my $buf = "";
    if($h->{'shouldArchive'}) {
        $h->{label} = $self->{archive_dir};
        $buf .= $self->rule_from($h);
        $buf .= $self->rule_to($h);
        $buf .= $self->rule_subject($h);
        $buf .= $self->rule_body($h);
    };
    return $buf;
};


=head2 rule_trash

Same as L<rule_archive> but for trash.

=cut

sub rule_trash {
    my ($self, $h) = @_;
    my $buf = "";
    if($h->{'shouldTrash'}) {
        $h->{label} = $self->{trash_dir};
        $buf .= $self->rule_from($h);
        $buf .= $self->rule_to($h);
        $buf .= $self->rule_subject($h);
        $buf .= $self->rule_body($h);
    };
    return $buf;
};

=head2 check_multiple_rules($h)

Checks if there are multiple locations where the mail
should be placed. Stores whether there are multiple such
places and how many of them are. This will later be used in order to

=cut

sub check_multiple_rules {
    my ($self,$h) = @_;
    my $count = 0;
    $count += defined($h->{shouldTrash}  );
    $count += defined($h->{shouldArchive});
    $count += defined($h->{hasTheWord});
    if(defined($h->{label})) {
        $count += defined($h->{from});
        $count += defined($h->{to}  );
    };
    $self->{rules_remaining} = $count;
    $self->{multi_rule}      = $count > 1;

    # if goes_to_inbox is true, the message will slide down all the way to inbox
    $self->{goes_to_inbox}   = !defined($h->{shouldArchive});
};


=head2 escape_fields

Escape the field values used for the procmail filters

=cut

sub escape_fields {
    my ($self,$h) = @_;

    for(qw/from to subject/) {
        next if ! defined($h->{$_});
        $h->{$_} =~ s/\[/\\[/g;
        $h->{$_} =~ s/\]/\\]/g;
        $h->{$_} =~  s/-/\\-/g;
    };
};

=head2 adapt_rule($h)

Converts a Gmail mail filter to procmail rules.
Returns a string with the procmail rule.

=cut

sub adapt_rule {
    my ($self,$h) = @_;
    my $buf = "";

    $self->check_multiple_rules($h);
    $self->collect_label($h->{label}) if($h->{label});
    $self->escape_fields($h);

    $buf .= $self->rule_from($h);
    $buf .= $self->rule_to($h);
    $buf .= $self->rule_body($h);
    $buf .= $self->rule_subject($h);
    $buf .= $self->rule_archive($h);
    $buf .= $self->rule_trash($h);

    if($buf ne "") {
        $buf .= "\n#".("="x70)."\n";
    };

    return $buf;
};


1;
__END__
=head1 NOTES

If you decide to use B<[Gmail]/All Mail> as the folder you sync and then use procmail
to run on it, you'll have to deal with the Spam (maybe spamassassin would help there).

From that point of view it's probably easier to just use B<INBOX>.

Currently this module only has functionality for converting some of the gmail rules.

=head1 SEE ALSO

=over 1

=item * L<Procmail Documentation Project|http://pm-doc.sourceforge.net/doc/>

=item * L<Exporting Gmail mail filters|http://webapps.stackexchange.com/a/3643>

=item * L<Synchronization that OfflineIMAP does|https://github.com/OfflineIMAP/offlineimap/blob/41cb0f577f6921a644d0c4c1ac23dd391270fee7/docs/doc-src/FAQ.rst#115what-is-the-uid-validity-problem-for-folder>

=item * L<spamassassin|https://spamassassin.apache.org/>

=back

=head1 BUGS

Please report bugs using the L<rt.cpan.org queue|https://rt.cpan.org/Public/Dist/Display.html?Name=Mail-Google-Procmailrc>.

=head1 PATCHES

Patches are welcome, either in the form of pull-requests on the L<github repo|https://github.com/wsdookadr/p5-Mail-Google-Procmailrc> or
in the form of patches on L<cpan's request tracker|http://rt.cpan.org>

=head1 AUTHOR

Stefan Petrea, E<lt>stefan@garage-coding.comE<gt>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2014 by Stefan Petrea

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.10.0 or,
at your option, any later version of Perl 5 you may have available.

=cut