=encoding utf8

=head1 NAME

Mail::Message::Construct::Rebuild - modify a Mail::Message

=head1 SYNOPSIS

 my $cleanup = $msg->rebuild;

=head1 DESCRIPTION

Modifying existing messages is a pain, certainly if this has to be
done in an automated fashion.  The problems are especially had when
multiparts have to be created or removed.  The L<rebuild()|Mail::Message::Construct::Rebuild/"Constructing a message"> method
tries to simplify this task and add some standard features.

=head1 METHODS

=head2 Constructing a message

=over 4

=item $obj-E<gt>B<rebuild>(%options)

Reconstruct an existing message into something new.  Returned is a new
message when there were modifications made, C<undef> if the message has
no body left, or the original message when no modifications had to be
made.

Examples of use: you have a message which only contains html, and you
want to translate it into a multipart which contains the original html
and the textual translation of it.  Or, you have a message with parts
flagged to be deleted, and you want those changes be incorparted in the
memory structure.  Another possibility: clear all the resent groups
(see L<Mail::Message::Head::ResentGroup|Mail::Message::Head::ResentGroup>) from the header, before it is
written to file.

Reconstructing is a hazardous task, where multi level multiparts and
nested messages come into play.  The rebuild method tries to simplify
handing these messages for you.

 -Option         --Default
  extra_rules      []
  keep_message_id  <false>
  rules            <see text>

=over 2

=item extra_rules => ARRAY

The standard set of rules, which is the default for the C<rules> option,
is a moderest setting.  In stead of copying that list into a full set
of rules of your own, you can also specify only some additional rules
which will be prependend to the default rule set.

The order of the rules is respected, which means that you do not always
need to rewrite the whole rule is (see C<rule> option).  For instance,
the extra rule of C<removeDeletedParts> returns an C<undef>, which
means that it cancels the effect of the default rule C<replaceDeletedParts>.

=item keep_message_id => BOOLEAN

The message-id is an unique identification of the message: no two messages
with different content shall exist anywhere.  However in practice, when
a message is changed during transmission, the id is often incorrectly
not changed.  This may lead to complications in application which see
both messages with the same id.

=item rules => ARRAY

The ARRAY is a list of rules, which each describe an action which will
be called on each part which is found in the message.  Most rules
probably won't match, but some will bring changes to the content.
Rules can be specified as method name, or as code reference.  See the
L</DETAILS> chapter in this manual page, and L<recursiveRebuildPart()|Mail::Message::Construct::Rebuild/"Internals">.

By default, only the relatively safe transformations are performed:
C<replaceDeletedParts>, C<descendMultiparts>, C<descendNested>,
C<flattenMultiparts>, C<flattenEmptyMultiparts>.  In the future, more
safe transformations may be added to this list.

=back

example: 

 # remove all deleted parts
 my $cleaned = $msg->rebuild(keep_message_id => 1);
 $folder->addMessage($cleaned) if defined $cleaned;

 # Replace deleted parts by a place-holder
 my $cleaned = $msg->rebuild
   ( keep_message_id => 1
   , extra_rules => [ 'removeEmpty', 'flattenMultiparts' ]
   );

=back

=head2 Internals

=over 4

=item $obj-E<gt>B<recursiveRebuildPart>($part, %options)

 -Option--Default
  rules   <required>

=over 2

=item rules => ARRAY-OF-RULES

Rules are method names which can be called on messages and message parts
objects.  The ARRAY can also list code references which can be called.
In any case, each rule will be called the same way:

 $code->(MESSAGE, PART)

The return can be C<undef> or any complex construct based on a
L<Mail::Message::Part|Mail::Message::Part> or coerceable into such a part.  For each part,
all rules are called in sequence.  When a rule returns a changed object,
the rules will start all over again, however C<undef> will immediately
stop it.

=back

=back

=head1 DETAILS

=head2 Rebuilding a message

Modifying an existing message is a complicated job.  Not only do you need
to know what you are willing to change, but you have to take care about
multiparts (possibly nested in multiple levels), rfc822 encapsulated
messages, header field consistency, and so on.  The L<rebuild()|Mail::Message::Construct::Rebuild/"Constructing a message"> method
let you focus on the task, and takes care of the rest.

The L<rebuild()|Mail::Message::Construct::Rebuild/"Constructing a message"> method uses rules to transform the one message into an
other.  If one or more of the rules apply, a new message will be returned.
A simple numeric comparison tells whether the message has changed.  For
example

 print "No change"
    if $message == $message->rebuild;

Transformation is made with a set of rules.  Each rule performs only a
small step, which makes is easily configurable.  The rules are ordered,
and when one makes a change to the result, the result will be passed
to all the rules again until no rule makes a change on the part anymore.
A rule may also return C<undef> in which case the part will be removed
from the (resulting) message.

=head3 General rules

This sections describes the general configuration rules: all quite straight
forward transformations on the message structure.  The rules marked with (*)
are used by default.

=over 4

=item * descendMultiparts (*)

Apply the rules to the parts of (possibly nested) multiparts, not only to
the top-level message.

=item * descendNested (*)

Apply the rules to the C<message/rfc822> encapsulated message as well.

=item * flattenEmptyMultiparts (*)

Multipart messages which do not have any parts left are replaced by
a single part which contains the preamble, epilogue and a brief
explanation.

=item * flattenMultiparts (*)

When a multipart contains only one part, that part will take the place of
the multipart: the removal of a level of nesting.  This way, the preamble
and epilogue of the multipart (which do not have a meaning, officially)
are lost.

=item * flattenNesting

Remove the C<message/rfc822> encapsulation.  Only the content related
lines of the encapsulated body are preserved one level higher.  Other
information will be lost, which is often not too bad.

=item * removeDeletedParts

All parts which are flagged for deletion are removed from the message
without leaving a trace.  If a nested message is encountered which has
its encapsulated content flagged for deletion, it will be removed as
a whole.

=item * removeEmptyMultiparts

Multipart messages which do not have any parts left are removed.  The
information in preamble and epiloge is lost.

=item * removeEmptyBodies

Simple message bodies which do not contain any lines of content are
removed.  This will loose the information which is stored in the
headers of these bodies.

=item * replaceDeletedParts (*)

All parts of the message which are flagged for deletion are replace
by a message which says that the part is deleted.

=back

You can specify a selection of these rules with L<rebuild(rules)|Mail::Message::Construct::Rebuild/"Constructing a message"> and
L<rebuild(extra_rules)|Mail::Message::Construct::Rebuild/"Constructing a message">.

=head3 Conversion rules

This section describes the rules which try to be smart with the
content.  Please contribute with ideas and implementations.

=over 4

=item * removeHtmlAlternativeToText

When a multipart alternative is encountered, which contains both a
plain text and an html part, then the html part is deleted.
Especially useful in combination with the C<flattenMultiparts> rule.

=item * textAlternativeForHtml

Any C<text/html> part which is not accompanied by an alternative
plain text part will have one added.  You must have a working
L<Mail::Message::Convert::HtmlFormatText|Mail::Message::Convert::HtmlFormatText>, which means that
HTML::TreeBuilder and HTML::FormatText  must be installed on
your system.

=item * removeExtraAlternativeText

[2.110] When a multipart alternative is encountered, deletes all its parts
except for the last part (the preferred part in accordance
with RFC2046). In practice, this normally results in the alternative
plain text part being deleted of an html message. Useful in combination
with the C<flattenMultiparts> rule.

=back

=head3 Adding your own rules

If you have designed your own rule, please consider contributing this
to Mail::Box; it may be useful for other people as well.

Each rule is called

 my $new = $code->($message, $part, %options)

where the C<%options> are defined by the C<rebuild()> method internals. At
least the C<rules> option is passed, which is a full expansion of all
the rules which will be applied.

Your subroutine shall return C<$part> if no changes are needed,
C<undef> if the part should be removed, and any newly constructed
C<Mail::Message::Part> when a change is required.  It is easiest to
start looking at the source code of this package, and copy from a
comparible routine.

When you have your own routine, you simply call:

 my $rebuild_message = $message->rebuild
  ( extra_rules => [ \&my_own_rule, 'other_rule' ] );

=head1 DIAGNOSTICS

=over 4

=item Error: No rebuild rule $name defined.

=back

=head1 SEE ALSO

This module is part of Mail-Box distribution version 2.112,
built on March 14, 2014. Website: F<http://perl.overmeer.net/mailbox/>

=head1 LICENSE

Copyrights 2001-2014 by [Mark Overmeer]. For other contributors see ChangeLog.

This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
See F<http://www.perl.com/perl/misc/Artistic.html>