The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XHTML::Util::Cookbook

Strip all HTML

Destructive

 print $xu->strip_tags(join(",", $xu->tags));

Note that this is a destructive action. The tags are gone from the object.

Non-destructive

 print $xu->text;

Remember you have access to the underlying XML::LibXML::Document through the doc and root methods. So the above is really just a convenience shortcut for-

 print $xu->root->textContent;

This is non-destructive. The tags are still in the object.

Bag it

Strip scripts

Keeping the script content

 $xu->strip_tags("script");

Removing tag and its content

 $xu->remove("script");

Strip links, leaving text

 $xu = XHTML::Util->new(\q{Click <a href="#">here</a>});
 print $xu->strip_tags("a");

Strip external (non-relative) links, leaving text

 print $xu->strip_tags("a['href^=http']");

Wrap pre content

Long lines in <pre/> tags can wreck layouts or overflow and be unreadable.

Downgrade headers

To do.

Transform text

To do.

Custom tags

To do.

SEE ALSO

XHTML::Util.