Hiroshi Sakai > HTML-Split-0.03 > HTML::Split

Download:
HTML-Split-0.03.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  0
Report a bug
Module Version: 0.03   Source  

NAME ^

HTML::Split - Splitting HTML by number of characters with keeping DOM structure.

SYNOPSIS ^

  use HTML::Split;

  my $html = <<HTML;
  <div class="pkg">
  <h1>HTML::Split</h1>
  <p>Splitting HTML by number of characters.</p>
  </div>
  HTML;

  my @pages = HTML::Split->split(html => $html, length => 50);

  # $pages[0] <div class="pkg">
  #           <h1>HTML::Split</h1>
  #           <p>Splittin</p></div>
  # $pages[1] <div class="pkg">
  #           <p>g HTML by number of characters.</p></div>

DESCRIPTION ^

HTML::Split is the module to split HTML by number of characters with keeping DOM structure.

In some mobile devices, mainly cell-phones, because the data size that can be acquired with HTTP is limited, it is necessary to split HTML.

This module provide the method of splitting HTML without destroying the DOM tree for such devices.

CLASS METHODS ^

split

Split HTML text by number of characters. It can accept below parameters with hash.

html

HTML string.

length

The length (characters) per pages.

extend_tags

Defining regexp of description that can not split. For example, your original markup to show emoticon '[E:foo]':

  extend_tags => [
      {
          full  => qr/\[E:[\w\-]+\]/,
          begin => qr/\[[^\]]*?/,
          end   => qr/[^\]]+\]/,
      },
  ]

new

Create an instance of HTML::Split. Accept same arguments as split method.

INSTANCE METHODS ^

current_page

Set/Get current page.

total_pages

Return the number of total pages.

next_page

Return the next page number. If the next page doesn't exists, return undef.

prev_page

Return the previous page number. If the previous page doesn't exists, return undef.

text

Return the text of current page.

AUTHOR ^

Hiroshi Sakai <ziguzagu@cpan.org>

LICENSE ^

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.