HTTP-Response-Encoding
view release on metacpan - search on metacpan
view release on metacpan or search on metacpan
# Revision history for HTTP-Response-Encoding
#
# $Id: Changes,v 0.6 2009/07/28 21:25:25 dankogai Exp dankogai $
#
$Revision: 0.6 $ $Date: 2009/07/28 21:25:25 $
! lib/HTTP/Response/Encoding.pm t/01-file.t
Addressed RT#47033:
new libwww-perl-5.827 release from 15.06.2009 breaks all tests
(Tested both on lwp5.826 and lwp5.830)
http://rt.cpan.org/Ticket/Display.html?47033
0.05 2007/05/12 09:24:15
! lib/HTTP/Response/Encoding.pm
removed method
- decoded_content() because HTTP::Message already has that
added methods
+ charset() -- returns the chraset as-is
+ encoder() -- encoding object that can be used to decode
0.4 2007/04/20 05:40:37
! lib/HTTP/Response/Encoding.pm
When you require Carp, you should surround arguments with ().
Message-Id: <200704200454.l3K4sMHL008173@franz.ak.mind.de>
0.03 2007/04/18 04:50:40
! MANIFEST
+ t/t-null.html
forgot to add. sorry.
0.02 2007/04/17 13:56:12
! lib/HTTP/Response/Encoding.pm
t/01-file.pm
+ more descriptive error message for decoded_content().
+ test case for failure added
0.01 2007/04/17 13:14:24
+ *
First version.
Changes
MANIFEST
META.yml # Will be created by "make dist"
Makefile.PL
README
lib/HTTP/Response/Encoding.pm
t/00-load.t
t/01-file.t
t/boilerplate.t
t/pod-coverage.t
t/pod.t
t/t-euc-jp.html
t/t-iso-2022-jp.html
t/t-null.html
t/t-shiftjis.html
t/t-utf-8.html
--- #YAML:1.0
name: HTTP-Response-Encoding
version: 0.06
abstract: Adds encoding() to HTTP::Response
author:
- Dan Kogai <dankogai@dan.co.jp>
license: unknown
distribution_type: module
configure_requires:
ExtUtils::MakeMaker: 0
build_requires:
ExtUtils::MakeMaker: 0
requires:
Encode: 2
HTTP::Response: 0
Test::More: 0
no_index:
directory:
- t
- inc
generated_by: ExtUtils::MakeMaker version 6.54
meta-spec:
url: http://module-build.sourceforge.net/META-spec-v1.4.html
version: 1.4
Makefile.PL view on Meta::CPAN
use 5.008001;
use strict;
use warnings;
use ExtUtils::MakeMaker;
WriteMakefile(
NAME => 'HTTP::Response::Encoding',
AUTHOR => 'Dan Kogai <dankogai@dan.co.jp>',
VERSION_FROM => 'lib/HTTP/Response/Encoding.pm',
ABSTRACT_FROM => 'lib/HTTP/Response/Encoding.pm',
PL_FILES => {},
PREREQ_PM => {
'Encode' => 2.00,
'Test::More' => 0,
# 'HTTP::Message' => 5.827,
'HTTP::Response' => 0,
},
dist => { COMPRESS => 'gzip -9f', SUFFIX => 'gz', },
clean => { FILES => 'HTTP-Response-Encoding-*' },
);
NAME
HTTP::Response::Encoding - Adds encoding() to HTTP::Response
VERSION
$Id: README,v 0.2 2007/05/12 09:24:15 dankogai Exp $
SYNOPSIS
use LWP::UserAgent;
use HTTP::Response::Encoding;
my $ua = LWP::UserAgent->new();
my $res = $ua->get("http://www.example.com/");
warn $res->encoding;
EXPORT
Nothing.
METHODS
This module adds the following methods to HTTP::Response objects.
"$res->charset"
Tells the charset *exactly as appears* in the "Content-Type:" header.
Note that the presence of the charset does not guarantee if the
response content is decodable via Encode.
To normalize this, you should try
$res->encoder->mime_name; # with Encode 2.21 or above
or
use I18N::Charset;
# ...
mime_charset_name($res->encoding);
"$res->encoder"
Returns the corresponding encoder object or undef if it can't.
"$res->encoding"
Tells the content encoding in the canonical name in Encode. Returns
undef if it can't.
For most cases, you are more likely to successfully find encoding
after GET than HEAD. HTTP::Response is smart enough to parse
<meta http-equiv="Content-Type" content="text/html; charset=whatever"/>
But you need the content to let HTTP::Response parse it. If you don't
want to retrieve the whole content but interested in its encoding, try
something like below;
my $req = HTTP::Request->new(GET => $uri);
$req->headers->header(Range => "bytes=0-4095"); # just 1st 4k
my $res = $ua->request($req);
warn $res->encoding;
"$res->decoded_content"
Discontinued since HTTP::Message already has this method.
See HTTP::Message for details.
INSTALLATION
To install this module, run the following commands:
perl Makefile.PL
make
make test
make install
AUTHOR
Dan Kogai, "<dankogai at dan.co.jp>"
BUGS
Please report any bugs or feature requests to
"bug-http-response-encoding at rt.cpan.org", or through the web
interface at
<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTTP-Response-Encoding>.
I will be notified, and then you'll automatically be notified of
progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc HTTP::Response::Encoding
You can also look for information at:
* AnnoCPAN: Annotated CPAN documentation
<http://annocpan.org/dist/HTTP-Response-Encoding>
* CPAN Ratings
<http://cpanratings.perl.org/d/HTTP-Response-Encoding>
* RT: CPAN's request tracker
<http://rt.cpan.org/NoAuth/Bugs.html?Dist=HTTP-Response-Encoding>
* Search CPAN
<http://search.cpan.org/dist/HTTP-Response-Encoding>
ACKNOWLEDGEMENTS
GAAS for LWP.
MIYAGAWA for suggestions.
COPYRIGHT & LICENSE
Copyright 2007 Dan Kogai, all rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
lib/HTTP/Response/Encoding.pm view on Meta::CPAN
package HTTP::Response::Encoding;
use warnings;
use strict;
our $VERSION = sprintf "%d.%02d", q$Revision: 0.6 $ =~ /(\d+)/g;
sub HTTP::Response::charset {
my $self = shift;
return $self->{__charset} if exists $self->{__charset};
if ($self->can('content_charset')){
# To suppress:
# Parsing of undecoded UTF-8 will give garbage when decoding entities
local $SIG{__WARN__} = sub {};
my $charset = $self->content_charset;
$self->{__charset} = $charset;
return $charset;
}
my $content_type = $self->headers->header('Content-Type');
return unless $content_type;
$content_type =~ /charset=([A-Za-z0-9_\-]+)/io;
$self->{__charset} = $1 || undef;
}
sub HTTP::Response::encoder {
require Encode;
my $self = shift;
return $self->{__encoder} if exists $self->{__encoder};
my $charset = $self->charset or return;
my $enc = Encode::find_encoding($charset);
$self->{__encoder} = $enc;
}
sub HTTP::Response::encoding {
my $enc = shift->encoder or return;
$enc->name;
}
=head1 NAME
HTTP::Response::Encoding - Adds encoding() to HTTP::Response
=head1 VERSION
$Id: Encoding.pm,v 0.6 2009/07/28 21:25:25 dankogai Exp dankogai $
=cut
=head1 SYNOPSIS
use LWP::UserAgent;
use HTTP::Response::Encoding;
my $ua = LWP::UserAgent->new();
my $res = $ua->get("http://www.example.com/");
warn $res->encoding;
=head1 EXPORT
Nothing.
=head1 METHODS
This module adds the following methods to L<HTTP::Response> objects.
=over 2
=item C<< $res->charset >>
Tells the charset I<exactly as appears> in the C<Content-Type:> header.
Note that the presence of the charset does not guarantee if the
response content is decodable via Encode.
To normalize this, you should try
$res->encoder->mime_name; # with Encode 2.21 or above
or
use I18N::Charset;
# ...
mime_charset_name($res->encoding);
=item C<< $res->encoder >>
Returns the corresponding encoder object or undef if it can't.
=item C<< $res->encoding >>
Tells the content encoding in the canonical name in L<Encode>.
Returns undef if it can't.
For most cases, you are more likely to successfully find encoding
after GET than HEAD. HTTP::Response is smart enough to parse
<meta http-equiv="Content-Type" content="text/html; charset=whatever"/>
But you need the content to let HTTP::Response parse it.
If you don't want to retrieve the whole content but interested in its
encoding, try something like below;
my $req = HTTP::Request->new(GET => $uri);
$req->headers->header(Range => "bytes=0-4095"); # just 1st 4k
my $res = $ua->request($req);
warn $res->encoding;
=item C<< $res->decoded_content >>
Discontinued since HTTP::Message already has this method.
See L<HTTP::Message> for details.
=back
=head1 INSTALLATION
To install this module, run the following commands:
perl Makefile.PL
make
make test
make install
=head1 AUTHOR
Dan Kogai, C<< <dankogai at dan.co.jp> >>
=head1 BUGS
Please report any bugs or feature requests to
C<bug-http-response-encoding at rt.cpan.org>, or through the web interface at
L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTTP-Response-Encoding>.
I will be notified, and then you'll automatically be notified of progress on
your bug as I make changes.
=head1 SUPPORT
You can find documentation for this module with the perldoc command.
perldoc HTTP::Response::Encoding
You can also look for information at:
=over 4
=item * AnnoCPAN: Annotated CPAN documentation
L<http://annocpan.org/dist/HTTP-Response-Encoding>
=item * CPAN Ratings
L<http://cpanratings.perl.org/d/HTTP-Response-Encoding>
=item * RT: CPAN's request tracker
L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=HTTP-Response-Encoding>
=item * Search CPAN
L<http://search.cpan.org/dist/HTTP-Response-Encoding>
=back
=head1 ACKNOWLEDGEMENTS
GAAS for L<LWP>.
MIYAGAWA for suggestions.
=head1 COPYRIGHT & LICENSE
Copyright 2007 Dan Kogai, all rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
=cut
1; # End of HTTP::Response::Encoding
t/00-load.t view on Meta::CPAN
#!perl -T
use Test::More tests => 1;
BEGIN {
use_ok( 'HTTP::Response::Encoding' );
}
diag( "Testing HTTP::Response::Encoding $HTTP::Response::Encoding::VERSION, Perl $], $^X" );
t/01-file.t view on Meta::CPAN
#!perl -T
use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Response::Encoding;
use File::Spec;
use Encode;
use Cwd;
use URI;
use Test::More tests => 13;
my $ua = LWP::UserAgent->new;
my $cwd = getcwd;
#BEGIN{
# package LWP::Protocol;
# $^W = 0;
#}
for my $meth (qw/charset encoder encoding decoded_content/){
can_ok('HTTP::Response', $meth);
}
my %charset = qw(
UTF-8 utf-8-strict;
EUC-JP EUC-JP
Shift_JIS SHIFT_JIS
ISO-2022-JP ISO-2022-JP
);
my %filename = qw(
UTF-8 t-utf-8.html
EUC-JP t-euc-jp.html
Shift_JIS t-shiftjis.html
ISO-2022-JP t-iso-2022-jp.html
);
for my $charset (sort keys %charset){
my $uri = URI->new('file://');
$uri->path(File::Spec->catfile($cwd, "t", $filename{$charset}));
my $res;
{
local $^W = 0; # to quiet LWP::Protocol
$res = $ua->get($uri);
}
die unless $res->is_success;
is $res->charset, $charset, "\$res->charset eq '$charset'";
my $canon = find_encoding($charset)->name;
is $res->encoding, $canon, "\$res->encoding eq '$canon'";
}
my $uri = URI->new('file://');
$uri->path(File::Spec->catfile($cwd, "t", "t-null.html"));
my $res = $ua->get($uri);
die unless $res->is_success;
if (defined $res->encoding){
is $res->encoding, "ascii", "res->encoding is ascii";
}else{
ok !$res->encoding, "res->encoding is undef";
}
t/boilerplate.t view on Meta::CPAN
#!perl -T
use strict;
use warnings;
use Test::More tests => 3;
sub not_in_file_ok {
my ($filename, %regex) = @_;
open my $fh, "<", $filename
or die "couldn't open $filename for reading: $!";
my %violated;
while (my $line = <$fh>) {
while (my ($desc, $regex) = each %regex) {
if ($line =~ $regex) {
push @{$violated{$desc}||=[]}, $.;
}
}
}
if (%violated) {
fail("$filename contains boilerplate text");
diag "$_ appears on lines @{$violated{$_}}" for keys %violated;
} else {
pass("$filename contains no boilerplate text");
}
}
not_in_file_ok(README =>
"The README is used..." => qr/The README is used/,
"'version information here'" => qr/to provide version information/,
);
not_in_file_ok(Changes =>
"placeholder date/time" => qr(Date/time)
);
sub module_boilerplate_ok {
my ($module) = @_;
not_in_file_ok($module =>
'the great new $MODULENAME' => qr/ - The great new /,
'boilerplate description' => qr/Quick summary of what the module/,
'stub function definition' => qr/function[12]/,
);
}
module_boilerplate_ok('lib/HTTP/Response/Encoding.pm');
t/pod-coverage.t view on Meta::CPAN
#!perl -T
use Test::More;
eval "use Test::Pod::Coverage 1.04";
plan skip_all => "Test::Pod::Coverage 1.04 required for testing POD coverage" if $@;
all_pod_coverage_ok();
#!perl -T
use Test::More;
eval "use Test::Pod 1.14";
plan skip_all => "Test::Pod 1.14 required for testing POD" if $@;
all_pod_files_ok();
t/t-euc-jp.html view on Meta::CPAN
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=EUC-JP"/>
<title>Test</title>
</head>
<body>
<p>´Á»ú¡¢¥«¥¿¥«¥Ê¡¢¤Ò¤é¤¬¤Ê¤ÎÆþ¤Ã¤¿html.</p>
</body>
</html>
t/t-iso-2022-jp.html view on Meta::CPAN
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-2022-JP"/>
<title>Test</title>
</head>
<body>
<p>$B4A;z!"%+%?%+%J!"$R$i$,$J$NF~$C$?(Bhtml.</p>
</body>
</html>
t/t-null.html view on Meta::CPAN
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Test</title>
</head>
<body>
<p>The quick brown fox jumps over the black lazy dog.</p>
</body>
</html>
t/t-shiftjis.html view on Meta::CPAN
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS"/>
<title>Test</title>
</head>
<body>
<p>¿AJ^JiAÐçªÈÌüÁ½html.</p>
</body>
</html>
t/t-utf-8.html view on Meta::CPAN
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Test</title>
</head>
<body>
<p>æ¼¢åãã«ã¿ã«ããã²ãããªã®å
¥ã£ãhtml.</p>
</body>
</html>
view all matches for this distributionview release on metacpan - search on metacpan
( run in 0.839 second using v1.00-cache-2.02-grep-82fe00e-cpan-1925d2aa809 )