The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

URI::Fast - A fast(er) URI parser

SYNOPSIS

  use URI::Fast qw(uri);

  my $uri = uri 'http://www.example.com/some/path?fnord=slack&foo=bar';

  if ($uri->scheme =~ /http(s)?/) {
    my @path  = $uri->path;
    my $fnord = $uri->param('fnord');
    my $foo   = $uri->param('foo');
  }

  if ($uri->path =~ /\/login/ && $uri->scheme ne 'https') {
    $uri->scheme('https');
    $uri->param('upgraded', 1);
  }

DESCRIPTION

URI::Fast is a faster alternative to URI. It is written in C and provides basic parsing and modification of a URI.

URI is an excellent module; it is battle-tested, robust, and handles many edge cases. As a result, it is rather slower than it would otherwise be for more trivial cases, such as inspecting the path or updating a single query parameter.

EXPORTED SUBROUTINES

uri

Accepts a URI string, minimally parses it, and returns a URI::Fast object.

iri

Similar to "uri", but returns a URI::Fast::IRI object. A URI::Fast::IRI differs from a URI::Fast in that UTF-8 characters are permitted and will not be percent-encoded when modified.

uri_split

Behaves (hopefully) identically to URI::Split, but roughly twice as fast.

ATTRIBUTES

Unless otherwise specified, all attributes serve as full accessors, allowing the URI segment to be both retrieved and modified.

Each attribute further has a matching clearer method (clear_*) which unsets its value.

scheme

Gets or sets the scheme portion of the URI (e.g. http), excluding ://.

auth

The authorization section is composed of the username, password, host name, and port number:

  hostname.com
  someone@hostname.com
  someone:secret@hostname.com:1234

Setting this field may be done with a string (see the note below about "ENCODING") or a hash reference of individual field names (usr, pwd, host, and port). In both cases, the existing values are completely replaced by the new values and any values not present are deleted.

usr

The username segment of the authorization string. Updating this value alters "auth".

pwd

The password segment of the authorization string. Updating this value alters "auth".

host

The host name segment of the authorization string. May be a domain string or an IP address. Updating this value alters "auth".

port

The port number segment of the authorization string. Updating this value alters "auth".

path

In scalar context, returns the entire path string. In list context, returns a list of path segments, split by /.

The path may also be updated using either a string or an array ref of segments:

  $uri->path('/foo/bar');
  $uri->path(['foo', 'bar']);

query

In scalar context, returns the complete query string, excluding the leading ?. The query string may be set in several ways.

  $uri->query("foo=bar&baz=bat"); # note: no percent-encoding performed
  $uri->query({foo => 'bar', baz => 'bat'}); # foo=bar&baz=bat
  $uri->query({foo => 'bar', baz => 'bat'}, ';'); # foo=bar;baz=bat

In list context, returns a hash ref mapping query keys to array refs of their values (see "query_hash").

query_keys

Does a fast scan of the query string and returns a list of unique parameter names that appear in the query string.

query_hash

Scans the query string and returns a hash ref of key/value pairs. Values are returned as an array ref, as keys may appear multiple times.

param

Gets or sets a parameter value. Setting a parameter value will replace existing values completely; the "query" string will also be updated. Setting a parameter to undef deletes the parameter from the URI.

  $uri->param('foo', ['bar', 'baz']);
  $uri->param('fnord', 'slack');

  my $value_scalar    = $uri->param('fnord'); # fnord appears once
  my @value_list      = $uri->param('foo');   # foo appears twice
  my $value_scalar    = $uri->param('foo');   # croaks; expected single value but foo has multiple

  # Delete 'foo'
  $uri->param('foo', undef);

An optional third parameter may be specified to control the character used to separate key/value pairs.

  $uri->param('foo', 'bar', ';'); # foo=bar
  $uri->param('baz', 'bat', ';'); # foo=bar;baz=bat

frag

The fragment section of the URI, excluding the leading #.

ENCODING

URI::Fast tries to do the right thing in most cases with regard to reserved and non-ASCII characters. URI::Fast will fully encode reserved and non-ASCII characters when setting individual values. However, the "right thing" is a bit ambiguous when it comes to setting compound fields like "auth", "path", and "query".

When setting these fields with a string value, reserved characters are expected to be present, and are therefore accepted as-is. However, any non-ASCII characters will be percent-encoded (since they are unambiguous and there is no risk of double-encoding them).

  $uri->auth('someone:secret@Ῥόδος.com:1234');
  print $uri->auth; # "someone:secret@%E1%BF%AC%CF%8C%CE%B4%CE%BF%CF%82.com:1234"

On the other hand, when setting these fields with a reference value, each field is fully percent-encoded:

  $uri->auth({usr => 'some one', host => 'somewhere.com'});
  print $uri->auth; # "some%20one@somewhere.com"

The same goes for return values. For compound fields returning a string, non-ASCII characters are decoded but reserved characters are not. When returning a list or reference of the deconstructed field, individual values are decoded of both reserved and non-ASCII characters.

encode

Percent-encodes a string for use in a URI. By default, both reserved and UTF-8 chars (! * ' ( ) ; : @ & = + $ , / ? # [ ] %) are encoded.

A second (optional) parameter provides a string containing any characters the caller does not wish to be encoded. An empty string will result in the default behavior described above.

For example, to encode all characters in a query-like string except for those used by the query:

  my $encoded = URI::Fast::encode($some_string, '?&=');

decode

Decodes a percent-encoded string.

  my $decoded = URI::Fast::decode($some_string);

SPEED

See URI::Fast::Benchmarks.

SEE ALSO

URI

The de facto standard.

Panda::URI

Written in C++ and purportedly very fast, but appears to only support Linux.

ACKNOWLEDGEMENTS

Thanks to ZipRecruiter for encouraging their employees to contribute back to the open source ecosystem. Without their dedication to quality software development this distribution would not exist.

AUTHOR

Jeff Ober <sysread@fastmail.fm>

COPYRIGHT AND LICENSE

This software is copyright (c) 2018 by Jeff Ober.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.