The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Plack::Middleware::ReverseProxyPath - adjust proxied env to match client-facing

SYNOPSIS

Generally you'll simple use the middle-ware:

  enable "ReverseProxy";
  enable "ReverseProxyPath";

Below is an elaborate example that includes both a dummy reverse proxy front-end and the back-end using ReverseProxyPath. Run with something like:

  PLACK_SERVER=Starman perl -x -Ilib ./lib/Plack/Middleware/ReverseProxyPath.pm

(Sample output below)

#!perl -MPlack::Runner #line 85 sub mw(&);

  use Plack::Builder;

  # Configure your reverse proxy (perlbal, varnish, apache, squid)
  # to send X-Forwarded-Script-Name and X-Traversal-Path headers.

  # This example just uses Plack::App::Proxy to demonstrate:
  sub proxy_builder {
    require Plack::App::Proxy;

    # imagine this is https://somehost/fepath/from
    mount "http://localhost/fepath/from" => builder {
      enable mw {
          my ($app, $env) = @_;

          # Headers for ReverseProxyPath
          $env->{'HTTP_X_FORWARDED_SCRIPT_NAME'} = '/fepath/from';
          $env->{'HTTP_X_TRAVERSAL_PATH'}        = '/bepath/to';

          # Headers for ReverseProxy (often already sent)
          $env->{'HTTP_X_FORWARDED_HOST'}        = 'somehost'; # pretending..
          $env->{'HTTP_X_FORWARDED_PORT'}        = 443;

          die "Need MP" if !$env->{'psgi.multiprocess'}
                        && !$env->{'psgi.multithread'};
          $app->($env);
      };
      Plack::App::Proxy->new(
            remote => 'http://0:5000/bepath/to' ) ->to_app;
    };
  };

  # In your Plack back-end
  my $app = builder {

    # /bepath/to/* is proxied (can also be accessed directly)
    mount "/bepath/to" => builder {        # base adjustments:
                                           # 1) http://0:5000/bepath/to/x
      # ReverseProxy fixes scheme/host/port
      enable "ReverseProxy";
                                           # 2) https://somehost/bepath/to/x
      # ReverseProxyPath uses new headers
      # fixes SCRIPT_NAME and PATH_INFO
      enable "ReverseProxyPath";
                                           # 3) https://somehost/fepath/from/x

      # $req->base + $req->path now is the client-facing url
      # so URLs, Set-Cookie, Location can work naively
      mount "/base" => \&echo_base;
      mount "/env"  => \&echo_env;
    };
    mount "/env" => \&echo_env;

    # proxy to myself to keep the synopsis short (needs >1 worker)
    proxy_builder();
  };

  # synopsis plumbing:
  sub echo_base { require Plack::Request;
      [200, [ qw(Content-type text/plain) ],
            [ Plack::Request->new(shift)->base . "\n" ] ]
  }
  sub echo_env {
      my ($env) = @_;
      [200, [ qw(Content-type text/plain) ],
            [ map { "$_: $env->{$_}\n" } keys %$env ] ]
  }
  sub mw(&) { my $code = shift;
    sub { my $app = shift; sub { $code->($app, @_); } } };

  Plack::Runner->new->run($app);
__END__

 # with ReverseProxyPath and ReverseProxy applied
 GET http://localhost:5000/fepath/from/base
 https://somehost/fepath/from/base

 # talking directly to the back-end
 GET http://localhost:5000/bepath/to/base
 http://localhost:5000/bepath/to/base

DESCRIPTION

Use case: reverse proxying /sub/path/ to http://0:5000/other/path/ . This middleware sits on the back-end and uses headers sent by the proxy to hide the proxy plumbing from the back-end app.

Plack::Middleware::ReverseProxy does the host, port and scheme.

Plack::Middleware::ReverseProxyPath adds handling of paths.

The goal is to allow proxied back-end apps to reconstruct and use the client-facing url. ReverseProxy does most of the work and ReverseProxyPath does the paths. The inner app can simply use $req->base to redirect, set cookies and the like.

I find the term reverse proxy leads to confusion, so I'll use front-end to refer to the reverse proxy (eg. squid) which the client hits first, and back-end to refer to the server that runs your PSGI application (eg. starman).

Plack::Middleware::ReverseProxyPath adjusts SCRIPT_NAME and PATH_INFO based on headers from a front-end so that it's inner app can pretend there is no proxy there. This is useful when you aren't proxying and entire server, but only a deeper path. In Apache terms:

  ProxyPass /mirror/foo/ http://localhost:5000/bar/

It should be used with Plack::Middleware::ReverseProxy which does equivalent adjustments to the scheme, host and port environment attributes.

Required Headers

In order for this middleware to perform the path adjustments you will need to configure your reverse proxy to send the following headers (as applicable):

X-Forwarded-Script-Name

The front-end prefix being forwarded FROM. This is the replacement.

The value of SCRIPT_NAME on the front-end.

X-Traversal-Path

The back-end prefix being forwarded TO. This is to be replaced.

This is the part of the back-end URI that is just plumbing which should be hidden from the app.

If you aren't forwarding to the root of a server, but to some deeper path, this contains the deeper path portion. So if you forward to http://localhost:8080/myapp, and there is a request for /article/1, then the full path forwarded to will be /myapp/article/1. X-Traversal-Path will contain /myapp.

Path Adjustment Logic

If there is either X-Traversal-Path or X-Forwarded-Script-Name, roughly:

  SCRIPT_NAME . PATH_INFO =~ s/^${X-Traversal-Path}/${X-Forwarded-Script-Name}/

The X-Traversal-Path prefix will be stripped from SCRIPT_NAME (borrowing from PATH_INFO if needed) and SCRIPT_NAME will be prefixed with X-Forwarded-Script-Name.

In the absence of reverse proxy headers, leave SCRIPT_NAME and PATH_INFO alone. This allows direct connections to the back-end to function. Also, leave REQUEST_URI alone with the old/original value.

Front-ends should clear client-sent X-Traversal-Path, and X-Forwarded-Script-Name (for security).

Note that while it is intended that this module operates on one directory segment at a time, that is not enforced at present. For example, /script_name adjusted with ( /script => /cgi ) would result in /cgi_name.

Examples

See the examples directory.

If you do use this with a new front-end then please send the configuration for inclusion.

Exceptions

If there are problems with the configuration or headers then a Plack::Middleware::HTTPException compatible exception will be thrown. It will be a 500 that stringifies with information about the bad headers. This might be considered sensitive information, in which case, you should catch and handle them.

TODO

 * Should REQUEST_URI be touched?
 * Plack::Middleware::Lint

LICENSE

This software is licensed under the same terms as Perl itself.

AUTHOR

Brad Bowman

Feedback from Chris Prather (perigrin)

SEE ALSO

Plack::Middleware::ReverseProxy

http://pythonpaste.org/wsgiproxy/ python middleware used as a template (although it uses X-Script-Name, instead).