The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

cpansite -- extend CPAN with private packages

SYNOPSIS

 cpansite version
 cpansite [OPTIONS] index
 cpansite [OPTIONS] mirror PACKAGE

  OPTIONS:                                  via %ENV:
    --verbose  -v -vv -vvv --mode=DEBUG
    --no-lazy       redo everything
    --cpan <url>    some CPAN mirror        CPANSITE_GLOBAL
    --env-proxy     read additional proxy settings
    --site <dir>    local archive directory CPANSITE_LOCAL
    --stand-alone   no fallback to global CPAN
    --no-undefs     do not include "undef" versions in index

DESCRIPTION

WARNING: A lot has changed with the 1.01 release. Please read more about these changes in the file "explain_101.txt" included in the distribution.

The cpansite script is used to create your own CPAN server. The logic is implemented in CPAN::Site::Index which you may use directly. You only need to install this module on one server in your network.

There are two kinds of local CPANs which can be constructed with this software:

1. local CPAN with fallback to the global CPAN

When you generate a new index for your local set-up, the default behavior is to merge that knowledge with the global CPAN. When you install a module on a client, it will first attempt to fetch it from your own set-up. If not found, it will automatically continue to look at the global CPAN.

2. pure local CPAN, without fallback

When you choose to generate the index without fallback, the installation of a module will fail when you do not have a local copy of the module in your set-up. You can use the mirror action to collect the latest version of a module into your own structure.

Indexing options

The following options are available with all actions:

--verbose -v -vv -vvv --mode=DEBUG

Produce verbose output via Log::Report.

--site <dir> or -s <dir> or $CPANSITE_LOCAL

The location of your local CPAN archive set-up.

Example: export CPANSITE_LOCAL="/www/websites/cpan.example.com" cpansite index

  cpansite --site $CPANSITE_LOCAL index  # alternative
--cpan <url> or -c <url> or $CPANSITE_GLOBAL

Update the list of "real" CPAN modules regularly (daily or more) from this url. By default, ftp:///ftp.cpan.org is addressed which redirects to a server close to you.

--env-proxy

Let LWP::UserAgent read the proxy settings from environment variables. See the according method in that manual page.

--stand-alone or -a

The "real" CPAN list is not included. For instance, if you have downloaded all the releases from CPAN that you need, and you do not want unexpected extra downloads. The downloaded versions will prevail over newer releases on CPAN, but you may download modules from the core CPAN that you do not expect.

--no-lazy or --lazy or -l

Try to avoid redo-ing everything. By default, the indexer is lazy: it will process only new distributions. When not lazy, all distributions on the local disk are processed and a new table is created. The default of this option was reversed with release 1.00 of CPAN::Site.

--no-undefs or --undefs or -u

Whether to include package names with "undef" version in the packages list. Those packages cannot be used for dependencies, so are hardly useful.

DETAILS

Configuring the Clients

To get in touch with your own cpan archive, you have to explicitly provide an url to it. Add this to your CPAN.pm configuration file (usually ~/.cpan/CPAN/MyConfig.pm) option urllist. There is no need to install the CPAN::Site software on your clients since release 1.01.

You probably also want to set the variable index_expire to very short: the clients need to reload your local index as soon as possible, and not wait a day; just after your new local release is put in your local index, it must get visible to your client.

You may also consider to have the CPAN install cache to be cleaned by the system. Certainly when you set the cache size larger (required for more complex recursive installations) it is nice to have it removed after a (short) while. Set keep_source_where to a temporary directory.

Example for ~/.cpan/CPAN/MyConfig.pm

 $CPAN::Config =
  { ...
  , index_expire      => 1/600     # 5 minutes
  , urllist => [ $MYCPAN_URL, $BIGCPAN_MIRROR ]
  , keep_source_where => '/tmp/cpan-cache'
  , build_cache       => 100       # MegaByte cache
  , ...
  };

To avoid manually editing the CPAN config file one can also set the MYCPAN_URL from the shell:

  cpan> o conf urllist unshift $MYCPAN_URL
  cpan> o conf index_expire 0.001  # 86 seconds
  cpan> o conf commit

Configuring the Server

Starting your own CPAN

You have to have a ftp or http server running. Create a directory where you will distribute the data from, here named $MYCPAN. With a web-server, it is adviced to create a virtual host like cpan.example.com which has $MYCPAN as DocumentRoot.

Define a fake pause-id (here the demo is MYID), because if you use an existing pause-id you clients will start producing warnings about missing checksums on files retreived for the public archive.

  MYMODS=$MYCPAN/authors/id/M/MY/MYID
  mkdir -p $MYMODS

Although CPAN.pm claims to support a directory format of $MYCPAN/authors/id/MYID, experience shows that this does not work correctly with some recursively dependencies.

Adding your own modules to the local archive

Put your own modules in $MYMODS and then rerun the indexer.

  mv MyDist-1.00-tar.gz $MYMODS   # local
  scp MyDist-1.00-tar.gz cpan.example.com:$MYMODS

Generating an index with fallback

Your own software probably depends on a lot of modules which are found on the global CPAN. And those modules require even more modules from CPAN. By default, your local CPAN index will know about all modules which you have yourself plus all module on the global CPAN.

The index only contains the last (highest) version of each file (which means that each file must contain a version number otherwise the text undef is used for version) In any case, the local packages get preference over the global CPAN packages, even when they have a lower version number.

With fallback:

 cpansite --site $MYCPAN index
 cpansite index    # when   CPANSITE_LOCAL=$MYCPAN

The script traverses $MYCPAN/authors/id and merges this with the $MYCPAN/global/02packages.details.txt.gz data, a copy from the original CPAN. It creates a CHECKSUMS file. The result is a private $MYCPAN/modules/02packages.details.txt.gz file.

The files $MYCPAN/authors/01mailrc.txt.gz and $MYCPAN/modules/03modlist.data.gz are downloaded from CPAN. This will reduce the number of failing retreivals when you start installing software.

Generating an index without fallback

When you wish for a controled environment, where all your systems run the same versions of the modules, you should disable the fallback to the global CPAN.

Without fallback:

 cpansite --site $MYCPAN --stand-alone index
 cpansite --stand-alone index    # when   CPANSITE_LOCAL=$MYCPAN

The index is now very small. But when you start installing your software on systems, it will start complaining that the module cannot be found on CPAN. Now, add specific distribution versions from the global CPAN to your own archive. See next section.

Adding distributions from global CPAN to your own

When you want a fixed distribution version to be used on your systems, you can manually download them and insert them in the $MYCPAN tree.

However, there is also a simple way to retrieve the most recent version. The next example shows how to insert the latest versions of the distributions which include the packages Mail::Box and Test::More into your local CPAN archive.

 cpansite --site $MYCPAN --cpan $GLOBAL mirror Mail::Box Test::More

 # when CPANSITE_LOCAL=$MYCPAN and CPANSITE_GLOBAL=$GLOBAL
 cpansite mirror Mail::Box Test::More

AUTHORS

Mark Overmeer <perl@overmeer.net>.