The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#!perl -w

# Copyright 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Kevin Ryde

# This file is part of Upfiles.
#
# Upfiles is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 3, or (at your option) any later
# version.
#
# Upfiles is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
# more details.
#
# You should have received a copy of the GNU General Public License along
# with Upfiles.  If not, see <http://www.gnu.org/licenses/>.

use 5.010;
use strict;
use warnings;
use App::Upfiles;

our $VERSION = 12;

my $upf = App::Upfiles->new;
exit $upf->command_line;

__END__

=for stopwords upfiles Upfiles username SQLite3 toplevel modtimes symlinks arrayref filename filenames filesystem versa PID tuxfamily Ryde arg ftpd UTC http SSL SFTP LF kbytes mbytes gbytes

=head1 NAME

upfiles -- upload files to FTP or SFTP server, for push mirroring

=head1 SYNOPSIS

 upfiles [--options] [filename...]

=head1 DESCRIPTION

Upfiles uploads changed files from your local disk to an FTP or SFTP server,
for a simple kind of "push" mirroring.

Create files locally with the same directory structure as the target, and in
a F<~/.upfiles.conf> file give the locations,

    upfiles (local  => '/my/directory',
             remote => 'ftp://fred@example.com/pub/fred');

This is actually Perl code, so you can put comment lines with C<#>, write
some conditionals, use C<< $ENV{HOME} >>, etc.  Then to upload run

    upfiles

Or to upload just some selected files give the local filenames,

    upfiles /my/directory/foo.txt
    upfiles bar.txt ../something/quux.pl

Your username on the remote system is in the C<ftp://> remote URL.
A password is taken from file F<~/.netrc> the same as for the C<ftp> program
and similar programs.  See L<netrc(5)> or L<Net::Netrc> for the format of
that file.

Note that FTP transmits username and password in clear text over the
network.  See L</FTP Security> below for secure connections (or L</SSH File
Transfer (SFTP)>).

B<upfiles> records what has been sent in an SQLite3 database file
F<.upfiles.sqdb> in each local toplevel directory, for example
F</my/directory/.upfiles.sqdb>.  Local changes are identified by traversing
the local tree and comparing file modtimes and sizes against what was last
sent in the database.  This is faster than asking the remote server what
it's got.

For convenience some local files are always excluded from the upload.
Currently these are

    .upfiles.sqdb    which is upfiles itself
    foo~             Emacs backups
    #foo#            Emacs autosaves
    .#foo            Emacs lockfiles

Files are uploaded one by one.  The upload goes first to a temporary
filename and is then renamed.  This means an incomplete file isn't left if
the connection is lost or upfiles is killed during transfer.  Temporary
files are noted in the database and leftovers are deleted on the next
upfiles run if necessary.

File modification times are copied to an FTP server if it has the draft
standard C<MFMT> command or the common C<SITE UTIME> extension (either 2-arg
or 5-arg).

Plain RFC959 FTP doesn't have a notion of symlinks or hard links so
C<upfiles> follows any local links to actual content to upload.  Perhaps in
the future the C<SITE SYMLINK> extension could be used for links within the
uploaded tree, if available.  (Or SFTP below has symlinks.)

Filenames containing CR or LF characters (\r, \n) cannot be sent by FTP
protocol and Upfiles will refuse to operate on them.

=head2 FTP Security

FTP can be used with SSL encryption and end-to-end destination verification
in one of two ways.  Either is highly recommended when supported by the
server.  Recent C<Net::FTP> (version 1.28 circa 2014) and its SSL
dependencies (L<IO::Socket::SSL>) are required.

C<ftps> is immediate SSL on both command and data channels.  The server
listens on a different port (990) than plain FTP (port 21).  Give C<ftps> in
the remote URL,

    upfiles (local  => '/my/directory',
             remote => 'ftps://fred@example.com/pub/fred');

Or, on the usual FTP connection a C<TLS> command is given to encrypt the
command channel (and data channels) from that point onwards.  For this give
remote URL with plain C<ftp>, plus C<use_TLS> option,

    upfiles (local   => '/my/directory',
             remote  => 'ftp://fred@example.com/pub/fred',
             use_TLS => 1);

Upfiles gives C<TLS> immediately on connecting to the server, so both
username and password are only sent after the connection is encrypted with
SSL.

See L</DEBUGGING> below on protocol tracing.

=head2 SSH File Transfer (SFTP)

The C<ssh> "secure shell" program and protocol incorporates an SFTP file
transfer.  Upfiles can use this by setting the remote URL to C<sftp>.

    upfiles (local  => '/my/directory',
             remote => 'sftp://fred@example.com/home/fred/mydir');

This requires Perl L<Net::SFTP::Foreign> and suitable L<ssh(1)> program.
The C<remote> URL target directory is a full filesystem path, the same as
seen in the C<ssh> shell.

C<ssh> has various user authentication methods.  No-password operation can
be had by putting your public key on the remote machine.  For password
operation, C<upfiles> looks in F<~/.netrc> for a password by username and
host similar to plain FTP.  If not found there then the C<ssh> program will
query.  (Not sure if querying is a good idea, might prefer to immediately
abort.)

See L</DEBUGGING> below on protocol tracing.

=head1 CONFIGURATION

Each C<upfiles> call in F<~/.upfiles.conf> takes the following key/value
parameters,

=over 4

=item C<local> (string)

The local directory to upload from.

=item C<remote> (string)

The remote FTP or SFTP server to upload to, as a URL.  The path in the URL
is the target directory, and your username on the remote machine is included
with "@" syntax, like

    remote => 'ftp://fred@example.com/pub/fredsdir',

The "scheme" part can be C<ftp>, C<ftps> or C<sftp>.  A password is sought
in F<~/.netrc> by username and machine name.

=item C<use_TLS> (0 or 1, default 0)

For an C<ftp> remote, if set to 1 then use the C<TLS> command to give an SSL
secure protocol with the server.  This is highly recommended when supported
by the server (or C<ftps> or C<sftp>).

For C<ftps> this option is ignored, as the connection is secured before any
commands at all.

=item C<exclude_regexps> (arrayref of regexps)

Additional filenames to exclude.  For example to exclude a local F<Makefile>

    upfiles (local => '/my/directory',
             remote => 'ftp://example.com/pub/fred',
             exclude_regexps => [ qr{/(^|/)[Mm]akefile$} ]);

=item C<copy_utime> (0, 1, default C<"if_possible">)

Whether to copy file modification times to the server.  Usually this is
desirable.  The default C<"if_possible"> means do so if the server supports
it.  0 means don't try.  1 means must copy.

For FTP, this uses the C<MFMT> or C<SITE UTIME> commands, which are widely
available extensions to the basic FTP protocol.  For SFTP, copying times is
always possible.

=back

=head1 COMMAND-LINE OPTIONS

The command line options are

=over 4

=item -n, --dry-run

Show what would be uploaded or deleted on the server, but don't actually do
anything.

    upfiles -n

=item --help

Print some brief help information.

=item -V, --verbose, --verbose=N

Print some diagnostics about what's being done.  With --verbose=2 or
--verbose=3 print some technical details too.

    upfiles --verbose

=item --version

Print the upfiles program version number.  With C<--verbose> also print the
version numbers of some modules used.

=back

=head1 DEBUGGING

The C<--verbose> option above shows what upfiles is considering for uploads,
and also enables various levels of debug or verbosity from C<Net::FTP> (its
C<Debug> option), C<IO::Socket::SSL> (its C<$DEBUG>), and
C<Net::SFTP::Foreign> (C<-v> to the C<ssh> program).  SSL and SSH
diagnostics include information about certificates checked and encryption
negotiated.

It can be helpful to check first that a connection to the remote works.  For
FTP this can be the usual C<ftp> program, or even C<telnet>  (C<quit> to exit).

    ftp example.com
    telnet example.com 21

Server support for C<TLS> command can be checked in the usual way by the
C<FEAT> command (which in an ftp client might have to be sent by C<quot
feat>) and checking for C<AUTH TLS>.  The initial text banner message often
says something for human readers too.  The C<use_TLS> option should then
work, or not, according to server support.

For C<ftps> (FTP with SSL on port 990), if no server is listening on that
port then the host might either immediately give "connection refused", or
could ignore and C<upfiles> time-out.  If a suitable SSL-enabled C<ftp>
program is not at hand then raw telnet equivalent though SSL can be had with
either

    openssl s_client -connect example.com:990
    gnutls-cli --verbose -p 990 example.com

For SSH SFTP, check C<ssh> works by running it directly (C<-l> for remote
username).  Its C<-v> can be repeated for ever greater verbosity.

    ssh -v -l fred example.com

On first connection to a new host C<ssh> will usually ask whether to accept
the host certificate.  This happens when run through C<upfiles> too.
Preferably you have some external way to validate or get a certificate to
know to whom you're speaking.  Likewise SSL setups.

=head1 ENVIRONMENT VARIABLES

=over 4

=item C<HOME>

For F<~/.upfiles.conf> directory, and C<Net::Netrc> for F<~/.netrc>.

=back

=head1 FILES

=over 4

=item F<~/.upfiles.conf>

Configuration file.

=item F<~/.netrc>

FTP password file.

=item F</my/local/dir/.upfiles.sqdb>

SQLite3 database of information about what has been sent so far from the
tree F</my/local/dir>.

=back

Upfiles determines the home directory F<~> using L<File::HomeDir>.
C<Net::Netrc> has something similar for the F<.netrc> file, and looks also
for F<_netrc>.

=head1 BUGS

Changing a local file from a file to a directory or vice versa probably
doesn't work very well.  Remove it and upload, then create the new and
upload that.

FTP and SFTP make a couple of round trip command/responses to the server for
every file.  When uploading many small files something streaming or parallel
would be faster.  The temp file and rename adds a round trip too, but is
desirable so anyone looking doesn't see a half file.  Perhaps an option
could turn this off if not needed (such as upfiles to a remote backup).  For
reference, C<Net::SFTP::Foreign> has its own atomic option in C<put()>, but
think it's still just a client side action so recording temporaries in the
database better ensures cleanup if interrupted.

The temporary files are named using the local C<$$> PID added to the target
filename.  This is enough to protect against simultaneous uploads from the
same source machine, but potentially not if you're networked and are foolish
enough to C<upfiles> the same tree simultaneously from two different source
machines.  For FTP, havoc probably ensues.  For SFTP, use of no-overwrite
should make an error.  FTP C<STOU> would guarantee uniqueness, but does it
have a window while the name comes back which if interrupted could leave the
file created but unknown to upfiles?  C<Net::FTP> C<put_unique()> doesn't
return the name until after the whole transfer too.

For FTP, there's a small chance of a 2-arg C<SITE UTIME> attempt looking
like 5-arg to a C<pure-ftpd> pre-1.0.33 server (circa 2011), if the filename
uploaded happens to be dates and "UTC".  Perhaps more care could be taken to
identify the C<SITE UTIME> style the server expects.  In practice such
filenames should be unlikely, and there's no problem with recent
C<pure-ftpd> which has C<MDTM>.

There's no support for a proxy connection to the remote system.  The squid
CONNECT style which gives a pass-through socket wouldn't be too hard, and
C<ssh> can be made to speak through such things too.  The http-only style,
where http PUT must be given for FTP upload, is irritating and makes the
proxy a man-in-the-middle which you have to trust.  If a http PUT upload
method was of interest then it might follow from that easily enough.

File sizes and totals are reported in kbytes (of 1024 bytes).  Probably
should jump to mbytes or even gbytes at some sensible threshold.

For a given local directory, if the remote URL is changed then upfiles will
want to send everything again.  This is good if the remote is genuinely
somewhere new, or if uploading a local directory to multiple remote places.
But if it's just a change of hostname or protocol then would want no
resending.  Hopefully this is rare.  There's a secret undocumented
maybe-working C<--catchup> for marking everything already sent which could
help.  Some sort of force option to resend particular files could be useful
too.

=head1 SEE ALSO

L<Net::FTP>, L<Net::SFTP::Foreign>, L<ssh(1)>, L<netrc(5)>, L<Net::Netrc>,
L<DBD::SQLite>, L<sqlite3(1)>

L<sitecopy(1)>, L<ftpmirror(1)>, L<ftp-upload(1)>, L<rsync(1)>

=head1 HOME PAGE

L<http://user42.tuxfamily.org/upfiles/index.html>

(Upfiles is good for uploading to tuxfamily, either FTP or SFTP.)

=head1 LICENSE

Copyright 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Kevin Ryde

Upfiles is free software; you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation; either version 3, or (at your option) any later version.

Upfiles is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.

You should have received a copy of the GNU General Public License along with
Upfiles.  If not, see L<http://www.gnu.org/licenses/>.

=cut