Gustavo Leite de Mendonça Chaves > Git-Hooks > Git::Hooks

Download:
Git-Hooks-1.3.0.tar.gz

Dependencies

Annotate this POD

Website

CPAN RT

Open  2
View/Report Bugs
Module Version: 1.3.0   Source  

NAME ^

Git::Hooks - Framework for implementing Git (and Gerrit) hooks

VERSION ^

version 1.3.0

SYNOPSIS ^

A single script can implement several Git hooks:

        #!/usr/bin/env perl

        use Git::Hooks;

        PRE_COMMIT {
            my ($git) = @_;
            # ...
        };

        COMMIT_MSG {
            my ($git, $msg_file) = @_;
            # ...
        };

        run_hook($0, @ARGV);

Or you can use Git::Hooks plugins or external hooks, driven by the single script below. These hooks are enabled by Git configuration options. (More on this later.)

        #!/usr/bin/env perl

        use Git::Hooks;

        run_hook($0, @ARGV);

INTRODUCTION ^

"Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals. (Git README)"

In order to really understand what this is all about you need to understand Git and its hooks. You can read everything about this in the documentation references on that site.

A Git hook is a specifically named program that is called by the git program during the execution of some operations. At the last count, there were exactly 16 different hooks which can be used. They must reside under the .git/hooks directory in the repository. When you create a new repository, you get some template files in this directory, all of them having the .sample suffix and helpful instructions inside explaining how to convert them into working hooks.

When Git is performing a commit operation, for example, it calls these four hooks in order: pre-commit, prepare-commit-msg, commit-msg, and post-commit. The first three can gather all sorts of information about the specific commit being performed and decide to reject it in case it doesn't comply to specified policies. The post-commit can be used to log or alert interested parties about the commit just done.

There are several useful hook scripts available elsewhere, e.g. https://github.com/gitster/git/tree/master/contrib/hooks and http://google.com/search?q=git+hooks. However, when you try to combine the functionality of two or more of those scripts in a single hook you normally end up facing two problems.

Complexity

In order to integrate the functionality of more than one script you have to write a driver script that's called by Git and calls all the other scripts in order, passing to them the arguments they need. Moreover, some of those scripts may have configuration files to read and you may have to maintain several of them.

Inefficiency

This arrangement is inefficient in two ways. First because each script runs as a separate process, which usually have a high start up cost because they are, well, scripts and not binaries. (For a dissent view on this, see this.) And second, because as each script is called in turn they have no memory of the scripts called before and have to gather the information about the transaction again and again, normally by calling the git command, which spawns yet another process.

Git::Hooks is a framework for implementing Git hooks and driving existing external hooks in a way that tries to solve these problems.

Instead of having separate scripts implementing different functionality you may have a single script implementing all the functionality you need either directly or using some of the existing plugins, which are implemented by Perl scripts in the Git::Hooks:: namespace. This single script can be used to implement all standard hooks, because each hook knows when to perform based on the context in which the script was called.

If you already have some handy hooks and want to keep using them, don't worry. Git::Hooks can drive external hooks very easily.

USAGE ^

There are a few simple steps you should do in order to set up Git::Hooks so that you can configure it to use some predefined plugins or start coding your own hooks.

The first step is to create a generic script that will be invoked by Git for every hook. If you are implementing hooks in your local repository, go to its .git/hooks sub-directory. If you are implementing the hooks in a bare repository in your server, go to its hooks sub-directory.

You should see there a bunch of files with names ending in .sample which are hook examples. Create a three-line script called, e.g., git-hooks.pl, in this directory like this:

        $ cd /path/to/repo/.git/hooks

        $ cat >git-hooks.pl <<EOT
        #!/usr/bin/env perl
        use Git::Hooks;
        run_hook($0, @ARGV);
        EOT

        $ chmod +x git-hooks.pl

Now you should create symbolic links pointing to it for each hook you are interested in. For example, if you are interested in a commit-msg hook, create a symbolic link called commit-msg pointing to the git-hooks.pl file. This way, Git will invoke the generic script for all hooks you are interested in. (You may create symbolic links for all 16 hooks, but this will make Git call the script for all hooked operations, even for those that you may not be interested in. Nothing wrong will happen, but the server will be doing extra work for nothing.)

        $ ln -s git-hooks.pl commit-msg
        $ ln -s git-hooks.pl post-commit
        $ ln -s git-hooks.pl pre-receive

As is, the script won't do anything. You have to implement some hooks in it, use some of the existing plugins, or set up some external plugins to be invoked properly. Either way, the script should end with a call to run_hook passing to it the name with which it was called ($0) and all the arguments it received (@ARGV).

Implementing Hooks

You may implement your own hooks using one of the hook directives described in the HOOK DIRECTIVES section below. Your hooks may be implemented in the generic script you have created. They must be defined after the use Git::Hooks line and before the run_hook() line.

A hook should return a boolean value indicating if it was successful. run_hook dies after invoking all hooks if at least one of them returned false.

run_hook invokes the hooks inside an eval block to catch any exception, such as if a die is used inside them. When an exception is detected the hook is considered to have failed and the exception string ($@) is showed to the user.

The best way to produce an error message is to invoke the Git::More::error method passing a prefix and a message for uniform formating.

For example:

    # Check if every added/updated file is smaller than a fixed limit.

    my $LIMIT = 10 * 1024 * 1024; # 10MB

    PRE_COMMIT {
        my ($git) = @_;

        my @changed = $git->command(qw/diff --cached --name-only --diff-filter=AM/);

        my $errors = 0;

        foreach ($git->command('ls-files' => '-s', @changed)) {
            chomp;
            my ($mode, $sha, $n, $name) = split / /;
            my $size = $git->command('cat-file' => '-s', $sha);
            $size <= $LIMIT
                or $git->error('CheckSize', "File '$name' has $size bytes, more than our limit of $LIMIT"
                    and $errors++;
        }

        return $errors == 0;
    };

    # Check if every added/changed Perl file respects Perl::Critic's code
    # standards.

    PRE_COMMIT {
        my ($git) = @_;
        my %violations;

        my @changed = grep {/\.p[lm]$/} $git->command(qw/diff --cached --name-only --diff-filter=AM/);

        foreach ($git->command('ls-files' => '-s', @changed)) {
            chomp;
            my ($mode, $sha, $n, $name) = split / /;
            require Perl::Critic;
            state $critic = Perl::Critic->new(-severity => 'stern', -top => 10);
            my $contents = $git->command('cat-file' => $sha);
            my @violations = $critic->critique(\$contents);
            $violations{$name} = \@violations if @violations;
        }

        if (%violations) {
            # FIXME: this is a lame way to format the output.
            require Data::Dumper;
            $git->error('Perl::Critic Violations', Data::Dumper::Dumper(\%violations));
            return 0;
        }

        return 1;
    };

Note that you may define several hooks for the same operation. In the above example, we've defined two PRE_COMMIT hooks. Both are going to be executed when Git invokes the generic script during the pre-commit phase.

You may implement different kinds of hooks in the same generic script. The function run_hook() will activate just the ones for the current Git phase.

Using Plugins

There are several hooks already implemented as plugin modules, which you can use. Some are described succinctly below. Please, see their own documentation for more details.

Each plugin may be used in one or, sometimes, multiple hooks. Their documentation is explicit about this.

These plugins are configured by Git's own configuration framework, using the git config command or by directly editing Git's configuration files. (See git help config to know more about Git's configuration infrastructure.)

To enable a plugin you must add it to the githooks.plugin configuration option.

The CONFIGURATION section below explains this in more detail.

Invoking external hooks

Since the default Git hook scripts are taken by the symbolic links to the Git::Hooks generic script, you must install any other hooks somewhere else. By default, the run_hook routine will look for external hook scripts in the directory .git/hooks.d (which you must create) under the repository. Below this directory you should have another level of directories, named after the default hook names, under which you can drop your external hooks.

For example, let's say you want to use some of the hooks in the standard Git package). You should copy each of those scripts to a file under the appropriate hook directory, like this:

Note that you may install more than one script under the same hook-named directory. The driver will execute all of them in a non-specified order.

If any of them exits abnormally, run_hook dies with an appropriate error message.

Gerrit Hooks

Gerrit is a web based code review and project management for Git based projects. It's based on JGit, which is a pure Java implementation of Git.

Gerrit doesn't support Git standard hooks. However, it implements its own special hooks. Git::Hooks currently supports only three of Gerrit hooks:

ref-update

The ref-update hook is executed synchronously when a user performs a push to a branch. It's purpose is the same as Git's update hook and Git::Hooks's plugins usually support them both together.

patchset-created

The patchset-created hook is executed asynchronously when a user performs a push to one of Gerrit's virtual branches (refs/for/*) in order to record a new review request. This means that one cannot stop the request from happening just by dying inside the hook. Instead, what one needs to do is to use Gerrit's API to accept or reject the new patchset as a reviewer.

Git::Hooks does this using a Gerrit::REST object. There are a few configuration options to set up this Gerrit interaction, which are described below.

This hook's purpose is usually to verify the project's policy compliance. Plugins that implement pre-commit, commit-msg, update, or pre-receive hooks usually also implement this Gerrit hook.

Since draft patchsets are visible only by their owners, the patchset-created hook is unusable because it uses a fixed user to authenticate. So, Git::Hooks exit prematurely when invoked as the patchset-created hook for a draft change.

draft-published

The draft-published hook is executed when the user publishes a draft change, making it visible to other users. Since the patchset-created hook doesn't work for draft changes, the draft-published hook is a good time to work on them. All plugins that work on the patchset-created also work on the draft-published hook to cast a vote when drafts are published.

CONFIGURATION ^

Git::Hooks is configured via Git's own configuration infrastructure. There are a few global options which are described below. Each plugin may define other specific options which are described in their own documentation. The options specific to a plugin usually are contained in a configuration subsection of section githooks, named after the plugin base name. For example, the Git::Hooks::CheckAcls plugin has its options contained in the configuration subsection githooks.checkacls.

You should get comfortable with git config command (read git help config) to know how to configure Git::Hooks.

When you invoke run_hook, the command git config --list is invoked to grok all configuration affecting the current repository. Note that this will fetch all --system, --global, and --local options, in this order. You may use this mechanism to define configuration global to a user or local to a repository.

Gerrit keeps its repositories in a hierarchy and its specific configuration mechanism takes advantage of that to allow a configuration definition in a parent repository to trickle down to its children repositories. Git::Hooks uses Git's native configuration mechanisms and doesn't support Gerrit's mechanism, which is based on configuration files kept in a dettached refs/meta/config branch. But you can implement a hierarchy of configuration files by using Git's inclusion mechanism. Please, read the "Includes" section of git help config to know how.

githooks.plugin PLUGIN...

To enable one or more plugins you must add them to this configuration option, like this:

    $ git config --add githooks.plugin CheckAcls CheckJira

You can add another list to the same variable to enable more plugins, like this:

    $ git config --add githooks.plugin CheckLog

This is usefull, for example, to enable some plugins globally and others locally, per repository.

A plugin may hook itself to one or more hooks. CheckJira, for example, hook itself to three: commit-msg, pre-receive, and update. It's important that the corresponding symbolic links be created pointing from the hook names to the generic script so that the hooks are effectively invoked.

In the previous examples, the plugins were referred to by their short names. In this case they are looked for in three places, in this order:

  1. In the githooks directory under the repository path (usually in .git/githooks), so that you may have repository specific hooks (or repository specific versions of a hook).
  2. In every directory specified with the githooks.plugins option. You may set it more than once if you have more than one directory holding your hooks.
  3. In Git::Hooks installation.

The first match is taken as the desired plugin, which is executed (via do) and the search stops. So, you may want to copy one of the standard plugins and change it to suit your needs better. (Don't shy away from sending your changes back to the author, please.)

However, if you use the fully qualified module name of the plugin in the configuration, then it will be simply required as a normal module. For example:

    $ git config --add githooks.plugin My::Hook::CheckSomething

githooks.disable PLUGIN...

This option disables plugins enabled by the githooks.plugin option. It's useful if you want to enable a plugin globally and only disable it for some repositories. For example:

    $ git config --global --add githooks.plugin  CheckJira

    $ git config --local  --add githooks.disable CheckJira

You may also temporarily disable a plugin by assigning to "0" an environment variable with its name. This is useful sometimes, when you are denied some perfectly fine commit by one of the check plugins. For example, suppose you got an error from the CheckLog plugin because you used an uncommon word that is not in the system's dictionary yet. If you don't intend to use the word again you can bypass all CheckLog checks this way:

    $ CheckLog=0 git commit

This works for every hook. For plugins specified by fully qualified module names, the environment variable name has to match the last part of it. For example, to disable the My::Hook::CheckSomething plugin you must define an environment variable called CheckSomething.

Note, however, that this works for local hooks only. Remote hooks (like update or pre-receive) are run on the server. You can set up the server so that it defines the appropriate variable, but this isn't so useful as for the local hooks, as it's intended for once-in-a-while events.

githooks.plugins DIR

This option specify a list of directories where plugins are looked for besides the default locations, as explained in the githooks.plugin option above.

githooks.externals [01]

By default the driver script will look for external hooks after executing every enabled plugins. You may disable external hooks invocation by setting this option to 0.

githooks.hooks DIR

You can tell this plugin to look for external hooks in other directories by specifying them with this option. The directories specified here will be looked for after the default directory .git/hooks.d, so that you can use this option to have some global external hooks shared by all of your repositories.

Please, see the plugins documentation to know about their own configuration options.

githooks.groups GROUPSPEC

You can define user groups in order to make it easier to configure access control plugins. A group is specified by a GROUPSPEC, which is a multiline string containing a sequence of group definitions, one per line. Each line defines a group like this, where spaces are significant only between users and group references:

    groupA = userA userB @groupB userC

Note that a group can reference other groups by name. To make a group reference, simply prefix its name with an at sign (@). Group references must reference groups previously defined.

A GROUPSPEC may be in the format file:PATH/TO/FILE, which means that the external text file PATH/TO/FILE contains the group definitions. The path may be absolute or relative to the hooks current directory, which is usually the repository's root in the server. It's syntax is very simple. Blank lines are skipped. The hash (#) character starts a comment that goes to the end of the current line. The remaining lines must define groups in the same format exemplified above.

The may be multiple definitions of this variable, each one defining different groups. You can't redefine a group.

githooks.userenv STRING

When Git is performing its chores in the server to serve a push request it's usually invoked via the SSH or a web service, which take care of the authentication procedure. These services normally make the authenticated user name available in an environment variable. You may tell this hook which environment variable it is by setting this option to the variable's name. If not set, the hook will try to get the user's name from the GERRIT_USER_EMAIL or the USER environment variable, in this order, and let it undefined if it can't figure it out.

The Gerrit hooks unfortunately do not have access to the user's id. But they get the user's full name and email instead. Git:Hooks takes care so that two environment variables are defined in the hooks, as follows:

If the user name is not directly available in an environment variable you may set this option to a code snippet by prefixing it with eval:. The code will be evaluated and its value will be used as the user name.

For example, if the Gerrit user email is not what you want to use as the user id, you can set the githooks.userenv configuration option to grok the user id from one of these environment variables. If the user id is always identical to the part of the email before the at sign, you can configure it like this:

    git config githooks.userenv \
      'eval:(exists $ENV{GERRIT_USER_EMAIL} && $ENV{GERRIT_USER_EMAIL} =~ /([^@]+)/) ? $1 : undef'

This variable is useful for any hook that need to authenticate the user performing the git action.

githooks.admin USERSPEC

There are several hooks that perform access control checks before allowing a git action, such as the ones installed by the CheckAcls and the CheckJira plugins. It's useful to allow some people (the "administrators") to bypass those checks. These hooks usually allow the users specified by this variable to do whatever they want to the repository. You may want to set it to a group of "super users" in your team so that they can "fix" things more easily.

The value of each option is interpreted in one of these ways:

githooks.abort-commit [01]

This option is true (1) by default, meaning that the pre-commit and the commit-msg hooks will abort the commit if they detect anything wrong in it. This may not be the best way to handle errors, because you must remember to retrieve your carefully worded commit message from the .git/COMMIT_EDITMSG to try it again, and it is easy to forget about it and lose it.

Setting this to false (0) makes these hooks simply warn the user via STDERR but let the commit succeed. This way, the user can correct any mistake with a simple git commit --amend and doesn't run the risk of losing the commit message.

githooks.nocarp [01]

By default all errors produced by Git::Hooks use Carp::croak, so that they contain a suffix telling where the error occurred. Sometimes you may not want this. For instance, if you receive the error message produced by a server hook you won't be able to use that information.

So, for server hooks you may want to set this configuration variable to 1 to strip those suffixes from the error messages.

githooks.gerrit.url URL =head2 githooks.gerrit.username USERNAME =head2 githooks.gerrit.password PASSWORD

These three options are required if you enable Gerrit hooks. They are used to construct the Gerrit::REST object that is used to interact with Gerrit.

githooks.gerrit.review-label LABEL

This option defines the label that must be used in Gerrit's review process. If not specified, the standard Code-Review label is used.

githooks.gerrit.vote-ok +N

This option defines the vote that must be used to approve a review. If not specified, +1 is used.

githooks.gerrit.vote-nok -N

This option defines the vote that must be used to reject a review. If not specified, -1 is used.

githooks.gerrit.comment-ok COMMENT

By default, when approving a review Git::Hooks simply casts a positive vote but does not add any comment to the change. If you set this option, it adds a comment like this in addition to casting the vote:

  [Git::Hooks] COMMENT

You may want to use a simple comment like 'OK'.

MAIN FUNCTION ^

run_hook(NAME, ARGS...)

This is the main routine responsible to invoke the right hooks depending on the context in which it was called.

Its first argument must be the name of the hook that was called. Usually you just pass $0 to it, since it knows to extract the basename of the parameter.

The remaining arguments depend on the hook for which it's being called. Usually you just pass @ARGV to it. And that's it. Mostly.

        run_hook($0, @ARGV);

HOOK DIRECTIVES ^

Hook directives are routines you use to register routines as hooks. Each one of the hook directives gets a routine-ref or a single block (anonymous routine) as argument. The routine/block will be called by run_hook with proper arguments, as indicated below. These arguments are the ones gotten from @ARGV, with the exception of the ones identified by 'GIT' which are Git::More objects that can be used to grok detailed information about the repository and the current transaction. (Please, refer to Git::More specific documentation to know how to use them.)

Note that the hook directives resemble function definitions but they aren't. They are function calls, and as such must end with a semi-colon.

Some hooks are invoked before an action (e.g., pre-commit) so that one can check some condition. If the condition holds, they must simply end without returning anything. Otherwise, they should invoke the error method on the GIT object passing a suitable error message. On some hooks, this will prevent Git from finishing its operation.

Other hooks are invoked after the action (e.g., post-commit) so that its outcome cannot affect the action. Those are usually used to send notifications or to signal the completion of the action someway.

You may learn about every Git hook by invoking the command git help hooks. Gerrit hooks are documented in the project site.

Also note that each hook directive can be called more than once if you need to implement more than one specific hook.

METHODS FOR PLUGIN DEVELOPERS ^

Plugins should start by importing the utility routines from Git::Hooks:

    use Git::Hooks qw/:utils/;

Usually at the end, the plugin should use one or more of the hook directives defined above to install its hook routines in the appropriate hooks.

Every hook routine receives a Git::More object as its first argument. You should use it to infer all needed information from the Git repository.

Please, take a look at the code for the standard plugins under the Git::Hooks:: namespace in order to get a better understanding about this. Hopefully it's not that hard.

The utility routines implemented by Git::Hooks are the following:

post_hook SUB

Plugin developers may be interested in performing some action depending on the overall result of every check made by every other hook. As an example, Gerrit's patchset-created hook is invoked asynchronously, meaning that the hook's exit code doesn't affect the action that triggered the hook. The proper way to signal the hook result for Gerrit is to invoke it's API to make a review. But we want to perform the review once, at the end of the hook execution, based on the overall result of all enabled checks.

To do that plugin developers can use this routine to register callbacks that are invoked at the end of run_hooks. The callbacks are called with the following arguments:

The callbacks may see if there were any errors signalled by the plugin hook by invoking the get_errors method on the GIT object. They may be used to signal the hook result in any way they want, but they should not die or they will prevent other post hooks to run.

is_ref_enabled(REF, SPEC, ...)

This routine returns a boolean indicating if REF matches one of the ref-specs in SPECS. REF is the complete name of a Git ref and SPECS is a list of strings, each one specifying a rule for matching ref names.

As a special case, it returns true if REF is undef or if there is no SPEC whatsoever, meaning that by default all refs/commits are enabled.

You may want to use it, for example, in an update, pre-receive, or post-receive hook which may be enabled depending on the particular refs being affected.

Each SPEC rule may indicate the matching refs as the complete ref name (e.g. "refs/heads/master") or by a regular expression starting with a caret (^), which is kept as part of the regexp.

im_memberof(GIT, USER, GROUPNAME)

This routine tells if USER belongs to GROUPNAME. The groupname is looked for in the specification given by the githooks.groups configuration variable.

match_user(GIT, SPEC)

This routine checks if the authenticated user (as returned by the Git::More::authenticated_user method) matches the specification, which may be given in one of the three different forms acceptable for the githooks.admin configuration variable above, i.e., as a username, as a @group, or as a ^regex.

im_admin(GIT)

This routine checks if the authenticated user (again, as returned by the Git::More::authenticated_user method) matches the specifications given by the githooks.admin configuration variable.

eval_gitconfig(VALUE)

This routine makes it easier to grok config values as Perl code. If VALUE is a string beginning with eval:, the remaining of it is evaluated as a Perl expression and the resulting value is returned. If VALUE is a string beginning with file:, the remaining of it is treated as a file name which contents are evaluated as Perl code and the resulting value is returned. Otherwise, VALUE itself is returned.

redirect_output

This routine redirects STDOUT and STDERR to a temporary file and returns a reference that should be passed to the routine restore_output to restore the handles to their original state.

restore_output REF

This routine gets a reference returned by redirect_output, restores STDOUT and STDERR to their previous state and returns a string containing every output since the previous call to redirect_output.

file_temp REV, FILE, ARGS...

This routine returns a File::Temp object representing a temporary file into which the contents of the file FILE in revision REV has been copied.

The object's filehandle is closed before being returned.

It's useful for hooks that need to read the contents of changed files in order to check anything in them.

These objects are cached so that if more than one hook needs to get at them they're created only once.

By default, all temporary files are removed when the hook exits.

Any remaining ARGS are passed as arguments to File::Temp::new so that you can have more control over the temporary file creation.

SEE ALSO ^

REPOSITORY ^

https://github.com/gnustavo/Git-Hooks

AUTHOR ^

Gustavo L. de M. Chaves <gnustavo@cpan.org>

COPYRIGHT AND LICENSE ^

This software is copyright (c) 2014 by CPqD <www.cpqd.com.br>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

syntax highlighting: