The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

makepp_signatures -- How makepp knows to rebuild files

DESCRIPTION

Each file is associated with a signature, which is a number or string that changes if the file has changed. Makepp compares signatures to see whether it needs to rebuild anything. The default signature for files is the file's modification time, unless you're executing a C/C++ compilation command, in which case the default signature is a cryptographic checksum on the file's contents, ignoring comments and whitespace. If you want, you can switch to a different method, or you can define your own signature functions.

In addition to the file's signature, it is also possible to control how makepp compares these signature values. For example, the exact_match method requires that file signatures be exactly the same as on the last build, whereas the target_newer method only requires that all dependencies be older than the target.

If makepp is building a file, and you don't think it should be, you might want to check the build log (.makepp_log). Makepp writes an explanation of what it thought each file depended on, and why it chose to rebuild.

At present, there are four signature checking methods included in makepp. Usually, makepp picks the most appropriate signature method automatically. However, you can change the signature method for an individual rule by using :signature modifier on the rule (see "signature" in makepp_rules), or for all rules in a makefile by using the signature statement (see "signature" in makepp_statements, or for all makefiles at once using the -m or --signature-method command line option (see "-m" in makepp_command).

Signature methods included in the distribution

exact_match

This method uses the modification dates on the file as signatures. It rebuilds the targets unless all of the following conditions are true:

  • The signature of each dependency is the same as it was on the last build.

  • The signature of each target is the same as it was on the last build.

  • The build command has changed.

  • The default directory for the build command has changed.

  • The machine architecture (or what perl thinks it is) has changed.

Makepp stores all the signature information and the build command from the last build, so that it can do these comparisons. (All the stored information is in the subdirectory .makepp of each directory.)

This is makepp's default algorithm unless it is trying to rebuild a makefile or compile C/C++ code. This is a highly reliable way of ensuring correct builds, and is almost always what you want. However, it does have a few side effects that may be surprising:

  • If you've been compiling with the traditional make, and then switch to makepp, everything is recompiled the first time you run makepp.

  • If you damage makepp's information about what happened on the last build (e.g., you delete the subdirectory .makepp, or don't copy it when you copy everything else), then a rebuild is triggered.

    -item *

    If you replace a file with an older version, a rebuild is triggered. This is normally what you want, but it might be surprising.

  • If you modify a file outside of the control of makepp (e.g., you run the compilation command yourself), then makepp will rebuild the file next time.

  • Architecture-independent files are rebuild when you switch to a different architecture. This is usually not a problem, because they often don't take long to build. The reason why all files are tagged with the architecture, instead of just binary files, is that often times even ASCII files are architecture-dependent. For example, output from the solaris lex program won't compile on linux (or at least this used to be true the last time I tried it).

target_newer

Rebuilds only if the target is newer than all of its dependencies. The dependencies may change their time stamp, but as long as they are older than the target, the target is not rebuilt. The target is also not rebuilt even if the command or the architecture has changed. (This is the signature method that the traditional make uses.)

This is makepp's default algorithm if it is trying to build the makefile before reading it in. (It loads the makefile and checks for a rule within the makefile to rebuild itself, and if such a rule is present and the makefile needs rebuilding, it is rebuild and then reread.) This is because it is common to modify a makefile using commands that are not under the control of makepp, e.g., running a configure procedure. Thus makepp doesn't insist that the last modification to the makefile be made by itself.

Using target_newer compared to exact_match has the following disadvantages:

  • makepp can be confused by clock synchronization or by bogus dates. For example, if a file somehow gets a date in the far future, anything that depends on it will always be rebuilt, no matter what.

  • Replacing a file with an older version of the same file won't trigger a rebuild.

  • Changing the build command (e.g., changing compilation options) won't trigger a rebuild.

  • Changing the architecture (e.g., switching from linux to solaris) won't trigger a rebuild.

md5

This is the same as exact_match, except that instead of using the file date as the signature, an MD5 checksum of the files contents is used. This means that if you change the date on the file but don't change its contents, makepp won't try to rebuild anything that depends on it.

This is particularly useful if you have some file which is often regenerated during the build process that other files depend on, but which usually doesn't change. If you use the md5 signature checking method, makepp will realize that the file's contents haven't changed even if the file's date has changed. (Of course, this won't help if the files have a timestamp written inside of them, as archive files do for example.)

For C/C++ source code, you should use c_compilation_md5 instead since it achieves the same thing but in a more powerful way.

c_compilation_md5

This is the same as exact_match, except that signatures for files which look like C or C++ source files are computed by an MD5 checksum of the file, ignoring comments and whitespace. (Technically, comments are replaced by a single space, and multiple whitespace characters are collapsed to a single space, before computing the MD5 checksum.) Ordinary file times are still used for signatures for object files, and any other files that don't have an extension typical of a C or C++ source file. (A file is considered to be source code if it has an extension of c, h, cc, hh, cxx, hxx, hpp, cpp, h++, c++, moc, or upper case versions of these.) If you use this, you can reindent your code or add or change comments without triggering a rebuild.

This method is particularly useful for the following situations:

  • You want to make changes to the comments in a commonly included header file, or you want to reformat or reindent part of it. For one project that I worked on a long time ago, we were very unwilling to correct inaccurate comments in a common header file, even when they were seriously misleading, because doing so would trigger several hours of rebuilds. With this signature method, this is no longer a problem.

  • You like to save your files often, and your editor (unlike emacs) will happily write a new copy out even if nothing has changed.

  • You have C/C++ source files which are generated automatically by other build commands (e.g., yacc or some other preprocessor). For one system I work with, we have a preprocessor which (like yacc) produces two output files, a .cxx and a .h file:

        %.h %.cxx: %.qtdlg $(HLIB)/Qt/qt_dialog_generator
            $(HLIB)/Qt/qt_dialog_generator $(input)

    However, most of the time when the input file changes, the resulting .h file contents are unchanged (except for a comment about the build time written by the preprocessor), although its date will change. This could trigger unnecessary rebuilds of many modules without this kind of cryptographic signature checking.

This is the default signature method for C or C++ compilation. It overrides any default specified with the -m or --signature-method command line option, but is overridden by any signature method specified by the signature</a> statement or the :signature rule modifier. Makepp determines that you are doing a C/C++ compilation if it recognizes your command line as an invocation of a C/C++ compiler (see makepp_scanning).

Custom methods

You can, if you want, define your own methods for calculating file signatures and comparing them. You will need to write a perl module to do this. Have a look at the comments in Signature.pm in the distribution, and also at the existing signature algorithms in Signature/*.pm for details.