The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

makepp_cookbook -- The best way to set up makefiles for various situations

DESCRIPTION

I discovered that practically no one ever reads a manual for a make tool, because frankly no one really is interested in the make process itself--we are only interested in results. So this cookbook was put together in hopes that people will be able to get what they need quickly from the examples without wading through the manual.

Building libraries

Do you really need a library?

I have seen a number of large programs which consist of a large number of modules, each of which lives in its own directory. Commonly, each directory is put into its own library, and then the final program links with all of the libraries.

In many cases, I think rather than use a library, there is a better approach. Libraries are not really the right solution if each module cannot or will not be reused in any other program, because then you get all of the drawbacks of libraries and none of the advantages. In my opinion, libraries should only be used in two situations:

  1. When you have a bunch of subroutines which have to be linked with several different programs, and no program actually uses 100% of the subroutines--each program uses a different subset. In this case, it probably is a good idea to use a static library (a .a file, or an archive file).

  2. When you have a module which should be linked into several different programs, and you want to load it dynamically so each program doesn't have to have a separate copy of the library. Dynamic libraries can save executable file space and sometimes enhance system performace because there is only one copy of the library loaded for all of the different programs that use it.

Using static libraries has one main disadvantage: on some systems (e.g., linux), the order in which you link the libraries is critically important. The linker processes libraries in the order specified on its command line. It grabs everything it thinks it needs from each library, then moves on to the next library. If some subsequent library refers to a symbol which hasn't yet been incorporated from a previous library, the linker does not know to go back and grab it from the previous library. As a result, it can be necessary to list the library multiple times on the linker command line. (I worked on a project where we had to repeat the whole list of libraries three times. This project is what made me prefer the alternative approach suggested below, that of incremental linking.)

Using dynamic libraries has two main disadvantages. First, your program can be slightly slower to start up if the library isn't already being used by some other program, because it has to be found and loaded. Second, it can be a real hassle to get all the dynamic libraries installed in the correct locations; you can't just copy the program executable, you also have to make sure that you copy all of its libraries.

If your module will never be used in any other program, then there is no reason to use a library: you get all of the disadvantages of using libraries and none of the advantages. The technique I prefer is to use incremental linking, where it is available.

Here is how you can do this on linux:

   my_module.o : $(filter_out my_module.o, $(wildcard *.o))
       ld -r -o $(output) $(inputs)

What this will do is to create another .o file called my_module.o, which will consist of all of the .o files in this subdirectory. The linker will resolve as many of the references as it can, and will leave the remaining references to be resolved in a subsequent stage of linking. At the top level, when you finally build your program, instead of linking with libmy_module.a or libmy_module.so, you would simply link with my_module.o. When you link .o files, you don't have problems with order-dependency in the linker command line.

Letting makepp figure out which library modules are needed

Even if you have a true library, where a given program needs only a few files from it (rather than every single module), makepp might be able to figure out which modules are needed from the library and include only those in the build. This can save compilation time if you are developing the library along with a program, because you don't bother to compile library modules that aren't needed for the particular program you are working on.

If your library strictly observes the convention that all functions or classes declared in a file xyz.h are completely implemented in a source file that compiles to xyz.o (i.e., you don't split up the implementation into xyz1.o and xyz2.o), then you can use the $(infer_objects ) function to tell makepp to pull out only the relevant modules from the library. This can work surprisingly well for libraries with even dozens of include files. Basically, $(infer_objects ) examines the list of .h files that are included, and looks for corresponding .o files. If you're rapidly developing a library and a program together, this can save compilation time, because you never bother to compile modules of the library that the program doesn't use.

Here's an example of the way I use it:

    my_program: $(infer_objects *.o, $(LIB1)/*.o $(LIB2)/*.o)
        $(CXX) $(inputs) -o $(output) $(SYSTEM_LIBRARIES)

The $(infer_objects ) function returns its first argument (after doing wildcard expansion on it), and also scans through the list of files in its second argument, looking for files whose name is the same as the name of any .h files included by any file in its first argument. If any such files are found, these are added to the list.

Building a static library

If you are sure you actually need a library and incremental linking isn't available or isn't what you want to do, there are a couple of ways to do it. First, here is an example where all the files are explicitly listed:

    LIBRARY_FILES = a b c d e

    libmine.a: $(LIBRARY_FILES).o
       $(RM) -f $(output)
       $(AR) cr $(output) $(inputs)
       ranlib $(output)     # May not be necessary, depending on your OS.

If you're used to writing makefiles, you may be a little suprised by this command; you may be used to something more like this:

    libmine.a: $(LIBRARY_FILES).o
       $(AR) ru $@ $?      # Not recommended!!!!!!!
       ranlib $(output)

where $? is an automatic variable that means any files which have changed since the last time the library was built, and $@ is roughly the same as $(output).

This approach is not recommended for several reasons:

  • Suppose you remove a source file from the current directory. It's still in the library, because you didn't rebuild the library from scratch. As a result, anything that links with this library will have the stale .o file, and that can screw up your builds. (I once got thoroughly confused by this when I was trying to remove dead code from a project: I kept deleting files and it all still linked because the libraries still had the modules I removed. However, when someone else rebuilt the project from scratch, it didn't link any more!)

    This is why we first remove the library file, and create it from scratch. This will take slightly longer than just updating modules in a library, but not much longer; the amount of time consumed by the ar program is miniscule compared to what the C compiler takes up in a typical build, so it's just not worth worrying about.

  • One of the ways that makepp attempts to guarantee correct builds is that it will automatically rebuild if the command line to build a given target has changed. But using the $? variable can cause problems, because each time the library is updated, the build command is different. To prevent this, makepp doesn't actually implement the $? variable like other makes; $? is actually roughly equivalent to $(inputs).

Sometimes you may find that listing all the files is a bit if a pain, especially if a project is undergoing rapid development and the list of files is constantly changing. It may be easier to build the library using wildcards, like this:

    libmine.a: $(only_targets *.o)
        $(RM) -f $(output)
        $(AR) cr $(output) $(inputs)

This puts all the .o files in the current directory into the library. The wildcard matches any .o file which exists or can be built, so it will work even if the files don't exist yet.

The only_targets function is used to exclude .o files which don't have corresponding source files any more. Suppose you had a file called xyz.c that you used to put into your library. This means that there's an xyz.o file lying around. Now you delete xyz.c because it's obsolete, but you forget to delete xyz.o. Without the only_targets function, xyz.o would still be included in the list of .o files included in the library.

Building a dynamic library

The process of building dynamic libraries is entirely system dependent. I would highly recommend using libtool to build a dynamic library (see http://www.gnu.org/software/libtool), so you don't have to figure out how to do it on your platform, and so that your makefile will continue to work even when you switch to a different OS. See the libtool documentation for details. Here's a sample Makefile:

    LIBTOOL := libtool

    libflick.la : $(only_targets *.lo)
        $(LIBTOOL) --mode=link $(CC) $(inputs) -o $(output)

    %.lo : %.c
        $(LIBTOOL) --mode=compile $(CC) $(CFLAGS) $(INCLUDES) -c $(input) -o $(output)

Tips for using wildcards

Matching all files except a certain subset

Makepp's wildcards do not have any way at present of matching all files except a certain set, but you can do it with a combination of functions.

For example, suppose you have a test program for each module in a library, but you don't want to include the test programs in the library. If all the test programs begin with test, then you can exclude them like this:

    libproduction.a: $(filter_out test*, $(wildcard *.o))

The $(filter ) and $(filter_out ) functions are a very powerful set of filters to do all kinds of set intersection and difference operations. For example,

    SUBDIRS := $(filter_out *test* *$(ARCH)*, $(shell find . -type d -print))
                            # Returns all subdirectories that don't have
                            # "test" or $(ARCH) in them.

    $(filter $(patsubst test_dir/test_%.o, %.o, $(wildcard test_dir/*.o)), \
             $(wildcard *.o))
                            # Returns a list of .o files in the current 
                            # directory for which there is a corresponding
                            # test_*.o file in the test_dir subdirectory.
    $(filter_out $(patsubst man/man3/%.3, %.o, $(wildcard man/man3/*.3)), \
                 $(wildcard *.o))
                            # Returns a list of .o files in the current
                            # directory for which there is not a manual page
                            # with the same filename in the man/man3 subdirectory.

Using the $(only_targets ) function to eliminate stale .o files

Suppose you are building a program or a library with a build command like this:

   program: *.o
       $(CC) $(inputs) -o $(output)

Suppose you now delete a source file. If you forget to delete the corresponding .o file, it will still be linked in even though there is no way to build it any more. In the future, makepp will probably recognize this situation automatically and exclude it from the wildcard list, but at present, you have to tell it to exclude it manually:

    program: $(only_targets *.o)
       $(CC) $(inputs) -o $(outputs)

Makepp does not know any way to build the stale .o file any more since its source file is gone, so the $(only_targets ) function will exclude it from the dependency list.

Tips for multiple directories

One of the main reasons for writing makepp was to simplify handling of multiple directories. Makepp is able to combine build commands from multiple makefiles, so it can properly deal with a rule in one makefile that depends on a file which is built by a different makefile.

What to do in place of recursive make

Makepp supports recursive make for backward compatibility, but it is highly recommended that you not use it. If you don't know what it is, good.

See "Better system for hierarchical builds" in makepp for details on why you don't want to use recursive make, or else search on the web for "recursive make considered harmful".

Instead of doing a recursive make to make the all target in every makefile, it is usually easier to let makepp figure out which targets will actually need to be built. Furthermore, if you put all of your .o and library files in the same directory as the makefiles, then makepp will automatically figure out which makefiles are needed too--the only thing that's needed is the have your top level make list the files that are needed for the final linking step. See the examples below.

One makefile for each directory: with implicit loading

The most common way to handle multiple directories is to put a makefile in each directory which describes how to build everything in that directory. If you put .o files in the same directory as the source files, then implicit loading (see "Implicit loading" in makepp_build_algorithm) will automatically find all the makefiles. If you put your .o files in a different directory (e.g., in an architecture-dependent subdirectory), then you will have to load all the relevant makefiles using the load_makefile statement.

Here is a sample top-level makefile for a directory hierarchy that uses implicit loading to build a program that consists of many shared libraries (but see "Do you really need a library?" in makepp_cookbook, because making a program out of a bunch of shared libraries is not necessarily a good idea):

   # Top level makefile:
   program : main.o **/*.la  # Link in shared libraries from all subdirectories.
       $(LIBTOOL) --mode=link $(CC) $(CFLAGS) $(inputs) -o $(output) $(LIBS)

That's pretty much all you need in the top level makefile. In each subdirectory, you would probably do something like this:

   # Makefile in each subdirectory:
   include standard_defs.mk   # Searches ., .., ../.., etc. until it
                              # finds the indicated include file.
   # override some variable definitions here
   SPECIAL_FLAGS := -do_something_different

Each makefile can probably be pretty much the same if the commands to build the targets are quite similar.

Finally, you would put the following into the standard_defs.mk file, which should probably be located in the top-level directory:

    # Common variable settings and build rules for all directories.
    CFLAGS := -g -O2
    INCLUDE_DIR := $(find_upwards includes)
                              # Searches ., .., ../.., etc. for a file or
                              # directory called includes, so if you put
                              # all your include files in there, this will
                              # find them.
    INCLUDES := -I$(INCLUDE_DIR)

    %.lo : %.c
        $(LIBTOOL) --mode=compile $(CC) $(CFLAGS) $(INCLUDES) -c $(input) -o $(OUTPUT)
    
    lib$(relative_to ., ..).la: $(only_targets *.lo)
        $(LIBTOOL) --mode=link $(CC) $(CFLAGS) -o $(output) $(inputs)
                      # $(relative_to ., ..) returns the name of the current
                      # subdirectory relative to the upper level
                      # subdirectory.  So if this makefile is xyz/Makefile,
                      # this rule will build xyz/libxyz.la.

    # Copy public include files into the top-level include directory:
    $(INCLUDE_DIR)/public_%.h : public_%.h
        cp $(input) $(output)

One makefile for each directory: explicit loading

If you want to put all of your .o files into an architecture-dependent subdirectory, then the above example should be modified to be something like this:

   # Top level makefile:
   MAKEFILES := $(wildcard **/Makeppfile)  # List of all subdirectories to
                                           # get makefiles from.

   load_makefile $(MAKEFILES)       # Load them all in.

   include standard_defs.mk         # Get compile command for main.o.

   program : $(ARCH)/main.o */**/$(ARCH)/*.la 
       $(LIBTOOL) --mode=link $(CC) $(CFLAGS) $(inputs) -o $(output) $(LIBS)
                                    # */**/$(ARCH) excludes the subdirectory
                                    # $(ARCH), where we don't want to build
                                    # a shared library.        

Each makefile would be exactly the same as before:

   # Makefile in each subdirectory:
   include standard_defs.mk
   # ... variable overrides here

And finally, standard_defs.mk would contain something like the following:

    # Common variable settings and build rules for all directories.
    ARCH := $(shell uname -m)
    dummy := $(shell test -d $(ARCH) || mkdir -p $(ARCH))
                              # Make sure the output directory exists.
    CFLAGS := -g -O2
    INCLUDE_DIR := $(find_upwards includes)
                              # Searches ., .., ../.., etc. for a file or
                              # directory called includes, so if you put
                              # all your include files in there, this will
                              # find them.
    INCLUDES := -I$(INCLUDE_DIR)

    dummy := $(shell test -d $(ARCH) || mkdir -p $(ARCH))
                              # Make sure the output directory exists.

    $(ARCH)/%.lo : %.c
        $(LIBTOOL) --mode=compile $(CC) $(CFLAGS) $(INCLUDES) -c $(input) -o $(OUTPUT)
    
    $(ARCH)/lib$(relative_to ., ..).la: $(only_targets *.lo)
        $(LIBTOOL) --mode=link $(CC) $(CFLAGS) -o $(output) $(inputs)
                      # $(relative_to ., ..) returns the name of the current
                      # subdirectory relative to the upper level
                      # subdirectory.  So if this makefile is xyz/Makefile,
                      # this rule will build xyz/$(ARCH)/libxyz.la.

    # Copy public include files into the top-level include directory:
    $(INCLUDE_DIR)/public_%.h : public_%.h
        cp $(input) $(output)

Automatically making the makefiles

If your makefiles are all extremely similar (as in the above example), you can tell Makepp to build them automatically if they don't exist. Just add the following to your top-level makefile:

   SUBDIRS := $(filter_out unwanted_dir1 unwanted_dir2, $(shell find . -type d))
   $(foreach)/Makeppfile: : foreach $(SUBDIRS)
       echo "include standard_defs.mk" > $(output)
       echo "_include additional_defs.mk" >> $(output)
                             # If the file additional_defs.mk exists, then
                             # it will be included, but if it doesn't exist,
                             # the _include statement will be ignored.

Now the makefiles themselves will be automatically built.

One makefile only at the top level

If all your makefiles are identical, you may ask: why should I have a makefile at each level? Why not put that all into the top-level makefile?

Yes, this can be done. The main disadvantage is that it becomes harder to specify different build options for each subdirectory. A second disadvantage is that your makefile will probably become a bit harder to read.

Here's an example of doing just that:

    # Top-level makefile for directory hierarchy.  Builds the program
    # out of a set of shared libraries as an example.  (See caveats above 
    # for why you might want to use incremental linking or some other
    # approach rather than shared libraries.)
    percent_subdirs := 1     # Allow % to match multiple directories.
    SUBDIRS := $(filter_out *CVS* other-unwanted_dirs $(shell find . -type d -print))
    CFLAGS := -g -O2
    INCLUDES := -Iincludes

    %.lo: %.c
        $(LIBTOOL) --mode=compile $(CC) $(INCLUDES) $(CFLAGS) -c $(input) -o $(output)

    $(foreach)/lib$(notdir $(foreach).la: $(foreach)/*.lo : foreach $(SUBDIRS)
        $(LIBTOOL) --mode=link $(CC) $(CFLAGS) -o $(output) $(inputs)
                               # Rule to make all of the libraries.

    program : main.o **/*.la
        $(LIBTOOL) --mode=link $(CC) $(CFLAGS) -o $(output) $(inputs)
    
    includes/$(notdir $(foreach)) : $(foreach) : foreach **/public_*.h
        cp $(input) $(output)
                               # Sample rule for copying the publically
                               # accessible .h files to the right place.

A clean target

Many makefiles have a phony target called clean which is just a name for a set of commands to remove all files that result from the make process. Usually a clean target looks something like this:

    $(phony clean):
        $(RM) -rf *.o .makepp*   # .makepp* gets rid of all of makepp's junk.

Instead of explicitly listing the files you want to delete, you can also tell makepp to remove everything it nodes how to build, like this:

    $(phony clean):
        $(RM) -rf .makepp* $(only_targets *)

This has the advantage that if any of your source files can be built from other files, they will be deleted too; on the other hand, stale .o files (files which used to be buildable but whose source file has since been removed) will not be deleted.

If you have a build that involves makefiles in several different directories, your top-level makefile may reference the clean target (or any other phony target) in a different makefile:

    # Top-level makefile
    SUBDIRS := sub1 sub2

    # build rules here

    # Clean up after the build:
    $(phony clean): $(SUBDIRS)/clean
        $(RM) -rf .makepp* $(only_targets *)

Alternatively, you can put your clean target only in the top-level makefile, and have it process all of the directories, like this:

    $(phony clean):
        $(RM) -rf $(wildcard **/.makepp) $(only_targets **/*)

The $(wildcard ) has to be used because the default unix shell doesn't know about the ** wildcard to match 0 or more intervening directories.

If you have a really huge build or really long filenames, this kind of a command can run into trouble with command line length limitations on some operating systems. Another approach will work:

    $(phony clean):
        find . -name '*.[oa]' -print | perl -nle unlink
        $(RM) -rf $(wildcard **/.makepp*)

(find . -name '*.[oa]' | perl -nle unlink does exactly the same thing as find . -name '*.[oa]' | xargs -r rm -f but is much faster.)

Using Qt's moc preprocessor

This example shows a makefile for a utility that uses Troll Tech's Qt GUI library (see http://www.trolltech.com). The only thing that's slightly unusual about this is that you must run a preprocessor called moc on most .h files that contain widget definitions, but you don't want to run moc on any .h files that don't use the Q_OBJECT macro.

Automatically determining which files need moc files

You could, of course, just list all of the .h files that need to have moc run on them. If you're rapidly developing new widgets, however, it may be something of a nuisance to keep updating the list in the makefile. You can get around the need to list the moc modules explicitly with something like this:

    MOC := $(QTDIR)/bin/moc
    MODULES     := whatever modules you happen to have in your program
    MOC_MODULES := $(patsubst %.h, moc_%, $(shell grep -l Q_OBJECT *.h))
                        # Scans all the .h files for the Q_OBJECT macro.
    
    my_program: $(MODULES).o $(MOC_MODULES).o
        $(CXX) $(inputs) -o $(output)
    
    moc_%.cxx: %.h              # Makes the moc files from the .h files.
        $(MOC) $(input) -o $(output)
    
    %.o: %.cxx
        $(CXX) $(CXXFLAGS) -c $(input) -o $(output)

This approach scans each of your .h files every time makepp is run, looking for the Q_OBJECT macro. This sounds expensive, but it probably won't take long at all. (The .h files will all have to be loaded from disk anyway by the compilation process, so they will be cached.)

#include the .moc file

Another approach is to #include the output from the moc preprocessor in your widget implementation file. This means you have to remember to write the #include, but it has the advantage that there are fewer modules to compile, and so compilation goes faster. (For most C++ compilation, the majority of the time is spent reading the header files, and the output from the preprocessor needs to include almost as many files as your widget anyway.) For example:

    // my_widget.h
    class MyWidget : public QWidget { 
      Q_OBJECT
    // ...
    } 
    
    // my_widget.cpp
    
    #include "my_widget.h"
    #include "my_widget.moc"    // my_widget.moc is the output from the
                                // moc preprocessor.
    // Other implementation things here.
    MyWidget::MyWidget(QWidget * parent, const char * name) :
      QWidget(parent, name)
    {
     // ...
    }

Now you need to have a rule in your makefile to make all the .moc files, like this:

    MOC := $(QTDIR)/bin/moc
    # Rule to make .moc files:
    %.moc: %.h
        $(MOC) $(input) -o $(output)

Makepp is smart enough to realize that it needs to make my_widget.moc if it doesn't already exist, or if it's out of date.

This second approach is the one that I usually use because it speeds up compilation.

Replacements for deprecated make idioms

MAKECMDGOALS

Sometimes people have rules in their makefile depend on what target they are building, using the special variable MAKECMDGOALS. For example, one sometimes sees things like this:

   ifneq ($(filter production, $(MAKECMDGOALS)),)
     CFLAGS := -O2
   else
     CFLAGS := -g
   endif

This will work fine with makepp. However, I recommend not to use MAKECMDGOALS for such cases (and so does the GNU make manual). You are better off putting your optimized and debug-compiled .o files in separate directories, or giving them different prefixes or suffixes, or using repositories, to keep them separate.

Probably the only time when you might actually want to reference MAKECMDGOALS is if it takes a long time to load your makefiles, and you don't need that for your clean target. For example,

   ifneq ($(MAKECMDGOALS),clean)
     load_makefile $(wildcard **/Makeppfile)
   else
     no_implicit_load . # Prevent automatic loading of any other makefiles.
   endif

   $(phony clean):
       rm -rf $(wildcard **/*.o)

Recursive make to build in different directories

See "Tips for multiple directories" in makepp_cookbook.

Recursive make to change value of a variable

Some makefiles reinvoke themselves with a different value of a variable, e.g., the debug target in the following makefile fragment

    .PHONY: all debug

    optimized:
        $(MAKE) program CFLAGS=-O2

    debug:
        $(MAKE) program CFLAGS=-g

    program: a.o b.o
        $(CC) $(CFLAGS) $^ -o $@

    %.o : %.c
        $(CC) $(CFLAGS) -c $< -o $@

If the user types "make debug", it builds the program in default mode with debug enabled instead of with optimization.

A better way to do it is to build two different programs, with two different set of object files, like this:

    CFLAGS := -O2
    DEBUG_FLAGS := -g
    MODULES := a b

    program: $(MODULES).o
         $(CC) $(CFLAGS) $(inputs) -o $(output)

    debug/program: debug/$(MODULES).o
         $(CC) $(DEBUG_FLAGS) $(inputs) -o $(output)

    %.o : %.c
         $(CC) $(CFLAGS) -c $(input) -o $(output)

    debug/%.o : %.c
         $(CC) $(DEBUG_FLAGS) -c $(input) -o $(output)

    $(phony debug): debug/program

The advantage of doing it this way is (a) you don't need to rebuild everything when you switch from debug to optimized and back again; (b)

The above can be written somewhat more concisely using repositories. The following makefile is exactly equivalent:

    repository debug=.     # Makes the debug subdirectory look like a copy of 
                           # the current subdirectory.
    load_makefile debug CFLAGS=-g    
                           # Override CFLAGS when invoked in debug subdirectory
    CFLAGS := -O2          # Value of CFLAGS when invoked in this subdirectory

    program: a.o b.o
        $(CC) $(CFLAGS) $^ -o $@

    %.o : %.c
        $(CC) $(CFLAGS) -c $< -o $@

    $(phony debug): debug/program
                           # If user types "makepp debug", builds
                           # debug/program instead of program.

Miscellaneous tips

How do I make sure my output directories exist?

You can specify a rule to build the output directory, then make sure that each file that goes in the output directory depends on it. But it's usually easier to do something like this:

   dummy := $(shell test -d $(OUTPUT_DIRECTORY) || mkdir -p $(OUTPUT_DIRECTORY))
                # This is usually easier than making all files depend on
                # $(OUTPUT_DIRECTORY) and having a rule to make it.
                # Note that you must use := instead of = to force it to
                # execute immediately.
   # An alternative approach: using perl code
   perl_begin
   -d $OUTPUT_DIRECTORY or mkdir $OUTPUT_DIRECTORY;
   perl_end

These statements should be near the top of your makefile, so they are executed before anything that could possibly need the directory.

How do I force a command to execute on every build?

The easiest way is not to use the rule mechanism at all, but simply to execute it, like this:

   dummy := $(shell date > last_build_timestamp)

Or put it in a perl block, like this:

   perl_begin
   system("command to execute");
   perl_end

This approach has the disadvantage that it will be executed even if the clean target is being run.