Friday, October 7, 2011

Syncevolution build system work

In August Openismus asked me to work on Syncevolution's build system. The most important part of this work was to convert it from recursive Automake to non-recursive one with some features from previous build system still being available. Those were namely version number generation and, if possible, avoiding of manual listing of files to distribute. An added value of this conversion would be faster parallel build. Also additional objective was to make the build system less confusing for newcomers, but I doubt if I succeeded in that one.

Before I started the work the build system of Syncevolution consisted of two
sort-of-configure.in files (configure-pre.in and configure-post.in), Makefile-gen.am in src/, gen-autotools.sh, configure-sub.in files in every backend directory and a bunch of regular Makefile.am files. How this worked? autogen.sh was calling gen-autotools.sh creating proper configure.in by sandwiching contents of all configure-sub.in files between configure-pre.in and configure-post.in and substituting the version in AC_INIT with the one it computed. Also it was doing some find-and-sed magic on Makefile-gen.am to generate Makefile.am with list of found backend directories for SUBDIRS variable. After finishing this steps, it called autotools. In different order than autoreconf is doing it, but that was minor issue. Pretty messy, eh? Another issue was toplevel Makefile.am and src/Makefile-gen.am being a mess - lots nested ifs mixed with some custom rules and variables here and there.

While I liked the idea of injecting contents of configure-sub.in into final configure.in I didn't quite like how it was done. I wanted autogen.sh to call just autoreconf with some flags. I wanted no sort-of-configure.in files. I wanted no Makefile-gen.am. I wanted no script doing sed on neither configure.ac nor Makefile.am.

For version number generation I just stole an idea from autoconf (it uses m4_esyscmd in AC_INIT), so now it calls gen-git-version.sh.

I put code injecting contents of configure-sub.in files inside a m4 macro (which in fact calls a script) so merging of sort-of-configure.in files into configure.ac was possible.

Doing magic on Makefile-gen.am was apparently not needed after conversion to non-recursive Automake, because SUBDIRS are rather not used there. Instead, the backends.am with 'include <backend_name>/<backend_name>.am' lines is generated by yet another script.

Above steps clearly don't help the readability of build system. Maybe at some point such tricks could be removed.

Automake's documentation says that Automake itself should have enough support for generating a non-recursive build system. But still there were some hurdles to clear.

  1. Automake has a useful feature of installing/distributing the directory structure without need of specifying foodir and foo_DATA variables for every subdirectory - it is a nobase_ prefix. Apparently it is not that very useful in non-recursive Automake. Why? I wrote it in detail in feature request I reported to Automake.
  2. At some point 'make -j4 distcheck' failed. After some digging I noticed that libtool was trying to relink a backend against a library that should be installed at that point but it seemingly wasn't. That looked like a race condition because of incomplete dependencies. There are already some reports/feature requests for ability to specify install-time dependencies. In general Automake generates install-am rules as follows:

    install-am: all-am
     @$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
    

    so install-exec-am and install-data-am can be executed in parallel. install-exec-am rule installs all libraries, scripts and programs that should reside directly in one of standard directories ($(libdir), $(bindir), $(libexecdir) and so on), while install-data-am installs the rest. So a library is installed during install-exec-am and backend during install-data-am. And seemingly thread doing the latter rule installs (and relinks) the backend before the library is installed.

    My temporary hack was just to override install-am rule to do these steps sequentially:

    install-am: all-am
      @$(MAKE) $(AM_MAKEFLAGS) install-exec-am
      @$(MAKE) $(AM_MAKEFLAGS) install-data-am
    

    This is ugly because it invades Automake's namespace.
  3. The task of distributing files, that is - creating a tarball is a special task of Automake. The list of files to distribute is independent from conditionals. Unfortunately Automake's documentation is not clear about it. The misunderstanding of this issue often leads to a situation like the one below:

    in configure.ac:

    # need rst2html for HTML version of README
    AC_ARG_WITH(rst2html,
                ...,
                [AC_PATH_PROG(RST2HTML, rst2html, "no")])
    AM_CONDITIONAL([COND_HTML_README], [test "$RST2HTML" != "no"])
    

    in Makefile.am:

    if COND_HTML_README
    dist_doc_DATA += README.html
    endif
    ...
    README.html: README.rst
      $(RST2HTML) --initial-header-level=3 --exit-status=3 $< >$@
    

    Often justification for such situation is to avoid hard dependency on rst2html but still be able to provide README.html in tarball. Now, if one clones the repository, calls autogen.sh to generate build system, then calls configure with --with-rst2html=no (or just configure, if rst2html is not installed) to generate Makefiles and then calls make && make dist to build project and generate the tarball then the build will fail during executing dist target. That is because make wants to put README.html into tarball thus it executes a rule generating it. But RST2HTML variable is 'no'. This often goes unnoticed for long time. Why is that? Because automake strives to make tarballs having always the same content, regardless of flags passed to configure, regardless of existence of some installed software. Tarballs have to be always the same. With this in mind there are two clean solutions to this situation: either never distribute README.html or always do it. For the former changing the line:

    dist_doc_DATA += README.html
    
    into:
    nodist_doc_DATA += README.html
    
    should be enough. For the latter - make rst2html a hard dependency and thus remove the
    COND_HTML_README
    conditional.

    There is also sort of solution for having README.html always distributed:
    dist_doc_DATA += README.html
    
    if COND_HTML_README
    
    README.html: README.rst
      $(RST2HTML) --initial-header-level=3 --exit-status=3 $< >$@
    
    else
    
    README.html:
      if test ! -f README.html ; \
      then \
        echo "no rst2html and README.html is not found!"; \
      exit 1; \
    fi
    
    endif
    
    Understanding of this solution is left to reader as an exercise.
  4. Since non-recursive Automake means that only one Makefile (a toplevel one) is generated that forces developer to be careful when using variables. That is because all .am files are included into toplevel Makefile.am and thus it may happen that some variables are clobbered. To avoid clobbering a variable meant as internal one (say: my_sources) it is good idea to prefix it with escaped path of this .am file, for instance src_dbus_server_my_sources. To avoid clobbering an Automake variable (say: lib_LTLIBRARIES) it is good to initatialize this variable at the beginning of Makefile.am with an empty value and later just append values to it:

    # beginning of Makefile.am
    lib_LTLIBRARIES =
    ...
    # some where later or in another file being included by Makefile.am
    lib_LTLIBRARIES += src/foo/libfoo.la
    src_foo_libfoo_la_SOURCES = ...
    

    Since I had like twenty of such variables (MAINTAINERCLEANFILES, DISTCLEANFILES, bin_PROGRAMS, dist_noinst_DATA and so on) I created a separate setup-variables.am file which contained only such initializations and included it in toplevel Makefile.am.

    Also, with such system not only variables may be clobbered but also some local rules meant to be run as hooks like installcheck-local or some special make variables (.PHONY). My solution was to add lines to setup-variables.am:

    all_dist_hooks =
    all_phonies =
    

    Add lines to toplevel Makefile.am:

    .PHONY: $(all_phonies) ;
    
    dist-hook: $(all_dist_hooks) ;
    

    And to .am file with dist check routine:

    all_dist_hooks += src_dist_hook
    src_dist_hook:
     ...
    ...
    all_phonies += $(TEST_FILES_GENERATED)
    $(TEST_FILES_GENERATED):
     ...
    
  5. Another not documented (or maybe a bug) was that one have to define explicit foo_DEPENDENCIES variable when foo_LIBADD (or foo_LDADD) has AC_SUBSTed variable containing a path to in-project library. Otherwise Automake won't generate dependency on such library and race condition ensues (foo may be linked before the library is build). This is written in detail in bug report I filed.

When I'm filing bug reports to any project I try to at least look where the problem in source code is and to create a patch. But Automake being script of 8k lines of scarcely documented functions scared me away - I suppose that modularisation should be performed earlier instead of writing such a monster. Maybe later I'll try looking at it again.

In the end I must say that I am not happy with the outcome. Probably some generated files does not need to be generated anymore, so they should reside in repository, be visible for newcomers and scripts previously generating them removed. That would for sure improve clarity. Toplevel Makefile.am and src.am in src/ are still a mess. configure.ac is still also a mess. Also non-recursive Automake being less confusing to newcomers is questionable. I suppose that most of people using Automake is used to recursive build system and tend to treat .am files as Makefile.am files. Which is obviously a pitfall, because $(srcdir) and $(builddir) change their behavior. But I suppose that Autotools are in general confusing for newcomers.

tl;dr - I noticed that when converting a recursive complex Autotools based build system (such as Syncevolution had) to non-recursive one, some problems never appearing before may (or rather: will) appear. Some of them appears to be an effect of not being documented clearly in Automake documentation.

No comments:

Post a Comment