[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Introduction

AutoGen is a tool designed for generating program files that contain repetitive text with varied substitutions. Its goal is to simplify the maintenance of programs that contain large amounts of repetitious text. This is especially valuable if there are several blocks of such text that must be kept synchronized.

One common example is the problem of maintaining the code required for processing program options. Processing options requires a minimum of four different constructs be kept in proper order in different places in your program. You need at least:

  1. The flag character in the flag string,
  2. code to process the flag when it is encountered,
  3. a global state variable or two, and
  4. a line in the usage text.

You will need more things besides this if you choose to implement long option names, rc/ini file processing, environment variables and so on. All of this can be done mechanically; with the proper templates and this program. In fact, it has already been done and AutoGen itself uses it See section 7. Automated Option Processing. For a simple example of Automated Option processing, See section 7.3 Quick Start. For a full list of the Automated Option features, See section 7.1 AutoOpts Features.

1.1 The Purpose of AutoGen  
1.2 A Simple Example  
1.3 csh/zsh caveat  
1.4 A User's Perspective  

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.1 The Purpose of AutoGen

The idea of this program is to have a text file, a template if you will, that contains the general text of the desired output file. That file includes substitution expressions and sections of text that are replicated under the control of separate definition files.

AutoGen was designed with the following features:

  1. The definitions are completely separate from the template. By completely isolating the definitions from the template it greatly increases the flexibility of the template implementation. A secondary goal is that a template user only needs to specify those data that are necessary to describe his application of a template.

  2. Each datum in the definitions is named. Thus, the definitions can be rearranged, augmented and become obsolete without it being necessary to go back and clean up older definition files. Reduce incompatibilities!

  3. Multiple values for a given name create an array of values. These arrays of values are used to control the replication of sections of the template.

  4. There are named collections of definitions. They form a nested hierarchy. Associated values are collected and associated with a group name. These associated data are used collectively in sets of substitutions.

  5. The template has special markers to indicate where substitutions are required, much like the ${VAR} construct in a shell here doc. These markers are not fixed strings. They are specified at the start of each template. Template designers know best what fits into their syntax and can avoid marker conflicts.

    We did this because it is burdensome and difficult to avoid conflicts using either M4 tokenizaion or C preprocessor substitution rules. It also makes it easier to specify expressions that transform the value. Of course, our expressions are less cryptic than the shell methods.

  6. These same markers are used, in conjunction with enclosed keywords, to indicate sections of text that are to be skipped and for sections of text that are to be repeated. This is a major improvement over using C preprocessing macros. With the C preprocessor, you have no way of selecting output text because it is an unvarying, mechanical substitution process.

  7. Finally, we supply methods for carefully controlling the output. Sometimes, it is just simply easier and clearer to compute some text or a value in one context when its application needs to be later. So, functions are available for saving text or values for later use.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.2 A Simple Example

This is just one simple example that shows a few basic features. If you are interested, you also may run "make check" with the VERBOSE enviornment variable set and see a number of other examples in the `agen5/test/testdir' directory.

Assume you have an enumeration of names and you wish to associate some string with each name. Assume also, for the sake of this example, that it is either too complex or too large to maintain easily by hand. We will start by writing an abbreviated version of what the result is supposed to be. We will use that to construct our output templates.

In a header file, `list.h', you define the enumeration and the global array containing the associated strings:

typedef enum {
        IDX_OMEGA }  list_enum;

extern const char* az_name_list[ 3 ];

Then you also have `list.c' that defines the actual strings:

#include "list.h"
const char* az_name_list[] = {
        "some alpha stuff",
        "more beta stuff",
        "final omega stuff" };

First, we will define the information that is unique for each enumeration name/string pair.

autogen definitions list;
list = { list_element = alpha;
         list_info    = "some alpha stuff"; };
list = { list_info    = "more beta stuff";
         list_element = beta; };
list = { list_element = omega;
         list_info    = "final omega stuff"; };

The autogen definitions list; entry defines the file as an AutoGen definition file that uses a template named list. That is followed by three list entries that define the associations between the enumeration names and the strings. The order of the differently named elements inside of list is unimportant. They are reversed inside of the beta entry and the output is unaffected.

Now, to actually create the output, we need a template or two that can be expanded into the files you want. In this program, we use a single template that is capable of multiple output files.

It looks something like this. (For a full description, See section 3. AutoGen Template.)

[+ AutoGen5 template h c +]
[+ CASE (suffix) +][+
   ==  h  +]
typedef enum {[+
   FOR list "," +]
        IDX_[+ (string-upcase! (get "list_element")) +][+
   ENDFOR list +] }  list_enum;

extern const char* az_name_list[ [+ (count "list") +] ];

   ==  c  +]
#include "list.h"
const char* az_name_list[] = {[+
  FOR list "," +]
  ENDFOR list +] };[+


The [+ AutoGen5 template h c +] text tells AutoGen that this is an AutoGen version 5 template file; that it is to be processed twice; that the start macro marker is [+; and the end marker is +]. The template will be processed first with a suffix value of h and then with c. Normally, the suffix values are appended to the `base-name' to create the output file name.

The [+ == h +] and [+ == c +] CASE selection clauses select different text for the two different passes. In this example, the output is nearly disjoint and could have been put in two separate templates. However, sometimes there are common sections and this is just an example.

The [+FOR list "," +] and [+ ENDFOR list +] clauses delimit a block of text that will be repeated for every definition of list. Inside of that block, the definition name-value pairs that are members of each list are available for substitutions.

The remainder of the macros are expressions. Some of these contain special expression functions that are dependent on AutoGen named values; others are simply Scheme expressions, the result of which will be inserted into the output text. Other expressions are names of AutoGen values. These values will be inserted into the output text. For example, [+list_info+] will result in the value associated with the name list_info being inserted between the double quotes and (string-upcase! (get "list_element")) will first "get" the value associated with the name list_element, then change the case of all the letters to upper case. The result will be inserted into the output document.

If you have compiled AutoGen, you can copy out the template and definitions, run `autogen' and produce exactly the hypothesized desired output.

One more point, too. Lets say you decided it was too much trouble to figure out how to use AutoGen, so you created this enumeration and string list with thousands of entries. Now, requirements have changed and it has become necessary to map a string containing the enumeration name into the enumeration number. With AutoGen, you just alter the template to emit the table of names. It will be guaranteed to be in the correct order, missing none of the entries. If you want to do that by hand, well, good luck.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.3 csh/zsh caveat

AutoGen tries to use your normal shell so that you can supply shell code in a manner you are accustomed to using. If, however, you use csh or zsh, you cannot do this. Csh is sufficiently difficult to program that it is unsupported. Zsh, though largely programmable, also has some anomolies that make it incompatible with AutoGen usage. Therefore, when invoking AutoGen from these environments, you must be certain to set the SHELL environment variable to a Bourne-derived shell. e.g., sh, ksh or bash.

Any shell you choose for your own scripts need to follow these basic requirements:

  1. It handles trap $sig ":" without output to standard out. This is done when the server shell is first started. If your shell does not handle this, then it may be able to by loading functions from its start up files.
  2. At the beginning of each scriptlet, the command \\cd $PWD is inserted. This ensures that cd is not aliased to something peculiar and each scriptlet starts life in the execution directory.
  3. At the end of each scriptlet, the command echo mumble is appended. The program you use as a shell must emit the single argument mumble on a line by itself.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1.4 A User's Perspective

Alexandre wrote:
> I'd appreciate opinions from others about advantages/disadvantages of
> each of these macro packages.

I am using AutoGen in my pet project, and find one of its best points to be that it separates the operational data from the implementation.

Indulge me for a few paragraphs, and all will be revealed: In the manual, Bruce cites the example of maintaining command line flags inside the source code; traditionally spreading usage information, flag names, letters and processing across several functions (if not files). Investing the time in writing a sort of boiler plate (a template in AutoGen terminology) pays by moving all of the option details (usage, flags names etc.) into a well structured table (a definition file if you will), so that adding a new command line option becomes a simple matter of adding a set of details to the table.

So far so good! Of course, now that there is a template, writing all of that tedious optargs processing and usage functions is no longer an issue. Creating a table of the options needed for the new project and running AutoGen generates all of the option processing code in C automatically from just the tabular data. AutoGen in fact already ships with such a template... AutoOpts.

One final consequence of the good separation in the design of AutoGen is that it is retargetable to a greater extent. The egcs/gcc/fixinc/inclhack.def can equally be used (with different templates) to create a shell script (inclhack.sh) or a c program (fixincl.c).

This is just the tip of the iceberg. AutoGen is far more powerful than these examples might indicate, and has many other varied uses. I am certain Bruce or I could supply you with many and varied examples, and I would heartily recommend that you try it for your project and see for yourself how it compares to m4.

As an aside, I would be interested to see whether someone might be persuaded to rationalise autoconf with AutoGen in place of m4... Ben, are you listening? autoconf-3.0! `kay? =)O|

        Gary V. Vaughan

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Bruce Korb on October, 20 2001 using texi2html

Viewable With Any Browser SourceForge Logo