M4 - A Macro Expansion Language

Page Contents


Macro names, or tokens, are any sequence of characters that must start with either an underscore or a letter followed by alphnumerical characters. Anything that does not match this terminates the token and is just passed straight to the output. So TOK1 TOK2 are two seperate tokens as are TOK1.TOK2, TOK1()TOK2 and so on...

Macros are defined using the define macro - which note is a macro and not a function! In general a define looks like this:

define(`macro-name', `what-macro-expands-to')

Where ` (a backtick) is an opening quote and ' (an apostophe) is a closing quote. Take the following example. If the m4 is fed the following input...

define(`jehtech', `a notes website coz I forget')

Then M4 will output:

a notes website coz I forget

M4 parses the input. The define block expands to the empty string but has the side effect of adding the mapping from "jehtech" to "a notes website coz I forget" to the macro table.

Thus, when the next line of input is entered, M4 recognises the token "jehtech" as a macro for expansion, and expands it to "a notes website coz I forget". In later sections you will see that it will then recusively re-evaluate and expand this until no expansion is possible, at which point it is fed to the output.

It's worth noting that you can define macros on multiple lines to make reading easier. So, for example, we could re-write the above to this:

   `a notes website coz I forget' <- Note this newline is included as part of the string jehtech expands to.

Note, however, that the output when the macro "jehtech" is expanded, will contain an extra newline at the end of the expansion because there is a newline between the trailing end-quote and the macro-terminating bracket on the last line which M4 interprets to be part of what the macro should expand to.

The easiest solution is to just write:

    `a notes website coz I forget') <- Note bracket on same line now

To remove a macro definition use the undefine macro.

Macro Expansion, Quoting & Parameters

Its's worth noting that the following examples are a little contrived but should serve to show how evaluation and expansion of tokens occurs. Aso note that define parameters are usually quoted and preceeded by a dnl, but for these examples we wont do this.


As the input is read, each token is checked to see if it is a macro. If it is, it is expanded. The result is then re-scanned and macro expansion is applied recusively. This is also true for macro parameters/arguments: they are recursively expanded before being "passed" to the macro.

We can investigate this by turning on M4's debug features. The following examples will demonstrate both recurive macro expansion and recursive macro parameter expansion.

The Most Basic Expansion

$ m4 --debug=V -
define(XXX, joey)
m4trace:stdin:21: -1- id 23: define ...                    | The token 'define' is seen.
m4trace:stdin:21: -1- id 23: define(`XXX', `joey') -> ???  | It is a known macro so the following tokens are parameters. What does it expand to?
m4trace:stdin:21: -1- id 23: define(...)                   | No furher expansion possible to expands to what it is, pass to interpretter.
                                                           | Mapping XXX -> joey added to macro table

XXX says hello
m4trace:stdin:22: -1- id 24: XXX ...
m4trace:stdin:22: -1- id 24: XXX -> ???
m4trace:stdin:22: -1- id 24: XXX -> `joey'
joey says hello

Recusive Macro Expansion

A slightly different example:

$ m4 --debug=V -
m4trace:stdin:1: -1- id 1: define ...                           | The normal define stuff here....
m4trace:stdin:1: -1- id 1: define(`play', `dodge') -> ???
m4trace:stdin:1: -1- id 1: define(...)

define(dodgeball, HURTS)
m4trace:stdin:2: -1- id 2: define ...
m4trace:stdin:2: -1- id 2: define(`dodgeball', `HURTS') -> ???
m4trace:stdin:2: -1- id 2: define(...)

m4trace:stdin:3: -1- id 3: play ...                             | First token is play
m4trace:stdin:3: -1- id 3: play(`') -> ???                      | Its identified as a macro and implicity passed the empty string
m4trace:stdin:3: -1- id 3: play(...) -> `dodge'                 | The macro evaluates to "dodge" and is expanded as such
m4trace:stdin:3: -1- id 4: dodgeball ...                        | After expansion the input is now "dodgeball"
m4trace:stdin:3: -1- id 4: dodgeball -> ???                     | Recurively re-evaluate the input for futher expansions
m4trace:stdin:3: -1- id 4: dodgeball -> `HURTS'                 | A further expansion is possible... the input is now just one token
HURTS                                                           | For which a mapping exists, so the final output is this!

Recusive Macro Parameter Expansion

$ m4 --debug=V -
define(james, john)
m4trace:stdin:1: -1- id 1: define ...                      | The token 'define' is seen.
m4trace:stdin:1: -1- id 1: define(`james', `john') -> ???  | It is a known macro so the following tokens are parameters. What does it expand to?
m4trace:stdin:1: -1- id 1: define(...)                     | No furher expansion possible to expands to what it is, pass to interpretter.
                                                           | Mapping james -> john added to macro table
define(john, mark)
m4trace:stdin:2: -1- id 2: define ...                      | The token 'define' is seen.
m4trace:stdin:2: -1- id 2: define(`john', `mark') -> ???   | It is a known macro so the following tokens are parameters. What does it expand to?
m4trace:stdin:2: -1- id 2: define(...)                     | No furher expansion possible to expands to what it is, pass to interpretter.
                                                           | Mapping john -> mark added to macro table
define(james, eric)
m4trace:stdin:3: -1- id 3: define ...                      | The token 'define' is seen.
m4trace:stdin:3: -2- id 4: james ...                       | Now evaluating the first parameter.
m4trace:stdin:3: -2- id 4: james -> ???                    | Does 1st param expand to anything - consult macro table.
m4trace:stdin:3: -2- id 4: james -> `john'                 | Mapping found (see ln 2) so do expansion.
m4trace:stdin:3: -2- id 5: john ...                        | Recusively evaluate token(s) resulting from previou expansion.
m4trace:stdin:3: -2- id 5: john -> ???                     | Consult macro table.
m4trace:stdin:3: -2- id 5: john -> `mark'                  | Mapping found (see ln 8) so do expansion.
m4trace:stdin:3: -1- id 3: define(`mark', `eric') -> ???   | All expansions done so re-evaluate input, now the define macro.
m4trace:stdin:3: -1- id 3: define(...)                     | No furher expansion possible to expands to what it is, pass to interpretter.
                                                           | Mapping mark -> eric added to macro table

The GNU manual explains it like so:

When the arguments, if any, to a macro call have been collected, the macro is expanded, and the expansion text is pushed back onto the input (unquoted), and reread. The expansion text from one macro call might therefore result in more macros being called, if the calls are included, completely or partially, in the first macro calls’ expansion.


The opening quote character is the backtick (`) and the closing quote character is the apostrophe (').

The main use of quotes is to defer the expansion of tokens/macros, but they are also used to do things like write "james,john" as one single token, which would otherwise be two tokens in a define, for example.

The GNU M4 manual has this to say about quoting:

The value of a string token is the text, with one level of quotes stripped off. Thus

is the empty string, and double-quoting turns into single-quoting.


I like the WikiPedia explanation better:

Unquoted identifiers which match defined macros are replaced with their definitions. Placing identifiers in quotes suppresses expansion until possibly later, such as when a quoted string is expanded as part of macro replacement.

So, bearing this definition in mind, considering our examples so far, without quotes we might write something like this:

$ m4 -
define(jehtech, bob)dnl
... A shed load of code later in the file we forgot about the jehtech macro define!... 
define(bob, eric)dnl  | The intension here is that jehtech should still translate to bob, but this will stop that!
jehtech               | The user inputs the string jehtech   
eric                  | But gets eric and not bob as s/he might expect. We have seen why in the above sections

The user's intension was probably not to have the string "jehtech" suddenly expand to "eric" as in this example we imagine there are several pages, perhaps, of code between the define for "jehtech" and the define for "bob".

This behvaiour is suprising to the user and not particularly useful. So how do we stop this? The answer is using quotes to defer the expansion of tokens. But remember what we saw in the previous section as to how the parameter "bob" was expanded to "jehtech" before it was passed to the define macro. Lets try quoting things and see if we get the expected answer...

$ m4 --debug=V -
m4debug: input read from stdin
define(`jehtech', `bob')
m4trace:stdin:1: -1- id 1: define ...
m4trace:stdin:1: -1- id 1: define(`jehtech', `bob') -> ???| Remember "The value of a string token is the text, with one level of quotes stripped off."
m4trace:stdin:1: -1- id 1: define(...)                    | So, the token passed to define is bob without the quotes

define(`bob', `eric')
m4trace:stdin:2: -1- id 2: define ...
m4trace:stdin:2: -1- id 2: define(`bob', `eric') -> ???   | AH HA! Look - "bob" was not expanded here because of the quotes!
m4trace:stdin:2: -1- id 2: define(...)                    | So, we should have stopped the unexpected behaviour right?!
                                                          | Here the quotes are stripped from the parameters, but they have the effect
                                                          |    of stopping any more recursive expansion, so it does not happen and therefore
                                                          |    bob is not re-evaluated (not that there was an expansion available, but if there was...).

jehtech                                                   | The user inputs the string "jehtech"
m4trace:stdin:3: -1- id 3: jehtech ...  
m4trace:stdin:3: -1- id 3: jehtech -> ???  
m4trace:stdin:3: -1- id 3: jehtech -> `bob'               | The input token matches a rule and is expanded to "bob"
m4trace:stdin:3: -1- id 4: bob ...                        | The expansion is recurively re-evaluated.... Because the quotes were stipped
                                                          |    during the define parameter's evaluation, this can now be re-evaluated.
m4trace:stdin:3: -1- id 4: bob -> ???                     | The re-evaluated input matches a rule and is expanded...
m4trace:stdin:3: -1- id 4: bob -> `eric'                  |    ... to "eric", which cannot be expanded further so is output
eric                                                      | Oh... we still got an unexpected result, as explained above!

So, the quoting didn't change the end result! But, it did stop the suprise expansion of the define parameters. This is why parameters to define's are generally quoted.

So, really this is now behaving how we want it. The expansion should be recusively done like this. What shouldn't happen is the expansion of the define parameters before the define is evaluated, so we've done our job.

But, for interest, lets just say we didn't want the recursive expansion to happen. How do we stop that? The answer is still quotes.

$ m4 --debug=V -
define(`jehtech', ``bob'')
m4trace:stdin:1: -1- id 1: define ...
m4trace:stdin:1: -1- id 1: define(`jehtech', ``bob'') -> ???  | NOTE how bob is double quoted
m4trace:stdin:1: -1- id 1: define(...)                        |   So, the outer quotes will be stipped off so jehtech maps
                                                              |   to `bob', with single quotes included

define(`bob', `eric')
m4trace:stdin:2: -1- id 2: define ...
m4trace:stdin:2: -1- id 2: define(`bob', `eric') -> ???       | In the same way as before the quotes here defer expansion
m4trace:stdin:2: -1- id 2: define(...)

jehtech                                                       | Now when the user inputs jehtech
m4trace:stdin:3: -1- id 3: jehtech ...
m4trace:stdin:3: -1- id 3: jehtech -> ???                     | It is treated as a tocken and mapped to `bob', quotes included
m4trace:stdin:3: -1- id 3: jehtech -> ``bob''
bob                                                           | As `bob' is evaluted the outer quotes are stipped off. But not
																															|    further expansion is deffered because of the quotes just stipped
																														  |    off. So bob does not get re-evaluated and expanded further to eric!


A macro can access its parameters using the variables $1 through $N. This resembles standard bash syntax. The number of arguments is accessed by $# and all the arguments using $@.

$ m4 --debug=V -
dnl --
dnl -- Access specific arguments
dnl --
define(`mymacro', `$1 -- $2')
m4trace:stdin:5: -1- id 3: define ...
m4trace:stdin:5: -1- id 3: define(`mymacro', `$1 -- $2') -> ???
m4trace:stdin:5: -1- id 3: define(...)

mymacro(jjj, ggg)
m4trace:stdin:6: -1- id 4: mymacro ...
m4trace:stdin:6: -1- id 4: mymacro(`jjj', `ggg') -> ???
m4trace:stdin:6: -1- id 4: mymacro(...) -> `jjj -- ggg'
jjj -- ggg

dnl --
dnl -- Show the number of arguments
dnl --
define(`james', `$#')dnl
james()    | This is interesting!! The num args is 1 because...
1          | ...a nothing argument becomes the empty string!

dnl --
dnl -- Access all the arguments
dnl --
define(`john', `$@')

john(`a', `b', `c')

Why Is There A "dnl" At The End Of So Many "define"s?

In M4 the input string is parsed and any macros are expanded (recursively) before being passed to the output and define(sym, val) is itself a macro that expands to the empty string whilse having the important side effect of recording the mapping to sym to val.

This means that where define(`some_name', `some_val') appears in the code an empty string results. As a newline normally follows the statement, an empty string and a newline, i.e., a blank line appears in the output, which is often not wanted - want the define to just disappear leaving absolutely no trace.

The macro dnl eats all whitespace occuring after it AND the newline. So it can be used like this:

$ m4 -
define('jehtech', 'www.jeh-tech.com')
dnl This is the next line. As the define evaluates to the empty string a blank line appears in the output.
dnl We can get around this as follows...

define('jehtech', 'www.jeh-tech.com')dnl
dnl Now this is the next line - no blank line as dnl has eaten all whitespace after it AND the newline


Macros can be conditionally expanded using the predefines ifdef, which tests if a macro is defined, ifelse, which is an if-then-else like functionality and shift, which helps enable recursion.


$ m4 -
ifdef(`some_macro', `yes', `no')
define(`some_macro', `some text')dnl
ifdef(`some_macro', `yes', `no')


$ m4 -
ifelse(`some string', `some string', `matches')
ifelse(`some string', `some other string', `matches')

ifelse(`some string', `some other string', `matches', `different')

Above are the most common variations. Note that the does-not-match, 4th parameter, is optional. When it is ommitted and no match occurs the ifelse macro expands to the empty string.

Note again, ifelse is a macro. It is not a function! It just expands to one of its parameters condionally.

The ifelse block, in the form ifelse(a, b, equal_ab) for can be expanded to ifelse(a, b, equal_ab, c, d, equal_cd, e, f, equal_ef, ....). In this case it acts like aa if (a == b) equal_ab else if (c == d) equal_cd else if(e == f) equal_ef ... .


Because macros are continually expanded we can do boneheaded things like this:

$ m4 --debug=V -
define(`test', `test')
m4trace:stdin:1: -1- id 1: define ...
m4trace:stdin:1: -1- id 1: define(`test', `test') -> ???
m4trace:stdin:1: -1- id 1: define(...)

m4trace:stdin:2: -1- id 2: test ...
m4trace:stdin:2: -1- id 2: test -> ???
m4trace:stdin:2: -1- id 2: test -> `test'
m4trace:stdin:2: -1- id 3: test ...
m4trace:stdin:2: -1- id 3: test -> ???
m4trace:stdin:2: -1- id 3: test -> `test'
m4trace:stdin:2: -1- id 4: test ...
m4trace:stdin:2: -1- id 4: test -> ???
m4trace:stdin:2: -1- id 4: test -> `test'
m4trace:stdin:2: -1- id 5: test ...
m4trace:stdin:2: -1- id 5: test -> ???
m4trace:stdin:2: -1- id 5: test -> `test'
m4trace:stdin:2: -1- id 6: test ...
m4trace:stdin:2: -1- id 6: test -> ???
m4trace:stdin:2: -1- id 6: test -> `test'
... continues forever repeating the above

That's not useful. But... what if we could get the continual recusive evaluation to end? Now that we have contitional evaluation and expansion, we can do this!

Lets try and do something that doesn't break the interpretter:

$ m4 --debug=V -
m4debug: input read from stdin
define(`test', `ifelse(`$1', `aaaaa', `done', `test(`$1a')')')#
m4trace:stdin:1: -1- id 1: define ...
m4trace:stdin:1: -1- id 1: define(`test', `ifelse(`$1', `aaaaa', `done', `test(`$1a')')') -> ???
m4trace:stdin:1: -1- id 1: define(...)

m4trace:stdin:2: -1- id 2: test ...
m4trace:stdin:2: -1- id 2: test(`a') -> ???                                            | Evaluate test(`a') and expand
m4trace:stdin:2: -1- id 2: test(...) -> `ifelse(`a', `aaaaa', `done', `test(`aa')')'   | Expands to this, re-evaluate
m4trace:stdin:2: -1- id 3: ifelse ...
m4trace:stdin:2: -1- id 3: ifelse(`a', `aaaaa', `done', `test(`aa')') -> ???           | Evaluate conditional and expand
m4trace:stdin:2: -1- id 3: ifelse(...) -> `test(`aa')'                                 | Expands to test(`aa')
m4trace:stdin:2: -1- id 4: test ...
m4trace:stdin:2: -1- id 4: test(`aa') -> ???                                           | This recursive re-evaluation keeps going until
m4trace:stdin:2: -1- id 4: test(...) -> `ifelse(`aa', `aaaaa', `done', `test(`aaa')')' |    until the string matches `aaaaa'
m4trace:stdin:2: -1- id 5: ifelse ...
m4trace:stdin:2: -1- id 5: ifelse(`aa', `aaaaa', `done', `test(`aaa')') -> ???
m4trace:stdin:2: -1- id 5: ifelse(...) -> `test(`aaa')'
m4trace:stdin:2: -1- id 6: test ...
m4trace:stdin:2: -1- id 6: test(`aaa') -> ???
m4trace:stdin:2: -1- id 6: test(...) -> `ifelse(`aaa', `aaaaa', `done', `test(`aaaa')')'
m4trace:stdin:2: -1- id 7: ifelse ...
m4trace:stdin:2: -1- id 7: ifelse(`aaa', `aaaaa', `done', `test(`aaaa')') -> ???
m4trace:stdin:2: -1- id 7: ifelse(...) -> `test(`aaaa')'
m4trace:stdin:2: -1- id 8: test ...
m4trace:stdin:2: -1- id 8: test(`aaaa') -> ???
m4trace:stdin:2: -1- id 8: test(...) -> `ifelse(`aaaa', `aaaaa', `done', `test(`aaaaa')')'
m4trace:stdin:2: -1- id 9: ifelse ...
m4trace:stdin:2: -1- id 9: ifelse(`aaaa', `aaaaa', `done', `test(`aaaaa')') -> ???
m4trace:stdin:2: -1- id 9: ifelse(...) -> `test(`aaaaa')'
m4trace:stdin:2: -1- id 10: test ...
m4trace:stdin:2: -1- id 10: test(`aaaaa') -> ???
m4trace:stdin:2: -1- id 10: test(...) -> `ifelse(`aaaaa', `aaaaa', `done', `test(`aaaaaa')')'
m4trace:stdin:2: -1- id 11: ifelse ...
m4trace:stdin:2: -1- id 11: ifelse(`aaaaa', `aaaaa', `done', `test(`aaaaaa')') -> ???  | The two strings finally match and expand into our end condition
m4trace:stdin:2: -1- id 11: ifelse(...) -> `done'                                      |    which when re-evaluated does not expand to anything so
done                                                                                   |    this processes stops. Recursion done!


The shift() macro expands to all its arguments except the first:




This macro is pretty central to recusion. Lets have an attempt at making a reverse macro. (Note there is a native reverse macro, so this is just for a play).

This was my first attempt:

define(`jump', ifelse($#, 1, end, `jump(shift($@)),$1'))

The first attempt just doesn't work, full stop. The reason is that because the ifelse is not quoted, it is immediately expanded due to parameter expansion. A good example of why quoting to defer parameter expansion is needed.

So... second attempt... quote the ifelse, so the it is not expanded as a parameter to the define macro and instead would be expanded after the define expands. (I've added indents to show the level of recusion)

define(`jump', `ifelse($#, 1, end, jump(shift($@)),$1'))
m4trace:stdin:3: -1- id 2: jump ...
m4trace:stdin:3: -1- id 2: jump(`1', `2', `3') -> ???
m4trace:stdin:3: -1- id 2: jump(...) -> `ifelse(3, 1, end, jump(shift(`1',`2',`3')),1'
                           Jump evaluates to this. Quotes stripped and returned to input and re-evaluated.
m4trace:stdin:3: -1- id 3: ifelse ...
                           The new input is unquoted so ifelse is immediately evaluated
   m4trace:stdin:3: -2- id 4: jump ...
                              Evaluate ifelse parameters.First expandable param is jump
      m4trace:stdin:3: -3- id 5: shift ...
                                 To evaluate jump its parameters must be evaluated
      m4trace:stdin:3: -3- id 5: shift(`1', `2', `3') -> ???
      m4trace:stdin:3: -3- id 5: shift(...) -> ``2',`3''
                                 The shift macro expands to all its parameters bar the first
   m4trace:stdin:3: -2- id 4: jump(`2', `3') -> ???
                              All the parameters to jump have been evaluate and expanded, so now jump can
                              be expanded, because this was the parameter to the ifelse.
   m4trace:stdin:3: -2- id 4: jump(...) -> `ifelse(2, 1, end, jump(shift(`2',`3')),2'
                              jump epxands again - this is recursion
   m4trace:stdin:3: -2- id 6: ifelse ...
      m4trace:stdin:3: -3- id 7: jump ...
                                 Again, jump's parameters are evaluated and expanded
         m4trace:stdin:3: -4- id 8: shift ...
         m4trace:stdin:3: -4- id 8: shift(`2', `3') -> ???
         m4trace:stdin:3: -4- id 8: shift(...) -> ``3''
                                    The same process occurs with shift
      m4trace:stdin:3: -3- id 7: jump(`3') -> ???
                                 Shift was fully expanded to just '3'. That's alll jump's parameters expanded so...
      m4trace:stdin:3: -3- id 7: jump(...) -> `ifelse(1, 1, end, jump(shift(`3')),3'
                                 ...Now jump's parameters are fully expanded, jump itself can be expanded
      m4trace:stdin:3: -3- id 9: ifelse ...
                                 Again, jump recursively expand into the ifelse macro.
         m4trace:stdin:3: -4- id 10: jump ...
                                     The no-match parameter of the ifelse is evaluated and expanded.
            m4trace:stdin:3: -5- id 11: shift ...
                                        It expands into the shift macro
            m4trace:stdin:3: -5- id 11: shift(`3') -> ???
            m4trace:stdin:3: -5- id 11: shift(...)
                                        The shift macro drops the 1st parameter. There are no others so it expands
                                        into the empty string.
         m4trace:stdin:3: -4- id 10: jump(`') -> ???
                                     All jumps parameters expanded so now expand the jump macro itself
         m4trace:stdin:3: -4- id 10: jump(...) -> `ifelse(1, 1, end, jump(shift(`')),'
                                     Jump no expands to the above
                                     !!! This is where one might expect the recusion to end !!! 
                                     !!! because now 1 == 1 so the ifelse should expand to 'end'. And !!! 
                                     !!! indeed it would, but before that can happen all its parameters !!! 
                                     !!! must be expanded and therefore the process will now recurse !!! 
                                     !!! infinitely !!!
      m4trace:stdin:3: -4- id 12: ifelse ...
            m4trace:stdin:3: -5- id 13: jump ...
               m4trace:stdin:3: -6- id 14: shift ...
               m4trace:stdin:3: -6- id 14: shift(`') -> ???
               m4trace:stdin:3: -6- id 14: shift(...)
... recurses forever 

Oh dear... this produces infinite recursion. The explanation is highlighted above. The problem was that the parameters get continually expanded before they are passed to the macro that would do the parameter comparison and end the recursion. Thus, the recursion never ends.

So, how can we stop the parameters being continually expanded before being passed to the macro itself? The answer is QUOTING! What we want is for jump to be expanded, but then the arguments to NOT be expanded so that the ifelse macro that jump is expanded to has a change to evaluate and compare the first two parameters.

Lets try with some more quoting...

define(`jump', `ifelse($#, 1, end, `jump(shift($@)),$1')')

So, now if we input jump(1,2,3), the macro is expanded to ifelse($#, 1, end, `jump(shift($@)),$1'). BUT NOW the 4th parameter does NOT get expanded because the quotes defer expansion.

So now, the ifelse macro can be expanded BEFORE the last parameter, which will allow it to compare the number of arguments to 1, giving it a chance to end the recursion. The output is end,2,1.

So, the last tweak is to replace "end" with the last parameter. And we'll also use some more quotes to stop other things getting unexpectedly expanded. So the final go is this:

define(`jump', `ifelse(`$#', `1', `$1', `jump(shift($@)),$1')')


Sweet :)

Diverting Output

You can get M4 to divert output to the standard output, files, a null-like device or the standard error. This can be used to create multiple file outputs from one M4 script, or even just stop macros creating annoying whitespace in the output.

Use divert(0) or just divert() to divert output to the standard output.

Use divert(1-9) to divert output to one of 9 files.

Use divert(-1) to divert output the a null-like device (actually M4 just discards the output its not really a device).

Including Files

Use include(filename). The include macro expands to the file contents which is then recursively re-evaluated - like any other macro expansion.

So, for example, if i was using an M4 macro to include a text file with example HTML/C/C++/whatever code in it into my HTML page, I could use the following to escape HTML special characters:

patsubst(patsubst(patsubst(include(`path/to/html-snippet'), `<', `&lt;'), `>', `&gt;'), `&', `&amp;')

This could also be bundled up into a helpful little macro:

define(`HTML_SNIPPET', `<pre>patsubst(patsubst(patsubst(include($1), `<', `&lt;'), `>', `&gt;'), `&', `&amp;')</pre>')dnl

BALLS - its doesn't quite work if you have M4 quote characters in the file being included :( We can improve this slightly by disabling quoting whilst parsing the file contents [Ref]:

define(`HTML_SNIPPET', `changequote(`',`')<pre>patsubst(patsubst(patsubst(include($1), `<', `&lt;'), `>', `&gt;'), `&', `&amp;')</pre>'changequote())dnl

BALLS again - even this isn't quote okay, because any content in the imported file that accidently matches a macro, user or pre-defined, will still wrongly be expanded, because the output of macro evaluations is always rescanned. Doh! It appears that GNU m4 offers several mechanisms or techniques for inhibiting the recognition of names as macro calls [Ref], which may be handy (see --prefix-builtins)...

...but, then we need to make sure that first, all the M4 builtins are prefixed with "m4_" and that the included snippet does not contain any user defined macros. Mind you, using --prefix-builtins is probably a good idea when using M4 with HTML anyway, just in case you do accidently use an M4 builtin by accident.

I have not been able to see a good way around this. What will prove easier is to us syscmd() [Ref]. This macro will pass the command output straight to the M4 output and evaluate to nothing. So, we can execute something that will correctly escape the HTML snippet for us. The downside is we now need to rely on an external program. If we have to run the M4 script on different platforms, this could get fun.

And SIGH, on Windows, at least, my M4 doesn't inherit the environment of shell it's executed from, so woopy-doo it doesn't even recognise the Python executable without it being prefix by it's absolute path... grrr.

If you wanted to include a file into the output, without letting M4 parse the file, i.e., send it straight to output without any more M4 involvement, you can use the GNU M4 extension undivert(`path/to/file'). The macro dumps the file to output but evaluates to nothing, so M4 won't be able to operate of the file contents.