DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH

# (m4.info.gz) Changeword

Info Catalog (m4.info.gz) Changecom (m4.info.gz) Input Control (m4.info.gz) M4wrap

7.4 Changing the lexical structure of words
===========================================

The macro changeword' and all associated functionality is
experimental.  It is only available if the --enable-changeword'
option was given to configure', at GNU m4' installation time.
The functionality will go away in the future, to be replaced by
other new features that are more efficient at providing the same
about it the same way you would do for bugs.

A file being processed by m4' is split into quoted strings, words
(potential macro names) and simple tokens (any other single character).
Initially a word is defined by the following regular expression:

[_a-zA-Z][_a-zA-Z0-9]*

Using changeword', you can change this regular expression:

-- Optional builtin: changeword (REGEX)
Changes the regular expression for recognizing macro names to be
REGEX.  If REGEX is empty, use [_a-zA-Z][_a-zA-Z0-9]*'.  REGEX
must obey the constraint that every prefix of the desired final
pattern is also accepted by the regular expression.  If REGEX
contains grouping parentheses, the macro invoked is the portion
that matched the first group, rather than the entire matching
string.

The expansion of changeword' is void.  The macro changeword' is
recognized only with parameters.

Relaxing the lexical rules of m4' might be useful (for example) if
you wanted to apply translations to a file of numbers:

ifdef(changeword', ', errprint( skipping: no changeword support
')m4exit(77')')dnl
changeword([_a-zA-Z0-9]+')
=>
define(1', 0')1
=>0

Tightening the lexical rules is less useful, because it will
generally make some of the builtins unavailable.  You could use it to
prevent accidental call of builtins, for example:

ifdef(changeword', ', errprint( skipping: no changeword support
')m4exit(77')')dnl
define(_indir', defn(indir'))
=>
changeword(_[_a-zA-Z0-9]*')
=>
esyscmd(foo')
=>esyscmd(foo)
_indir(esyscmd', echo hi')
=>hi
=>

Because m4' constructs its words a character at a time, there is a
restriction on the regular expressions that may be passed to
changeword'.  This is that if your regular expression accepts foo',
it must also accept f' and fo'.

ifdef(changeword', ', errprint( skipping: no changeword support
')m4exit(77')')dnl
define(foo
', bar
')
=>
dnl This example wants to recognize changeword, dnl, and foo\n'.
dnl First, we check that our regexp will match.
regexp(changeword', [cd][a-z]*\|foo[
]')
=>0
regexp(foo
', [cd][a-z]*\|foo[
]')
=>0
regexp(f', [cd][a-z]*\|foo[
]')
=>-1
foo
=>foo
changeword([cd][a-z]*\|foo[
]')
=>
dnl Even though foo\n' matches, we forgot to allow f'.
foo
=>foo
changeword([cd][a-z]*\|fo*[
]?')
=>
dnl Now we can call foo\n'.
foo
=>bar

changeword' has another function.  If the regular expression
supplied contains any grouped subexpressions, then text outside the
first of these is discarded before symbol lookup.  So:

ifdef(changeword', ', errprint( skipping: no changeword support
')m4exit(77')')dnl
ifdef(__unix__', ,
errprint( skipping: syscmd does not have unix semantics
')m4exit(77')')dnl
changecom(/*', */')dnl
define(foo', bar')dnl
changeword(#$$[_a-zA-Z0-9]*$$')
=>
#esyscmd(echo foo \#foo')
=>foo bar
=>

m4' now requires a #' mark at the beginning of every macro
invocation, so one can use m4' to preprocess plain text without losing
various words like divert'.

In m4', macro substitution is based on text, while in TeX, it is
based on tokens.  changeword' can throw this difference into relief.
For example, here is the same idea represented in TeX and m4'.  First,
the TeX version:

\def\a{\message{Hello}}
\catcode\@=0
\catcode\\=12
@a
@bye
=>Hello

Then, the m4' version:

ifdef(changeword', ', errprint( skipping: no changeword support
')m4exit(77')')dnl
define(a', errprint(Hello')')dnl
changeword(@$$[_a-zA-Z0-9]*$$')
=>
@a
=>errprint(Hello)

In the TeX example, the first line defines a macro a' to print the
message Hello'.  The second line defines <@> to be usable instead of
<\> as an escape character.  The third line defines <\> to be a normal
printing character, not an escape.  The fourth line invokes the macro
a'.  So, when TeX is run on this file, it displays the message Hello'.

When the m4' example is passed through m4', it outputs
errprint(Hello)'.  The reason for this is that TeX does lexical
analysis of macro definition when the macro is _defined_.  m4' just
stores the text, postponing the lexical analysis until the macro is
_used_.

You should note that using changeword' will slow m4' down by a
factor of about seven, once it is changed to something other than the
default regular expression.  You can invoke changeword' with the empty
string to restore the default word definition, and regain the parsing
speed.

`
Info Catalog (m4.info.gz) Changecom (m4.info.gz) Input Control (m4.info.gz) M4wrap
automatically generated byinfo2html