mk
is pretty useful. With the
addition of some printf-like expansions mk
is a
much more
powerful processor. By snipping bits of text from the target filename,
target file, a lookup table, the environment and the command line
options passed to mk
itself
we construct a more specific and powerful shell command.
The compact notation presented here is used to
express the same string operations that are commonly done with
sed
, expr
, basename
, or dirname
in a shell script. The intent is to remove clutter from the command
in the source file, and to make the
customized command more portable and precise.
The other purpose of the expander is to enable some "meta" operations,
like changing the marker or the submarker mk
is processing.
In some cases a file might not meet a criteria for the marker
presented, mk
offers expanders to prevent execution of any
command in that case.
mk
are often used to
process the file as a stand-alone module.
The commands are selected and extracted from the target file,
expanded for the present context,
then passed to a shell for execution.
This usefulness of this tool, like make
, is driven by the
use to which it is put. There are few limits placed on the tool
by the implementation.
It is also common to keep the marked lines at the top of the file,
mk
only reads about 99 lines of the file by default while
searching for marked lines. This prevents mk
from reading a
whole binary file (only to find gibberish at the end).
It is also common convention to use a shell variable
to reference the "main" program we are about to run. For example
${cc-cc}
to mean run "$cc if set, else find cc in $PATH"
(see sh(1)). This allows the User's environment to replace "cc" with
"gcc", "acc", or "bsdcc" as needed.
mk
is reading from the input file is
called marked if it contains the marker specified, in
the correct context.
A marker can be thought of as a verb, a message, a recipe, or just a name for
a command hidden in the target file. A submarker can be thought of
as a direct object, detail, ingredient, or a specific destination for the
marker to act on (or against) the target file.
The difference is in how you
view mk
's relation to the target, or the target's relation
to the context in which it exists.
The marker
mk
searches for is specified with the
-m command line option, or defaults to
the name mk
was called (in argv[0]), or the word "Compile" if
the programs is called "mk".
The submarker
mk
searches for is specified with
the -d command line option. The default is none.
When mk
is processing a file it is searching either a template
or the file itself for a line formatted like either:
In each case the leading "..." is usually a comment delimiter, or
white space (which is ignored). The trailing "$$ ..." is a trick to
delete closing comment delimited for any enclosing processors that
requires them.
The template
is expanded to form
the command
.
Examples:
Other examples:
mk
, or the sequence of data sources
selected and searched.
%.
%;
%^
%J
) to build a temporary file for use in other
templates marked commands (as %j
is not reset per template).
In that code$Test(*): %J: build %j%^ %m Bar no %m Test yes . . default $Test(*): echo %<%0> $$%j
%0
is the first here document in the file,
so "mk -mTest" on that file outputs "yes".
To make this even more useful use %|
to remove leading
comment symbols.
%g
mk
to search the Makefile
for a command
marked with "Compile", rather than this file:
// $Compile: Makefile%g
mapfile
%g
replaces
the current mapfile
with
the result string, and starts the search over again.
%h
mk
to change the default "Compile" marker to
the "Display" marker for the nroff format file:
.\" $Compile: Display%h .\" $Display: groff -man -tbl %f
%H
# $*(*): debug%H
%=/
exp1
/
exp2
/
/
) may be any character that
doesn't appear in exp1
or
exp2
%!/
exp1
/
exp2
/
exp1
is the same string as the expansion of
exp2
.
%<
mapexp
>
mapexp
, use that as a mapfile
(see below)
mapexp
must expand to less than MAXPATHLEN characters
%.
or %;
in the expansion of mapfile is honored for this expansion
%g
replaces the current mapfile
, not the current template file
%0
, %1
, %2
through %9
%J
to %j
) this may be used
to access more than one document. In that case %1
is the first %J
document, %2
is the second and so on (even beyond %9
).
mk
's optionsmk
is going to run a shell command it stands to
reason that is might want to call itself, or a program that looks
a lot like itself.
In that case being able to pass our command line options on would
be clever.
Access to mk
's command line options, and state:
%a
mk
has -a in effect, else the empty string.
%A
mk
has -a in effect, else the empty string.
Using the mixer, one might also write this as %(a,$)
.
%+
%-
%b
Mk
's, that is as much of the path
as mk
was provided in argv[0].
%B
%[b/$]
mk
was called with.
The name mk
was called at execution time. Some subsystems
call mk
by several symbolic links: "Compile", "Run", "Clean",
"Test", thus using mk
as an "object oriented" message passing
agent. This escape allows passing the "message" on to another program.
%c
mk
has -c in effect, else the empty string.
%C
mk
has -c in effect, else the empty string.
%e
%E
%i
mk
has -i in effect, else the empty string.
%I
mk
has -i in effect, else the empty string.
%k
mk
's marker prefix.
%K
%k%k
mk
's end marker token.
%l
mk
's searches from
each target file.
%L
mk
's active notion of how many lines to search from
each target file.
%m
mk
's active marker.
%M
mk
's active marker in all lowercase.
%n
mk
's -n option is active, else the empty string
%N
mk
's -n option is active, else the empty string
%o
%A%C%I%N%V
mk
in a bundle
%O
%A%C%I%N%V
%o
without the leading dash.
%s
mk
's active submarker
%S
mk
's active submarker in all lowercase
%t
-t
templates
",
the active templates option as a command-line specification
%T
%v
mk
has verbose set, else "-s"
%V
mk
has verbose set, else "s"
%w
mk
searches for a template under -t or -e this
expansion record which directory we are presently checking.
%W
%w
, but if we are not
searching the template options reject the command.
Thus one could tell the difference between a template file
as a template, or as a target itself. I doubt anyone would use that
distinction.
%z
%Z
%[z/$]
%~
mk
searches, like mk
's home directory.
Functions of the target file:
%d
%D
%d
above, but reject the command rather than present the empty string.
%f
%F
%G
%p
%q.
).
%P
%Q.
).
%q
x
x
, reject the command if no x
in target.
%Q
x
x
, reject the command if no x
in target.
%r
%R
%u
x
x
, reject the command if no x
in target.
%U
x
x
, reject the command if no x
in target.
%X
%x
%u.
or %U.
).
%y
%Y
x
x
.
%Y~
x
%#!
%#/
%[#!/$]
).
%#
[nbytes
] [@
seek
] [%
] fmt
[size
Insert data from the target file.
Read nbytes of data (taken as a decimal integer)
at position seek (default 0).
Then use the printf formatted fmt to output
them in units of size (b for byte, w for word, or l for long).
Note that fmt cannot contain a '*', since there is not a way to pass
the width parameter to sprintf(3).
$Info: %=/1fff/%#%04xw/echo %f is in compact format
%{
ENV
}
%`
ENV
`
%"
ENV
"
ENV
}", reject the command when $ENV
is not set.
This is largely used to see if an X11 DISPLAY environment variable is available, to avoid forking failed X client application. Since the command is rejected when the variable is not set we can prefer the X version of an application (e.g. browser or spread sheet) then the text only version.
%[
expression
separator
field
...]
xapply
dicer rules starting with separator
and field, as in xapply
the separator and
field specification may be repeated as needed.
The expansion is broken into fields at each occurrence of the character separator, then field number field is selected. A negative fields inverts the selection to mean "all but field". In the case of a literal blank the separator is any non-zero number of white space characters. A backslash may be used to remove special meaning from space, backslash, digits, and close bracket.
%(
expression
mixer
)
xapply
, process the
expression
then select characters from
it with the mixer.
The rules for the mixer are too complex to fully explain here
(see explode's dicer.html
for details).
In brief, ranges of characters indexed from 1 to the end of
the string are selected by index from
the left (integer
), index from
the right (~
integer
),
last character ($
),
the whole range (*
),
or augmented with a quoted string
(`
text
'
, or
"
text
"
).
Results may be filtered again with repeated application of these expressions
in parenthesis.
%<
mapexp
>
.
For example a files magic number to the name of the program
that builds that type, or unpacks that type of file.
See the default templates for many clever uses.
Blank lines and lines that start with a hash (#) are ignored. All the other lines in the file should be matching lines.
Matching lines have three columns separated by white space, all of which are expanded before being used:
If any of the expansions rejects the expansion the next matching line in the file is tested. If no line matches the expansion rejects the command. To put a literal hash or white space in the test string or the RE use the backslash escapes below.
As a bonus the strings matched by \(..\) pairs are available as
%1
, %2
, %3
... to %9
.
And %0
is the whole matched string.
As a corner-case the empty string may be specified as
\e
.
mk
supports the standard C backslash (\) escapes.
This might also allow the comment character(s), from the
file's native processor, to be included in the embedded command.
For example if double-dash (--) is
the comment ending token, and the command needs a double-dash option
(e.g. --help) in it
one could use any of these expander forms:
To break up multi-character tokens I prefer\055-help -\055help -\e-help
\e
, viz. "*\e/"
to avoid a C comment termination.
In some files the comment character is all but impossible to include
in a comment. For example a hash (#) character might have to be
expressed as a \043 to hide it from the native processor,
viz. make
.
spelling | expands to |
---|---|
\\ | a literal backslash |
\n | newline |
\t | tab |
\s | space |
\b | bell |
\r | carriage return |
\f | form-feed |
\v | vertical tab |
\e | the empty string |
\$ | a literal dollar, often used to defeat $$ |
\000 .. \777 | literal octal ASCII codes |
\else | any other character is taken as a literal |
%*
-E
option, as there is no actual marker.
%j
%j
and no preceding
%J
is always rejected.
This prevents the null command from being selected at the end
of a here document.
%J
%J
in it builds a temporary file much as a
shell "here document".
Lines from the current file are copied to the temporary file
up to the next marked line (matching the same marker and submarker)
with a %j
anywhere in it is read.
By anywhere we mean before the marker, in the command, or after the
end token $$
.
These lines (not including the last) are presented in the filename
reported by %j
.
Consumed lines are not re-inspected for marked lines. The example below outputs lines numbered from 1 to 4.
$Here: %J tail -r %j 4 3 2 1 $Here(*): %.%j thing
The line which ended the here document is inspected
for a command if the controlling line is rejected,
even when it ended the previous document.
Mapfiles get their here document data from the marked file,
not the mapfile (%J
is not allowed in
a mapfile, but %j
is).
%|/
expr
/
# $Here: %|/#/%J tail -r %j #4 #3 #2 #1 # $Here(*): %.%j thing
%?
%J
, %?
expands to
the text which follows the %j
which terminates the
here document. In the example above it would be "thing". Note that
leading white space is consumed. A good use of this is to fetch
data from the end of a block that was computed as the block was produced,
e.g. the standard deviation and mean of a list of numbers.
This is also a good use of $$
, we can end the current
here document, and use the $$
feature to prevent the
interpretation of the %j
if that command is ever expanded.
... $*(*): ${false-false}%; $$ %j end of the world
%$
%$
. When there was no $$ token this
expansion rejects the command. In the example above the
result would be " %j
end of the world".
A strange side-effect of %J
is
that this expander sees the text on the end of
the marked line that ends the here document,
not the text on the end of the current template. This is thought to
be a feature rather than a bug.
mk
looks for all the matching
marked commands. This is culturally used to follow a step-by-step recipe,
much as a make(1) file, but the steps might also be stand-alone
targets themselves.
There are two expanders that are used strictly for their side-effects
to set and end "step mode": %+
to turn it on,
and %-
to end turn it off.
Here is an example from the regression tests:
When we are asked to "Compile" the file we shift the marker to "Step" and the submarker to "*", so we'll match al the "Step(n)" lines below. We also set -a (via$Compile: %+Step(*)%h # This file checks to be sure mk honors %+ to do multistep tasks $Step(1): true $Step(2)=~0: false $Step(3): exit 0 $Step(4): true%- $Step(fail): false
%+
) so we do not stop at the first
one. On step four (Step(4)
) we execute "true" and
end the step mode so we don't fail on
the next marked step (Step(fail)
)
Other applications might just search for a specific Step
for
another purpose, since they would all work a "stand alone" commands (even Step(4)
).
mk
, and then
some there are other really subtle uses.
The here documents are an if-then-elif-else-endif type construction.
In this outline we see the three alternatives (submarkers 1, 2, 3),
each of which has a here document block. The last alternative
doesn't start a here document, but does end the third one.
Use "mk -VmTest -i" on a file with those lines to see it go. If you quit from the command prompt$Test(1): %J something %j first block $Test(2): %J something other %j second block $Test(3): %J another way %j third block $Test(*): default case $$ %j
mk
leaves the file in /tmp
for you, that might be a feature or a bug, see untmp
(1).
The other way to view a here document set is like a shell archive.
The sections could be installed into other filenames (then processed
with mk
or even executed). It is not an error to
remove or rename the here document files (%j
) in your command.
Under -a (all matched commands) we can unpack all the here document data in the file, which makes mk into a pretty smart archive unpacker. I would use uudecode or perl to unpack the data to be safer.
The 2 additional forms (%$
and
%?
) are largely used for automation.
Assume that some processor only knows the total of the numbers it
it producing after it has written them out. It can put the
total on the end of the here block:
Then "mk -mNumbers $file" outputs the header line and the list.$Numbers: %Jecho Total %$ ; cat %j 100 200 300 $Numbers: %j %^ $$ 600
%_
%,
or %&
-A
continue with the next marked line
when this one succeeded. Group marked lines into a script, kind of the
opposite of %.
.
%\
, %>
,
%)
, %]
, %}
mk
has all of the expander magic above, we still fall-back
on the shell variable expansion to trap the main program.
This allows the calling application to
replace xterm
with echo
to debug, or with
a script to trace actions, then execute the xterm
.
Here are some examples to clear-up some of the expansions:
xterm
for the file
with a pager in it, default to "less" if $PAGER is not set. Don't
try this unless $DISPLAY is set in the environment.
mk
is run with "-d color".
stderr
and fail. If this command is
not picked then fail anyway.
lex
over this file, recursively call mk
by
the name provided with the single letter options given on the output
C file, then remove the output C file, and move the a.out build to
our name without an extender.
Used mv -i if we don't confirm the Update marker.
cron
based task every hour to poll a list of
hosts. Using mk
we might break this problem down like this:
# $Poll: grep -v '\043' %f |xapply -P3 -f '$HOME/libexec/pollme' - host1 host2 ...
mk
on the list of hosts
27 * * * * /usr/local/bin/mk -smPoll $HOME/lib/poll.list
This has several advantages over combining the loop with the list of hosts.
pollme
script with a different
list of hosts for other tasks or to diagnose failed polls.
With another marker in the poll.list file we can reuse the host
list for a different task. We can mark the poll.list with
RCS/CVS/SVN keywords, as long as we comment them.
xapply
's -P
, option but
we don't have to change the crontab to tune that factor. And we
know what the list of hosts is used for by reading the file.
If we don't do these we risk loosing track of what the list of hosts does, or which program uses it.
Mk
treats the comments in
a file as compacted shell commands.
These commands are extracted by matching a "marker" name to a token
prefix, then expanding a lot of percent expressions into a shell
command. The command is passed to the shell to do what ever the
marker means. Any "meaning" assigned to a marked command is
from the perspective of the person that wrote the markup:
mk
doesn't pretend to assign intent
to such names, much as make
doesn't.
This puts the details about a files use in the file, rather than someplace harder to locate (like a crontab, Makefile, or script in another directory).
Mk
has a strong templating structure which allows files to only
include the marker commands that are different than the customary versions.
The culture around mk includes heavy use of shell code as well as recursive calls to mk.
Mk
is used as a back-end for:
Version identifier: $Id: expand.html,v 5.28 2010/08/13 19:26:04 ksb Exp $