To understand this document

This document assumes you are quite familiar with the standard UNIX macro processor m4(1), and the shell sh(1). If you've already used hxmd (and/or msrc) to manage the configuration of multiple hosts this program makes a lot more sense.

While distrib is older than hxmd the tool looks like it might have been coded after that program to make ad hoc command-line updates to remote hosts: the reverse is more the case. Distrib was coded to make off-the-fingers updates easy, at a time when it looked like such updates were a good idea.

Really that kind of easy update leads to madness later. Lots of typing fixes just makes the mess worse in the long run. After 25 years of running UNIX™ systems and other data center hosts I've found that I have little use for any spell that is not automated in a recipe file (with make, a script, or mk) and revision controlled.

What is distrib

Distrib sends files to remote hosts via rdist. It does this by macro processing a template distfile (see the rdist manual page) through m4. There is some ad hoc markup that may be applied to the distfile in addition to the common m4 markup to denote files which should also be processed with the same m4 context.

The same attributes bound to each host by hxmd are bound by distrib for macro processing (given the same configuration file), but distrib is limited to a single configuration file. To merge multiple configuration files use efmd in filter mode, see the HTML document for more details. The latest version of distrib accepts configurations on stdin just to support sites that are converting from one to the other.

The purpose of this program is to update binary compatible hosts with exact copies of files, or nearly identical copies when the file is marked-up with m4. In an effort to reduce typing most of distrib's command-line options are intended to simplify shell access to complex operations.

I suggest using msrc and a recipe file rather than distrib and an interactive shell for several reasons:

Here are the common usage forms:

distrib [-I | -a | -t type] [-S] [-bHnqRvxy] [-C config] [-d var=value] [-D var=value] [-f distfile] [-G guard] [-k key] [-o rdist-opts] [-p rdistd] [-P rsh] [hosts]
Update the specified hosts via the distfile specified.

This is the form used in most recipe files to send a set of files to lot of hosts. Then some other recipe usually gets a shell on the remote host(s) to finish the update. This was the core of the older master source structure.

distrib -m machine [-S] [-c] [-bHnqRvxy] [-C config] [-d var=value] [-D var=value] [-f distfile] [-G guard] [-k key] [-o rdist-opts] [-p rdistd] [-P rsh] [labels]
Update a single machine with the given distfile (or the default one), specify which parts of that file via the listed labels. Note that "single" is a lie, you can provide a list with commas: that's where the dmz script in the msrc source gets it's usage (see the HTML document).

This was (rarely) used to push sub-set files on a per-platform basis to some target hosts. We use mmsrc or msrc (in local mode) to select a different SEND or MAP file-set in the new structure.

distrib -c [-I | -a | -t type] [-S] [-bHnqRvxy] [-C config] [-d var=value] [-D var=value] [-G guard] [-k key] [-o rdist-opts] [-p rdistd] [-P rsh] [-s cmd] [hosts]
Generate a temporary distfile from the command-line specification and update selected hosts with it.

This is used from the command-line or a recipe file to send some updated file to a set of hosts. Quite often this gun is loaded with more bullets than you expect.

distrib -h
Display really complex usage message.

This might be the program that convinced me that on-line help was not optional.

distrib -V
The standard version output from any ksb tool.

I'm pretty use this was the program that taught me to put a version string in all my programs.

History first

Distrib used to be a key part of the master source structure: every file pushed to every host (source, binary, or configuration data) would be pushed with it. This made pull versions of the master source hard to do (in fact mostly impossible in the general case). I set about to fix the structure by creating the hxmd/xapply/ptbw wrapper stack to remove the single threaded nature, and the msrc/xclate parts to deal with the input/output issue. Those work quite a bit better than distrib.

I expected to just remove distrib from my hosts.

It turns out that distrib is still useful when you run a lot of hosts that don't want to compile every little program. I don't do that (normally) -- I just compile every program on every host, or build a system package to install on the hosts that don't have a compiler, then use apt-get, yum, or some other package manager to fetch the update from a locally controlled repository.

Others still love the automation they've created around distrib, and so we keep it working for them.

Primitive ancestor to msrc

If you've already learned to use the newer msrc chain (see the HTML document) then this is all going to seem like a kludge. Bear in mind that this was the prototype, not the result of 20 years of effort. If you've not seen msrc you might browse the quick start page there before you read this; you might never come back to this page.

Distrib just made it way easier to send installed files and source code to many remote hosts from the command-line, or a recipe file. It was built as a "better than what we have" effort, not designed with a lot of operational experience. For what it is, it works pretty well.

I'll admit that I almost never use this program anymore. That's because msrc is faster, safer, and easier to explain to another person. Yet, in a strange way, distrib is still useful for ad hoc updates to common text files, at the very least. For example, to send an updated message of the day file to all my servers I might run:

# distrib -ac -C npc.cf /etc/motd HOST:/etc/motd
The issue I have is that I don't really have the exact same message of the day on every host, so that command doesn't do what I really want. But if it does for your environment keep reading.

The configuration file

We take the same configuration file as hxmd, which is better documented in the hxmd HTML document. With a slight change, in that the default header is more than just the HOST macro:
%HOST SHORTHOST HOSTTYPE HOSTOS HASSRC

Unlike msrc, distrib expects a few more attributes for each host:

HOST -- a name that targets the host's preferred network interface, or FQDNs
This is the unique key for each node. It is the destination for each rdist file transfer.
SHORTHOST -- a command-line abbreviation for the host
This is just a name for the host that might be easier to type on the command-line. Which really was not a great idea in the long term (because it makes typing errors more common).
HOSTTYPE -- hosts that run the same binary files should have the same value
I have a set of them I've used for years. You can make up your own, or use mine.
HOSTOS -- the version of the operating system
I make this a 3 digit "base 100" number for easy comparison in m4 and cpp. For example "11.22" become "112200" (because 11.22.0 is the same thing). and 4.1.3 becomes 40103 (since we get 2 base 10 digits for each base 100 number). I would stick to integer numbers, but you could use strings.
HASSRC -- this host has a compiler, otherwise undefine this (with .)
If this is set to a non-empty string it means that the host matches the -S specification. This is mostly a shorthand for the command-line.

Lots of the logic in distrib knows about these macros. They are each special in some way, and the logic for them is all over the code. Under hxmd the only macro with a fixed name is HOST, and (like distrib) -k can change that name.

You can change the header markup line (as in hxmd), but when you don't define some of these macros things just don't work as you'd expect. Think of it as an invariant that you must define these (except for HASSRC) to use distrib.

Host selection

Compared to hxmd's options these might seem odd: that's because they were built as we needed them, more than as a unified structure. The "unified" part here is the set of macros expected in every configuration file.

The configuration file (-C config)

The list of machines and attributes is in hxmd. If the name is dash (-) then stdin is read.

By default distrib processes files with the same HOSTTYPE as the local host (assuming it can find that information). If the host is not listed in the specified config it checks for a definition of the attribute MYTYPE. It also draws a parallel default from MYOS.

There is also a compiled-in default (shown under -V).

$ distrib -V
distrib: $Id: distrib.m,v 5.9 2009/10/12 20:23:20 ksb Exp $
distrib: using configuration from `/usr/local/lib/distrib/distrib.cf'
distrib: default column headers: %HOST SHORTHOST HOSTTYPE HOSTOS HASSRC
distrib: compiled HOSTTYPE is FREEBSD
distrib: library path "/usr/local/lib/distrib"
distrib: rdist binary is "/usr/local/bin/rdist"
distrib: m4 binary is "/usr/bin/m4"
distrib: template for private file space: fdistXXXXXX

Include myself (-I)

By default distrib excludes the current host from an update (since that might clobber a file with itself). Under -I it includes itself in the target list.

This is really used under -H mostly, so in a list of hosts we include ourself. Nowadays I'd use efmd to get the list of hosts from a configuration file.

$ distrib -Cnpc.cf -H | grep `hostname`
$ distrib -Cnpc.cf -H -I | grep `hostname`
sulaco.example.com

All hosts (-a)

Ignore any selection to update all machines in the specified configuration. This is usually used for common configuration files or text files like /etc/motd.

As in the example above, this defeats some of the internal restrictions host selection logic. In this case only hosts with the same HOSTTYPE would be selected, while under -a all hosts are:

$ distrib -Cnpc.cf -H | wc -l
22
$ distrib -Cnpc.cf -H -a | wc -l
98

Select hosts via a guard (-G guard)

An m4 clause to select target hosts. This is not exactly like hxmd's option with the same name: this guard must produce a non-zero, non-empty string for each node that should be selected. This proves to be quite limiting, as a host may only select itself.

For example we'll select the hosts on this subnet by producing the string "yes" for them (rather than the empty string).

$ distrib -C npc.cf -H -G "ifelse(DEFNET,192.168.87.0,yes)"
w01.example.com
w03.example.com
...
(I'm not showing where DEFNET comes from, assume it must be part of the configuration file.)

Select hosts via HASSRC (-S)

This is the more primitive version of hxmd's -B specification: it only works on the HASSRC macro.

If the macro is set to a value for a node then it meets the conjunction. This was quite often used in the older master source structure to pick the hosts used to compile binary files. For example:

$ make HOSTS=-SCnpc.cf install

In that make control recipe file there is a set of default values to lever this:

INTO=	/usr/src/local/bin/mkcmd
HOSTS= -S
MDEFS=
DDEFS=	-dINTO=${INTO} ${MDEFS} ${HOSTS}

And some macros to lever those value (or overrides from the command-line) to take generic actions:

SSH=ssh
LOOP=	-for i in `distrib -H ${HOSTS}` ; do \
		echo $$i: ;\
		${SSH} $$i -n sh -c '". /usr/local/lib/distrib/local.defs && cd ${INTO} && ${MAKE} DESTDIR=${DESTDIR} DEBUG=${DEBUG} $@"' ;\
	done
HERE=	distrib -E -f Make.host -m `hostname` | ${MAKE} -f - $@
Note that nowadays we spell the platform recipe as Makefile.host, but we used to spell it wrong.

We'll push the current directory to each target host with:

rsource: Distfile msource
	distrib ${DDEFS}

Later in the recipe we can run a target on each target remote host with:

install:
	${LOOP}

Select hosts by HOSTTYPE (-t type)

This is the more primitive version of hxmd's -E specification: it only works on the HOSTTYPE macro, and only for equality.

With no hosts listed this selects only the hosts that have one of the listed types (a comma separated list). For example:

$ distrib -Cnpc.cf -H -t SUN5 | wc -l
33

Select hosts by name (hosts)

Each node's HOST and SHORTHOST macros are compared to each of the hosts, if any match the node is selected. This is very much like the hxmd -G specification, but more forgiving in that it accepts an abbreviated hostname.

For example we can convert the abbreviated hostname into the key value:

$ distrib -Cnpc.cf -H -e nostromo
nostromo.example.com

Select by another key (-k key)

Specify a different comparison key for a nodes name. This was dangerous (at best) and still is under hxmd. It does the same thing in both programs: replace the HOST macro with a user specified macro name.

Don't do this, when you can use efmd. It is very likely to confuse distrib.

The interface to m4 and the shell

Distrib's interface to m4 is not as complete as hxmd's: they is no way to send -d, -I, or -U down, or send a file as hxmd's -j does.

Send a command-line define to m4 (-D var=value)

This sends the define on to m4. It doesn't know anything about recursion or building a merged configuration file, so that's as far as it goes.

This is used to tune the target files, for example a spell that needs a file with the present Julian day might have a recipe which includes:

DDEFS= -DDAY_STAMP=`date +%Y/%j`

Trace and debug options (-n and -x)

Distrib is sometimes hard to use, so having a "don't really hurt me" option (-n), and a "please show me why you are doing this!" option (-x) is pretty useful.

When getting started with distrib you can make liberal use of these options. Note that it doesn't stop the host selection logic (or show that to you).

The interface to rdist

In effect distrib is just a front-end for rdist, so it stands to reason that it should take some of the same options.

Since it was coded before rdist version 6 was released it takes the older rdist command-line specification. It may translate or pass the options on as given. I'm not going to limit that because I don't know which one is more sane.

This super-tight coupling to rdist is the primary bug in distrib. Note that msrc pushed the file distribution much further down the stack, and is designed to allow completely local operation.

Perform binary comparisons (-b)

Turn on rdist's binary comparison feature.

This is not usually all that useful, unless the clocks on your hosts drifts a lot.

Pass a definition on to rdist (-d var=value)

Exactly as given.

This is used for the "dead-man's switch" that assures that the recipe file in the current directory is calling rdist through distrib:

DDEFS=	-dINTO=${INTO} ${MDEFS} ${HOSTS}

Name a distfile to process (-f distfile)

Since rdist defaults to distfile or Distfile you might want to specify something else -- but in my youth I made the mistake of using the same name for the marked-up file.

Of course running plain rdist over the m4 and @file@ marked-up file leads to insane results. So I put in an undefined rdist macro (${INTO}) which would stop rdist before it could break anything. Nice save, but I shouldn't start by painting myself into a corner before I finished the first can of paint.

This is largely used under -E (see below).

Pass on options (-o rdist-opts)

Exactly as given. This was added when version 6 of rdist became common.

For example to update the script but not the binary executables in a local directory I might run:

distrib -c -o noexec /usr/local/libexec/hostlint HOST:/usr/local/libexec/hostlint

Specify the remote rdistd path (-o rdistd)

Exactly as given.

In the new system this is a host attribute (RDISTD_PATH).

Specify the local path to (-P transport-path)

Exactly as given.

In the new system this is a host attribute (SSH).

Silence rdist (-q)

Tell rdist to be quiet, as given.

Tell rdist to remove extraneous files (-R)

Exactly as given.

In the new system I would use a clean target in the platform recipe. Or use rm itself on the msrc command line.

Insert a special command under -c (-s cmd)

Nowadays we'd use msrc and just put the command on the end of the line. This was largely used to run install or some other update tool. It is way more sane to put the command in a recipe file and revision control it, manage it, and run it without using your hands so much.

Let alone the quoting issues of getting the command through the shell, rdist, and the remote shell with the phrasing you meant.

Younger files mode (-y)

Exactly as given. Used when you are thinking the remote host may have made a local update to a file.

This usually means you've lost control.

Tell rdist to verify (-v)

Ask rdist for a verification report only.

This is one you can test yourself. It doesn't break anything.

Command-line modes of operation

Distrib is really more than 1 tool, or maybe it is a Swiss Army Knife™ with a few too many flashlights included. In any case these options select a command line usage to engage any of the many blades, flashlights, and/or cyanide tablets available.

Use a command-line distfile (-c)

In this mode the same list of target files is sent to a set target hosts. The last parameter is a host:directory destination, where the host part is almost always the word HOST (so distrib can iterate over the list). Hey, if you really want to send the files to the wrong host, you may.

For example, let's look at host we might update two files from /etc (note the -nx flags):

$ distrib -cnx -t AARDVARK /etc/pf.conf /etc/inetd.conf HOST:/etc
distrib: echo "define(MYTYPE,\`FREEBSD')define(...)dnl" |
distrib: m4 - /tmp/fdistxD0ar4/cdistU5QbzF |
distrib: rdist -P /usr/local/bin/ssh-x -p /usr/local/bin/rdistd -f - -n
updating host nostromo.example.com
install -onochkgroup,nochkowner,noexec /etc/pf.conf /etc
install -onochkgroup,nochkowner,noexec /etc/inetd.conf /etc

Output just the list of hosts to update (-H)

Nowadays one might use efmd to do this. Since all the logic to process host selection is spread out thought-out the code this option must be internal.

This is the mode we've used throughout the examples in this document.

Change the usage to allow labels (-m machine)

This changes the command-line usage to treat parameters on the end as rdist labels. That allows more complex distfile access. Nowadays we'd use hxmd and call rdist ourself, or use msrc and a PRE_CMD hook. See the hxmd HTML document.

It is also often used in combination with -E (below).

Output the m4 processed distfile on stdout (-E)

Nowadays one would used efmd to do this (or even hxmd). This doesn't work completely, as the @file@ markup may create temporary files which have already been removed when the file is being read. Hxmd was designed to overcome exactly this issue.

This is almost never used to look at Distfile, it is much more often used to look at the platform recipe file to find markup errors in it. For example to see the Make.host as it is in the payload:

distrib -E -m sulao -Cnpc.cf -f Make.host 2>&1 |less

Force @file@ processing in files not named distfile (-F)

This is a total hack. The @file@ markup is triggered by the name of the file we are processing, and this adds that markup to any file (no matter the name). I'd never do this today.

Bugs

The command-line options don't match the msrc_base chain options and they should. There is little point to reworking distrib to make it look like a (modern) master source tool, since it is actually easier to process a marked-up distfile via efmd, pass the output to rdist to send the required files (which could also be processed with efmd or hxmd).

The program is not well maintained: I don't use it nearly as much as I use msrc, or even dd.


$Id: distrib.html,v 5.13 2012/08/17 19:03:37 ksb Exp $