To understand this document

This document describes one of the two master source platform pull services. The other is mpull. If you want the client to do the configuration of the shadow copy, then you want to read about that client. If you need to use remote meta-information to configure the shadow copy, then you are in the right place.

If you have never read the msrc primer, please read it before you continue with this document.

To install this service you must be able to edit system configuration files (viz. /etc/inetd.conf, /etc/tcpmux.conf, or /etc/xinetd.d/msrcmux.conf). Then you must be able to restart (reload) the apropos service.

Most implementations of inetd had an implementation of tcpmux included, until recently. Since some versions of xinetd do not include one, so I coded a stand-alone version that you may configure to support local services. See tcpmux, if your local version of inetd doesn't internally support the RFC1078 mux.

What is msrcmux?

The manual page describes msrcmux as offering platform archives of layer 2 (product) master source directories. This allows recognized clients to request the configured shadow of a specific source directory for download, via an anonymous data service. The client must have an IP address that reverse-maps to a hostname contained in the meta-configuration data on the server, and it must specify a legitimate path to the product directory.

This sounds like it could be a security issue, but there are filters one can put in to mitigate such risks. More about that later, because putting a cipher or other checks in makes this really hard to debug. So please start with clear-text tar files, then compress them, then encrypt them, only as needed.

When would a client connect to msrcmux?

Some people push updated (or new) configuration files to the instances they manage. Some setup pull structures which allow their instances to `self configure' by requesting updates from the (local) configuration management service. I am not an advocate for either method. I push when needed, and pull when needed. Both work and scale just the same, if you engineer the structure well. It is clear to me that you need a push to setup a pull, and need a pull to assure that all the pushes assumed to be installed were, or still are. See hostlint.

Clients connect to msrcmux to pull updated configuration data from their configuration management server. In the msrcmux case, that data comes only from master source directories which are under a specific directory hierarchy. The service may also limit the meta-configuration files available to the clients by directory. These limits are intended to allow enough restriction to assure no private data radiates outside known clients, while still allowing delivery of useful updates.

That is a fine line. Allowing global client access to your CVS repository is a much larger data-leak risk.

As for the specific time a client would request any given update, that is an engineering issue. I will say that having a large population of clients request services at exactly the same second is a bad idea. It is far better to configure a window for each client (or groups of clients) to assure they do not all pick the same time to pounce the service.

It is also a good idea to provide locally cached copies of the master source to reduce network latency. Disk space is cheap and rsync is really nifty. That is not to say that every client should cache all parts of your source hierarchy: set a local policy that works for you.

How do clients connect to msrcmux?

Each client connects with muxcat, see muxcat(1l). There is a recipe file which pulls the whole build-chain (msrc_base, install_base, and all the level 2 packages) into the home directory of a mortal login. At npcguild.org you might get it from msrc/opt/ksb/home/, but your local site policy may have a custom location.

What if the reverse lookup for my hosts doesn't work?

This is actually quite common, so I fixed it. Provide a reverse map file under -R on the msrcmux command-line. This forces the mux to call mk to select a shell command from the reverse file to map the host to the correct name in the config file:
mk -s -l0 -mhostname -sIP-address -DCONFIG=config reverse
Note that the hostname will be @ for unmapped IP addresses.

If the reverse specified on the command line is dot (.), then the configuration file is searched for a matching marked line. All the normal mk judo works: build a script to find the name or match it from a map file of regular expressions, or chain to some other program. The client connection is actually still open in stdin, but this is not a good way to augment the protocol, really.

Example reverse map file

This reverse file maps the loop-back address to our hostname, 2 RFC 1918 addresses to local hostnames, and every other hostname to what the local resolver returns (in lowercase).
# $Id: ....
# $*(127.0.0.1): hostname
# $@(10.5.34.83): ${echo:-echo} predator.example.com
# $@(10.5.34.144): ${echo:-echo} sulaco.example.com
# $@(10.5.34.233): ${echo:-echo} nostomo.example.com
# $@(*): ${false:-false} "unmapped reverse for an unknown IP"
# $*(*): ${echo:-echo} %M
The unmapped reverse for an unknown IP line should issue some warning to the client, and log the failure to trigger the remeditation of the client's lack of a reverse mapping, policy configuration, and owner. Moreover, use this to find the party trying to pull configuration data from your configuration management server to reconnoiter your hosts.

This is a clear opportunity to close-the-loop on many levels. Don't let it slip away.

Example services

On my workstation I run this service from /etc/tcpmux.conf
# supply test hosts with msrc configurations
msrcmux stream tcp nowait bob /usr/local/libexec/msrcmux msrcmux -Zmux.zf /usr/msrc
The login bob is the builder of all the configured tar archives. Bob can write in /tmp and has a home directory that is empty and owned by source.source, not bob. The ownership prevents any build from writing in $HOME, but the group (source) allows read access to things like common shell configuration files.

My local rsync'd copy of the master source is kept where I have space to do it (under /home/local). That is really invisible to clients, since they have no visibility into that. But it does set the default INTO to /home/local/src if it is not set in any master recipe file. Since I always set INTO that's never happens. I could use a symbolic link from /usr/msrc to /home/local/msrc to trick the makefile code.

If you run a platform that uses xinetd then you'll need this stanza:

 # This is the configuration of the xinetd tcpmux service.
service msrcmux
{
	disable      = no
	id           = tcpmux-msrcmux
	type         = TCPMUX UNLISTED
	wait         = no
	socket_type  = stream
	protocol     = tcp
	env          = PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/bin:/usr/bin
	user         = bob
	group        = source
	server       = /usr/local/libexec/msrcmux
	server_args  = -R /usr/local/lib/policy/dmz.rev -Z mux.zf /usr/msrc
}
Install that file (replacing any red text in accordance with local site policy). Also enable tcpmux-server. Then reload xinetd to activate the new service.

Building the extracted directory

A client host pulls the tar archive via the generic muxcat client, see muxcat. For example:
$ muxcat msrc.rocks.example.com msrcmux local/bin/oue >/tmp/$$.tar
$ file /tmp/$$.tar
/tmp/2063.tar: POSIX tar archive
Then builds it as a platform source directory:
$ D=`mktemp -d /tmp/${USER:-${LOGIN:-nobody}}XXXXXX`
$ cd $D
$ tar xvf /tmp/$$.tar
x ./
x ./ITO.spec
x ./machine.h
x ./README
x ./TODO
x ./oue.html
x ./oue.m
x ./oue.man
x ./ouereg.sh
x ./Makefile
$ make
explode -s dicer.c
mkcmd oue.m
(cmp -s prog.c main.c || (cp prog.c main.c && echo main.c updated))
main.c updated
(cmp -s prog.h main.h || (cp prog.h main.h && echo main.h updated))
main.h updated
rm -f prog.[ch]
explode -s dicer.h
/usr/bin/gcc  -DFREEBSD -I/usr/local/include -o oue main.c  -L/usr/local/lib -lgdbm
$ ./oue -V
oue: $Id: ...
...
To clean up after the build:
$ cd ..
$ rm -rf $D $$.tar

This allows a client to pull any number of archives, build them, install them and leave no local debris (other than the updated configuration and any install backup files). Thus no large /usr/src/local spool is required for this model.

However this model also doesn't support building layer 3 packages. Those must be converted to some binary package format.

Building a map file for hosts that don't reverse correctly

It is better to map the reverse of a host to the FQDN of the host. Sometimes for other network topology reasons the name used in site.cf is an alias for 1 interface, rather than the FQDN that might map to multiple network interfaces. In that case the reverse name will not match the forward used by site.cf.

These are easy to find. For each IP, if the name used forwards to and maps back to the same name we are good. Otherwise we need to build a map from the name we got to the name we have:

#!/usr/bin/env ksh
# $Id: ...
# Build a prototype reverse file from an hxmd configuration file. (ksb)
hxmd -C${1:-site.cf} 'echo HOST $(dig HOST +short)' |
xapply -f 'Rev=$(dig -x  %[1 2] +short |sed -e s/\\.\$//)
	: ${Rev:=$(host %[1 2] |sed -ne "s/\\.\$//" -e "s/.*pointer //p")}
	: ${Rev:=@}
	[ _"$Rev" = _%[1 1] ] || echo "# \$$Rev(%[1 2]): \${echo:-echo} %[1 1]"' -

That spell builds a new reverse map file. Make the map file depend on the configuration file in a make recipe and you are in business. You may have to add some catch-all lines at the end to map unknown clients. For example:

# map unknown clients to their reverse name (lower cased)
# $@(*): ${echo:-echo} "IP %d: not authorized" 1>&2 ; ${false:-false}
# $*(*): ${echo:-echo} %M

Default values provide useful clues

Some clues are built into the stock structure via 2 files in hxmd's library auto.cf and mux.zf.

We'll talk about mux.zf first. Here is an example from my installation:

# $Id: mux.zf...
# Zero configuration file for msrcmux pull actions.			(ksb)
#
# Provided in the RFC1078 service name:
MSRCMUX=`msrcmux'
# Clients access this service by this CNAME, A record, or SRV:
MSRCMUX_MPS=`msrc.example.com'
# We were provisioned with this configuration file:
MSRCMUX_CFG=`HXMD_OPT_C'

That zero-configuration file sets the attribute MSRCMUX to the name of the (default) RFC1078 service, "msrcmux". This allows checks for -BMSRCMUX or ifdef-markup (based on that macro) to allow any master source level 2 directory to modify its output when called from our msrcmux service.

Since it is a zero configuration file the values are defaults, and any host definition might provide an explicit value to replace them.

Limit access to specific directories

A client is allowed to pull any directory if they are in any valid configuration file, and their IP reverses to a known hot. Any directory might include an Remote.hxmd (aka. Msrcmux.hxmd) which limits access to hosts that do not (or do) have a specific attribute.

One might wish to limit access to files which carry protected information (ssh keys, passwords, SSL certificates, etc.), which would normally be pushed over ssh. To implement that policy, let's use the MSRCMUX from the zero configuration file (above) to flag a general limit for pull-clients. Then for each product which should not allow such clients may include the lines in Remote.hxmd:

# Never allow msrcmux client pull access because ...
# See mux.zf in hxmd's lib.
-B!MSRCMUX

This clearly documents the reason for the limit in each directory, and the location of the attribute definition. The provision spell to make that happen a -Z option to msrcmux specifies the file mux.zf which is built in hxmd's library directory by msrc_base. You are welcome to build one which better fits your needs. This could be specified by local site policy: for example, local policy might require an attribute name like MSRCMUX_ACCESS_REQUEST.

A local example of this is the hostlint-policy directory which only allows hosts which have the policycache SERVICE defined to pull that policy. There is no restriction on rsync access to that directory, so clients could pull the directory an run mmsrc to install it. The limit is in-place to prevent distribution of a policy directory to hosts that might keep an out-of-date copy: by limiting the places we have to update the policy, to assure a more timely update for all.

Using the auto.cf cycle

The auto.cf file then adds attribute assignments for some macros that might allow the client to reconnect (as listed in the zero-configuration file above, mux.zf above):
MSRCMUX
The service polled.
MSRCMUX_MPS
The CNAME that the service wants clients to connect to, or the name of the host when none are provided.
MSRCMUX_CFG
The configuration file selected for the host.
These automate pull updates cycle by intent, and also allow an update to change (the enclosing loop) for the next update cycle by changing the name of the host to poll. This requires the current service to advertise a new CNAME which may be moved to the new service after at least 1 whole update cycle. Then a flip of the CNAME to atomically move the update record.

Display the configuration of a client

This is a really good trick. It is much easier to debug issues with a client if you can see the hxmd configuration data the server is using for your client. But it seems like it might be hard to fetch that data in a clear presentation, since it might be drawn from any number of configuration files.

But with msrcmux it is really not as hard as you'd think. It does have a lot of levels of quote and markup, but then again think about who coded it. There are 5 files needed to construct the output:

A Makefile to drive the data collection:

# $Id: Makefile,v ...
# Reflect our configuration back to a pull client, as best we can. (ksb)
INTO=/usr/src/opt/ksb/which

GEN=
SOURCE=Makefile Makefile.host README

source: ${SOURCE} ${GEN}

FRC:

# msrc hook
__msrc: source

An hxmd cache directory to capture the output from an m4 dumpdef call. We'll call the directory defs.host so msrc will add it to MAP.

For that directory we need a defs.host/Cache.m4 recipe file:

`# $Id: Cache.m4,v ...
# Record all the m4 definitions we know about for this host.	(ksb)
'changequote([,])dnl
[Q='
O=`
]changequote(`,')
`all: suppress
	@TF=${O}mktemp $${TEMPDIR:-/var/tmp}/defXXXXXX"${O} &&\
	grep -v ${Q}^#${Q} suppress |oue -D$$TF >/dev/null &&\
	efmd'dnl
ifdef(`HXMD_OPT_C',`` -C 'HXMD_OPT_C')`'dnl
ifdef(`HXMD_OPT_X',`` -X 'HXMD_OPT_X')`'dnl
ifdef(`HXMD_OPT_Z',`` -Z 'HXMD_OPT_Z')`'dnl
` -G 'HOST` -F0 dump.host 2>&1 |oue -k ${Q}%[1 1]${Q} -R ${Q}%1${Q} -I$$TF |\
	grep . ;\
	rm -f $$TF

defs: all

HOST: all
'dnl

Which depends on a file defs.host/suppress. This file removes any macros from the dumpdef output that either is noise, or should never be made public.

# $Id: suppress,v ...
`builtin'
`sysval'
`dnl'
`dumpdef'
`indir'
`paste'
`divnum'
`m4wrap'
`substr'
`maketemp'
`__line__'
`expr'
`undivert'
`len'
`include'
`divert'
`__file__'
`undefine'
`regexp'
`shift'
`index'
`decr'
`spaste'
`pushdef'
`translit'
`define'
`popdef'
`patsubst'
`errprint'
`traceon'
`ifdef'
`incr'
`sinclude'
`changequote'
`traceoff'
`esyscmd'
`eval'
`unix'
`m4exit'
`changecom'
`ifelse'
`defn'
`syscmd'
`PEGPROXY'
I only stuck PEGPROXY in there as an example, it is really not double-secret. It might be clever to remove some of the hxmd internal macros: HXMD_B, HXMD_OPT_C, HXMD_OPT_X, HXMD_OPT_Z, HXMD_PHASE, HXMD_U, HXMD_U_COUNT, HXMD_U_MERGED, and HXMD_U_SELECTED. These do not radiate a lot of information, but putting them into a configuration file is poor form.

The value of MSRCMUX_CFG is litterally "HXMD_OPT_C", forcing a dependency on m4 to expand that to the filename given to msrcmux by the client: removing that definition makes the output far less useful. Se we don't remove the internal definitions at this level.

The cache recipe also uses a file defs.host/dump.host that is super anti-climatic:

dnl $Id: dump.host,v ....
dumpdef
This file could be omitted, in favor of literal markup in the recipe file, but it makes the spell far less flexible.

A Makefile.host to drive the display of the data on the client host (after it become Makefile):

`# $Id: Makefile.host,v ...
# The platform recipe for this just displays the data we produced	(ksb)
# the recorded data and the list.  Most functions just use the defs
# file directly.
'changequote([,])dnl
[Q='
C=`
]changequote(`,')
`SOURCE=Makefile README defs

BUILT="'syscmd(`date|tr -d "\\n\\r"')`"

all: defs .SILENT
	echo ${Q}# $$From: $${echo:-echo} ${BUILT}${Q}
	sort defs | sed -e "s/^\${C}\([^${Q}]*\)${Q}	/\\1=/"

clean:

${SOURCE}:
	echo "Bad push for $@, sorry." ; $${false:-false}

source: ${SOURCE}

FRC:
'dnl
Note that there is a TAB character in the sed command between ${Q} and the following slash (/).

With those files installed, a pull of that directory (which I called opt/ksb/which) builds a directory with a make recipe file, and the defs file built by the cache recipe in defs.host. The defs file is the raw output from a dumpdef for the client host, with the macros listed in suppress removed. The all target in the recipe formats this output to look more like an hxmd configuration file.

Run make all| ${PAGER:-less} in that directory to see the configuration specification for your client host. Here is an example script to drive the process from a client:

#!/usr/bin/env ksh
# $Id: ...
# Report this host's configuration from our master-pull-source's	(ksb)
# the hxmd file used to build auto.cf.
set -e
TF=`mktemp -d ${TMPDIR:-/var/tmp}/${USER:-${LOGIN:-$$}}XXXXXX`
cd $TF
hxmd -Cauto.cf -Glocalhost -BMSRCMUX_MPS \
	'muxcat MSRCMUX_MPS msrcmux ${1:-opt/ksb/which} MSRCMUX_CFG |tar xf - &&
		make all'
cd /
exec rm -rf $TF
Note that this outputs nothing when the client didn't pull auto.cf (because MSRCMUX_MPS is not defined for -B). If you'd like more feedback for that case add a -N like:
hxmd -Cauto.cf -Glocalhost -BMSRCMUX_MPS \
	-N "echo 'No pull information for localhost in auto.cf%0'" \
	'muxcat MSRCMUX_MPS msrcmux ${1:-opt/ksb/which} MSRCMUX_CFG |tar xf - &&
		make all'
Otherwise the output looks like:
# $From: ${echo:-echo} "Wed Sep 26 18:23:08 CDT 2012"
BINDVER=`9'
CLASS=`w'
ENTRY_DEFS=`/usr/local/lib/distrib/local.defs'
HOST=`w01.example.com'
HOSTOS=`61000'
HOSTTYPE=`NETBSD'...

Given a local site policy that specifies which host attributes are provided to define each host, we could reformat that output into auto.cf to build a local configuration file on the client that mocks the central one quite effectively.

The perl script to do just that is left as puzzle for the reader. And this would be the place to remove the HXMD_* macros from the stream (replacing them in any definitions). Next handle multiple-line values and all unbalanced quotes.

Inadvertently limiting access

If the master recipe (e.g. Makefile, or Msrc.mk) must write in the local directory, then you may be out of luck. Almost all tcpmux services run as a non-privileged user (viz. nobody or vanilla) and possibly a group to allow read-only access to some data (viz. source). Processes running under such restrictions do not have access to write in cache directories or to co files from RCS.

Most MAP'd files should function because they only update temporary files via m4 filters. But if they use syscmd they may execute commands that must create files and fail as well. A good example of this is hxmd cache directories that are owned by the superuser or some dedicated pseudo-user.

What this means is the you can only pull products that were coded with care. Actually this is true for any complex structure, so it shouldn't be news to you.

Out of sync sources

Note that msrcmux does not do anything to assure that the source directory is stable. If you need to add an rcsvg filter to your process, then you'll have to edit the script, or root the master source copy on a clean shadow of the working directories. Space is cheap. See the HTML document on rcsvg.

Some care should be taken to assure that the revision control label used to checkout the stable code is, itself, always stable.

Summary

The tcpmux service allows access to some, but not all, remote clients over simple network services. This service provides client-based limits on access, granular to directory, and a clean way to update level 2 packages and the configuration of the service itself. The auto.cf file contains some macro support to allow clients to find the service.
$Id: msrcmux.html,v 1.23 2012/11/09 20:13:52 ksb Exp $