Authorization is not authentication

Op uses pam to authenticate access requests, and that works most of the time. It also sanity checks parameters, all the better. But sometimes you need another person or process to authorize command the makes a meaningful change to the instance.

To understand this document

Some configuration experience with op rules and in-line scripts, to understand why one might need a helmet. A clear understanding of the UNIX™ process model would be really helpful, because we are using wait(2) and pipe(2) here clever in ways.

Helmets allow authorization checks before an escalation

Op does a pretty good job of limiting access to privilege escalation rules in the configuration file to authentic logins (the session is owned by the correct login, in an allowed group or netgroup, and they might know the right password). But sometimes a rule should only be accessed based on some criteria that op doesn't understand. For example, the time of day, or the lack of a key process, or a missing, full, or read-only filesystem. Or a permission credential provide by another user.

Many other examples are be ridiculously site specific, so there is no way anyone could code for all of them.

Even more to the point, all of those possible rules would be impossible to express in the limits of the declarative op configuration file's language. That format is otherwise adequate to describe escalation rules: so I don't want to change it. But most of these limits are easy to check in a shell or perl script, or a C program. Thus op out-sources the work to a co-process that should be coded in the most appropriate style for the task at hand.

That leaves a simple set of data-flow tasks:

Each of those tasks is described below.

Parameters passed to a helmet define context

This is all in op's HTML page for an overview of the mechanism. In this section we are going to talk about the parameters passed to a helmet, and why you might need each of them:
helmet [-C config] [-f file] [-g group] [-j job] [-m mac] [-R root] [-u user] mnemonic program euid:egid cred_type:cred
helmet
This is the program to execute. Of course we need that.
-C config
This provides the configuration file that authorized the escalation. You could look for meta information in the file with mk, or you could only allow escalation from that file in a time-box.
-f file
This provides the file specification given under -f, which has already passed any %f or !f checks. Since op has a fairly complete set of checks for files, I can only guess why you'd need this. I suppose it could be a local domain socket that you need to chat with, heck I'd do that.
-g group
This provides the group specification given under -g, which has already passed any %g or !g checks. I don't know what you might check with a group, but I'm sure you'll think of something.
-j job
This provides the job specification given under -j, which is only available in op version 3.
-m mac
This is the MAC specification given under -m.
-R root
This is the value chroot derived from the formula given (which might include $f or $d, for example). Since this might tell you where to mount a file system, or install a wrapper socket, it might be the most useful data passed to a helmet.
-u user
This provides the user specification given under -u, which has already passed any %u or !u checks. I have heard of sites that use a RADIUS service for passwords, but I think I'd use the pam module before I'd use a perl script.
mnemonic
This provides the name of the selected mnemonic. This could be used an a submarker (or marker) for mk, for example.
program
This is the path to the selected program, which might be in a chroot, so take care if you are going to stat(2) it. With all the checks op does under %_ and !_ I don't really know what else you might be looking at.
euid:egid
The provide escalated effective user and group ids (as decimal numbers). I don't know what you'd check these against in a helmet, but you might know more than I do about that. Which is the whole point of a helmet.
cred_type:cred
Which of the three credentials types allowed the access:
groups
The name (or gid) of the group that allows the access. Note that if the group is allowed by number you will get the number, not the name.
users
The name (or uid) the the user that allows the access. Like a group you may get the uid, rather than the name.
netgroup
The name of the netgroup that allows the access. If you use netgroups you should know how this could be used.

In all of these cases you could make some other local check on the attribute that allows the escalation, or check the other credential (viz. check user if group allowed) since op uses "or" logic to allow access, and you might prefer "and" logic.

All of the parameters listed above are provided to give a complete description of the context of the escalation. Other information is always passed in the environment. This allows every helmet to have a separate set of specifications that do not overlap any other (so multiple helmets could be chained together (see the helmet named coat, which does just that).

The additional environment variables usually start with the name of the helmet (or jacket) that reads them and an under-bar (_). As they are consumed the helmet process requests that op delete them from the escalated environment. This reduces the radiation of information to black-hat crackers who are trying to suborn an escalation rule. These bits are still in the process table (for the life of the helmet), but there is little to be done about that. Should we use a pipe? That forces all the helmets read and parse a stream? I think not.

For example, the stamp helmet reads STAMP_SPEC for the specification of the required authorization.

The jacket specification

A jacket process is a more advanced version of a helmet. Other than an additional command-line option it has exactly the same interface as a helmet:
jacket -P pid [-C config]...same options... mnemonic program euid:egid cred_type:cred
-P pid
The the escalated process-id.

The jacket program runs after the helmet, it is expected to start the escalated process (which is waiting for the jacket to close stdout). Once the process is running the jacket may take any actions required to manage the escalation. When the escalated process exit's the jacket should wait for the process and produce an apropos exit code based on the status returned from the escalation.

A jacket has only 2 more clues than a helmet: it knows the process-id of the proposed escalation (which has not changed effective uid or gid yet) and it can see the environment proposed by the helmet. These are not (generally) useful to make any new authorization decisions, but it is possible to use a jacket to reject or modify the access, see below. Many jackets work as helmets without the -P option.

How to start the jacketed process

The jacket starts the escalation process by closing stdout. After which the jacket continues to run in parallel with the escalated process. But how does one do this in the common scripting languages?

In perl one may:

open STDOUT, ">/dev/null";
In sh one may:
exec 1>&-
In C one would:
close(1);
or
fclose(stdout);

If the last suggested exit code is 0 (or none were suggested), op starts the process under the pid specified under -P. This process is a child of the jacket process, so it is the jacket's task to wait for the child process and interpret the success/failure of the escalated child.

In addition the jacket may provide services to the process, may clean-up any pre-escalation setup, or may kill the escalated process after some time limit or event.

Jacket credentials

Note that both helmets and jackets run with the setuid (and/or setgid) permissions that op gained when executed, or those assumed by any sentinel configuration. If you need to drop effective as the client process did, then consult the euid:egid specified on the command-line.

In the rest of the text assume that anything a helmet can do a jacket can do just the same. The only special feature the jacket has is the ability to start the escalation, and run as a co-process for that escalated process. The last action of the jacket should be to wait for the escalated process to return an exit code based on the status returned.

Taking action to allow, modify, or deny access

There are several actions a helmet might take. Each of them has a specific use-case (goal). We'll start with the output stream from the helmet, then the exit code.

Every output to stdout by a helmet is read by op for single-line commands. Each command is processed in the order read.

# comment
Comments are only used for debugging: if op is compiled for debugging the comments from helmets are output to stderr to help debug the configuration and processing of the helmet (or jacket).
-VAR
Remove the specified environment variable from the escalated process's environment list.
$VAR=value
Force the given value for the specified environment variable.
$VAR
Copy the specified environment value from the original list.
~prefix
Remove the given prefix from any matching environment variables. For example, the variable "hide_PATH" becomes "PATH" after a ~hide_ command. This allows the helmet to have a different forced PATH from the escalated process, while both are specified in the configuration of the rule. While this is a little strange, the operation makes some specifications much more clear, and easier to understand.
&0redirection
&1redirection
&2redirection
Redirect one of the standard I/O channels to/from the given redirection, which has one of the following forms (mocking the shell):
file
The default redirection is sane for the specified descriptor, but you can override that.
<file
Force read-only.
>file
Force write-only.
<>file
Force read-write.
>>file
Force append-only.
socket
Connect to the named local (UNIX) domain socket.
& fd
Close all file descriptors above fd. The typical value of fd is 3. This prevents escalated processes from finding unexpected open files. Because almost no program expects non-standard open descriptors there is no way (in a helmet) to redirect other fds.
exit-code (a decimal number)
Provide a proposed exit code for the helmet. Any non-zero exit code denies the escalation. The last proposed code is the one that matters.
"cmd"
Any of the above commands may be enclosed in double-quotes to quote internal newlines and double-quote. To protect the special characters op might confuse in the authorization stream, these characters should be replaced as follows:
  • " with \d (for double quote)
  • ` with \o (for open m4 quote)
  • ' with \q (for quotes closed, in m4)
  • a literal newline with \n
  • a literal tab with \t

Other single letter C escapes (e.g. \r) may be optionally replaced, as op doesn't treat these characters as special.

Note that octal escapes are not supported. This does limit the use of non-Roman languages or 8-bit character in the text of environment variables. This limitation may be removed in the 3.1 release of op.

Also note that the terminating newline must follow the closing quote, trailing white-space is not allowed.

On exit make the final call

The final word from the helmet is the exit code. Any non-zero exit fails the escalation. If the program is a jacket specification, them the code represents the success of the escalated process.

Configuration of helmets and jackets

Each escalation rule may have 1 helmet and 1 jacket configured. The keywords helmet and jacket specify the path to the trusted program that acts on behalf of the superuser (or sentinel user), so we generally expect an absolute path to be specified.

Later we'll see that there is a way to get more than one helmet and jacket, but that hinders some of the features of the native structure.

The examples in this section do not provide any authorization services. These are designed to be easy to read and understand, not to provide much additional functionality. Which is not to say they are useless, just not super useful for authorization.

An example helmet timebox

If we want to limit an escalation base on the time of day we can use the timebox helmet, which does exactly that. Most of the helmets and jackets I've coded use the standard -H and -V options to output usage and version information. The version information includes the names of the environment variables each expects as specifications:
$ /usr/local/libexec/jacket/timebox -V
timebox: ...timebox,v 1.7 2012/...
timebox: TIMEBOX_REVEAL, TIMEBOX_INSIDE, TIMEBOX_FORBID, TIMEBOX_WARN
That doesn't give you the format of the specification, or the semantics -- but it is a good reminder of what to look for in the documentation or code. Some helmets take -H to output a better reminder of the configuration options.
$ /usr/local/libexec/jacket/timebox -H
TIMEBOX_FORBID   comma separated list of excluded times: [!]*strftime[!=]=strftime
TIMEBOX_INSIDE   comma separated list of time relations: [!]strftime(<=?strftime)+;
TIMEBOX_REVEAL   remove prefix from environment entries
TIMEBOX_WARNING  escalation denied message for the customer (Sorry)

I've used the same suffix to mean the same thing in each helmet that supports them: These conventions are not mandatory: your local site policy may vary.

_REVEAL
This string is sent back in the output commands to op prefixed with the ~ command, the name of the variable is sent prefixed with a - command. The effect of this is to reveal some environment variables and remove the one that did it.
_WARN
If set this is the message sent to stderr when the client access is rejected. Otherwise some common default like the vanilla "Sorry." or "Access denied." is output.
_STALL
The number of seconds allowed to wait for access. If a lock is required, the time of day matters, or some other issue requires a re-attempt the check will only block approximately this many seconds.
The timebox helmet uses 2 of the above and 2 specific to the task at hand: TIMEBOX_FORBID and TIMEBOX_INSIDE.

Here is a base example of an op rule that allows anyone in group wheel (aka group 0) to get a superuser shell:

su	MAGIC_SHELL ;			# what
	groups=^wheel$,#^0$		# whom
	uid=0 gid=0 initgroups=root	# escalation
	PERP=$l RCSINIT=-w$l		# details

That rule could be run anytime by any Admin. If we want to limit it to off-peak hours (for our fish store peak 10:30 to 16:30) we can install a timebox helmet specification into that rule:

	...
	PERP=$l RCSINIT=-w$l		# details
	helmet=/usr/local/libexec/jacket/timebox
	$TZ=America/Denver
	$TIMEBOX_INSIDE=!1030.00<=%H%M.%S<1630.00
	$TIMEBOX_WARN=Peak$.hours,$.try$.later.
Note that we explicitly set a time zone, so that we don't take the system default. You might want to force $TZ in a DEFAULT stanza for your rule-base.

If we would rather not have any changes on Thursday, because the boss is not in the store to help, we would use this helmet:

	...
	helmet=/usr/local/libexec/jacket/timebox
	$TZ=America/Denver
	$TIMEBOX_FORBID=%u!=Thu
	$TIMEBOX_WARN=No$.su$.on$.Thursday.
This takes advantage of the fact that timebox converts day of week (in English) into a number for comparison (following %u's rules).

The timebox helmet lets you deny access based on the current time. That's all it does, but there are other helmets to do other things.

Also note that long running processes are not killed when they leave the specified time-range. That could be implemented as a jacket, but it was never needed (as starting a service should not leave a time-bomb in the process table). If someone is leaving a superuser shell around there are other ways to deal with that.

An example jacket: xdisplay

If we want a superuser shell with access to our X display we might be able to just set the DISPLAY environment variable and lie about HOME to find the .xauth database. But when we want to become a different mortal login we need to copy the authentication data to the new login's authentication database.

That's what the xdisplay jacket does. To do this it requires some specific configuration. The jacket requires the current display name (in DISPLAY) and the target login's home directory (in HOME):

...
	$DISPLAY $HOME=$H
	jacket=/usr/local/libexec/helmet/xdisplay
That extracts the current key from the active .Xauthority and installs it into the target login's .Xauthority (as the same display name). After the escalation exits is may remove the installed data (when used as a jacket, the helmet usage cannot).

Authorization via a stamp

First we need to distinguish authentication from authorization. Op does a good job of authentication: it assures that the login, group membership or netgroup membership required is met by the calling process. It makes sure that the parameters provides meet any restrictions that would make them unsafe (aka not `authentic'). It consults PAM to make sure the person at the keyboard is the person represented.

All the work above is matching the person to the task. That's authentication in a nut-shell: matching a person to the escalation.

Authorization means that another person or process agrees that now is the time to act. This is very different from knowing who is acting. You could think of the timebox helmet as using the clock to authorize an action (or possibly to deny it when, the escalation would be inappropriate).

As an example where people take action: imagine that there is a 24 hour by 365.24 day monitoring group at a major data center for example.com. They get alerts when an application service sends a error messages. The monitors then filter those alerts, only calling application support when there is an actual service disruption.

The management at example.com doesn't want application support to randomly or accidentally stop or restart running applications. They do want them to be able to act when the service is not working correctly. So they give the operations monitoring team a rule to enable the application support members to control the application on a given host for a fixed time. The theory here is that the operations team only calls the production support for the application when their is a service disruption, so restarting the application might be better than no service.

So the application team has a control rule ("tiger stop", "tiger restart", and so forth). But those rules only work for the application support account "joe" when an operations team member runs "tiger enable joe" and joe is a member of the support team. This enchantment lasts for some fixed window, or until it has not been used for some idle time limit. The notion above is that an interactive session might be granted a window of time in which repeated authorizations are not needed. Otherwise the operators would have to type 1 command for each escalation required to bring the service back on-line (which is also possible, but has proved to be a really bad usage pattern).

It is also common to use a 2 key commit style of authorization. In this case any two people from a group are required to run the escalated command. One creates a stamp for himself via a group access rule, with a -u option to specify the team member that will use the stamp. The stamp-check allows any stamp built by someone else to authorize the action (viz. Owner!$l:Allow=$l). This prevents anyone from authorizing their own changes. See stamp(7l).

This might also be used by a person to authorize themselves to access a rule-base repeatedly. This is parallel to the "timestamp" feature built-in to the popular sudo escalation program. But it is not "part" of op, and it requires a specific escalation (via local site policy) to build the timed stamp. It does make almost the same thing possible, and adds some features that add security and a lot of versatility.

How stamps are made and destroyed

Under op the stamp helmet connects to an existing socket in a specific place in the filesystem. The process listening on that socket is (usually) created with the stampctl program, which has several modes of operation (see stampctl(8l)):
stampctl -V

Output all the version information compiled into stampctl.

stampctl -V | tr -s '\t ' ' '| sed -n -e 's/.*cache directory: \([^ ]*\).*/\1/p'

Output the top-level stamp directory. This is often used to purge the stamp directory of dead domain sockets at system boot.

stampctl [-g group] [-m mode] [-u user] [facilities]

At system boot time an init.d (aka rc.d) script should make a call to the helmet (as the superuser) to setup the stamp directory structure. The facilities are simply the names of subdirectories that must be instanced. The optional owner, group, mode may be mixed with the names of directories (the last one set is used in each case).

PATH=/usr/local/libexec/jacket:$PATH
STAMP_FACILITY=. stampctl -B -m 755 -u root -g 0 . \
	-m  750  su \				# sudo-like root stamps
	-m 1777 -g daemon stamps \		# test stamps for mortals
	-m  750 -g cats  tiger puma \		# applications
	-m  700 -u source -g staff  msrc	# master source
Note that dot (.) is a synonym for the top-level directory which sets the default mode, uid, and gid if it exists, otherwise the default mode is 750, uid 0, gid 0. Existing unmentioned directories are not changed, so multiple start-up scripts may each build their own facility. Absolute paths are allowed, but discouraged. The code will not build implied directories, this is a safety feature, since the modes of any such directory cannot be specified. Modes may contain optional bits is in instck(8l) (as 750/1 or rwxr-x--?), the modes on the enclosing directory are used to mask optional bits (which is local site policy).

Since that the environment variable STAMP_FACILITY specifies the default top-level directory (absolute) or subdirectory (relative), it is always a good idea to explicitly set it in any superuser run script.

It is poor form to use the facility name "OLD", since install might build a backup directory by that name. It would also be a bad idea to let the Customer specify a stamp path in an arbitrary directory.

stampctl -M name [-n] [-max][-E end] [-I idle] [name=values]

This creates a new stamp entry for name, the name is taken relative to either STAMP_FACILITY or the hard-coded directory that all stamps are under, unless it starts with a leading slash. Most applications use the name of the facility followed by the Customer's login name or uid. This allows other op rules to find that same stamp by name.

The -n switch does the opposite: it builds a stamp that denys every authorization request. This is called "penalty mode". I'm not sure why you'd want to lock escalations for a limited time, but you may.

The max integer specifies a limit on the number of authorizations allowed (denied) by the stamp. Common values are less than 3, for a single-shot access, usually with an idle time limit of 20m or so.

Lock escalations more permanently by building a plain file where the stamp would be placed. The file is displayed as a rejection message for every escalation request that would open the stamp, and is not removed by stampctl (use rm to remove the lock).

Note that running stampctl itself is usually an escalated operation, and that action may, itself, require a stamp: this allows as many "sign-off" levels as required.

stampctl -M name -R remote[:roap] [-nX] [-max] [-E end] [-I idle] [-T timeout] [name=values]

Build a stamp which includes tableau entries from an off-host service. The remote host must provide a tcpmux service which accepts credential information from the client. See RFC 1918. Any proposed tokens from the command-line are replace by those sent from the remote service. (Under -X this measure is reversed, then the command-line overrides the remote service.)

The remote specification may include the name of the tcpmux service after a colon (:). The default service name is roapmux, which happens to be a real program. See roapmux, and the HTML document for it. The API for the service requires: the client to connect, the server replies with a positive connect message (like any tcpmux service) or a failure; the client sends single line:

login:groups:netgroups:domain:query
The five parameters sent should be valid, but the remote service is allowed to reject even valid request. Note that plural elements are separated by spaces.

Unauthorized clients get a negative reply. This starts with a leading dash (-) followed by a rejection message, terminated with a newline. Any isspace character may be removed from the end of the reply. For example, to disallow without radiating any useful information a service may reply with:

-Sorry

Authorized clients get a positive reply, which may begin with three integer values, like this:

+max,idle,timeout ignored-text
These values limit the values specified on the command-line. Values of zero are ignored.

The service then produces a tableau list (one per line), then closes the connection. Each tableau entry may be enclosed in double quotes to allow embedded newlines and the common C single-letter backslash escapes.

For example to allow 10 escalated commands with an idle timeout of 13 minutes, while keeping the local timeout the reply might look like:

+10,780,0
MAY_SHOVEL=yes
MAY_START_tiger=yes
MAY_STOP_tiger=no

A value of -1 for max forces the stamp into penalty mode (which forbids any access until the stamp timeout expires). For example we might reject a invalid login for a kilosecond with:

+-1,1000,1000 no login john
REASON="You do not exist.\nSee an admin, john."

Note that stampctl removes 1 carriage return character from the end of each line, to compensate for 1918 network encoding. Missing carriage returns are silently ignored.

stampctl -k [stamps]
stampctl -K [stamps]

Kill the sessions which are associated with each name. When none are specified, the implied one is a session named for the real uid. Note that this is mostly useful to kill sessions at logout time. Sockets owned by the real uid or spelled with the real uid or login name as the name of the socket are terminated, if connect(2) allows the access and the stamp is not in penalty mode. The uppercase version tries to signal the stamp to quit to overcome any penalty restrictions, but requires a connection to the socket to find the process ID.

stampctl -N [stamps]

Convert the given stamp to penalty mode. Thus blocking any future escalations with that authorization until the stamp times out, or it is destroyed via -K.

stampctl -Q stamp [-F] tableaus
Output the values for each of the requested tableau entries. If any of those entries do not exist fail. Under the -F option the format of each entry (1 per line) is the format op expects from a helmet to force an environment variable to a known value, otherwise it is the decoded values separated by spaces.

Usually there is an escalation rule to control the stamp creation, and one to end the session explicitly. There may also be other services that create stamps as part of a work-flow process. These tend to be owned by an application login that has ownership of a facility directory reserved for the application.

There may also be an escalation rule that allow an application login to create the directory on system (or application) startup. These rules are usually also available to operations to restore service after a storage or service failure.

How stamps are used to provide authorization services

The helmet stamp checks a specific stamp for access, tableau values, and possibly for environment information. It accepts only the standard helmet options, so any specifications are provided in the environment. (This is quite common for helmets.)

Each socket represents an active session, via a process which is attached to the socket. The "credentials" the stamp are 3 fold:

The existence of the stamp and the process attached to it

If no stamp exists then no authorization is implied. So the a successful connection to the stamp at the required location means that the question of authorization has been asked and answered. But we have to query the socket for that answer.

An unresponsive stamp (the process has exited or the system start-up did not removed the dead domain socket) gives us an answer without any follow up: the stamp request always fails. A plain file is taken as a failed request, so stamp presents the contents of the file as an failure message on stderr.

The existence and values in the tableau

The stamp jacket may request the verification of tableau entries, either by existence or by value. Thus a stamp might only be authorized for a particular tty, login name, parent process-id, or day of the week -- or perhaps the intersection of all of those. Note that there is no disjunction operation included for this specification.

If any required entries or values fail to match the specification the escalation fails. There is no practical way to tell a configuration error from a legitimately denied authorization. This is because creating a stamp that could never allow any access is not necessarily an error.

There should be local site policy conventions for meaning of common tableau names. I use Owner, Perp, and some others, but you might choose terms that are already common at your site.

The status of the stamp (penalty or authorized)

The stamp helmet requests the authorization status each stamp (actually that's the last thing it does). If the reply is "y" (yes), it allows the escalation. Other replies might include "p" (penalty) an "n" (no such stamp) -- they all fail the escalation.

Five environment variables are consulted for a stamp helmet request (these are documented in the stamp(7l) manual page as well):

STAMP_FACILITY=path

This specifies subdirectory under the system stamp directory (usually /var/op which should contain the specified stamp. It may be an absolute path, but that is poor form.

STAMP_SPEC=stamp[:name=value]*

This specifies the path to the stamp, which may contain slashes, but never dot-dot (..). This may also be an absolute path, but you might guess that I think that's a bad idea.

The name-value check actually allows 5 forms:

name
The given name must exist in the tableau.
name=value
The given name must exist in the tableau and must have exactly the specified value.
name!value
The given name must exist in the tableau and must not have the specified value.
namerelopvalue
Where relop is one of the C numeric relational operators: <, <=, >, >=, ==, or !=. In this case leading integers are converted from text to their numeric values. The given relational operator must be true between the values. Any trailing characters are also compared with strcmp, when provided.
namematchopRE
Where matchop is one of the perl matching operators: =~ or !~. The given tableau entry must match (=~) or not match (!~) the given regular expression.

These allow checks to assure that the same session that gained the stamp is the one returning for additional escalated commands.

Very rarely one needs to consult multiple stamps to authorize an escalation. Any environment variable that begins with STAMP_SPEC is actually permissible. A parallel usage for _WARN and _SET is honored. See an example below.

STAMP_SET=names

The listed tableau values are pushed into the escalated environment only if they are defined in the tableau. (Any undefined name is left as-is in the environment. It is not a configuration error to set a default value in the configuration of a given rule.)

STAMP_REVEAL=prefix

This supports the common reveal logic in op, see op-jacket(7l).

STAMP_WARN=sorry

When the authorization fails we output this string (else "Sorry"). This is not displayed when the rejection is provided by a plain file.

When you need additional check for authentication see coat below to nest jackets.

When called in jacket mode, the stamp process reconnects to the stamp socket at intervals to prevent the stamp from reaching the timeout limit. This doesn't defeat the session expiry limit.

Example stamp rule

In the system startup the stamp directory must be created under /var/run so we need a rule to do that as the correct login (since the stamps do not have to be run as the superuser):
DEFAULT uid=games gid=games	# test only stamp owner

games	MAGIC_SHELL ;		# get a test shell as games
	initgroups=games users=^ksb$

This rule allows anyone to open a session. It could be limited by any authentication require by local site policy. As an example, it just opens a default session for the client login (the $l after -M):

stamp	/usr/local/libexec/jacket/stampctl -M $l -I 6m
		BLAME=$l:$b TTY=$y
		TERM TERMCAP EDITOR DISPLAY ORIG_PATH=${PATH}
	;
	$1=^-M$,^make$,^open$,^start$,^touch$
	users=^.*$
	environment
	initgroups=%l
The session contains a lot of tableau data so we can poke at it. We pass the complete environment so the stamp may copy TERM and the rest. This is better than trying to pass the values on the command line as:
		BLAME=$l:$b TTY=$y
		TERM=${TERM} TERMCAP=${TERMCAP} EDITOR=${EDITOR}... ;
Because that sets unset environment variables to an empty string in the tableau, which makes it hard to tell if it had been empty or unset at the time the stamp was created. But for most application I suppose that doesn't really matter, unless you use an existence check. It doesn't hurt to pass the whole environment to the stamp, as it never fork's or execve's in any case. (Also, elements not listed in the tableau are not recoverable from the socket interface.)

This rule allows anyone to terminate their own default session, if the session process allows it (they might be in the penalty state):

stamp	/usr/local/libexec/jacket/stampctl $1 $l ;
	$1=^-k|-K|-n$
	users=^.*$
I usually don't allow the -K spell in production, as it removes penalty stamps. If you never use a penalty stamp then it doesn't matter. Access to -K could allow operators to cleanup stamps, which would be a feature.

This rule checks for the authorization stamp and recovers the TERMCAP variable for inspection:

check	{ echo TERMCAP="$TERMCAP"
	} ;
	users=^.*$
	helmet=/usr/local/libexec/jacket/stamp
	$STAMP_SPEC=$l:TTY=$y
	$STAMP_SET=TERMCAP:TERM:DISPLAY
	$STAMP_WARN=There$.is$.no$.stamp$.for$.$l
This rule allows anyone to reset the idle timer on their stamp:
stamp	/usr/local/libexec/jacket/stampctl -v $l ;
	$1=^-v$,^ping$,^refresh$
	users=^.*$
it may be merged with the kill rule above if the same authentication is required for both.

In this example the tableau for the new stamp comes from an authorization server (central.example.com:

central	/usr/local/libexec/jacket/stampctl -M $l -R $0.example.com -I 60m
		TTY=$y RT=$1 ;
	$1=^[0-9][0-9]*$
	$STAMP_FACILITY=rt $STAMP_SPEC=$l
	$STAMP_WARN=$l$.no$.love$.from$.$0

Two of a group

In some cases a support organization may require a second person to authorize a change to production (or a commit to a source repository). I use jacket stamp to do that by accepting a stamp from anyone in the group (e.g. group commit) to create a stamp in the commit directory named for a ticket number. The key feature here is that the stamp contains the login name (in the tableau entry Auth=$l):
ready	/usr/local/libexec/jacket/stampctl -M commit/ready-$1 -I 15m Auth=$l Ticket=$1 ;
	$1=^[0-9][0-9]*$
	groups=^commit$,^qm$
	uid=stamp gid=audit
This is run my someone in group qm or group commit to allow a commit operation under a given ticket number (you could change the RE to match letters as well):
$ op ready 121393

Next we configure a rule to make the commit. The key is to only accept a login that is not the same login, but is in any allowed group:

commit	commit-command... $@ ;
	$1=^[0-9][0-9]*$
	groups=^commit$,^appdev$
	$STAMP_SPEC=commit/ready-$1:Auth!$l
	$STAMP_WARN=No$.authorization$.stamp$.for$.$1
	$PATH $TERM $ENV $TERMCAP $EDITOR
	uid=source gid=source
This is accessed by anyone in group appdev or group commit to execute a commit operation under the authorized ticket:
$ op commit 121393 Makefile stamp.m

To make that rule a little more useful we can tell the commit-command the name of the login that authorized the commit by passing the "Auth" tableau entry to the environment (as $Auth) and the ticket number (as $Ticket):

commit	commit-command... $@ ;
	...
	$STAMP_SET=Auth:Ticket

Because we used the ticket number as part of the path to the stamp, we must open a stamp for each ticket. The stamp timeout of 15 minutes is an idle timeout, so as long as the commits keep coming the ticket will stay open. If you didn't make a script to run the commits, then you are not really serious about it; because that script is the best list to review.

Two to allow a multi-player escalation

In a different situation we might require two managers to authorize an escalation. For example a Quality Manager (in group qm) and an Operations Manager (from ops). They both need to get on-board with a change to allow the technical staff (in group admin) to get a superuser shell.
qm	/usr/local/sbin/stampctl -M qm/$2 -E 4h ;
	groups=^qm$
	$1=^allow$,^ok$
	$2=^[0-9][0-9]*$	# the RT number
	....
ops	/usr/local/sbin/stampctl -M ops/$2 -E 4h ;
	$1=^allow$,^ok$
	$2=^[0-9][0-9]*$	# the RT number
	....
Which gives both managers a command to type (or select from a GUI) like:
$ op qm ok 317811

After both managers have approved the change, the admin has about 4 hours to run the install command as the superuser. We configure her rule as follows:

admin	/bin/sh -c $* ;		# local site policy, of course
	groups=^admin$
	uid=root gid=wheel
	initgroups=root		# more local site policy here
	$1=^[0-9][0-9]$		# the RE number
	helmet=/usr/local/libexec/jacket/stamp
	$PERP=$l $RT=$1
	environment=^TERMCAP$,^TERM$,^DISPLAY,....
	$STAMP_SPEC_1=ops/$2
	$STAMP_SPEC_2=qm/$2
	$STAMP_WARN=There$.is$.no$.stamp$.for$.RT$1
Which gives her a usage like:
$ op admin 317811  make install

We could add more restrictions: we could make the managers specify the admin's login name (then check that from the tableau in each stamp). Actually there is a lot you could do, but the question is really what does your local site policy require. Yeah, we can do that.

Notes that the parallel usage for STAMP_SET means that STAMP_SET_5 is read when consulting the stamp specified under STAMP_SPEC_5, and STAMP_WARN_5 will be output in favor of the common STAMP_WARN. In the example I chose to radiate very little information, which is usually a good idea.

An finally the last specification (in alpha order) is taken as the stamp to refresh in jacket mode.

The boot cheat via stampctl

When a rule requires a stamp to run, but the rule also needs to be run on reboot, we have a problem. One could duplicate the rule to allow the superuser to run it without the stamp, or we could just use stampctl to build a stamp in the init.d (aka rc.d script that offers permission to start the application.

This adds a line to the boot script to create the required stamp. It also might include a -k call to end the stamp after the application is started. This allows the normal audit check to remain in-place, and should not violate local site policy. Just document the use-case in the op rules-base with a comment.

Allow a reverse-pickup escalation for ~/.ssh/authorized_keys

I want to submit my authorized_keys file to the local accounting system. This system pushes each login's ssh configuration to newly created home directory to provide a much better customer experience. We'd really like to allow each customer to submit the file they want installed, but how can we do it securely?

The Customer must submit the files (or file names) securely, and we want to enable the accounting user to fetch the files only when requested to do so. If we don't secure both ends a suborner could install her keys in someone else's account, or collect keys from someone's account (possibly to their account).

So let's give every login a rule to run stampctl to build a stamp in a secure directory with a random name. This rule then sends the name of the stamp file to the accounting system over a plain-text tcpmux service.

acct	{
		# Send our stamp name (pid.op-pid.five-random-digits) to
		# the local accounting hub for a key recovery.
		if ! [ -f "$Where" ] ; then
			echo "$0: $Where: no such file"
			exit 66 # NOINPUT
		fi
		set -e
		export From=`pwd`
		StN=acct/$$.$1.`jot -r 1 10000 89999`
		stampctl -M $StN -Dnobody:acct -E10m -uacct:acct -m660 \
			Intention=recovery Who Permit Where From
		muxsend -xf/dev/null acct.example.com key-pickup "$STAMP_FACILITY/$StN" 2>&1 |
		grep -v '^[+]' || echo "Submission complete."
		#echo "$0: requested pickup of $From/$Where."
	} $0 $Z ;
	users=^.*$
	$1=^submit$,^authorized_keys$,^authkeys$,^keys$
	uid=0 gid=acct
	$STAMP_FACILITY=/var/op/stamp
	$SHELL=/bin/sh umask=0007
	$PATH=/usr/local/libexec/jacket:/usr/local/bin:/bin:/usr/bin:/usr/sbin
	$Where=.ssh/authorized_keys $Who=$l $Permit=acct
Stamp access is allowed for the login acct and group acct, which you should change to local site policy values. If the acct directory under /var/op/stamp exists and it is owned by some login other than the acct, then you may change the uid to that login name (or to 0 for the superuser). Update the accounting host name while you are here, because your local accounting host is not in example.com. Also check the gid's spelling.

This specification runs the stamp process as nobody, while the filesystem permission allow access by acct. This is a good policy in general, because it prevents signals from removing the effects of penalty mode (but it is not really needed here). We have to be the superuser to make the permission not match the process owner. We keep the group to allow the process to remove the stamp socket as it exits.

You could run the rule as acct and drop the -D specification from the stampctl spell. That matches my local site policy much better, as the rule would never need superuser access. If the muxsend fails, then the useless stamp times out after 10 minutes (which doesn't hurt anyone).

We'll see code for that below that loops-back an ssh to the host as a mortal login (usually acct to fetch the login name from the newly created stamp. We must ssh to the host because we don't want to send the client's login name over the unenciphered tcpmux connection.

Back to the client for the pickup

After we have both the login name and the stamp path we can run an op rule to become the Customer with the stamp file (under -f) and the login name (under -u).

The stamp spell for that rule checks the specified stamp (%f) for the owner (%u) and for confirming token attribute (viz. Intention=recovery) which the stamp creation rule added to the tableau. This makes it a lot harder to use some other stamp as a shill to trick the accounting system:

That provides enough security that we can pass an audit. Here is the rule we need:

acct	{
		# We must cd to the original directory
		cd "$From" || exit 77 # NOPERM
		# Optional check modes on the file, or exit non-zero.
		# Optional check the name of the file, or exit non-zero.
		# Optional add other files (.profile, .forward, ...)
		tar chf - $Where
	} $0 $u $f ;
	$1=^recover$
	%f.path=^/var/op/stamp/acct/[0-9]*.[0-9]*.[0-9]*$
	users=^acct$,^root$
	uid=%u initgroup=%u
	$SHELL=/bin/sh
	$PATH=/usr/local/bin:/bin:/usr/bin:/usr/sbin
	$STAMP_FACILITY=. $STAMP_SPEC=$f:Intention=recovery:Who=$u
	$STAMP_SET=Where:Who:Permit:From
	$STAMP_WARN=Insuficient$.stamp$.credentials
	jacket=/usr/local/libexec/jacket/stamp

Note that you might get this strange error from a client:

muxsend: acct.example.com: /var/op/stamp/acct/17711.28657.46368:\
	-ssh 192.168.89.233 stampctl exits 65280
That usaully means you don't have a valid SSH_AUTH_SOCK, or no permission to connect to it. If you used your agent to test it and that had a random name, then you bought your own troubles. Or something caused the agent to exit on the accounting server.

When all that passes muster, the rule collects the file (as a tar stream) and outputs that back to the accounting system. That archive is extracted as the mortal login to build their .ssh directory when their account is created on new instances. We don't check the permissions anymore. If they require an empty file or mode that breaks ssh, then we give that to them.

This is really hard to suborn: the stamp has to be in a local directory that only the superuser (or acct) may write to -- with an active process bound to that name; the tableau must have an entry in it that no other rule installs; the stamp only lives for a relatively short time. The plain-text message with the random stamp name radiates almost no information. The ssh to the client host runs only an escalated rule, of which the accounting system is the only possible client. The tar archive generated as the requesting customer, and is benign until it is unpacked. It may be frisked for traps before any deployment: then it is unpacked as the new mortal account from their home directory. This is forward compatible with collecting .profile, .exrc, .forward and the like.

The only issue I see is that a login name on the requesting system may overlap a login name on some other host: that is clearly a matter of local site policy. So the accounting system must autonomously map the name of the recovered tar archive, since we don't provide a name for the file -- so it is up to that structure to find a unique home for the archive.

Setup for the rules above

Here is a session where I test the recover rule as myself. I first change rules listed above to add replace login acct with ksb, and the group acct as with my primary login group.

Then I build the op stamp directory we need with install and stampctl:

# mkdir -p /var/op/stamp
# vinst /usr/local/lib/op/acct.cf
# /usr/local/bin/install -drv -m 2775 -o root /var/op
drwxrwsr-x. 3 root root 4096 Aug 29 11:10 /var/op
# /usr/local/bin/install -drv -m 2775 -o root -g ksb /var/op/stamp
drwxrwsr-x. 3 root  ksb 4096 Aug 29 11:13 /var/op/stamp
# stampctl -u ksb -g ksb acct
# ls -ld /var/op/stamp/acct
drwxr-xr-x. 2 ksb ksb 4096 Aug 29 14:55 /var/op/stamp/acct
Then I become myself to test the first rule (setup the stamp):
# su - ksb
$ op acct submit
muxsend: acct.example.com: key-pickup:		# no service running there, of course
$ ls -la /var/op/stamp/acct
total 8
drwxr-xr-x. 2 ksb  ksb  4096 Aug 29 11:17 ./
drwxr-xr-x. 3 root root 4096 Aug 29 11:13 ../
srw-rw----. 1 ksb  ksb     0 Aug 29 11:17 17974.19166.19331=
$ stampctl -Q acct/17974.19166.19331 Intention
recovery
$ stampctl -Q acct/17974.19166.19331 From Where Permit Who
/home/ksb .ssh/authorized_keys root ksb
That looks great. We build a stamp with a random name, it has all the information we need to recover the authorized keys file. Next I'll try the recover rule. If it builds a tar file we are good:
$ op -u ksb -f /var/op/stamp/acct/17974.19166.19331 acct recover | tar tvf -
-rw-r----- ksb/ksb         911 2000-09-11 09:11 .ssh/authorized_keys
$ stampctl -k acct/17974.19166.19331
$ exit
# vinst /usr/local/lib/op/acct.cf
With that check done, I can put the owner back and change the host we notify from acct.example.com to the local accounting hub (or at least one in our security zone). Then re-run the install commands with the accounting owner rather than myself. Note that the rm command should be redundant, but some very old versions of stampctl failed to unlink the socket. (But if you are running 1.26 or better you are good to go without it.)

The accounting tcmpmux support

On the accounting host we need a tcpmux service that reads 1 line. The line is the path to the stamp on the target host. The IP address of the peer is available from getpeername. That is enough to assure that the host is one we manage, then trigger an ssh to the host (as the accounting user) to fetch the name of the user via stampctl. With a valid stamp we'll ssh back to collect the file(s) with op -u login -f stamp acct recover. The output of that (if not empty) is the tar archive to add to all the user's new home directory builds. If the exit code is non-zero ignore the output.

Another feature you must enable is a secure way to credential the accounting user to reach all the client hosts. I use an instance of ssh-agent which is started when the machine boots, in a fairly secure fassion. It is true that if you have a shell as the accounting login on that host you can get a password-less login on any managed instance, so mayhap that allows you to break everything. But that's how we update /etc/passwd, /etc/shadow, and /etc/netgroups -- so likewise does the ability to submit arbitrary changes to the accounting system. So you are already a superuser if you can do that. Your local site policy may vary. We, therefore, assume that the accounting user's known_hosts already has all the client machines.

From the commands above we can code such a script. Because they all do the same thing, this is going to look like recvmux, msrcmux, explmux, or roapmux. See the whole perl program, in acctmux.pl, but this is the good bits:

...
unless ($login = `ssh -anT -o \"VerifyHostKeyDNS yes\" $remote_as\@$remoteIP /usr/local/sbin/stampctl -Q $stamp_name Who 2>/dev/null` and 0 == $?) {
	print $opts{'M'} ? $mask : "-ssh $remote_as\@$remoteIP stampctl exit code $?\r\n";
	exit EX_NOPERM();
}
# Hey, local site policy: login names alpha + up to 15 alpha-numerics %%
$login =~ s/\r?\n$//;
if (not $login =~ m/^(\w[\w\d]{0,15})$/o) {
	print $opts{'M'} ? $mask : "-invalid login name \"$login\"\r\n";
	exit EX_NOPERM();
}
$login = $1;

# Find a place to stash the tar file.
unless (open($keep, ">userkeys/$login.tar")) {
	print $opts{'M'} ? $mask : "-no space available for $login\r\n";
	exit EX_NOPERM();
}

# Open the command to capture the file.
my($a) = '&';
unless (open($fh, "exec </dev/null $a$a ssh -nT $remote_as\@$remoteIP . /usr/local/lib/distrib/local.defs \\$a\\$a op -u $login -f $stamp_name acct recover \\$a\\$a exec /usr/local/sbin/stampctl -k $stamp_name|")) {
	print $opts{'M'} ? $mask : "-ssh $remoteIP recover: $!\r\n";
	exit EX_NOPERM();
}

# Release the client and finish the download async.
print "+$login\r\n";
open(STDOUT, ">/dev/null");
...
The code is marked with comments matching /Hey.*%%/ that help you fill in all the site policy. Fix the code and check it into local revison control so you can update later to better upstream versions.

We are a tcpmux service, so our messages back to the client start with either a good (+) or failure (-). Becuase of my local site policy there is an option to mask any radiated information about all failures.

On the client host we source local.defs by local site policy to set the environment (PATH, TZ, and their like). Then we collect the requested file. There is no reason why we couldn't collect more files, but we don't need others for this task. The process could be extended to include most other dot files. Or there might be more than one pick-up for files, which must be submitted selectively: your call.

The server-side tcpmux configuration is a little long:

tcpmux/key-pickup  stream tcp nowait acct:acct /usr/local/libexec/acctmux acctmux \
	PATH=/usr/local/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin \
	HOME=/home/acct SSH_AUTH_SOCK=/home/acct/private/agent

With all the installed we can test the whole flow:

ksb$ time op acct submit
Submission complete.

real    0m0.41s
user    0m0.02s
sys     0m0.02s
ksb$  exit

Now make your accounting system deploy the tarball when it builds new home directories. The meta-reference here is that the acct account itself must have an up-to-date .ssh/authorized_keys file to make this work. In configuration management it usually comes down to a chicken-and-egg issue in the end: if you have it it works, if you don't it is a total breeze to get going.

Other helmets

These are other helmets I've coded, some of which do not make a lot of sense unless you know the context in which they are needed.
envauth -- match environment variables to REs
 $ENVAUTH_VAR_name=re
 $ENVAUTH_NOT_name=forbidden
 $ENVAUTH_WARN=warning
 $ENVAUTH_REVEAL=prefix

This checks the contents of the dynamic environment variables set by other helmets. Since op only checks the environment as presented by the client, we might need this after a helmet has added new elements.

sheval -- assign dynamic environment variables from shell command output
 $SHEVAL_SET_var=cmd
 $SHEVAL_UNSET=list
 $SHEVAL_WARN=sorry
 $SHEVAL_REVEAL=prefix

This is a good way to get some information recorded in the environment while still escalated. Can be replaced by in-line scripts, or replace an in-line script. See sheval(7l).

This is also a great way to check for dual group membership. Given that sheval fails when a command exits non-zero, we can match the group list $a plus the client's real group $r against a group with expr. In this example we'll match the group disk in the rule, and the group wheel in the jacket (you may have to pass $PATH to make the jacket work):

ktest1  echo You are in both wheel and disk ;
	groups=disk
	helmet=/usr/local/libexec/jacket/sheval
	$SHEVAL_SET_trapWheel=/usr/bin/expr$.',$r,$a,'$.:$.'.*,wheel,.*'
	$SHEVAL_UNSET=trapWheel
Or get the the system architecture via uname, which the make recipe below depends on:
ktest2  make install clean ;
	users=^.*$
	helmet=/usr/local/libexec/jacket/sheval
	$SHEVAL_SET_MyARCH=/usr/bin/uname$.-p
wrope -- escalated environment access client's diversions via wrapw
 $WROPE_TO=template
 $WROPE_REVEAL=prefix

Runs an instance of wrapw as the client, but changes the ownership of the diversion socket to the escalated login. This allows the escalated-process access to the client's active diversions. Very useful to run regression tests as a mortal login. Unlike other jackets, there is no way to get an authorization warning from wrope: a failure to start wrapw is treated an an error, and fails the escalation as a OS error. (Which does get logged as a failed escalation.) See wrope(7l) and wrapw(1l).

proxy-agent -- escalated environment accesses client's ssh-agent socket
 $SPROXY_FROM=env
 $SPROXY_ENV=env
 $SPROXY_TO=template

Grant the escalated login access to any single local socket service. Almost always used to gain access or ssh's $SSH_AUTH_SOCKET socket (which is the default). Actually this will proxy any local domain socket. It builds a safe directory under template with mkdtemp(3) and mktemp(3). We use 2 (or more) sets of XXXXXX's to assure that the make temporary filename call is secure (since mkdtemp is atomic, the mktemp in the newly created directory is also secure, due to limited permissions). See proxy-agent(7l).

signed -- check signature hash or checksum on the proposed program
 $SIGNED_FILTER_cmd=output
 $SIGNED_CMD_cmd=output
 $SIGNED_REVEAL=prefix
 $SIGNED_WARN=sorry

This may be used to assure that a script or binary program was not been changed since the rule-base was last updated by keeping one or more hashes or checksums in the rule-base. These are compared to a run-time version of the same hash which must match as an additional authorization check. See signed(7l).

manifest -- match the proposed program to a list of REs
 $MANIFEST_LISTspec=file
 $MANIFEST_WARNspec=sorry
 $MANIFEST_WARN=sorry
 $MANIFEST_REVEAL=prefix

This helmet allows the consolidation of many parallel rules into a single rule, as long as all the escalated scripts have the same usage and client authentication constraints. The path to the program specified by op in the helmet parameters must match one of the regular expression in at least one file. The RE may be prefixed with a forced non-zero exit code to deny the escalation.

For example this list allows ls, cat, and true, but not other commands:
^/bin/ls$
^/bin/cat$
^/bin/true$
77=.*
The op configuration might look like:
do	$@ ;
	helmet=/usr/local/libexec/jacket/manifest
	$MANIFEST_LIST=/path/to/file/above
	users=... uid=...

This reduces the number of mnemonic rules, while adding little cost. Manifest is most often used to allow many similar application support scripts to the grouped into a single rule definition. See manifest(7l).

coat -- apply multiple jackets to an escalation
 $COAT=jackets
 $COAT_REVEAL=prefix

This allows more than a single jacket or helmet. Both may be done with the reveal hook, due the limits of the environment specification the jacket must be hidden with a prefix from the helmet.

Each of the required jackets in $COAT is started in turn. After all the external command output is consumed and the exit code is still 0, the next is started. If all the listed jackets start we allow the escalation. See coat(7l).

ttyowner -- replaced by stamp
 do not use
This jacket has been replaced by stampctl and stamp. It was intended to prevent 2 login sessions from sharing a common authorization, but it was very hard to assure that it actually checked the correct conditions.

Summary

Jackets and helmets give op a wider range of application and power, without adding every possible escalation limit into the core tool. Use them with care.

The facility provided by the stampctl and stamp services allow multi-key authorization as well as services that keep short-term state to create a whole range of options unavailable through other escalation structures.



$Id: jacket.html,v 1.44 2013/09/03 18:18:28 ksb Exp $ by .