What you need to know to understand this document

This document assumes you are familiar with UNIX shell commands and have run some system level utilities. It also assumes that you have access to a UNIX system as the superuser. The document also uses my "code.css" style sheet to denote the difference between markup terms, parameter designations, and command-line options, environment variables, and a path /seen/in/the filesystem.

Privilege escalation in general

UNIX ™ and Linux services use the least privilege required to perform each task, which makes the whole system more secure. Special groups (viz. "operator", "lp", "mail") give some applications access to protected resources (data, devices, directories), rather than running all local service as the superuser. Everyone takes great care to use secure network protocols (viz. ssh and https) for private data, and to avoid injection attacks or releasing private data to third parties.

Any tools that escalate privilege must also be capable of very fine-grained control, and be as secure as possible by default. Part of that control would include rejection of simple typographical errors and overt acts of subornation, and a clear audit trail. This document describes how to use op (with my modifications) to get exactly what you want, and nothing else.

As a first example allow anyone to change their own shell (under Solaris):

chsh	/bin/passwd -e $l ;
	users=.*
	uid=root

A slightly more complex example: allow any Customer in group "web" to restart Apache:

apachectl /usr/local/sbin/apachectl $1 ;
	$1=^(restart)$
	groups=^web$
	uid=root
With that in place anyone in group "web" may run:
op apachectl restart
to restart the running Apache instance. They can't pass any other verb (stop, configcheck, etc.) unless it is added to the $1 regular expression list in that rule definition, or another rule is created.

Tip: I almost always use group membership as the key to escalated access. It is easy to maintain as my Customers change political groups, I don't have to change the op configuration, because their login names never appear in the rules, just their groups. And group access is key to other features of UNIX -- so use it.

Op allows much more complex checking and control, some of which may be out-sourced to an arbitrary helper application. Before we dig into all that we need to explain the basis of design.

Think of op as a firewall. It keeps Bad Guys away from sensitive commands while allowing Good Customers access to make their tasks easier. This is what an IP firewall does, but op does it with shell commands rather than network resources. We follow the same paradigms as any firewall:

UNIX models

The general-purpose privilege escalation on a UNIX system comes in 2 flavors: the setuid bits, and proxy access through a system daemon.

Any program with setuid (setgid) on in its permissions bits runs with effective-uid (effective-gid) set to the uid (gid) of the file owner (see chmod(1)). This allows a mortal login to run an application with the privilege of the owner of the application. The application can "drop" back to the original login at anytime (for example after opening a protected file).

Other privilege escalation is done by connecting to an existing process with any of the interprocess communication facilities: sockets, FIFOs, shared memory, semaphores, message queues, ptrace, and/or signals. In this case the client cannot start the process, so when it doesn't exist no escalation is possible. The system daemons which accept these connections may appear to "start on demand" from inetd/tcpmux, sshd, launchd or the like, but the IPC connection is made to an existing end-point that was present shortly after the host booted, or the user logged in. These services are started with escalated privilege to allow access to private resources. The sendmail MTA is an example of this type of escalation.

How op does it

For op we want to focus on the first tactic: op runs with a setuid bit and an owner of the superuser ("root"). But op is designed to "drop" to a particular login and group set specifically for each configured application.

For example, a program that needs to remove a mailbox from the local e-mail spool might only have to run with the "mail" group. Coding a whole new application just to run rm would be a waste of time: we can tell op (in part):

rm-mail	/bin/rm -f /var/mail/$1 ;
	uid=.
	gid=mail

That specification tells op to treat anyone running the command:

$ op rm-mail lark
as if the login running this command were in the group "mail" and ran:
$ rm -f /var/mail/lark

The advantages to this:

There is no special path to find "rm-mail".
Either you fill "/usr/local/bin" with lots of adapter scripts, or you add new directories to each Customer's $PATH variables. By putting the adapter logic in the op rule, we obviated the need to add most scripts or extend $PATH.
Setuid shell scripts are a security issue.
Don't ever make a setgid (or setuid) script: there is a race condition in the indirect execution of shell scripts via a symbolic link that allows Bad Guys to break them. Op provides a secure path to any script it runs, and doesn't need setgid (or setuid) bits on the script, it drops to the correct credentials before it executes the target application.
We can revision control and audit the op rule-base
So we remember who built "rm-mail", why she did it, and who needed it. If we do have to put some adapter logic into a script we use the op rule-base as an index to keep track of where they are, not the Customer's $PATH.
We keep the filesystem cleaner
Over the years we've found that little adapter scripts get lost, out-of-date, or otherwise mismanaged. Either they never get deleted, or get deleted while still (rarely) needed. With a well-known structure and policy to grant access and index them we have the elements we need to manage the ones we do need.

The disadvantages to the example presented:

Insecure in that we might be able to remove unexpected files
As coded (in the example) that would allow the removal of any file on the filesystem that the group "mail" could remove. We can tighten that up later -- it is just an example.

Think about giving that rule /var/mail/../*/*": since we didn't forbid the "/../" string that might be able to remove some other files not under /var/mail.

Anyone can access the rule
We didn't limit the rule to a list of groups or users. We should always be explicit when we make a rule as to who we expect to access it.

Later we'll see how to remedy these issues, and how to code more complex rules.

Mechanics of escalation: building rules

"Who can execute this mnemonic?" is the first question we need to answer for each rule. That question is answered in op's configuration file in three parts: parameter matching, basic authorization, detailed authorization.

To describe these we are going to jump right into op's configuration file, because it is the best way to get you started. To find op's configuration file you should ask op with the version option (-V):

$ op -V
op: $Id: op.m,v 2.78 2010/03/25 18:18:14 ksb... $
op: access file `/usr/local/lib/op/access.cf'
op: using regex
op: multiple configuration files accepted
op: inline script and $s accepted
op: with pam support, default application "op"

That tells us to look in /usr/local/lib/op/access.cf. Because op is really picky about that file, you would have to su to root to see it. When the file is insecure, the version command above will complain about it.

Starting with access.cf

Looking at the default configuration file for op we see the "help" spell, which looks something like this:
help	/usr/local/libexec/op/help ;
	users=.*
	dir=/usr/local/lib/op

In brief that says, in English:

the mnemonic "help" means run /usr/local/libexec/op/help with no arguments, let any login run this and change the process's current working directory to /usr/local/lib/op.

If you actually run, as yourself:

$ op help
then you should see a list of the op mnemonics configured for your local host. This doesn't mean you have permission to escalate to use any of them -- we don't want to give the Bad Guys a list of the accounts to attack for free.

In fact op makes a lot of effort to skate on a fine line between telling the Bad Guy too much and telling the Good Customer too little. Sometimes a local admin might change op to be a little less verbose if there might be more Bad Guys about. In fact the "help" script may not be installed at your site, or -l might show an error message about such listings being "forbidden by site policy".

With more examples like the one above we will poke at the other features of op. This only works if you have access to a machine where you can edit access.cf (as root). If you don't have a host to do that on you can just read the manual page, as this tutorial won't help you very much.

See also the more technical review of the configuration file format.

Which rule should we select?

The configuration file may define more than one rule for a given mnemonic: but the first one that matches the input arguments is the only one that the Customer may access. If they do not have credentials to run that one op rejects the attempt.

After the mnemonic name matches additional attributes may be added to the definition of a mnemonic to specify expressions that must match the argument list to select the proper rule.

mnemonic
The first thing that has to match is the mnemonic name itself. That is a literal string match, because RE matched proved too produce unexpected results. By convention mnemonic names should be short strings with no shell-special characters in them.
$#=number
Force the count of the number of words allowed on the command-line to be exactly number. When two mnemonics have the same name, forcing a different number of allowed arguments disambiguates them.
$N
$N=REs
The named positional parameter (viz. $1, $2, $3, and so on) must match one of these REs, otherwise another mnemonic may be selected. The default RE is a single dot (.).
$*=REs
Every other positional parameter must match one of these REs. This is often used to prevent leading dashes in parameters.

Examples of parameter matching

My customer want to run rndc with several keyword option: start, stop, reload, status, reconfig, and querylog.

Most of these the native rndc will do, but both "start" and "restart" need to call the /etc/rc.d/named script:

rndc	/etc/rc.d/named $1 ;
	$1=^(start|restart)$
	...

rndc	/usr/sbin/rndc $* ;
	$1=^(stop|reload|status|reconfig|querylog|help)$
	...

In the example above I anchored the "stop,reload,..." list because that helps op build the correct usage message under -l. I also could build another rule with "freeze" and "thaw" if I needed one. By keying on $1 we make it look like the command namespace is not as flat as it really is.

In older versions of op we would have named the rules as:

rndc-start	/etc/rc.d/named start ;
	...
rndc-restart	/etc/rc.d/named restart ;
	...
rndc-stop	/usr/sbin/rndc stop ;
	...
That really wastes space in the configuration file and causes the Customers to wonder if it was a hyphen or an under-bar they needed? It is almost as bad as the little adapter scripts. The better solution is to match on $1 and the like.

In this example I need to pass a helper script some values:

apachectl /opt/web/bin/apachectl $1 $2 ;
	$1=^unsecure$,^secure$
	$2=^(start|stop|restart|graceful|configtest)$
This uses a little different matching tactic for $1: a list of REs that each match a single word. Op knows how to output either under -l:
op apachectl unsecure|secure start|stop|restart|graceful|configtest

Who can run an escalated rule?

Now that we've filtered down to a single rule we need to check to see if the Customer is allowed access to it. Op controls access to each mnemonic based on five attributes. A client must match at least one of this group of three:
groups=REs
Allow execution only for clients that have a group matching one of these REs. If any RE is prefixed with a octothorp (hash, "#") then the match is against the numeric gid, not the group name.
users=REs
Allow execution only for logins clients that have a login matching one of these REs. If any RE is prefixed with a octothorp (hash, "#") then the match is against the numeric uid, not the login name.
netgroups=words
Allow execution only for clients that are a member of one of the listed netgroups. See innetgr(3).
If they don't pass any of those they get a nice error message and op exits with a non-zero status (usually 1).

As I said above: use groups in preference to users to make your life easier. I also prefer netgroups to users, but I don't use them much as group membership works almost every time. I do use "users=.*" to mean "anyone", which op even outputs under -w.

If any of the above match, then these four optional attributes may check deeper:

password
Ask for the user's password to credential the execution. Before being asked they must have been allowed by one of the first three attributes. If this is set as a DEFAULT (see below) then it cannot be reset per-rule.
password=logins
As above, but check against the password of each of the specified logins. If a PAM authentication fails a password specification may still allow the escalation. When a list of logins is configured and matched, three additional specifications are allowed, besides a literal name:
.
The client's password.
%u
The password for the login specified under -u (see below).
%f
The password for the owner of the file specified under -f (see below).
pam
Unlike some other attributes, the empty value turns off PAM authentication. This allows a rule with a common set of DEFAULT attributes to skip PAM authentication.
pam=application
The specified PAM application must authenticate the requesting user before any escalation is allowed. The requesting user and the remote user are both set to the requesting login, the remote host is "localhost".

Commonly specified applications: "su", "login" or "system". Using "sudo" would tie op and sudo to the same policy, which could be clever.

This option fulfills (skips) any password check when satisfied. Other specifications:

.
Dot is taken as the default application, which is listed in the version output under -V, usually "op".
helmet=path
The the program specified by path as root, if it exits zero the access is allowed. Such a program is only consulted if one of the first three rules above allowed the access, and any password specification was met.
jacket=path
The program specified by path as root to monitor the progress (and completion) of the new process. Such a program is only executed after all other authorization checks, including any helmet provided.

This is not really intended to deny access, but it can and should in some cases, see "jacket", below.

Helmets and jackets have other uses, see "Helmet and jacket programs", below. But for now just take it on faith that these provide some additional checks that you might want someday, and move ahead.

Examples for matching for access

To explicitly match all customers (to prevent the sanity check for complaining):
users=.*
To match members of group 0 as a client, by gid:
groups=#^0$

To allow everyone in group staff that knows the "operator" password:

groups=^staff$
password=operator

To allow any member of group "wheel" that can also su:

groups=^wheel$
pam=su

To allow the owner of a workstation access to install a set of mnemonics put them in a netgroup named "owner" and set:

netgroups=owner
This is how we specify host-based access in op's configuration. There is no other way to directly limit the scope of a mnemonic to a given host: that is a job for msrc, hxmd, or a helmet.

This is broken, as the netgroups code cannot get a list of netgroups to match REs against.

netgroups=.*
Always list netgroups lists explicitly without RE markup:
netgroups=localadmin,netadmin,operator

This allows anyone to try the rule, but the helmet rejects clients without LDAP authentication:

users=.*
helemt=/usr/local/libexex/ldapcred
$LDAP_CRED=$l:$r
$LDAP_LEVEL=admin
The "ldapcred" helmet exits 0 for success and may remove the two parameter environment variables via the API below. Note that LDAP_CRED and LDAP_LEVEL are arbitrary names I picked, while $l and $r are op markup described below.

Required options may be added

In addition to those access limits above, a rule may require any combination of three op command-line options (specified before the mnemonic):
-f file
The specified file is matched against (possibly many) attributes before the name of the file or an open file descriptor on that file is passed to the escalated command. A failure to match any attribute denys access to the mnemonic.
-g group
The specified group is matched against several attribute checks before allowing access to the mnemonic.
-u login
The specified login is matched against several attribute checks before allowing access to the mnemonic.

These options are mandatory whenever each of them is called upon for a value. If the option's value is never required then the specification of the option doesn't allow access with an error message, for example:

$ op -f /dev/null help
op: Command line -f /dev/null not allowed

There are two ways op calls for a value from the command-line: by a reference in the context of an attribute as a percent macro (%f), or as a parameter specification when building the actual command as a dollar expansion ($f).

The first type allows op options in the rule definition to reference any %f, %g, or %u as the target value for the option. The second expands the the appropriate value from the command-line option of the same letter when building a command (or environment variable). For example, using %f when a login name is expected, substitutes the owner of the file. Using $f in the args section of the rule definition substitutes the path as part of the executed command. We don't use the percent form there because we don't want to make percents special in that context, and we don't use dollar in the other place as that is a legitimate value for some options (for example part of an RE).

In the password description above we've already seen that in that context %u expands to the user's login.

Permission configuration for option values

These attributes specify who may access the mnemonic.
%g=REs
The command-line -g's group must match one of these REs.
!g=REs
The group specified on the command-line must not match any of the listed REs.
%u=REs
The login specified on the command-line must match one of the listed REs.
!u=REs
The login specified on the command-line must not match any of the listed REs.
%u@g=REs
The login specified on the command-line must be a member of a group that matches one of the listed REs.
!u@g=REs
The login specified on the command-line must not be a member of any group that matches one of the listed REs.
%f.attr=REs
The file specified on the command-line has its stat(2) attribute checked against the listed REs, one of which must match.
!f.attr=REs
Same as above, but none of the REs are allowed to match the attribute.
For the case of %f the attr must come from this list, most of which are taken from struct stat members with the leading "st_" removed.
dev
The file's device number in decimal.
ino
The file's inode number in decimal.
nlink
The file's link count in decimal.
atime
The file's access time in decimal.
mtime
The file's modification time in decimal.
ctime
The file's change time in decimal.
btime or birthtime
The file's birth time in decimal (not available on platforms other than FreeBSD).
size
The file's size in decimal bytes.
blksize
The file's block size in decimal.
blocks
The file's size in 512 byte blocks.
uid
The file's owner as a decimal uid.
login
The file's owner converted to a login name.
gid
The file's group as a decimal gid.
group
The file's group, converted to a group name.
login@g
The file's owner is treated as %u@g: the owner must be a member of a group matching one of the given REs for the file to pass. Inverted under !f, of course.
mode
The file's mode as a four-digit octal number.
perms
The file's permissions as ls might display it.
path
The file's absolute path.
access
A four character string representing the return values from four calls to access(2) against the file's: "rwxf" would indicate all access, while "----" would indicate no access at all.
type
The file's type letter as ls would display it in the first column of the symbolic permissions. After the type some other details about the file may be available: if the file is a directory the letter 'm' will be suffixed if it is an active mount-point, and the letter 'e' will be suffixed if it is empty. If the file is a symbolic link and %f.type matches the letter 'l', then the type of the file the symbolic link points will be added.

Examples of argument specifications

To fix the first example (rm-mail) we can make sure the name of the mailbox is a valid login name:

rm-mail	/bin/rm -f /var/mail/$u ;
	uid=.
	gid=mail
	%u=^.*$

To allow any file under /tmp, not owned by login sshd:

	%f.path=^/tmp/.*$
	!f.login=^sshd$

To allow anyone in group source to specify another member of group source (even themselves):

	groups=^source$
	%u@g=^source$

To allow any group that contains "web" in the name we could use an unanchored RE, but then sanity will carp at us, better to be more explicit:

	%g=^.*web.*$

And lastly the ever popular "anyone but the superuser":

	!u=#^0$

What can op change about a process?

There are about 20 attributes of a process one might escalate or remove to make escalation safer. That is to say any tool like op should be able to change any these attributes in a predictable way to assure that the new privileged command is as safe as it can be. Under op most of those attributes may be forced to specific values.

By default op modifies the environment for a mnemonic command by changing the effective uid to 0 (the superuser), and removing any supplementary groups, then cleaning the environment. This default is modified by putting attribute settings on the mnemonic "DEFAULT" (which is never used as a Customer driven mnemonic).

For a specific rule, any default should be replaced with an explicit value that gives the minimal privilege required to meet the intent of the rule.

Below we list the process attributes that op might change, and some idea of why that is something op might do.

$VAR
Pass the given environment variable as-is: don't remove it from the original environment. This might be used to pass $TERM for example.
$VAR=value
Set the given environment variable to the exact value. This might be used to set a $PATH, or $TZ.

There is a limit in op's parser that doesn't allow values with embedded white-space to be set directly. Later we'll see how to do that with a helmet.

basename=word
Force a different argv[0] for the new process. Some programs (like sbp) look at the name of the program to force command-line options. Also most shells look for a leading dash (`-') in the name to start a login shell. Not often used.
chroot=directory
Change root for the process. Used to start network programs that use a restricted environment. On some systems this is way harder to setup than others.
daemon
Double-fork the process into the background, redirect I/O to /dev/null. Used to start daemons processes: these also don't stay connected to the controlling terminal device, see setsid(2).
dir=directory
Change directory here first. Has the obvious use. Usually changes to the root directory, or a directory that the Customer normally could not access.
environment
Allow all existing environment variables to pass. This is only used when the effective uid and gid are left as they were; then we can pass the environment as-is because no possible compromising escalation was done.
environment=REs
Allow any existing environment variable which match any of the listed REs through to the new process. This is often used to allow access to one of ksb's wrapper applications.
gid=words
Force the real uid to this login, uid, %g, %f, or . (use the invokers gid) The real group identifier might be used by the escalated process to restore the user's original group.

Note that this can be a list of groups, the first will be the real group id, the others are added with setgroups(2) in place of a call to initgroups(3). When the rule has both a gid list and an initgroups attribute the results should be the unique elements from both the list and the groups from the specified login, but we may run out of slots.

egid=word
The effective group may be different from the real only if these are both set. Takes the same specification as gid. This is also always the first group in the group list, because that's the convention on a lot of UNIX systems.
uid=word
The real user identifier is forced to the specified value. The word may be a login, uid, %u, %f or . (use the invoker's uid). The default uid is the effective uid given by the setuid bit on the op binary, usually the superuser. Since that might give away too much privilege the sanity check asks that you specify an explicit uid for each command. You can suppress by setting one for the DEFAULT stanza.
euid=word
The effective user identifier is forced to that given value. The default is that value of uid.
initgroups
When no word is provided op calls initgroups(3) on the same login as the uid set (either effective or real).
initgroups=word
Force an initgroups call on a specific login with this specification: one of %u, %f, or . (the current list), or any valid login name.
fib=number
Use the setfib(2) system call to set the routing table for the new process. This is only available on FreeBSD systems, see -H output.
nice=number
This allows tasks to run with greater priority than the default. The nice value ranged from -20 to 20 on most UNIX systems.
stdin=redir
stderr=redir
stdout=redir
Force the input (output, error) channel of the new process to redir. Redir may be prefixed with the standard shell input/output redirection markups (<, <>, >, >>) to modify the open(2) flags. The file may be specified as %f, or an path to an existing file. Op won't create a file with such redirection.
session
Turn off any PAM session default.
session=login
Setup a PAM session for the given login. The application requesting the session is always the default on listed under -V, usually "op". The remote user is the requesting login, the remote host is "localhost".

The login may also be specified one of the following:

.
The original user.
%i
The initgroups login, or the value of uid or euid in that order. This value is only available under session, since usually it makes the most sence to provide a session for the same login as the initgroups.
%u
Forcing -u, the command-line specified login.
%f
Forcing -f, the owner of the specified file.
cleanup
Turn off any cleanup default.
cleanup=login
Taking the same specification as session, fork(2) a process to call pam_session_close(3) after the escalated process exits. The same specifications as session are allowed, with the special dot (.) specification interpreted as a request to exactly copy the session specification (even if empty).

This specification is normally not required unless the session started a co-process (for example an instance of ssh-agent in the case of pam_ssh).

umask=octal
Set the processes umask (default 022). Mostly used to unsure that the client doesn't make a file with escalated privileges that is insecure.

Examples of limits

To start the real-time Large Hadron Collider process with elevated scheduler priority:

cruncher /opt/atomic/bin/smasher $* ;
	stderr=>>/var/atomic/errors
	nice=-4  umask=0026
	uid=mighty gid=mouse
	groups=^lhc$,^eotw$,^admin$ ...
	$PATH=/opt/atomic/bin:${PATH}

To allow users in group "operator" to cat any single plain file on the filesystem:

cat	/bin/cat ;
	groups=^operator$
	%f.type=^-$
	stdin=<%f
	uid=. gid=.
The only part of the escalation that runs as the superuser is the open of the file. The cat process runs as the mortal that ran op. That is so cool.

Seeing the impact of each one

Most of a process's attributes may be displayed by running the showme.sh script. I often use this script to test op's environment logic in new rules I crafted. For more advanced checks you might need a perl or C program to produce special output.

Building the command to run

If the first word after the mnemonic is a full path to a shell program, then the args after that are all positional parameters to that utility.

When the first word after the mnemonic is a lone open curly brace ({ followed by white-space) then the lexical part of the configuration file parser builds an in-line script out of all the characters until it finds a close curly brace (}) as the first non-white-space character on a line. The script is effectively replaced with 3 tokens ($S -c $s). the args after that are positional parameters to that script.

If the first word is MAGIC_SHELL then something totally different happens. The MAGIC_SHELL token is discarded. If that leaves no args then a default argument list will be constructed later. Otherwise the argument list is expanded as given, but the meaning of $* and $@ changes, see the explanation below.

Expander markup for building commands

The arguments to the new process are expanded from the list of words after the mnemonic and before the delimiting semicolon (;") or ampersand (amp). These words are expanded via a shell-like substitution. The dollar-sign ("$") is the only special character. No backslashes, no quotes for white-space. This is an attempt to make it clear to an auditor what the expansion will output, while still allowing useful replacement operations.

Expanded from the rule definition

$0
The mnemonic specified on the command-line.
$_
The path to the program we are going to execute. This cannot be used to create itself, of course.
$s
The in-line script provided in place of a command path, without the delimiting curly braces. This is an exception to the rule about case (below), this is just the best letter to represent the script text, and it is often used with $S.
$w
The name of the configuration file that defined the access rule.
$W
The line number in $w that started the rule stanza.

Expanded from the UNIX credentials

In general each lower-case letter is a string, while the upper-case version is a number. For example $l is a login name with $L as that login's uid.

These are expanded from the credentials that the UNIX process holds (uids real and effective and the like), and the provided environment, and the command-line:

$a
The group list of the client process. For example "staff,wiz" when the process was in two groups. Which makes $A a list of the gids.
$i
The target login for any initgroups(3) call from the rule. Which makes $I the uid of that login.
$h
The home directory of the client login.
$H
The home directory of the target login. This doesn't follow the case convention, but the numeric rule doesn't really apply to a home directory.
$l
The client login name. Which makes $L their uid.
$n
The new group list given to setgroups(2), which makes $N a gid list.
$o
The target real group, which makes $O the group's gid.
$r
The clients real group, which makes $R that group's gid.
$t
The target login, which makes $T be the target uid.
${ENV}
The value of the environment variable ENV as it was in the original environment.
$S
The value of $SHELL if allowed from the original environment, or /bin/sh when no value for SHELL is present (or allowed). This is for MAGIC_SHELL support.

Expansion based on the command-line presented

There are several expansions used to construct the utility command and parameters executed by op:

$1, $2, ...
Each positional parameter after the mnemonic is available by the same name a shell script would use. Mentioning the name as a word, or substring of a word expands the actual value in place of the markup.
$N
The N-th command-line parameter (like the shell) $1, $2, $3 and so on.
$*
$@
The arguments specified on the command line that were matched by the $* attribute below. When called as $* empty parameter words are squeezed out.
$* (under MAGIC_SHELL)
Under MAGIC_SHELL the expansion of this changes to a single word. Used to form a -c specification to the specified shell.
$#
The number of words presented on the command-line. This doesn't always match the number of words in $@ as some of them might have matched fixed $N's.
$f
The absolute path to the file given on the command-line.
$F
Substitutes an open file descriptor (small integer in decimal) open to the file specified on the command-line. This descriptor will be read/write if possible, else read-only. It is safer to ask for read or write with the stdin, or stdout attributes below when possible, viz. stdin=%f.
$g
The group name provided to -g, or the part of the login specification after the colon. Any mention of this in a rule forces -g on the command-line (unless a -u specified as login:group is included).
$G
The gid of the group name provided to -g (also honors the group name after the colon under -u).
$u
The login provided to -u. Mention of this anywhere in the parameter list forces the need for a -u on the command-line.
$U
The uid of the login provided to -u (which also forces the need for that option).
$d
The directory part of the -f's file value. Substitutes the absolute path to the directory containing the file specified on the command-line.
$D
Substitutes an open read-only file descriptor (small integer in decimal) on the directory part of file. Caution: this can break the chroot attribute (below).

Fixed strings

These are fixed strings that are useful markup, largely to overcome limitations in op's configuration parser.

$|
The empty string. This is useful to deny a semicolon its special meaning, see the examples below.
$\c
Allow any of the black-slash escapes tr(1) allows for special characters:
a   alert character
b   backspace
f   form-feed
n   newline
r   carriage return
t   tab
v   vertical tab
s   a space (not from tr)
This was added in a hollow attempt to allow spaces in command arguments. It is just too much noise to type $1.$2$\s$3 for long commands.
$$
A literal dollar sign.
$|
The empty string. This allows "$1" to be followed directly by digit "3", as "$1$|3".

Examples of expander markup

To remember the original login name in $ORIG_LOGIN:

rule	command ... ;
	$ORIG_LOGIN=$l

To get a rule to echo a semicolon use:

echo	/bin/echo $|;$| ;
	uid=. gid=. users=.*

To get a parameter with an embedded space:

date	/bin/date +%a$\s%b$\s%e ;
	uid=. gid=. users=.*
or you could use an in-line script, but I don't use them often for these reasons. Here is an example where I might:
date	{
		/bin/date +"%a %b %e"
	}
	uid=. gid=. users=.*

A different use of the in-line script is a dynamic DNS update with nsupdte. Since we need to keep the authentication key secret, and the op rule-base is already protected we can stash the whole update script here.

dnsSet	{
		exec nsupdate -v <<-!
		server 10.10.10.254
		zone dynamic.example.com.
		key dynamicKey aabbCCddEE00FF==
		update delete sulaco.dynamic.example.com. A
		update add sulaco.dynamic.example.com. 86400 A ${1}
		show
		send
		!
	} $1 ;
	$1=^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*$
	uid=. gid=. users=^root$,#^0$ netgroups=owner
This is not a perfect solution: the process table may display the dynamic key for a very short window of time, but in practice it has been much more likely that the key file leaked in a more public way. This rule is usually only run by dhclient from /etc/dhclient-exit-hooks as the superuser, or as the owner of each workstation.

To get the rule above to the target host without giving away the crypto-key is a whole topic all by itself. Here I'll just say that msrc can merge the dynamic name of the host and the appropriate key from a file that mortals cannot normally read, and op work for that step as well.

To set some of the same environment variables sudo might:

rule	...
	environment=^(COLORS|DISPLAY|HOSTNAME|KRB5CCNAME|LS_COLORS|MAIL|PATH|PS1|PS2|TZ|XAUTHORITY|XAUTHORIZATION)$
	$SUDO_COMMAND=$_
	$SUDO_USER=$l
	$SUDO_UID=$L
	$SUDO_GID=$G
	$LOGNAME=$t
	$USER=$t
	$USERNAME=$t
	$HOME=$H
	TERM=unknown
Note that SUDO_COMMAND will not be exactly the same, we didn't put the argument list on it. If I really wanted to emulate sudo I think I would code a helmet to do it: I believe the rules structure in sudo is far too complex to really audit.

To set some of the same environment variables super might:

rule	...
	ORIG_USER=$l ORIG_LOGNAME=$l ORIG_HOME=$h
	USER=$t LOGNAME=$t HOME=$H
	IFS=$\s$\t$\n
	PATH=/bin:/usr/bin
	SUPERCMD=$0
I'm not really sure about HOME, since the wording in the manual page is unclear (to me).

Helmet and jacket programs

Other privilege escalation tools try to think of every aspect one might with to use limit access, or to log about an access: op has a simple, but complete API to allow the administrator to "plugin" any check they can code in a program.

The basic protection is provided by a helmet (that is a play on the idea that it protects your head). A helmet is a program that is called to approve the access just before op is ready to run the program. It receives a long list of options and parameters the explain the context that op is in, and expects the program to exit 0 only if the access should be granted.

The helmet may add or delete environment variables from the target process's environment with an overly simple protocol. And it may recommend a failure exit-code back to op. Usually a helmet is passed any addition specifications via environment variables set in the rule's configuration. These are removed by the helmet if they should not be leaked to others (although they may appear in the process table for a very short time as the helmet executes).

For example below I made up a rule that called the helmet "time-box" to check a time-box of 22:00 to 06:00 for the start of the "backup" mnemonic:

backup	/usr/local/libexec/doBackups ... ;
	groups=^operator$
	helmet=/usr/local/libexec/helmet/time-box
	$TIME_BOX=2200-2400,0000-0600
	$TZ=...
That helmet should remove $TIME_BOX from the environment for two reasons: (1) other programs might use it in conjunction for some unrelated filter when set, and we don't want to trigger that by accident, and (2) we don't need to radiate information about the allowed access. Also note that we need to set $TZ if we are going to look at the clock, right?

There is an example perl script (jacket.pl in the source to op, or you can view an HTML version of the same code. Checking to see if the current hour and minute is in a range of integer values is left as an exercise to the reader. But if we had that check we could use the jacket.pl code above and add to our "CHECKS AND REPARATIONS":

	use POSIX qw(strftime);
	my($hhmm);
	$hhmm = strftime "%H%M", localtime;
	if (your code here) {
		print "-TIME_BOX\n";
		exit 0;
	}
	print STDERR "It is not your time\n";
	exit 69;

A jacket

A jacket wraps the escalated process so it can wait for it to exit. Along the way it could time the process, kill the process, restart failed processes, or block other people from using the same resources as the process. It might even run a wrapper diversion for the task, like ptbw.

It can do anything any co-process might. It is up to you to justify the use of the extra process. It may cleanup after the escalated program or run the parts of the application that need elevated privileges.

The original op process becomes the jacket program, a child process (already fork'd) is about to execve the escalated program. Before it does it blocks reading from stdout from the jacket.

The jacket sets up any times, file locks, or what-not it needs, then closes stdout to free the blocked child. If it can't gain the locks it needs or smells something bad, it can write a non-zero exit code to the child (say 75) and op will abort the execution and exit without starting the target program.

The same program might be both a helmet and a jacket: it can tell which context it is in by the -P option, only jackets get that one.

The MAGIC_SHELL form

This form usually allows arbitrary command execution, so I don't favor it. In fact the number of times I'll installed a production magic shell rule is exactly two. I'd rather use ssh and a well controlled authorized_keys file as local policy at NPC Guild.org. At the very least you should be really picky about the who can use this access: at the least I'd limit it to a single (special purpose) login.

Change to $*

The first change from a standard rule is the $* gets all of the command-line parameters bound to it merged into a single parameter. Any sense of word separation is lost (replaced with a single space). So, for example:
test1	MAGIC_SHELL /bin/$1 -c $* ;
	users=^anybloke$
	$1=^sh$,^ksh$,^bash$
	uid=. gid=.
The -c's single argument is all the words on the original command-line after $1. To test this I ran
op test1 sh date \; hostname
and got the output from date and hostname. Note that I back-quoted the semicolon to get it passed as a literal word to op.

If you remove the MAGIC_SHELL token you'll only get the output from date, as the semicolon and hostname tokens are now parameters to /bin/sh, rather than included in -c's argument.

Default argument template

The next strange thing is that no args need be specified: op build one of two default argument lists, when presented with parameters it builds:
$S -c $*
When presented with no arguments it builds:
$S

This provides an interactive shell with no command-line parameters, otherwise it runs the command you gave it with the selected shell under -c.

There is one other magic element: if the shell you set contains the string "perl" then the "-c" is changed to "-e" because that's what perl wants.

Alternatives

By putting $* in the context of an environment specification the same transformation (joining the words together with spaces) may be rendered:
	$OP_ARGV=$*
	$OP_ARGC=$#
which does give the new process a different way to get the command-line parameters joined into a single word. This might also be useful to pass the arguments to a helmet or a jacket.

Examples of MAGIC_SHELL specifications

To allow anyone in group "wheel" to become the login "operator" on demand:

operator MAGIC_SHELL ;
	groups=^wheel$
	uid=operator initgroups=operator

To force that access though the restricted shell and limit the first word a little:

operator MAGIC_SHELL /bin/ksh -r -c $* ;
	groups=^wheel$
	uid=operator initgroups=operator
	$1=^/sbin/dump$,^/sbin/restore$
Note that the $1 limitation doesn't really add any security, as $2 might be an option that makes the command exit to allow a command after a literal semicolon to run anything. By giving access to a shell you are removing almost any limit from the Customer.

This is a feature that people like, but I don't really think you need to use it in production. The operator example above could be better thought-out an expressed as the 3 things you really need to do as the operator login, not a general shell access. Maybe a command to remove a dump, create a dump, and one to get started with restore. I doubt there are so many commands one would run as operator that they cannot be matched by op's RE logic.

My favorite exploit was passing the text below to a magic shell to get an interactive session as the target login:

$(DISPLAY=localhost:11 /usr/local/bin/xterm -ls)

You can depend on your Customers to be more creative than you thought they could be. If they can load commands that are then run by op, then they will load a command with an option to get a shell, count on it.

Sanity checks

Under the -S option op tests each rule in the rule-base against my own ideas of what is "sane". When it finds something it should flag as "insane", it produces an error message on stderr. If the error is serious it exits with a non-zero exit-code from <sysexits.h>.

The sanity drops to the real uid of the invoker when frisking any files given on the command-line. But an op rule to allow a mortal administrator to run "op -S" is not out of the question.

Example sanity checks

Allowing unanchored regular expressions to match users or groups might be silly (as noted in the examples in that section).

Giving non-numeric values to nice, umask, or $#, or N in $N is really frowned upon.

Giving specifications for -u, -g, or -f then not using them in the rule (or the opposite).

Requiring a password from a login that doesn't exist on the host.

Setting a DEFAULT rule that is never used.

Adding rules that match the same patterns as one above it. Or putting the same mnemonic in more than one file (as you can't be sure which is consulted first).

There are a lot of path checks as well, and some others that I hope you'll never see.

The command line

The command-line for op has more modes that most tools:
op [-f file] [-g group] [-u login] mnemonic [args]
In this mode op looks up the mnemonic that matches the args given, then uses the context of the current process to authorize the escalation. When all goes well the op process becomes (or jackets) the requested process.

Any command-line login and group must match and rules for %u, !u, %u@g, !u@g, and %g, !g. Any command-line file must match all the %f, !f qualifiers.

op -l [login]
This mode requests the list of rules the current user might request. The superuser may specify a login to request another's list.
op -r [login]
op -w [login]
List what op might run as the result of the corresponding -l command. This sometimes help Customers see which rule they want. Under the -w option also list why each role is allowed. This is great for audits.
op -S [files]
This mode is only used to sanity check the addition of new files to the existing op rules, or when no files are provided to sanity check the existing policy.
op -Sn [files]
Under this switch op does not read the existing rule-base to check for integration with the current rules. If the file you are checking is a new version of an existing file you'll need this to remove errors about duplicate rules.
op -h
The standard on-line help all my tools provide.
op -H
This is an extended help text to allow rule writers a "quick reference" view of the configuration attributes. It is built from the same list the sanity checker uses to list unknown attribute settings.
op -V
The standard on-line version information, plus any compile-time option settings.

Strange addition to -f

One may specify a nonexistent file under -f, as long as the attributes only match:
path
To force the nonexistent file's location.
perms (always n---------),
type (always n),
To check for the file's nonexistence.
access (always ----)
A nonexistent file has no access available.
anything else
A match against any other %f or !f attribute rejects the attempt.

What I didn't add

I didn't add a bunch of checks that you'll never need. If op gets though the basic allow checks and the checks for the -f, -g, and -u options then any additional checks you build into a helmet is usually just for the over-cautious, or for a local policy no other site would use or understand.

Features that are not missing, just less obvious

Op eats its own dog food to allow administrators to access -l for other logins. Use this rule to allow anyone in groups "wheel" or "staff" to see anyone but root's allowed rule list:

op	/usr/local/bin/op $1 $2 ;
	groups=^wheel$,^staff$
	$1=^-[lrw]$
	uid=0 gid=.
Run this rule as:
op op -l ksb

More dog food: use op to let anyone in group zero run a sanity check as the superuser:

op	/usr/local/bin/op $1 $* ;
	groups=^0$
	$1=^-S$
	uid=0 gid=. initgroups=root
I don't match $2 because op does a fine job of that for me.

Summary

Op offers all the features you want in a privilege escalation strtucture with an easy to audit configuration. Simple rules are clear and isolated in definition and in scope, while complex rules are possible. Support for some less often required services (such as timeboxing access to rules) is out-sourced to helmet or jacket co-processes.

The op rule-base is broken into separate files to allow deployment of subsets of the rule-base to different hosts. Host specific access may be granted by selective deployment of rule, or by the netgroups facility.

Customer access may be granted by login name, uid, group memebership, lack of group membership or any rule coded in a helmet. Resources are accessed by name, ownership, group ownership or any other stat(2) attribute, as well as by existence (or nonexistence).

These features together offer a complete privilege escalation strtucture without forcing the administrator to give away arbitrary superuser access.


$Id: op.html,v 2.58 2010/08/13 19:22:25 ksb Exp $