To understand this document

This document assumes you are quite familiar with the standard UNIX process model and the shell sh(1). In addition you should have a working understanding of the xapply(1) wrapper stack (for the examples).

What is `wrapw`?

The wrapw wrapper compresses many wrapper name spaces into a single name space. This compression enables the specification of an entire set of diversions as a single string. That compression is useful to pass a set of diversions to an escalated or remote process, which is the basis for the sshw program.

Only programs coded against the common divconnect.m mkcmd module (or the like) know how to accept the descriptor passing that wrapw employs to forward connections. Naive clients usually hang in read(2).

In client mode, the program recreates the original name space under a new directory, creates a set of environment variables to restore access to the control sockets for each then proxies connections to the proxy environment back to the original wrapped environment.

What is `wrapw` used for?

Primarily this program is used by sshw on both sides on the remote shell, once in master mode and again on the remote side in proxy-client mode.

Additionally the program can create a snapshot of the wrapper environment at the present stack depth. The marker can be passed out-of-band to a descendent process (perhaps in a file, or environment variable) to allow the descendent access to an environment without interference from any diversions installed in the interim.

Since wrapw's diversion socket proxies multiple wrappers though a single name, it allows access to a whole wrapper-stack via a single path.

Let me show you

I'm going to assume that you'd read some of ptbw's HTML document, or are a super genius and can figure it all out from a few examples. I'm also assuming you have no active diversions in your shell.

First we are going to start with a shell (I used ksh for these examples) that has a prompt we can recognize, and we're not going to fall into csh by mistake. I'm using an interactive shell here to avoid having to type in many small scripts. When the examples were all scripts it was really hard to follow in a single linear reading.

We start with a shell on a machine that allows us superuser access:

$ PS1='base$ ' SHELL=`which ksh` ksh -i
base$

Next I'm going to ask env to output the current diversion stack. Yeah, this is a trick, we look for any environment variables that have an underscore followed by a number, or a 'd' or an 'l' (for "link"); the '=' helps assure that the matched letters are part of the name:

base$ env | grep '_[0-9ld].*='
base$

If you have any output between the command and the prompt you should clean out the open diversions before you proceed.

Let's build some diversion we can test with. Since ptbw is totally harmless we'll use that, and we'll go ahead and see what a wrapw diversion does to that environment:

base$ PS1='p1$ ' ptbw -m $SHELL
p1$ env | grep ptbw_
ptbw_1=/tmp/ptbd5QIgV7/sta0
ptbw_link=1
p1$ PS1='w1$ ' orig=$ptbw_1 wrapw -m $SHELL
w1$ env | grep ptbw_
ptbw_link=1
ptbw_1=/tmp/wrapwIepg39/ww0/0
w1$ env | grep wrapw_
wrapw_link=1
wrapw_1=/tmp/wrapwIepg39/ww0

What we see above is that we created a shell wrapped in a ptbw called "p1", we wrapped another shell in a wrapw called "w1", and that changed the name of the ptbw socket. It also created a diversion socket for wrapw.

With that in-place we can look at the filesystem for the diversion sockets and status output from the open diversions:

w1$ ls -als $orig $wrapw_1
0 srwxr-xr-x  1 ksb  wheel  0 Oct 18 10:06 /tmp/ptbd5QIgV7/sta0=
0 srwxr-xr-x  1 ksb  wheel  0 Oct 18 10:08 /tmp/wrapwIepg39/ww0=
w1$ ls $ptbw_1
ls: /tmp/wrapwIepg39/ww0/0: Not a directory
w1$ wrapw -V
wrapw: $Id: wrapw.m,v 1...
 ...
wrapw: environment tags: "link", "list", "d", "1"
wrapw:  1: /tmp/wrapwIepg39/ww0 [target]
w1$ ptbw -
ptbw: master has 8 tokens, max length 1 (total space 16)
 Master tokens from the internal function: iota
  Index  Lock    Tokens (request in groups of 1)
     0  0       0
 ...
     7  0       7
w1$

That shows us that even with a nonexistent diversion socket ptbw still can show us the tableau. Which is not really a surprise, because that is what wrapw is supposed to allow.

We can also access the same diversion through the original socket, since we have not changed to a context where the socket is really unavailable (and we kept the name in orig:

w1$ ptbw -t $orig -R4 -A echo
0 1 2 3

If we wanted to store the current diversion stack state in a file we could create one:

w1$ wrapw -R one.ds
w1$ ls -als one.ds
2 -rw-r--r--  1 ksb  source  113 Oct 18 10:27 one.ds
w1$ strings one.ds
/tmp/wrapwIepg39/ww0
wrapw_link=1
wrapw_1=/tmp/wrapwIepg39/ww0
ptbw_1=/tmp/wrapwIepg39/ww0/0
ptbw_link=1

To use that make another window as the same login, and move the the same directory. From that shell, start a client instance of wrapw to clone the environment from the other instance:

$ PS1='w2$ ' wrapw -t one.ds ksh -i
w2$ ptbw -V
ptbw: $Id: ptbw.m,v 1...
 ...
ptbw:  1: /tmp/wrapwIepg39/ww0/0 [target]
w2$ wrapw -V
wrapw: $Id: wrapw.m,v 1...
 ...
wrapw:  1: /tmp/wrapwIepg39/ww0 [target]
w2$

That means xapply's, started from those windows, would share the instance of ptbw for and token allocations. It also means that other processes that would read one.ds could work in that same environment.

More to the point, that container could contain any set of diversions, including additional wrapw instances with whole diversion stacks open in them. The number of addressable diversions is not limitless, but it s more than the 64000-odd available TCP/IP ports on any given UNIX host. And individual logins can manage their own pools without denial of service to any other process. Consider my process bind'ing to port 46600: unlike a diversion under $TMPDIR, I block the similar use of that port by any other process.

We don't have to put all the diversion index into the environment at the same time, we can "load" a diversion set into the environment for a process, which will unwind when that process exits. Usually the open diversions on a machine are spread out across many processes, not formed as a single large cluster.

Using the filesystem to hold state saves a lot of space in the environment, which gets copied for every process fork, but now days RAM is cheap.

Let's end the second window and restart it with a merged diversion stack. I'm going to create a ptbw instance and merge the one from window 1 in with it (via -I):

w2$ exit
$ PS1='w3$ ' ptbw -m -R2 -J2 wrapw -It one.ds ksh -i
w3$ ptbw -V
ptbw: $Id: ptbw.m,v 1...
 ...
ptbw:  2: /tmp/wrapwIepg39/ww0/0 [target]
ptbw:  1: /tmp/ptbdF8pGFn/sta0
w3$

This environment has access to a local diversion with 4 tokens, and a common one with 8 tokens:

w3$ ptbw -1 -
ptbw: master has 4 tokens, max length 1 (total space 8)
 ...
     3  0       3
w3$ ptbw -
ptbw: master has 8 tokens, max length 1 (total space 16)
 ...
     7  0       7
w3$

From there we can go "the other way" and remove the diversion stack by giving wrapw an empty list:

w3$ exec wrapw -t /dev/null $SHELL
w3$ ptbw -
ptbw: no enclosing diversion
w3$

But the process is still above us in the process tree, and if we dig with our mad UNIX skills we can still find the socket:

w3$ ptree $$
54961 sshd: ksb@ttyr9
  54963 -ksh
    56793 ptbw -m -R2 -J2 wrapw -It one.ds ksh -i
      56794 /bin/ksh
	57039 ptree 56794
w3$ lsof -p 56793 | grep unix
ptbw    56793  ksb    3u  unix 0xcda84b60      0t0        /tmp/ptbdF8pGFn/sta0
w3$ ptbw -t /tmp/ptbdF8pGFn/sta0 -A echo
0
w3$ exit

This is not a good data-flow for applications, but it does help debug missing links. Knowing the path you are missing really helps trace where it got lost.

Exit the test shells in the first window:

w1$ exit
p1$ exit
base$

Using a fixed-path persistent instance

Here is an example I setup on my workstation to use a fixed-path instance of wrapw to allow access from a jail (or chroot) out to the larger scope.

I started with an instance of ptbw with some random numbers as the tableau:

base$ jot -r 11 100 200 | oue >/tmp/foo.cl
base$ PS1='me$ ' ptbw -m -t /tmp/foo.cl $SHELL
me$

I made a new slash in /home/slash and made the top-level directories and put some content in them that we need to run as shell:

me$ su -
 ...
# umask 022
# cd /home
# mkdir slash
# cd slash
# mkdir bin etc lib libexec tmp usr var usr/bin usr/local usr/local/bin
# cp -p /bin/* bin/
# cp -p /etc/passwd /etc/group /etc/master.passwd etc/
# cp -p /lib/*.so* lib/
# cp -p /libexec/*.so* libexec/
# chmod 1777 tmp
# cp /usr/local/bin/wrapw /usr/local/bin/ptbw usr/local/bin/
# cp /usr/local/bin/xclate /usr/local/bin/xapply usr/local/bin/

That makes enough of a support structure to run a chroot's shell under FreeBSD UNIX, for example. Now let's start a gateway instance of wrapw that presents itself in /tmp/out in the chroot:

# wrapw -m -N /home/slash/tmp/out : &

To run a shell in there we use chroot, then pull the wrapper environment from the diversion socket and test with ptbw:

# PS1='in# ' chroot /home/slash /bin/sh -i
in# exec wrapw -t /tmp/out $SHELL
in# ptbw -A -R3 echo
109 108 136

That worked fine, so we can shut it all down. We'll use the quit option to tell the external wrapw that we are done with our mission:

in# wrapw -Q -t /tmp/out echo Fini
Fini
in# exit
[1] + Done                 wrapw -m -N /home/slash/tmp/out :
# : remove chroot if you are done testing
# exit
me$ exit

In a real application, the exit of the external instance of wrapw would trigger the next itteration of a loop, or the successful exit of that process back to another level of automation. While the cleanup inside the chroot, VM, or jail might continue for a while after that notification. Breaking the synchronous bonds of processing makes the throughput of the whole much better.

Any required results would have been sent either to stdout or to another diversion via a client access.

How `wrapw` useful?

Scripting very large tasks (like `big data' searches with Hadoop) is largely done today with an opaque job controller that schedules tasks internally. I use a shell-driven engine that balances resources with my wrappers for much better performance. This structure stops all processing when it finds a result with the right weight (or answer, combination) by the most effective method: kill the processes.

Since I'm going to tear down the entire structure I don't care about any `cleanup' operations. I can grow a new machine, data pool, and manager faster than I can assure the old one is in a known-good state. Think about the working data store as being /tmp, not as being something you have to backup and handle with care.

Wrappers push the idea of temporary space into the abstraction of data-flow and processing a lot harder than a `script' alone. When you design a task think about what you could do with many of them in parallel, and what resources they would have to share: can you split the work with xapply, and manage the output with xclate, or share common locks with ptbw? Could you build those resources, group them under a wrapw and give access to the whole `cluster' with a single socket?

Could you project that resource pool out to a cluster of machines to spread the work across them all? Now you are getting someplace. If you can, then think about how many clusters you could light in parallel. Can you break that into parallel clusters as well? How many levels can you decompose, and which ones split the work the best? At what level do you drop from `control logic' into `application logic', that is from shell and wrappers to perl, Python, Java, or compiled code?

Conclusion

There really isn't an end to the ideas here. I could explain with many more examples how useful wrapper technology is, but either you get it or you don't. Building vast engines to solve complex configuration management or "big data" problems is a skill you refine over time. It is a whole new way to think about what is possible with the large compute resources we have in a modern data center (or at home for that matter).

Rather than depending on Java or C++ code we can use these tactics to allow the shell to scale out to drive very large clusters. That allows VM, Hadoop, or other data clusters to be purpose-built, used, then recycled moments after that task is done (like a process). By allowing thousands of co-operating services to run on a single OS instance without using and network visible ports we can hide the implementation of internal services in a model that is secure by nature, and scales by design

$Id: wrapw.html,v 1.6 2012/03/21 16:15:04 ksb Exp $