		Technical implementation details
	for the Zero-Install kernel-userspace interface

	
Overview

The zero-install system is a high-performance caching HTTP based read-only
network filesystem. Programs and resources on remote machines can be read via
standard Unix paths, such as /uri/0install/zero-install.sourceforge.net/demo.
Because speed is of utmost importance, all resources accessed are cached on
the local machine.

Having fetched a resource, the system will always satisfy requests directly
from the cache without even checking that they are up-to-date. Users must
explicitly request a refresh. Software can also force a refresh if it knows
that the cache is out-of-date (for example, if a program requires libfoo-1.2.3,
but the cache says that 1.2.1 is the latest version).

The cached downloads may be kept anywhere. On a traditional Linux filesystem,
/var/cache/zero-install is the obvious choice. The network filesystem itself
must also be mounted somewhere. '/uri' must be used for this, since software
must be able to rely on absolute paths to identify linked resources. For
example, a python program might start with:
	
	#!/uri/0install/www.python.org/python-latest
	print "Hello world from python!"


LazyFS

LazyFS implements the kernel side of things. A lazyfs filesystem should
be mounted on /uri. The path of the cache directory must be passed as
the mount option:

	mount lazyfs -t lazyfs0d1d22 /uri/0install -o /var/cache/zero-inst

The equivalent line for /etc/fstab is:

	lazyfs	/uri/0install	lazyfs0d1d22	/var/cache/zero-inst	0 0


Because the system must show files even before they have been downloaded, the
directory listing for a directory in /uri/0install is fetched from a file
called '...' in the corresponding cache directory. If this doesn't exist yet,
LazyFS will send a request to the user-space helper to create it. If no
helper is running, an IO error is reported instead.

The '...' files begin with the magic string 'LazyFS\n', then continue with a
list of lines in the form:

d 100 1234 name\0
x 5 12 name2\0
...

The first character is 'd', 'x', 'f', or 'l' (for directory, executable,
file, or symlink). Then space, then the size, another space, the mtime,
another space, the name and the terminator. Symlinks are then followed by
the link target, terminated by '\0'. Note that entries end in '\0', not '\n'
(since filenames can contain newlines).

Because it wouldn't be safe to change a '...' file while in use (the kernel
might try to read it half way though) you must create a new file and then
rename it to '...'. Otherwise, the kernel won't even notice the change.
(note that the kernel should still handle corrupted files safely)

Having got the directory listing, lazyfs will allow applications to browse
around the /uri/0install system, reading more '...' files as it goes. When a
file or directory is first opened by an application, lazyfs opens the
corresponding file (or '...' file for directories) in the cache.

Because lazyfs has a listing for each directory, it can automatically
respond to requests for files which don't exist. However, if the listing
indicates that the file does exist on the remote machine, but it is not
present locally, lazyfs will ask a user-space helper application to
fetch the missing files.


The user-space helper

LazyFS can operate without any userspace helper. It creates the virtual
directory structure from the '...' files. When a virtual file is opened, it
opens the corresponding host file and proxys to that.

If we need to access a host file or directory which doesn't exist, we
need a helper application. If no helper is registered, we return EIO
(I/O error).

LazyFS creates a symlink, '/uri/0install/.lazyfs-cache', to the actual cache
directory, so that user-space programs can find it easily.

There can only be one registered helper at a time. It registers by opening
the /uri/0install/.lazyfs-helper pipe and reading requests from it. When a
process opens a file or directory which has a missing host inode, it is put to
sleep and a request sent to the helper in the form of a file handle.
The pipe is given the same owner as the cache directory, so there is no
need for the helper to run as root.

The request is a number (ASCII) followed by " uid=%d\n". You have to read the
whole line in a single read() operation (and you'll only get one per
read). The number is a file handle which lazyfs has opened for you.
The UID is the id of the process requesting the file. Reading from the file
will give you the pathname of the file to create relative to the mount point
(ie, / means /uri/0install). Again, you must read the whole path in one
operation.

When the handle is closed, the requesting process wakes up (hopefully to
find that the missing file has appeared). If more processes request the
file, they will also be put to sleep until the request is handled. New
requests will be sent for each different user accessing the same file. This is
so that the helper can provide information to each user about what requests are
currently in progress for that user, and allows requests to be cancelled on
a per-user basis.

If the helper closes the /uri/0install/.lazyfs-helper pipe then all pending
requests (those not yet delivered to the helper) return EIO errors. The helper
can be safely restarted without having to remount the filesystem.

The user-space helper may handle any number of requests in parallel, and
can pass the file handles to subprocesses (remember to close them in the
parent, though!). It may also choose to limit the number of fetches.

As well as normal directories, the helper can create a 'dynamic' directory
by creating a '...' file containing only 'LazyFS Dynamic\n'. In this case,
any lookup performed in the directory will create a subdirectory with that
name and ask the helper to fill it in. On error, it is removed again. The
caller sleeps, but other applications can see the temporary directory,
although they'll block if they try to open it.


Kernel implementation details

See the HACKING file in the lazyfs-linux module for details of the
implementation.


Notes on possible kernel DoS attacks:
 
When a user opens a non-cached file, we create a request object.
The request is not freed until the file is fetched (or the fetch aborted),
but the user can close the file and open an unlimited number of uncached
files, creating a new request for each without being limited by the number
of open files limit. There should be some form of resource limiting to stop
users having too many open requests at once.

Also, we assign a new inode number (creating inode and dentry structs) to
each virtual inode. These can never be freed (unless we actually discover
that the remote object has been deleted) because we'd get a different inode
number when we next looked it up. There should be some way to prune the
tree if we run low on memory, even if that means the inode numbers change.
