Seer Trace Collection

Overview

As part of the Seer research project, I collected a large body of traces of user behavior on mobile machines. The traces consist of about half a gigabyte of compressed (gzipped) trace files that record all file activity except reads and writes by nine different users who were working both connected and disconnected on Linux-based laptop computers. The trace lengths range from about 1 to about 6 months.

This Web page provides access to the individual traces, and describes the trace format. All traces have been "sanitized" by removing user names and private pathnames, but most pathnames have been left untouched (it turns out to be important for many research purposes to know what program a user was running, or to have an idea of the type of file being accessed).

Traces

The traces themselves are currently available in a gzipped binary format. The binary format is not yet fully documented (I'm working on it, but it's a slow process).

In the meantime, Linux/i386 users can download a program called dumpobs that will read the compressed binary format and produce ASCII output. This output can then be processed to your heart's content. (If you the version above doesn't work, try this alternate version. The alternate is known to be required on RedHat 7.1. I don't know if it runs on other systems.

Using Dumpobs

Dumpobs is a normal Unix program. That means that it can read a file from standard input or accept multiple filenames on the command line. Files provided on stdin must be uncompressed, but if a file specified on the command line ends with ".gz" then it will be uncompressed automatically.

Two switches are accepted. The -n switch causes the name of the file being dumped to be prepended to each output line, followed by a colon in the manner of grep. This can be useful if you are searching for a particular record in a lot of files. The -o n switch specifies a starting value other than zero for the record offset (see below).

ASCII Output Format

The ASCII output of dumpobs contains three types of lines. All three types begin with a decimal number that represents the offset of the record from the start of the input file.

The record type is identified by the second field, which will be one of RESTART, TIMESTAMP, or UID.

The RESTART record occurs whenever the Seer system was restarted, usually due to a reboot of the machine being traced. It is formatted as follows:

offset RESTART (type) at timestamp ascii-time

where

offset
is the record offset discussed above.
type
is either
scratch
or
reexec
(in practice, the latter never appears in the trace files).
timestamp
is the UNIX timestamp (seconds since midnight UTC, January 1st, 1970) expressed as a decimal number.
ascii-time
is the human-readable conversion of the timestamp in your own timezone , in standard ctime format. If you want the timestamp expressed in the timezone used to collect the traces, run dumpobs with the TZ environment variable set to PST8PDT.

The TIMESTAMP record is generally produced once every ten minutes while the traced computer is up and running. It also (usually) appears at the beginning of every trace file. It is formatted as follows:

offset TIMESTAMP timestamp ascii-time

where offset, timestamp, and ascii-time are all interpreted as for the RESTART record above.

The final record type is the actual trace record, identified by a second field equal to UID. It is formatted as follows:

offset UID uid PID pid program flag timestamp.microseconds call(args) = result (error)

where

offset
is the record offset discussed above.
uid
is the numeric user ID of the process traced. For UIDs other than the user who owned the machine, only exit, rename, unlink, and rmdir were traced.
pid
is the process ID traced. Note that UNIX reuses process IDs, so this is not a globally unique identifier.
program
is the name of the program generating the trace record, or two question marks (??) if dumpobs was unable to determine the program name.
flag
is one of B, A, S, or G. B and A indicate that the trace was taken before or after actual execution of the system call, respectively. Most traces are taken after the call completes, but exec calls are traced both before and after the call. S indicates that the call was traced through certain special hooks in the kernel, and applies only to fork and exit calls; this is of limited value to most users. Finally, G indicates that the system call is a "fake" one generated internally by the Seer system. This is used to generate chdir indications that will correctly reflect the working directory of a process when it first appears in the trace.
timestamp
is the UNIX timestamp (seconds since midnight UTC, January 1st, 1970) expressed as a decimal number.
microseconds
is the microsecond portion of the timestamp above.
call
is the name of the UNIX system call issued.
args
is a list of the important arguments to the system call. String arguments (pathnames) are enclosed in double quotes; at present there is no encoding of pathnames containing quotes, so that a potential ambiguity exists (but to the best of my knowledge, none of the traces contain the ambiguous strings). Most numeric arguments are expressed in decimal, but the flags to open are given symbolically. Note that sometimes arguments are abbreviated, truncated, or "faked"; see below for more information.
result
is the return value provided to the calling process. Usually this is 0 (success) or -1 (failure), but sometimes it is a process ID or file descriptor. As with the arguments, this is occasionally faked to make it simpler to process the output; see below for more information.
error
and the associated parentheses are shown only if (a) the system call was traced after it executed and (b) the call failed. In this case, this field gives the numeric error code (taken from errno.h). Note that in some of the early trace files, failed system calls were not recorded, so this information is not always available.

As mentioned above, some system calls are traced with only limited argument information, and others are traced with "fake" arguments or return values. These include:

open
Only two arguments are shown, and only the O_CREAT, O_TRUNC, and O_EXCL flags are displayed in addition to the open mode. The full flags are available in the binary trace file, however.
creat
Only the pathname created is shown.
execv/execve
Only the pathname executed is shown, and only in the trace captured before the call was made. The program interpreting the output must record this pathname and correlate it with the subsequent "after" trace record for that PID (which may not follow immediately). It is only valid to assume that this pathname was accessed if the "after" trace record shows success.
mknod
Only the pathname created is shown.
stat/lstat/oldstat/oldlstat
Only the pathname tested is shown. The results of the stat (e.g., file size and times) are not available even in the binary trace file.
utime/utimes
Only the pathname updated is shown.
access
Only the pathname tested is shown.
readlink
Only the pathname of the link being read is shown. The value returned to the program is not available even in the binary trace file.
fcntl
The second argument is shown symbolically if possible, otherwise it is given in decimal. For F_DUPFD, F_GETFD, F_SETFD, F_GETOWN, F_SETOWN, the file descriptor involved is shown in decimal as the third argument. For other calls, the third argument is given in hex.
truncate
Only the pathname truncated is shown.

Except as noted above, the omitted arguments are missing from the binary trace file as well as the dumpobs output.

Binary Trace Format

The traces are stored in a gzipped binary format. Each trace record is described by the following "C" structure:

typedef struct {
	unsigned char callcode;	/* System call code */
	unsigned char errcode;	/* Error return (errno) from system call */
	unsigned short flags;	/* Flags associated with entry */
	short argsize;		/* Size of arguments, ints */
	uid_t uid;		/* UID of process */
	pid_t pid;		/* PID being traced */
	int retval;		/* Return value from call, if success */
	struct timeval calltime; /* Time call was recorded */
	int args[1];		/* Argument[s] to system call */
} seerbuf_t;

With certain exceptions, each record represents a system call by one user. The records appear in the order they were seen by the Linux kernel.

The fields in the structure have the following meaning:

callcode
is a numerical indication of the system call issued (see below).
errcode
is the error code if the call failed, or zero if it succeeded (but see the flags field below.

Geoff Kuenning's home page.

This page maintained by

Geoff Kuenning.