Workload Model Language

For a tutorial on WML read Writing Workload Models.

FileBench uses a synthetic application model description which can be used to derive analytic models and reconstruct the footprint of the application; allowing predictions and characterization but with greatly reduced time and cost.

The language allows accurate representation of application workloads and thus facilitates prediction, modeling and measurement of system performance.

Synthetic benchmarking is achieved by using the model described in the language to generate load on a test system in a manner similar to if the real application was running on the system. The performance of the system can be measured during load generation.

For benchmarking, the model is realized by dynamically recreating the correct number of processes (or threads), memory footprint and I/O, together with all the inter-process synchronization seen in the real application.

The language could also be used to drive an analytical model. This allows forward prediction of system performance changes when sub-components of the system are changed.

The workload description is achieved by describing it as a series of processes, threads and flows. Each process represents an address space, which contains 1 or more threads. Each thread represents a flow of sequential execution of a closed queue of flow operations. Each flow operation is a pre-defined system operation, such as a file read/write etc...

  model -> process -> thread -> {flowop, flowop, flowop...}
                   -> thread -> ...
        -> process -> ...

For example, a random I/O workload in the f language could be represented by the following:

  define process randomizer
  define thread random-thread procname=randomizer
  {
        flowop random-read type=read,filename=bigfile,
                                   random,iosize=2k
        flowop random-write type=write,filename=bigfile,
                                   random,iosize=2k
  }

We also introduce flow synchronization between flows; to facilitate replication of inter-flow constraints as seen in real applications. For example, a database workload may consist of two critical process which are dependent on each other. The flow operation loop for each will be linked in this case:

  {read, read, read, block on other flowop} <--+
                                               |
  {write, write, write, wakeup other flowop}  -+

A simple database representation consisting of three processes would be represented by:

  define process logwr
  define process dbwr instances=1
  define process shadow instances=$shadows
  
  define thread logwr procname=logwr,memsize=10m
  { 
    flowop log-write type=write,filename=log,
        iosize=1m,workingset=1m,random,dsync
    flowop log-block type=semblock,value=40
  }
  
  define thread dbwr procname=dbwr,memsize=10m
  { 
    flowop dbwr-write type=write,filename=datafile,
        iosize=1m,workingset=1m,random,dsync
    flowop dbwr-block type=semblock,value=10,highwater=1000
  }
  
  define thread shadow procname=shadow,memsize=10m
  { 
    flowop shadowread-a type=read,filename=datafile,
        iosize=2k,workingset=10m,random,dsync
    flowop shadow-post-log type=sempost,value=1,
                                  target=log-block
  }

Workload Model Language Example

(Subject to lots of change at the moment)

#!/usr/benchmarks/filebench/go_filebench -f

debug 1

define file name=bigfile1,path=$datadir1/myrandfile,size=50g,prealloc,reuse,paralloc

define process name=rand-read,instances=1
{
  thread name=rand-thread,memsize=10m,instances=100
  {
    flowop read name=rand-read1,filename=bigfile1,iosize=$iosize,random,directio
    flowop eventlimit name=rand-rate
  }
}

Usage:
filebench: interpret f script and generate file workload
Options:
   [-h] Display verbose help
   [-p] Disable opening /proc to set uacct to enable truss

Workload Model 'f' language definition:

(Subject to change as new features are added)

Variables:

set $var = value
    $var   - regular variables
    ${var} - internal special variables
    $(var) - environment variables
define randvar name = $random_var
                       [, type=<uniform | gamma>]
                       [, seed=<value>]
                       [, gamma=<value>]
                       [, mean=<value>]
                       [, min=<value>]
                       [, round=<value>]
                       [, randsrc=<urandom | rand48>]
                       [[, type=tabular], randtable = {{<value>, <value>, <value>}, {...}, ... }]
set $random_var.<type=<uniform | gamma>|seed=<value>|gamma=<value>|mean=<value>|min=<value>|round=<value>|randsrc=<urandom | rand48>>

Files and Filesets:

define file name=<file-name>,path=<pathname>,size=<size>
                        [,paralloc]
                        [,prealloc]
                        [,reuse]

define fileset name=<fileset-name>,path=<pathname>,entries=<number>,size=<size>
                        [,dirwidth=<width>]
                        [,dirdepthrv=<$random_var>
                        [,dirgamma=[100-10000] (Gamma * 1000)
                        [,sizegamma=[100-10000] (Gamma * 1000)
                        [,prealloc[=percent]]

Processes and Threads:

define process name=<name>[,instances=<count>]
{
  thread ...
  thread ...
  thread ...
}

  thread  name=<name>[,instances=<count>]

  {
    flowop ...
    flowop ...
    flowop ...
  }

Flowops:

flowop [aiowrite|write|read] name=<name>, 
                        filename|filesetname=<file-name|fileset-name>,
                        iosize=<size>
                        [,fd=<number>]
                        [,directio]
                        [,dsync]
                        [,opennext]
                        [,iters=<count>]
                        [,random]
                        [,workingset=<size>]
                        [,indexed=<file-index>]

flowop [appendfile|appendfilerand] name=<name>
                        filename|filesetname=<file-name|fileset-name>
                        [,fd=<number>]
                        [,directio]
                        [,dsync]
                        [,opennext]
                        [,iters=<count>]
                        [,workingset=<size>]
                        [,indexed=<file-index>]

flowop [writewholefile|readwholefile] name=<name>, 
                        filename|filesetname=<file-name|fileset-name>
                        [,fd=<number>]
                        [,directio]
                        [,dsync]
                        [,opennext]
                        [,iters=<count>]
                        [,workingset=<size>]
                        [,indexed=<file-index>]

flowop aiowait name=<name>,target=<aiowrite-flowop>

flowop createfile name=<name>, filesetname=<fileset_name>
                        [,fd=<number>]
                        [,indexed=<file-index>]

flowop [deletefile|statfile] name=<name>,
                        filesetname=<fileset_name>|fd=<number>
                        [,indexed=<file-index>]

flowop closefile name=<name>, fd=<number>

flowop openfile name=<name>,
                        filename|filesetname=<file_name|fileset_name>,
                        [,opennext]
                        [,fd=<number>]

flowop [makedir|removedir] name=<name>, filesetname=<fileset_name>
                        [,indexed=<file-index>]

flowop listdir name=<name>, filesetname=<fileset_name>

flowop fsync name=<name> [,fd=<number>]


flowop sempost name=<name>,target=<semblock-flowop>,
                        value=<increment-to-post>

flowop semblock name=<name>,value=<decrement-to-receive>,
                        highwater=<inbound-queue-max>

flowop block name=<name>

flowop hog name=<name>,value=<number-of-mem-ops>

flowop wakeup name=<name>,target=<block-flowop>,

flowop eventlimit name=<name>
flowop [bwlimit|iopslimit|opslimit] name=<name>
                        [,target=<io-producing-flowop>]

flowop finishoncount name=<name>, value=<number-of-ops>
                        [,target=<any-flowop>]
flowop finishonbytes name=<name>, value=<number-of-megabytes>
                        [,target=<io-producing-flowop>]

Commands:

eventgen rate=<rate>
create [files|processes]
stats [clear|snap]
stats command "shell command $var1,$var2..."
stats directory <directory>
run <run-time>
sleep <sleep-value>
quit

Filebench Language Reference

In this section, the Workload Model Language (WML), which runs to nearly 70 words, is detailed. There are four main categories of interest: Commands, Entities, Flowops and Attributes.

Global parameters are set, Entities and Flowops created and benchmark runs are controlled using Commands. Entities are specific resources such as files and threads of control. The actions taken by the workload defined by the f program are implemented by Flowops. Finally, many Commands, Entities and Flowops can be passed parameters through the use of Attributes.

General Commands

eventgen

The eventgen command sets the rate (per second) at which internal Filebench events are generated. Events are then used by eventlimit, iopslimit, opslimit, and bwlimit flowops. If there are no events available then a flowop blocks until new events are posted. There is one general pool of events used by all flowops in all threads.

Example:

eventgen rate = 100

echo

The echo command prints text to the standard output. The text should be enclosed in quotes and can contain variables. It is usually used to print workload description.

Example:

echo "Bringover Version 1.12"

echo "Number of files in workload is $files"

enable

lathist

Enables per-flowop latency histograms. Disabled by default.

Syntax:

enable lathist

multi

Enables multinode mode. Disabled by default. One has to specify master's hostname and client's name.

Syntax:

enable multi master=<hostname>,client=<clientname>

run

Do a file bench run. Calls routines to create file sets, files, and processes. It resets the statistics counters, then sleeps for the runtime passed as an argument to it on the command line in 1 second increments. When it is finished sleeping, it collects a snapshot of the statistics and ends the run.

Syntax:

run runtime|$integervarname

set

Set searches for the varname supplied as the first argument and if not found creates a new var of that name. It then supplies the integer, string, or $varname value on the rest of the line to it.

Syntax:

set $<random varname>.type = [ uniform | gamma | tabular ]

set $<random varname>.randsrc = [ urandom | rand48 ]

set $<random varname>.[ gamma | mean | min | round | seed ] = <value>

set $<random varname>.randtable = {{ <%>, <min value>, <max value>}, ...}

set mode

The set mode command is used to put FileBench into various special modes of operation. It is followed by a subcommand, of which quit is the only one presently defined. The default is quit timeout, which ends the run when the runtime specified in a run command has expired or when an explicit shutdown command is encountered. If the workload is expected to end when it runs out of resources, such as files to delete, then use either quit alldone to quit once all the threads have quit because of resource exhaustion, or quit firstdone, to quit as soon as the first thread detects resource exhaustion.

Syntax:

set mode quit [ timeout | alldone | firstdone ]

shutdown

Shuts down filebench if process or processes is specified as an argument. Anything else is currently an error.

Syntax:

shutdown process[es]

system

Executes quoted UNIX command, waits for the command to finish, and prints its output to the screen.

Syntax:

system "<unixcommand>"

Examples:

system "echo 3 > /proc/sys/vm/drop/caches"

system "ls $dir"

define

Defines filesets, files, processes, and flowops.

fileset

Information about a group of related files is contained in a fileset entity. Fileset entities are specified using the define fileset command. The define fileset command must provide a name for the fileset and the path to the directory where the fileset will reside. In addition, the number of files to be created, their average size, the average width of the subdirectories in the directory tree which contains the files of the fileset, whether to, and if so what percent to pre allocate with null data, and whether to reuse he fileset or recreate it if it already exists.

Syntax:

define fileset name=<name>,path=<pathname>,entries=<number> [,dirwidth=<width>] [,dirgamma=<directory gamma value>] [,size=<mean file size>] [,sizegamma=<>] [,prealloc=<percent to preallocate>]

file

Defines a single file. Mandatory attributes of the file are name, path (directory where the file will reside), and size. Optional attributes are prealloc (create and fill the file with data), paralloc (allocate in parallel with other files), and reuse (reuse file if exists).

XXX': What about readonly and cached?

Syntax:

define file name=<name>,path=<pathname>,size=<size> [,paralloc] [prealloc] [,reuse]

Examples:

define file name=myfile,path=/tmp,size=100mb,prealloc

create

Creates all previously defined files and filesets. This would be automatically done by run or psrun commands but sometimes it is handy to separate file creation stage. E.g., when Filebench is used only to generate a file system tree or when one wants to execute system commands after file creation (e.g., drop caches).

Syntax:

create files

Entities

The workload language defines the following entities:

var (a variable)
file (a single file)
fileset (a set of files)
process (an operating system process)
thread (an operating system thread)
flowop (a Filebench workload operation)

This section will cover all the entities except for the flowops, which have their own section following this one.

vars

A workload model can make use of variables, known as vars from within the model. Vars can be user defined, FileBench internally defined items, or externally defined environment variables. All vars are identified by a string preceded by a dollar sign "$". User defined vars can be created and assigned either integer or string values using the set command. Internal vars can be accessed by specifying the appropriate name enclosed in braces, while environment vars can be accessed by specifying the appropriate name enclosed in parentencies.

Var Name Syntax:

$user_defined_varname

${stats | rate | date |scriptname | hostname}

$(environment_varname)

Variables can be used to set attributes for files, filesets, processes, threads, and flowops. Each var can hold either a boolean, 64 bit unsigned integer, double, or character string, or be empty.

Regular vars

Regular, user defined vars can be set to a value with a set command.

Var Set Syntax:

set $<user_defined_varname> = [true | false | <positive integer value> | <double precision floating point value> | <character string>]

Random Vars

Random variables are user defined entities that are defined with a random distribution which is used to pick a random value to return with each use. The are created with a define randvar command, and may have individual parameters set with set commands. The are used just like regular variables, but return a different value each time they are accessed. They are particularly useful with flowops.

Random Variable Define Syntax:

define randvar name = $<user_defined_varname>, [type=[uniform | gamma | table]] [, seed=<value>] [, mean=<value>] [, gamma=<value>] [, min=<value>] [, round=<value>] [, randsrc=[urandom | rand48] [, randtable={{<%>,<min value>,<max value>}, ...}

Random Variable Set Syntax:

set $<user_defined_varname>.<attribute name> = <value>

Processes and Threads Entities

A Filebench process represents and operating system process and contains one or more threads. In turn, each Filebench thread represents an operating system thread of control and contains a collection of flowops. The following example illustrates how to specify a process, thread and collection of flowops.

Example:

set $nthreads=1
define process name=filewriter,instances=1
{
  thread name=filewriterthread,memsize=10m,instances=$nthreads
  {
    flowop appendfile name=write-file, filesetname=bigfileset, iosize=1m, fd=1, iters=20
    flowop closefile name=close,fd=1
    flowop finishoncount name=finish,value=1
  }
}

process

Process entities are used to hold attributes and other information about each operating system process. The define command is used to instantiate a given process entity, which may spawn multiple, identical copies if instances is specified as greater than 1 (the default is 1). The process(es) corresponding to the process entity may be run at a lower priority level by setting a value for the nice attribute.

Syntax:

define process name=<process name> [,instances=<number of instances>] [,nice=<additional niceness>]

{

: thread ...; [thread ... ]; ...

}

thread

Thread entities are used to hold attributes and other information about each operating system thread. They are defined within a process definition, and become part of that process, which may spawn multiple, identical copies if instances is specified as greater than 1 (the default is 1).

Threads can also have a region of memory allocated which certain flowops will then use as buffer space for I/O. This region is created by setting a value for the memsize attribute. If the useism attribute is set then IPC shared memory will be used, otherwise thread local memory will be used.

Syntax:

thread name=<thread name> [,instances=<number of instances>] [,memsize=<size of thead's memory>] [,useism]

{

: flowop ...; [flowop ...]; ...

}

Flowops

The flowop clause determines what a thread actually does. The operations and their syntax is covered in this section. The flowop actions can be divided up into Basic I/O, Asynchronous I/O, Synchronization constructs and Misc other operations.

Some of the flowop syntax is common to all flowops. They are defined with the flowop keyword within a thread definition. Immediately following the flowop keyword is the name of the particular operation that is to be performed. Following that are a list of attributes, of which the name and iters attributes are common to all flowops. The name attribute is required and provides the particular instance of the flowop with a name by which it can be referenced elsewhere. The name must be globally unique. The iters attribute is optional, but if specified allows the action specified by the flowop to happen multiple times each time the flowop is invoked.

Common Flowop Syntax:

flowop <operation type name> name=<name> [,iters=<number of iterations per invocation>] [additional attributes...]

Flowop I/O Operations

Reading and writing to files and filesets. On opening or creating a file, a file descriptor number can be specified, which will save the returned file descriptor for later use. Then operations on the already open file can reference it by its file descriptor number. Other read and write flowops will implicitly open a file for use if the filename or fileset name is provided without a file descriptor number. For filesets, specific files can be accessed by passing a file index number to the flowop, which can be obtained from a random variable to provide random file accesses. Otherwise files will be accessed round robin style.

read

Emulate posix read or pread. The flowop must include either a fileset or filename attribute. If a fileset attribute is supplied, the read operation will be done to a file from the fileset. Otherwise it will read the file specified by the filename attribute. If a fileset is specified along with an fd attribute, than the referenced file will be read. Otherwise the default fd=0 will be used or, if the opennext attribute is set, the flowop will pick the next file in sequence to read. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The actual read is done to a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size if the random attribute is set, or at the next sequential location. The workingset attribute specifies the working set size for use in choosing the random disk offset.

Syntax:

flowop read name=<name>, filename|fileset=<fname>, iosize=<size> [,directio] [,dsync] [,iters=<count>] [,random] [,opennext] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>]

readwholefile

Emulate a read of a whole file. The file from the supplied fileset fname that is referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Reading from ordinary files (fileobjs, as would be specified by the filename attribute) is not currently supported. The readwholefile flowop then reads from the beginning of the file to the end, using zero or more iosize reads, followed by a read of whatever remaining, less than iosize amount, needs to be read. If iosize is not defined or set to zero, then the file will be read in one read of filesize bytes.

Syntax:

flowop readwholefile name=<name>, fileset=<fname>, iosize=<size>, [,iters=<count>] [,fd=<file-desc-number>] [,index=<file-index>]

write

Emulate a write to a file. The size of the write is specified by the iosize attribute. If a fileset is specified, it writes to a file from the fileset referenced by the fd attribute, if it is supplied, to the default fd=0 file, or to the next file of the sequence if the opennext attribute is set. If a filename attribute is supplied instead, it will write to the named file. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The flowop's workingset attribute will be used to set the maximum file size if it is non-zero, otherwise the filesetentry's fse_size will be used. The actual write is done to a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size if the random attribute is set, or at the next sequential location.

Syntax:

flowop write name=<name>, filename|fileset=<fname>, iosize=<size> [,directio] [,dsync] [,iters=<count>] [,random] [,opennext] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>]

writewholefile

Emulate a write of a whole file. The size of the file is taken from a filesetentry identified by the srcfd attribute, while the file used for the write is identified by the fd attribute. Both default to the filesetentry associated with fd=0. Does multiple writes of iosize length until full file has been written. If iosize is not defined or set to zero, then a single write of the size of the source file is done.

Syntax:

flowop writewholefile name=<name>, fileset=<fname>, iosize=<size> [,dsync] [,iters=<count>] [,srfd=<source file desc num>] [,fd=<file-desc-number>] [,index=<file-index>]

appendfile

Emulate a fixed size append to a file. Will append data to a file chosen from a fileset if one is specified with the fileset attribute, or if the fd attribute is non zero and the attribute's file associated with the file descriptor is open. If a fileset is specified but the referenced file is not open, appendfile will open it. If no fileset or non-zero fd attribute is specified, then a file named by the filename attribute will be used. If no appropriate file can be found, filebench will terminate. While the workingset attribute is accepted, it is not currently used. Thus the repeated invocation on the flowop for a given file will cause the file to grow arbitrarily large. The size of each append is set by the iosize attribute.

Syntax:

flowop appendfile name=<name>, filename|fileset=<fname>, iosize=<size>\ [,dsync] [,iters=<count>] [,workingset=<size>] [,fd=<file-desc-number>][,index=<file-index>]

appendfilerand

Emulate a random size append to a file. Will append data to a file chosen from a fileset if one is specified with the fileset attribute, or if the fd attribute is non zero and the attribute's file associated with the file descriptor is open. If a fileset is specified but the referenced file is not open, appendfile will open it. If no fileset or non-zero fd attribute is specified, then a file named by the filename attribute will be used. If no appropriate file can be found, filebench will terminate. While the workingset attribute is accepted, it is not currently used. Thus the repeated invocation on the flowop for a given file will cause the file to grow arbitrarily large. A write to the current end of the file with a random transfer size of at most iosize bytes is done.

Syntax:

flowop appendfilerand name=<name>, filename|fileset=<fname>, iosize=<size>\ [,dsync] [,iters=<count>] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>]

File Meta Data Flowops

Opening, closing and stating files and filesets. On opening or creating a file, a file descriptor number can be specified, which will save the returned file descriptor for later use. Then operations on the already open file can reference it by its file descriptor number. For filesets, specific files can be opened or created by passing a file index number to the flowop, which can be obtained from a random variable to provide random file accesses. Otherwise files will be accessed round robin style.

createfile

Emulate create of a file. Associates createfile's fd attribute, if supplied, with the created file's operating system specific file descriptor so it can be referenced by other file operations. Selects a file entry (filesetentry) from the fileset whose file does not currently exist for the file create operation. Then performs an open operation on the file with the O_CREATE flag set to create the file. The file can be created with direct (as opposed to buffered) I/O by including the directio attribute and writes can be forced to behave as defined by synchronized I/O data integrity constraints by setting the dsync attribute.

Syntax:

flowop createfile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,directio] [,dsync] [,index=<file-index>]

openfile

Emulates a file open operation. Associates openfile's fd attribute, if supplied, with the opened file's operating system specific file descriptor so it can be referenced by other file operations. However, openfile will fail if the supplied fd attribute is already associated with an open file. Selects a file entry (filesetentry) from the fileset whose file exists for the file open operation. Then performs a file open operation on the filesetentry's associated file. The file can be created with direct (as opposed to buffered) I/O by including the directio attribute and writes can be forced to behave as defined by synchronized I/O data integrity constraints by setting the dsync attribute.

Syntax:

flowop openfile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,directio] [,dsync] [,index=<file-index>]

closefile

Emulate close of a file. The file referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Simply does a close operation on the referenced file.

Syntax:

flowop closefile name=<name> [,fd=<file-desc-number>]

fsync

Emulates fsync of a file. The file referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Doing an fsync on an ordinary file (fileobj) works with "fd=0" (or "fd" not set). Simply does an fsync operation on the referenced file.

Syntax:

flowop fsync name=<name> [,fd=<file-desc-number>]

fsyncset

Emulate fsync of an entire fileset. Does an fsync operation on every open file of the fileset.

Syntax:

flowop fsyncset name=<name>,fileset=<fname>

statfile

Emulate stat of a file. Picks an arbitrary fileset entry with an existing file from the fileset specified by the fileset attribute, then performs a stat() operation on it.

Syntax:

flowop statfile name=<name>,fileset=<fname> [,fd=file-desc-number] [,index=<file-index>]

deletefile

Emulates delete of a file. Picks either an arbitrary, index specified, or file-descriptor-number specified, filesetentry whose file exists from the fileset specified by the fileset attribute, and deletes it.

Syntax:

flowop deletefile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,index=<file-index>]

Directory Flowops

Making, Listing and removing directories. The fileset must have been defined to include empty directory entries in addition to or instead of files. Specific directories can be accessed by passing a directory index number to the flowop, which can be obtained from a random variable to provide random directory accesses. Otherwise directories will be accessed round robin style.

MakeDir

Emulates mkdir command. Picks either an arbitrary or index specified filesetentry for a directory which not exist in storage from the fileset specified by the fileset attribute, and makes it.

Syntax:

flowop makedir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>]

ListDir

Emulates ls of a directory. Picks either an arbitrary or index specified filesetentry for a directory which exists in storage from the fileset specified by the fileset attribute, and lists it.

Syntax:

flowop listdir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>]

RemoveDir

Emulates an rm of a directory. Picks either an arbitrary or index specified filesetentry for a directory which exists in storage from the fileset specified by the fileset attribute, and removes it.

Syntax:

flowop removedir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>]

Flowop Asynchronous I/O Operations

Filebench supports asynchronous writes and implements a mechanism to wait for their completion. An Asynchronous I/O (aio) element is used to associate the asynchronous write request with its subsequent completion. An aiowrite flowop will add an aio to the thread's aio list, An aiowait flowop will wait for half the current list to complete (minimum of 1), removing completed ones from the list.

aiowrite

Emulate posix aiowrite(). The size of the write is specified by the iosize attribute. If a fileset is specified, it writes to a file from the fileset referenced by the fd attribute, if it is supplied, to the default fd=0 file, or to the next file of the sequence if the opennext attribute is set. If a filename attribute is supplied instead, it will write to the named file. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The flowop's workingset attribute will be used to set the maximum file size if it is non-zero, otherwise the filesetentry's fse_size will be used. The flowop issues the asynchronous write from a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size. This operation is currently only valid for random I/O, and returns an error if the flowop is set for sequential I/O.

Syntax:

flowop aiowrite name=<name>, filename|fileset=<fname>, iosize=<size>, random [,directio] [,dsync] [,iters=<count>] [,opennext] [,workingset=<size>] [,fd=<file desc num>]

aiowait

Emulate posix aiowait(). Waits for the completion of half the outstanding asynchronous I/Os, or a single I/O, which ever is larger. The routine will return after a sufficient number of asynchronous writes issued by any thread in the procflow have completed, or a 1 second time-out elapses. All completed I/O operations are deleted from the thread's list of asynchronous I/Os in progress.

Syntax:

flowop aiowait name=<name>,target=<aiowrite-flowop>

Flowop Synchronization Operations

block

Blocks the threadflow until woken up by the wakeup flowop.

Syntax:

flowop block name=<name>

wakeup

Wakes up one or more blocked target flowops. The set of targets consists of all flowops whose name matches this flowop's target attribute.

Syntax:

flowop wakeup name=<name>,target=<block-flowop>

semblock

Attempts to pass a semaphore and blocks if necessary. Can be compiled to use either System V or posix semaphores, but the System V semaphore version is produced by the currently set source code defines.

Syntax:

flowop semblock name=<name>, value=<decrement-to-receive>, highwater=<inbound-queue-max>

sempost

Post to a set of semblock flowops identified by the target attribute. Either System V or posix semaphores are used, as described for semblock.

Syntax:

flowop sempost name=<name>, target=<semblock-flowop>, value=<increment-to-post>

Flowop Misc Operation

hog

Consumes CPU cycles and memory bandwidth by looping for value iterations, while setting the first byte of the thread's memory region to 1 on each iteration.

Syntax:

flowop hog name=<name>,value=<number-of-mem-ops>

delay

Delays for value number of seconds using the user sleep routine.

Syntax:

flowop delay name=<name>,value=<number-of-seconds>

eventlimit

Completes one invocation per posted event. If events are available, it removes one and continues to the next flowop, otherwise it blocks until one or more new events are posted.

Syntax:

flowop eventlimit name=<name>

bwlimit

Blocks the calling thread if the number of bytes of I/O issued exceeds one megabyte times the number of posted events, thus limiting the average I/O byte rate to one megabyte times the event rate. To set the event rate see rate. If a target flowop is specified than the i/o bandwidth produced by that particular flowop (separately for each thread) sets the limit.

Syntax:

flowop bwlimit name=<name>,[target=<io-producing-flowop]

iopslimit

Blocks the calling thread if the number of issued I/O operations exceeds the number of posted events, thus limiting the average I/O operation rate to one I/O per event. If a target flowop is specified than the i/o operations produced by that particular flowop (separately for each thread) sets the limit.

Syntax:

flowop iopslimit name=<name>,[target=<io-producing-flowop]

opslimit

Blocks the calling thread if the number of issued filebench operations exceeds the number of posted events, thus limiting the average filebench operation rate to one per event. If a target flowop is specified than the operations (generally number of times called) produced by that particular flowop (separately for each thread) sets the limit.

Syntax:

flowop opslimit name=<name>,[target=<any-flowop]

finishoncount

Stops the filebench run when the number of I/O operations specified by value have been performed. If a target flowop is specified than the operations count produced by that particular flowop (separately for each thread) determines when to stop.

Syntax:

flowop finishoncount name=<name>,value=<ops/s>,[target=<any-flowop]

finishonbytes

Stops the filebench run when the number of bytes of I/O specified by value have been read and/or written. If a target flowop is specified than the i/o bandwidth produced by that particular flowop (separately for each thread) determines when to stop.

Syntax:

flowop finishonbytes name=<name>,value=<bytes>,[target=<io-producing-flowop]

Attributes

The behavior of processes, threads and flowops can be modified by supplying them with attributes. Some attributes are required, others are optional and have default values which will be used if they are not supplied. Some attributes are booleans, that is they are true if supplied and false if not. Other attributes take numeric or string values, sometimes with default values if a value is not supplied. The supplied values may either be a constant string or integer, as appropriate, or may be a string or integer variable which has had a value assigned to it by a set command. For example, directing a fileset to pre allocate 100% of its files can be done by any of the following:

define fileset name=foo, prealloc

define fileset name=foo, prealloc=100

set $preallocpercent = 100
define fileset name=foo, prealloc=$preallocpercent

In this section, <intval> will be used to designate either an integer or an integer variable, wile <strval> will be used to designate either a string or a string variable.

cached

When specified, this attribute prevents the filesystem caches from being flushed after file or fileset creation or reuse. If not specified, filebench will attempt to flush filesystem caches so as to prevent a workload's I/Os from running completely out of filesystem cache.

Note: for ZFS we will still do the 'zpool export/import' (via the 'fs_flush' script to get rid of the cached copy in the ARC) even if "cached" is set to true.

Syntax:

define file | fileset ... ,cached ...

dirwidth

The dirwidth attribute of a fileset specifies the average number of entries in each directory created as part of the fileset. Filebench also uses this in combination with the total number of files in the fileset to calculate the required mean depth of the fileset's directory tree. The default is a dirwidth of 0, which specifies a single level directory containing all the files of the fileset.

Syntax:

define fileset ... ,dirwidth=<intval>

dirgamma

The dirgamma attribute of a fileset specifies the alpha parameter of the gamma distribution which will be used to decide whether a given subdirectory contains files or additional subdirectories. If the dirgamma attribute is not specified, it defaults to 1500. The value can range from 100 to 10000, and the corresponds to a gamma of 0.1 to 10.

Syntax:

define fileset ... ,dirgamma=<intval>

namelength

The namelength attribute is not presently supported.

filesize

The filesize attribute of a file or fileset specifies the size of file(s) that will be created. If a fileset and filegamma has been specified as other than 0, the filesize attribute will actually specify the mean file size, with the actual size of each file based on the gamma distribution with alpha based on the filegamma attribute.

Syntax:

define file | fileset ... ,filesize=<intval>

filesizegamma

The filesize attribute of a fileset specifies the alpha parameter for the gamma distribution used to select the sizes of each file created as part of a fileset, where the mean file size is specified by the filesize attribute. If the filesizegamma attribute is set to zero, all files will be created with filesize number of bytes. This attribute also specifies the mean width of each created subdirectory, with the actual width selected from a gamma distribution with alpha of dirgamma. If the filesizegamma attribute is not specified, it defaults to 1500. The value can range from 100 to 10000, and the corresponds to a gamma of 0.1 to 10.

Syntax:

define fileset ... ,filesizegamma=<intval>

directio

Specifies opening the file in direct, rather than buffered, I/O mode. Essentially bypasses filesystem caches, so each I/O request results in an actual I/O to the attached device. Needs to be specified with the flowop that opens the file, which is often openfile, but can be one of the other I/O flowops.

Syntax:

dsync

Specifies the use of synchronous writes, which do not complete until the attached device has written the data to non volatile storage. Not only does this disable filesystem write back caching, it also is supposed to prevent device (such as an attached disk drive) from doing write back caching. This attribute needs to be specified with all flowops that might open the file, as the file must be opened as a synchronous file for this attribute to be effective. While openfile is often used for that purpose, any of the other flowops will open a file if it is not already open, so they may need the attribute defined as well.

Syntax:

fd

The fd parameter to the flowop clause is used to explicitly set the file descriptor on which the file is opened. This is useful where the script is used to emulate an application that has a number of files open on different descriptors or does involved opens/closes using a limited or extended range of descriptors. In the example, an arbitrary file from the file bigfileset is opened by the first flowop, and assigned file descriptor 1. The read flowop which follows also references file descriptor 1, so it will read whichever file was opened by the first flowop. Finally, the closefile flowop will close the file referenced by file descriptor 1.

Example:

thread name=filereaderthread,memsize=10m,instances=$nthreads
  {
    flowop openfile name=openfile,filesetname=bigfileset,fd=1
    flowop read name=readfile1,fd=1
    flowop closefile name=closefile,fd=1
  }

Syntax:

srcfd

The srcfd attribute specifies the file descriptor to use as a source of filesize information when invoking the writewholefile flowop. In the example below the code is emulating a copy file operation, where the file is read in then written out to a new file, which, of course, would end up with the same size as the original.

Example:

thread name=filereaderthread,memsize=10m,instances=$nthreads
  {
    flowop openfile name=openfile1,filesetname=bigfileset,fd=1
    flowop readwholefile name=readfile1,fd=1
    flowop createfile name=createfile2,filesetname=destfiles,fd=2
    flowop writewholefile name=writefile2,filesetname=destfiles,fd=2,srcfd=1
    flowop closefile name=closefile1,fd=1
    flowop closefile name=closefile2,fd=2
  }

Syntax:

flowop writewholefile ... ,srcfd=<integer>

opennext

The opennext attribute is used with I/O flowops to indicate that the flowop should open a different file with each invocation. This attribute only applies to filesets and is meaningless when used with a simple file.

Syntax:

filename

The filename attribute specifies the name of a file. It is used with I/O flowops to specify which particular file to access.

Syntax:

filesetname

The filesetname attribute specifies the name of a fileset. It is used with I/O flowops to specify which particular fileset to access.

Syntax:

instances

When process and threads are specified using the define command, the instances attribute may be used so request multiple copies of the defined process or threads. If used with the process definition, the requested number of operating system processes will be created, each with its own copy of the threads and flowops included in the definition. Similarly, if used with a thread definition, the requested number of threads will be created for each operating system process, each with its own copy of the specified flowops. If the instance flowop is not included, then a single instance of the process or thread is created.

Syntax:

define process name=<procname> ... ,instances=<intval>

threads name=<threadname> ... ,instances=<intval>

iosize

The iosize attribute is used with I/O commands to specify the size of the I/O operations (i.e. disk reads) that they will perform. It defaults to zero if omitted.

Syntax:

iters

Individual flowops may be executed multiple times each time they are invoked by setting the iters attribute to the desired number of executions. If not specified, the flowop will only be executed once each time it is invoked.

Syntax:

flowop ... ,iters=<intval>

memsize

This parameter in the thread clause sets the size of the private memory segment of the thread.

Example:

define process name=filewriter,instances=1
{
  thread name=filewriterthread,memsize=10m,instances=1
  {
    flowop ....
  }
}

Syntax:

thread ... ,memsize=<intval>

name

Files, filesets, processes, threads and flowops all require names, which are set with the name attribute. Names must be globally unique within an entity type. Thus if you have two processes, each of which has a read type flowop, you must make sure that both read flowops have unique names, such as name=read1 name=read2.

Syntax:

define file|define fileset|define process|thread|flowop name=<strval> ...

nice

The nice attribute allows you to lower the priority of the process (or set of processes if multiple instances are requested) below that of what it otherwise would be. Note that all processes are automatically set to a lower priority than the master process controlling the run. But if you want a particular process to be at a lower priority than the others, specify nice with some integer value to accomplish this.

Syntax:

process ... ,nice=<intval>

entries

Filesets are typically used to create a group of files, and the entries attribute is used to set the number of such files. If the entries attribute is not specified, only one file will be created.

Syntax:

define fileset ... ,entries=<intval>

prealloc

The files defined by the file or fileset entities can either exist as potential files or actual ones. As a potential one, information about them is kept by the file or fileset entity, but the do not occupy disk space or exist in a directory. If they do not exist they can be created later with the creatfile flowop. When used with a file, the prealloc attribute specifies that the file should actually exist. When used with a fileset, it specifies the percentage of files that should actually exist, with a default value of 100%.

Syntax:

define file | fileset ... ,prealloc[=<intval>]

paralloc

Use of this attribute can speed up the pre allocation of files, by creating and writing them in parallel. However, it only works with files at present, not filesets.

Syntax:

define file ... ,paralloc

reuse

The reuse attribute allows the reuse of existing files or filesets which have the same name as the specified file or fileset. If the file is too large it will be truncated, and if it is too small it will be rewritten. A fileset with a matching name will also be reused, with individual files adjusted to match their new specified sizes.

Syntax:

define file | fileset ... ,reuse

path

The path directive is used in the fileset clause to set the prefix path for the dataset. It is often set at the head of the script or passed in as a parameter:

Example:

set $dir=/tmp
define fileset name=myset,path=$dir,size=16k,entries=1000,dirwidth=20

Syntax:

define fileset ... ,path=<strval>

random

The random attribute is used with the read, write and aiowrite flowops to specify that a random location within the file be picked for access. Without this attribute, the next sequential file blocks will be read or written.

Syntax:

flowop read | aiowrite | write ... ,random

rate

The rate attribute is used to set the event generation rate for the event generator as part of the eventgen command.

Syntax:

eventgen rate=<intval>

size

The size attribute is used with the define file and define fileset commands. For define file it sets the size of the file. For define fileset, it sets the mean size of the files, with the actual size set by the gamma random distribution specified for the fileset.

Syntax:

define file | fileset ... ,size=<intval>

target

Certain flowops depend on actions in other flowops, in which case they need a target attribute to tell them the name of the flowop on which they are depending. A prime example is the semaphore flowops, as illustrated in the following example from the oltp.f workload:

Example:

   ...

    flowop semblock name=lg-block,value=3200,highwater=1000

   ...

    flowop sempost name=shadow-post-lg,value=1,target=lg-block,blocking

   ...

Here the sempost flowop has a target of lg-block, the name of the associated semblock flowop. The actual semaphore is created by the semblock flowop, but the sempost flowop will use the lg-block target to find it and do the requested post operation on it.

Syntax:

flowop ... ,target=<strval>

useism

The useism attribute tells the thread to use shared memory for its thread memory region.

Syntax:

thread ... ,useism

value

The value attribute is used for passing miscellaneous integer values to the flowops. The exact meaning is dependent on the particular use.

Syntax:

flowop ... ,value=<intval>

workingset

The workingset attribute is used by some I/O flowops to specify a maximum byte range of the file that will actually be read from or written to. This can be less than the actual file size, or for writes may also be larger than the current size, where it serves to set the maximum size the file can grow to.

Syntax:

flowop read | aiowrite | write | appendfile | appendfilerand ... workingset=<intval>

blocking

Used with semaphore flowops.

Syntax:

flowop sempost|semblock ... ,blocking

highwater

Used with the semblock semaphore flowop.

Syntax:

flowop semblock ... ,highwater=<intval>

Debug Commands

This section lists commands that are mainly used for debugging.

list

Depending on the argument, the list command either prints information about all defined files or flowops.

Syntax:

list fileset

list flowop

debug

Sets the verbosity of debugging level. Allowed values are from 0 to 10. The higher the value the more debugging information is printed. By default this is set to 2 at the startup.

Example:

debug 4

quit

Ends the Filebench execution. Can be used to terminate WML parsing in the middle of a WML script.

Example:

quit

sleep

Causes Filebench to sleep for the number of seconds supplied by the argument. Notice, this is not a flowop executed by workers but a command that Filebench's master process executes.

Example:

sleep 10

sleep $varname

version

Prints Filebench version.

Example:

version

Workload Model Language

Contents

Workload Model Language

Workload Model Language Example

Workload Model 'f' language definition:

Filebench Language Reference

General Commands

eventgen

echo

enable

lathist

multi

run

set

set mode

shutdown

system

define

fileset

file

create

Entities

vars

Regular vars

Random Vars

Processes and Threads Entities

process

thread

Flowops

Flowop I/O Operations

read

readwholefile

write

writewholefile

appendfile

appendfilerand

File Meta Data Flowops

createfile

openfile

closefile

fsync

fsyncset

statfile

deletefile

Directory Flowops

MakeDir

ListDir

RemoveDir

Flowop Asynchronous I/O Operations

aiowrite

aiowait

Flowop Synchronization Operations

block

wakeup

semblock

sempost

Flowop Misc Operation

hog

delay

eventlimit

bwlimit

iopslimit

opslimit

finishoncount

finishonbytes

Attributes

cached

dirwidth

dirgamma

namelength

filesize

filesizegamma

directio

dsync

fd

srcfd

opennext

filename

filesetname

instances