Filebench Workload Language
Caution: This page is currently under major construction! We are updating it for the upcoming Filebench Version 1.5 Release. Some information on this page might be inaccurate or completely wrong!
Contents
- 1 Workload Model Language Introduction
- 2 FileBench Language Reference
- 2.1 General Commands
- 2.2 Statistics Commands
- 2.3 Entities
- 2.4 Flowops
- 2.5 Attributes
- 2.5.1 cached
- 2.5.2 dirwidth
- 2.5.3 dirgamma
- 2.5.4 namelength
- 2.5.5 filesize
- 2.5.6 filesizegamma
- 2.5.7 directio
- 2.5.8 dsync
- 2.5.9 fd
- 2.5.10 srcfd
- 2.5.11 opennext
- 2.5.12 filename
- 2.5.13 filesetname
- 2.5.14 instances
- 2.5.15 iosize
- 2.5.16 iters
- 2.5.17 memsize
- 2.5.18 name
- 2.5.19 nice
- 2.5.20 entries
- 2.5.21 prealloc
- 2.5.22 paralloc
- 2.5.23 reuse
- 2.5.24 path
- 2.5.25 random
- 2.5.26 rate
- 2.5.27 size
- 2.5.28 target
- 2.5.29 useism
- 2.5.30 value
- 2.5.31 workingset
- 2.5.32 blocking
- 2.5.33 highwater
Workload Model Language Introduction
XXX: Add new primitives: ioprio, posset, noreadahed
FileBench uses a synthetic application model description which can be used to derive analytic models and reconstruct the footprint of the application; allowing predictions and characterization but with greatly reduced time and cost.
The language allows accurate representation of application workloads and thus facilitates prediction, modeling and measurement of system performance.
Synthetic benchmarking is achieved by using the model described in the language to generate load on a test system in a manner similar to if the real application was running on the system. The performance of the system can be measured during load generation.
For benchmarking, the model is realized by dynamically recreating the correct number of processes (or threads), memory footprint and I/O, together with all the inter-process synchronization seen in the real application.
The language could also be used to drive an analytical model. This allows forward prediction of system performance changes when sub-components of the system are changed.
The workload description is achieved by describing it as a series of processes, threads and flows. Each process represents an address space, which contains 1 or more threads. Each thread represents a flow of sequential execution of a closed queue of flow operations. Each flow operation is a pre-defined system operation, such as a file read/write etc...
model -> process -> thread -> {flowop, flowop, flowop...} -> thread -> ... -> process -> ...
For example, a random I/O workload in the f language could be represented by the following:
define process randomizer define thread random-thread procname=randomizer { flowop random-read type=read,filename=bigfile, random,iosize=2k flowop random-write type=write,filename=bigfile, random,iosize=2k }
We also introduce flow synchronization between flows; to facilitate replication of inter-flow constraints as seen in real applications. For example, a database workload may consist of two critical process which are dependent on each other. The flow operation loop for each will be linked in this case:
{read, read, read, block on other flowop} <--+ | {write, write, write, wakeup other flowop} -+
A simple database representation consisting of three processes would be represented by:
define process logwr define process dbwr instances=1 define process shadow instances=$shadows define thread logwr procname=logwr,memsize=10m { flowop log-write type=write,filename=log, iosize=1m,workingset=1m,random,dsync flowop log-block type=semblock,value=40 } define thread dbwr procname=dbwr,memsize=10m { flowop dbwr-write type=write,filename=datafile, iosize=1m,workingset=1m,random,dsync flowop dbwr-block type=semblock,value=10,highwater=1000 } define thread shadow procname=shadow,memsize=10m { flowop shadowread-a type=read,filename=datafile, iosize=2k,workingset=10m,random,dsync flowop shadow-post-log type=sempost,value=1, target=log-block }
Workload Model Language Example
(Subject to lots of change at the moment)
#!/usr/benchmarks/filebench/go_filebench -f debug 1 define file name=bigfile1,path=$datadir1/myrandfile,size=50g,prealloc,reuse,paralloc define process name=rand-read,instances=1 { thread name=rand-thread,memsize=10m,instances=100 { flowop read name=rand-read1,filename=bigfile1,iosize=$iosize,random,directio flowop eventlimit name=rand-rate } }
Usage: filebench: interpret f script and generate file workload Options: [-h] Display verbose help [-p] Disable opening /proc to set uacct to enable truss
Workload Model 'f' language definition:
(Subject to change as new features are added)
Variables:
set $var = value $var - regular variables ${var} - internal special variables $(var) - environment variables define randvar name = $random_var [, type=<uniform | gamma>] [, seed=<value>] [, gamma=<value>] [, mean=<value>] [, min=<value>] [, round=<value>] [, randsrc=<urandom | rand48>] [[, type=tabular], randtable = {{<value>, <value>, <value>}, {...}, ... }] set $random_var.<type=<uniform | gamma>|seed=<value>|gamma=<value>|mean=<value>|min=<value>|round=<value>|randsrc=<urandom | rand48>>
Files and Filesets:
define file name=<file-name>,path=<pathname>,size=<size> [,paralloc] [,prealloc] [,reuse] define fileset name=<fileset-name>,path=<pathname>,entries=<number>,size=<size> [,dirwidth=<width>] [,dirdepthrv=<$random_var> [,dirgamma=[100-10000] (Gamma * 1000) [,sizegamma=[100-10000] (Gamma * 1000) [,prealloc[=percent]]
Processes and Threads:
define process name=<name>[,instances=<count>] { thread ... thread ... thread ... } thread name=<name>[,instances=<count>] { flowop ... flowop ... flowop ... }
Flowops:
flowop [aiowrite|write|read] name=<name>, filename|filesetname=<file-name|fileset-name>, iosize=<size> [,fd=<number>] [,directio] [,dsync] [,opennext] [,iters=<count>] [,random] [,workingset=<size>] [,indexed=<file-index>] flowop [appendfile|appendfilerand] name=<name> filename|filesetname=<file-name|fileset-name> [,fd=<number>] [,directio] [,dsync] [,opennext] [,iters=<count>] [,workingset=<size>] [,indexed=<file-index>] flowop [writewholefile|readwholefile] name=<name>, filename|filesetname=<file-name|fileset-name> [,fd=<number>] [,directio] [,dsync] [,opennext] [,iters=<count>] [,workingset=<size>] [,indexed=<file-index>] flowop aiowait name=<name>,target=<aiowrite-flowop> flowop createfile name=<name>, filesetname=<fileset_name> [,fd=<number>] [,indexed=<file-index>] flowop [deletefile|statfile] name=<name>, filesetname=<fileset_name>|fd=<number> [,indexed=<file-index>] flowop closefile name=<name>, fd=<number> flowop openfile name=<name>, filename|filesetname=<file_name|fileset_name>, [,opennext] [,fd=<number>] flowop [makedir|removedir] name=<name>, filesetname=<fileset_name> [,indexed=<file-index>] flowop listdir name=<name>, filesetname=<fileset_name> flowop fsync name=<name> [,fd=<number>] flowop sempost name=<name>,target=<semblock-flowop>, value=<increment-to-post> flowop semblock name=<name>,value=<decrement-to-receive>, highwater=<inbound-queue-max> flowop block name=<name> flowop hog name=<name>,value=<number-of-mem-ops> flowop wakeup name=<name>,target=<block-flowop>, flowop eventlimit name=<name> flowop [bwlimit|iopslimit|opslimit] name=<name> [,target=<io-producing-flowop>] flowop finishoncount name=<name>, value=<number-of-ops> [,target=<any-flowop>] flowop finishonbytes name=<name>, value=<number-of-megabytes> [,target=<io-producing-flowop>]
Commands:
eventgen rate=<rate> create [files|processes] stats [clear|snap] stats command "shell command $var1,$var2..." stats directory <directory> run <run-time> sleep <sleep-value> quit
FileBench Language Reference
In this section, the f language, which runs to nearly 70 words, is detailed. There are four main categories of interest: Commands, Entities, Flowops and Attributes.
Global parameters are set, Entities and Flowops created and benchmark runs are controlled using Commands. Entities are specific resources such as files and threads of control. The actions taken by the workload defined by the f program are implemented by Flowops. Finally, many Commands, Entities and Flowops can be passed parameters through the use of Attributes.
General Commands
debug
The level of debugging information is set with flags to debug. By default this is set to 2 at startup. If debugging is enabled, output lines will be sent to the standard output of the form: <Process ID, Seconds since start of script, text of message>
Example:
debug 1 |
Setting debug to 0 suppresses all debugging messages. Setting Debug to 3 allows flowop level messages, including statistics for flowops.
Syntax:
debug <debug level> |
echo
The echo command is used in f as in every other language to output text to the standard output. The text should be enclosed in quotes. See also the usage command.
Example:
echo "Bringover Version 1.12 2005/06/21 21:18:52 personality successfully loaded" Prints the following to the standard output and the log file: 7831: 8.900: Bringover Version 1.12 2005/06/21 21:18:52 personality successfully loaded |
exit
Ends the filebench run.
Syntax:
exit |
foreach
Assigns the designated variable successive values from the supplied comma separated integer or string list. After each successive value assignment, it executes the bracket enclosed list of commands.
Example:
foreach $iosize in 2k, 4k, 8k { run 60 } |
The above example will repeatedly run the loaded workload with increasing I/O sizes of 2 KB, 4 KB and 8 KB.
Syntax:
foreach $varname in [integer[,integer]... | "string"[,"string"]...] {
} |
quit
Ends the filebench run.
Syntax:
quit |
help
Prints usage string if the string exists, else just a message requesting load of a personality. Note that the usage string is typically created by usage commands embedded in the workload files.
Syntax:
help |
list
The list command prints information about all the file objects that have been instantiated to the Filebench log. The information printed is the file object's name, path name, and size.
Syntax:
list |
load
The load command loads a workload specification pathname.f, to be used in addition to the current workload specification file.
Syntax:
load pathname |
log
The log command prints the values of a list of variables to the log file and also the terminal. The list of variables is placed on the command line, separated by comas and the entire list is enclosed in quotes. For example, if $dir contains “/export/home/tmp” and $filesize is set to 1048576, then typing log “$dir, $filesize” at the prompt gives:
Example:
filebench> log “$dir, $filesize” 6838: 24.459: log /export/home/tmp, 1048576 |
Syntax:
log "$varname1[, $varname2...]" |
run
Do a file bench run. Calls routines to create file sets, files, and processes. It resets the statistics counters, then sleeps for the runtime passed as an argument to it on the command line in 1 second increments. When it is finished sleeping, it collects a snapshot of the statistics and ends the run.
Syntax:
run runtime|$integervarname |
create
TODO: Create processes, filesets, etc.
define
TODO: Define processes, filesets, random variables,
eventgen
set
Set searches for the varname supplied as the first argument and if not found creates a new var of that name. It then supplies the integer, string, or $varname value on the rest of the line to it.
Syntax:
set $<varname> = [ true | false | <integer> | <double> | "<string>" | $<othervarname> set $<random varname>.type = [ uniform | gamma | tabular ] set $<random varname>.randsrc = [ urandom | rand48 ] set $<random varname>.[ gamma | mean | min | round | seed ] = <value> set $<random varname>.randtable = {{ <%>, <min value>, <max value>}, ...} |
set mode
The set mode command is used to put FileBench into various special modes of operation. It is followed by a subcommand, of which quit is the only one presently defined. The default is quit timeout, which ends the run when the runtime specified in a run command has expired or when an explicit shutdown command is encountered. If the workload is expected to end when it runs out of resources, such as files to delete, then use either quit alldone to quit once all the threads have quit because of resource exhaustion, or quit firstdone, to quit as soon as the first thread detects resource exhaustion.
Syntax:
set mode quit [ timeout | alldone | firstdone ] |
shutdown
Shuts down filebench if process or processes is specified as an argument. Anything else is currently an error.
Syntax:
shutdown process[es] |
sleep
The sleep command causes the master process to sleep for the number of seconds supplied by the argument, one second at a time.
Syntax:
sleep integer|$varname |
system
The system command runs the quoted unix command. It waits for the command to finish.
Syntax:
system "unixcommand" |
usage
Adds the string supplied as the argument to the usage command to the end of the string printed by the help command.
Syntax:
usage "additional help string" |
Statistics Commands
The statistics subsystem is controlled use stats commands. Using various subcommands the user can initialize and examine statistics collected by the flowops.
stats all
The all subcommand is not yet implemented.
Syntax:
stats all |
stats clear
The clear subcommand clears all the statistics associated with all of the flowops and the global summary statistics.
Syntax:
stats clear |
stats command
The command subcommand runs the quoted unix command as a background process. Intended for running statistics gathering utilities such as mpstat while the filebench workload is running. It also records the pid's of the background processes so that the "stats snap" command can terminate them when the run completes. For example the following would run mpstat in the background and save the results to mpstat_log:
Example:
stats command "mpstat 5 > mpstat_log" |
Syntax:
stats command "unixcommand" |
stats directory
The directory subcommand changes the directory into which the statistics files will be placed.
Syntax:
stats directory <pathname> |
stats snap
The snap subcommand kills off background statistics collection processes, then takes a snapshot of the filebench run's collected statistics and rolls them up into summary statistics for each named flowop.
Syntax:
stats snap |
stats dump
The dump subcommand updates the global dump filename with the filename supplied as the command's argument. Then dumps the statistics of each worker flowop into the dump file, followed by a summary of overall totals.
Syntax:
stats dump <filename> |
stats xmldump
The xmldump subcommand updates the global dump filename with the filename supplied as the command's argument. Then dumps the statistics of each worker flowop into the dump file, followed by a summary of overall totals in xml format.
Syntax:
stats xmldump <filename> |
Entities
The workload language defines the following entities:
- var (a variable)
- file (a single file)
- fileset (a set of files)
- process (an operating system process)
- thread (an operating system thread)
- flowop (a Filebench workload operation)
- eventgen (a thread that generates events)
This section will cover all the entities except for the flowops, which have their own section following this one.
vars
A workload model can make use of variables, known as vars from within the model. Vars can be user defined, FileBench internally defined items, or externally defined environment variables. All vars are identified by a string preceded by a dollar sign "$". User defined vars can be created and assigned either integer or string values using the set command. Internal vars can be accessed by specifying the appropriate name enclosed in braces, while environment vars can be accessed by specifying the appropriate name enclosed in parentencies.
Var Name Syntax:
$user_defined_varname ${stats | rate | date |scriptname | hostname} $(environment_varname) |
Variables can be used to set attributes for files, filesets, processes, threads, and flowops. Each var can hold either a boolean, 64 bit unsigned integer, double, or character string, or be empty.
Regular vars
Regular, user defined vars can be set to a value with a set command.
Var Set Syntax:
set $<user_defined_varname> = [true | false | <positive integer value> | <double precision floating point value> | <character string>] |
Random Vars
Random variables are user defined entities that are defined with a random distribution which is used to pick a random value to return with each use. The are created with a define randvar command, and may have individual parameters set with set commands. The are used just like regular variables, but return a different value each time they are accessed. They are particularly useful with flowops.
Random Variable Define Syntax:
define randvar name = $<user_defined_varname>, [type=[uniform | gamma | table]] [, seed=<value>] [, mean=<value>] [, gamma=<value>] [, min=<value>] [, round=<value>] [, randsrc=[urandom | rand48] [, randtable={{<%>,<min value>,<max value>}, ...} |
Random Variable Set Syntax:
set $<user_defined_varname>.<attribute name> = <value> |
file
Information about a single file is contained in a file entity. File entities are specified using the define file command. The define file command must provide a name for the file, the path to the directory where the file will reside and the size of the file. In addition, the file can be specified to be allocated in parallel with any other files, pre allocated with null data, and reused if it already exists.
If 'prealloc' is not set, then regardless if 'size' is specified or not, the file will not be preallocated (as expected). A file specified with 'prealloc' set and 'size' set to 0 will preallocate a zero length file. If 'prealloc' is set, but 'size' is not set, then a default file size of 0 will be used (the same as specifying 'size' as 0).
Syntax:
define file name=<name>,path=<pathname>,size=<size> [,paralloc] [prealloc] [,reuse] |
fileset
Information about a group of related files is contained in a fileset entity. Fileset entities are specified using the define fileset command. The define fileset command must provide a name for the fileset and the path to the directory where the fileset will reside. In addition, the number of files to be created, their average size, the average width of the subdirectories in the directory tree which contains the files of the fileset, whether to, and if so what percent to pre allocate with null data, and whether to reuse he fileset or recreate it if it already exists.
Syntax:
define fileset name=<name>,path=<pathname>,entries=<number> [,dirwidth=<width>] [,dirgamma=<directory gamma value>] [,size=<mean file size>] [,sizegamma=<>] [,prealloc=<percent to preallocate>] |
Processes and Threads Entities
A Filebench process represents and operating system process and contains one or more threads. In turn, each Filebench thread represents an operating system thread of control and contains a collection of flowops. The following example illustrates how to specify a process, thread and collection of flowops.
Example:
set $nthreads=1 define process name=filewriter,instances=1 { thread name=filewriterthread,memsize=10m,instances=$nthreads { flowop appendfile name=write-file, filesetname=bigfileset, iosize=1m, fd=1, iters=20 flowop closefile name=close,fd=1 flowop finishoncount name=finish,value=1 } } |
process
Process entities are used to hold attributes and other information about each operating system process. The define command is used to instantiate a given process entity, which may spawn multiple, identical copies if instances is specified as greater than 1 (the default is 1). The process(es) corresponding to the process entity may be run at a lower priority level by setting a value for the nice attribute.
Syntax:
define process name=<process name> [,instances=<number of instances>] [,nice=<additional niceness>] {
} |
thread
Thread entities are used to hold attributes and other information about each operating system thread. They are defined within a process definition, and become part of that process, which may spawn multiple, identical copies if instances is specified as greater than 1 (the default is 1).
Threads can also have a region of memory allocated which certain flowops will then use as buffer space for I/O. This region is created by setting a value for the memsize attribute. If the useism attribute is set then IPC shared memory will be used, otherwise thread local memory will be used.
Syntax:
thread name=<thread name> [,instances=<number of instances>] [,memsize=<size of thead's memory>] [,useism] {
} |
eventgen
Generates events with a specified rate (in events per second). Used by the eventlimit flowop. There is just one global event generator: you can not define multiple event generators and assign them to different flowops.
Syntax:
eventlimit rate=1 |
Flowops
The flowop clause determines what a thread actually does. The operations and their syntax is covered in this section. The flowop actions can be divided up into Basic I/O, Asynchronous I/O, Synchronization constructs and Misc other operations.
Some of the flowop syntax is common to all flowops. They are defined with the flowop keyword within a thread definition. Immediately following the flowop keyword is the name of the particular operation that is to be performed. Following that are a list of attributes, of which the name and iters attributes are common to all flowops. The name attribute is required and provides the particular instance of the flowop with a name by which it can be referenced elsewhere. The name must be globally unique. The iters attribute is optional, but if specified allows the action specified by the flowop to happen multiple times each time the flowop is invoked.
Common Flowop Syntax:
flowop <operation type name> name=<name> [,iters=<number of iterations per invocation>] [additional attributes...] |
Flowop I/O Operations
Reading and writing to files and filesets. On opening or creating a file, a file descriptor number can be specified, which will save the returned file descriptor for later use. Then operations on the already open file can reference it by its file descriptor number. Other read and write flowops will implicitly open a file for use if the filename or fileset name is provided without a file descriptor number. For filesets, specific files can be accessed by passing a file index number to the flowop, which can be obtained from a random variable to provide random file accesses. Otherwise files will be accessed round robin style.
read
Emulate posix read or pread. The flowop must include either a fileset or filename attribute. If a fileset attribute is supplied, the read operation will be done to a file from the fileset. Otherwise it will read the file specified by the filename attribute. If a fileset is specified along with an fd attribute, than the referenced file will be read. Otherwise the default fd=0 will be used or, if the opennext attribute is set, the flowop will pick the next file in sequence to read. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The actual read is done to a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size if the random attribute is set, or at the next sequential location. The workingset attribute specifies the working set size for use in choosing the random disk offset.
Syntax:
flowop read name=<name>, filename|fileset=<fname>, iosize=<size> [,directio] [,dsync] [,iters=<count>] [,random] [,opennext] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>] |
readwholefile
Emulate a read of a whole file. The file from the supplied fileset fname that is referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Reading from ordinary files (fileobjs, as would be specified by the filename attribute) is not currently supported. The readwholefile flowop then reads from the beginning of the file to the end, using zero or more iosize reads, followed by a read of whatever remaining, less than iosize amount, needs to be read. If iosize is not defined or set to zero, then the file will be read in one read of filesize bytes.
Syntax:
flowop readwholefile name=<name>, fileset=<fname>, iosize=<size>, [,iters=<count>] [,fd=<file-desc-number>] [,index=<file-index>] |
write
Emulate a write to a file. The size of the write is specified by the iosize attribute. If a fileset is specified, it writes to a file from the fileset referenced by the fd attribute, if it is supplied, to the default fd=0 file, or to the next file of the sequence if the opennext attribute is set. If a filename attribute is supplied instead, it will write to the named file. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The flowop's workingset attribute will be used to set the maximum file size if it is non-zero, otherwise the filesetentry's fse_size will be used. The actual write is done to a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size if the random attribute is set, or at the next sequential location.
Syntax:
flowop write name=<name>, filename|fileset=<fname>, iosize=<size> [,directio] [,dsync] [,iters=<count>] [,random] [,opennext] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>] |
writewholefile
Emulate a write of a whole file. The size of the file is taken from a filesetentry identified by the srcfd attribute, while the file used for the write is identified by the fd attribute. Both default to the filesetentry associated with fd=0. Does multiple writes of iosize length until full file has been written. If iosize is not defined or set to zero, then a single write of the size of the source file is done.
Syntax:
flowop writewholefile name=<name>, fileset=<fname>, iosize=<size> [,dsync] [,iters=<count>] [,srfd=<source file desc num>] [,fd=<file-desc-number>] [,index=<file-index>] |
appendfile
Emulate a fixed size append to a file. Will append data to a file chosen from a fileset if one is specified with the fileset attribute, or if the fd attribute is non zero and the attribute's file associated with the file descriptor is open. If a fileset is specified but the referenced file is not open, appendfile will open it. If no fileset or non-zero fd attribute is specified, then a file named by the filename attribute will be used. If no appropriate file can be found, filebench will terminate. While the workingset attribute is accepted, it is not currently used. Thus the repeated invocation on the flowop for a given file will cause the file to grow arbitrarily large. The size of each append is set by the iosize attribute.
Syntax:
flowop appendfile name=<name>, filename|fileset=<fname>, iosize=<size>\ [,dsync] [,iters=<count>] [,workingset=<size>] [,fd=<file-desc-number>][,index=<file-index>] |
appendfilerand
Emulate a random size append to a file. Will append data to a file chosen from a fileset if one is specified with the fileset attribute, or if the fd attribute is non zero and the attribute's file associated with the file descriptor is open. If a fileset is specified but the referenced file is not open, appendfile will open it. If no fileset or non-zero fd attribute is specified, then a file named by the filename attribute will be used. If no appropriate file can be found, filebench will terminate. While the workingset attribute is accepted, it is not currently used. Thus the repeated invocation on the flowop for a given file will cause the file to grow arbitrarily large. A write to the current end of the file with a random transfer size of at most iosize bytes is done.
Syntax:
flowop appendfilerand name=<name>, filename|fileset=<fname>, iosize=<size>\ [,dsync] [,iters=<count>] [,workingset=<size>] [,fd=<file-desc-number>] [,index=<file-index>] |
File Meta Data Flowops
Opening, closing and stating files and filesets. On opening or creating a file, a file descriptor number can be specified, which will save the returned file descriptor for later use. Then operations on the already open file can reference it by its file descriptor number. For filesets, specific files can be opened or created by passing a file index number to the flowop, which can be obtained from a random variable to provide random file accesses. Otherwise files will be accessed round robin style.
createfile
Emulate create of a file. Associates createfile's fd attribute, if supplied, with the created file's operating system specific file descriptor so it can be referenced by other file operations. Selects a file entry (filesetentry) from the fileset whose file does not currently exist for the file create operation. Then performs an open operation on the file with the O_CREATE flag set to create the file. The file can be created with direct (as opposed to buffered) I/O by including the directio attribute and writes can be forced to behave as defined by synchronized I/O data integrity constraints by setting the dsync attribute.
Syntax:
flowop createfile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,directio] [,dsync] [,index=<file-index>] |
openfile
Emulates a file open operation. Associates openfile's fd attribute, if supplied, with the opened file's operating system specific file descriptor so it can be referenced by other file operations. However, openfile will fail if the supplied fd attribute is already associated with an open file. Selects a file entry (filesetentry) from the fileset whose file exists for the file open operation. Then performs a file open operation on the filesetentry's associated file. The file can be created with direct (as opposed to buffered) I/O by including the directio attribute and writes can be forced to behave as defined by synchronized I/O data integrity constraints by setting the dsync attribute.
Syntax:
flowop openfile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,directio] [,dsync] [,index=<file-index>] |
closefile
Emulate close of a file. The file referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Simply does a close operation on the referenced file.
Syntax:
flowop closefile name=<name> [,fd=<file-desc-number>] |
fsync
Emulates fsync of a file. The file referenced by the fd attribute (if it is supplied) or by the default fd=0, must be open. Doing an fsync on an ordinary file (fileobj) works with "fd=0" (or "fd" not set). Simply does an fsync operation on the referenced file.
Syntax:
flowop fsync name=<name> [,fd=<file-desc-number>] |
fsyncset
Emulate fsync of an entire fileset. Does an fsync operation on every open file of the fileset.
Syntax:
flowop fsyncset name=<name>,fileset=<fname> |
statfile
Emulate stat of a file. Picks an arbitrary fileset entry with an existing file from the fileset specified by the fileset attribute, then performs a stat() operation on it.
Syntax:
flowop statfile name=<name>,fileset=<fname> [,fd=file-desc-number] [,index=<file-index>] |
deletefile
Emulates delete of a file. Picks either an arbitrary, index specified, or file-descriptor-number specified, filesetentry whose file exists from the fileset specified by the fileset attribute, and deletes it.
Syntax:
flowop deletefile name=<name>,fileset=<fname> [,fd=<file-desc-number>] [,index=<file-index>] |
Directory Flowops
Making, Listing and removing directories. The fileset must have been defined to include empty directory entries in addition to or instead of files. Specific directories can be accessed by passing a directory index number to the flowop, which can be obtained from a random variable to provide random directory accesses. Otherwise directories will be accessed round robin style.
MakeDir
Emulates mkdir command. Picks either an arbitrary or index specified filesetentry for a directory which not exist in storage from the fileset specified by the fileset attribute, and makes it.
Syntax:
flowop makedir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>] |
ListDir
Emulates ls of a directory. Picks either an arbitrary or index specified filesetentry for a directory which exists in storage from the fileset specified by the fileset attribute, and lists it.
Syntax:
flowop listdir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>] |
RemoveDir
Emulates an rm of a directory. Picks either an arbitrary or index specified filesetentry for a directory which exists in storage from the fileset specified by the fileset attribute, and removes it.
Syntax:
flowop removedir name=<name>,fileset=<fname> [,fd=<directory-desc-number>] [,index=<directory-index>] |
Flowop Asynchronous I/O Operations
Filebench supports asynchronous writes and implements a mechanism to wait for their completion. An Asynchronous I/O (aio) element is used to associate the asynchronous write request with its subsequent completion. An aiowrite flowop will add an aio to the thread's aio list, An aiowait flowop will wait for half the current list to complete (minimum of 1), removing completed ones from the list.
aiowrite
Emulate posix aiowrite(). The size of the write is specified by the iosize attribute. If a fileset is specified, it writes to a file from the fileset referenced by the fd attribute, if it is supplied, to the default fd=0 file, or to the next file of the sequence if the opennext attribute is set. If a filename attribute is supplied instead, it will write to the named file. If the file is not already open, this flowop will open it, using the directio and dsync attributes as described for the openfile flowop. The flowop's workingset attribute will be used to set the maximum file size if it is non-zero, otherwise the filesetentry's fse_size will be used. The flowop issues the asynchronous write from a random offset in the threadflow's thread memory, with a size set by the iosize attribute and at a random disk offset within the working set size. This operation is currently only valid for random I/O, and returns an error if the flowop is set for sequential I/O.
Syntax:
flowop aiowrite name=<name>, filename|fileset=<fname>, iosize=<size>, random [,directio] [,dsync] [,iters=<count>] [,opennext] [,workingset=<size>] [,fd=<file desc num>] |
aiowait
Emulate posix aiowait(). Waits for the completion of half the outstanding asynchronous I/Os, or a single I/O, which ever is larger. The routine will return after a sufficient number of asynchronous writes issued by any thread in the procflow have completed, or a 1 second time-out elapses. All completed I/O operations are deleted from the thread's list of asynchronous I/Os in progress.
Syntax:
flowop aiowait name=<name>,target=<aiowrite-flowop> |
Flowop Synchronization Operations
block
Blocks the threadflow until woken up by the wakeup flowop.
Syntax:
flowop block name=<name> |
wakeup
Wakes up one or more blocked target flowops. The set of targets consists of all flowops whose name matches this flowop's target attribute.
Syntax:
flowop wakeup name=<name>,target=<block-flowop> |
semblock
Attempts to pass a semaphore and blocks if necessary. Can be compiled to use either System V or posix semaphores, but the System V semaphore version is produced by the currently set source code defines.
Syntax:
flowop semblock name=<name>, value=<decrement-to-receive>, highwater=<inbound-queue-max> |
sempost
Post to a set of semblock flowops identified by the target attribute. Either System V or posix semaphores are used, as described for semblock.
Syntax:
flowop sempost name=<name>, target=<semblock-flowop>, value=<increment-to-post> |
Flowop Misc Operation
hog
Consumes CPU cycles and memory bandwidth by looping for value iterations, while setting the first byte of the thread's memory region to 1 on each iteration.
Syntax:
flowop hog name=<name>,value=<number-of-mem-ops> |
delay
Delays for value number of seconds using the user sleep routine.
Syntax:
flowop delay name=<name>,value=<number-of-seconds> |
eventlimit
Completes one invocation per posted event. If events are available, it removes one and continues to the next flowop, otherwise it blocks until one or more new events are posted. Events are generated by eventgen object.
Syntax:
flowop eventlimit name=<name> |
bwlimit
Blocks the calling thread if the number of bytes of I/O issued exceeds one megabyte times the number of posted events, thus limiting the average I/O byte rate to one megabyte times the event rate. To set the event rate see rate. If a target flowop is specified than the i/o bandwidth produced by that particular flowop (separately for each thread) sets the limit.
Syntax:
flowop bwlimit name=<name>,[target=<io-producing-flowop] |
iopslimit
Blocks the calling thread if the number of issued I/O operations exceeds the number of posted events, thus limiting the average I/O operation rate to one I/O per event. If a target flowop is specified than the i/o operations produced by that particular flowop (separately for each thread) sets the limit.
Syntax:
flowop iopslimit name=<name>,[target=<io-producing-flowop] |
opslimit
Blocks the calling thread if the number of issued filebench operations exceeds the number of posted events, thus limiting the average filebench operation rate to one per event. If a target flowop is specified than the operations (generally number of times called) produced by that particular flowop (separately for each thread) sets the limit.
Syntax:
flowop opslimit name=<name>,[target=<any-flowop] |
finishoncount
Stops the filebench run when the number of I/O operations specified by value have been performed. If a target flowop is specified than the operations count produced by that particular flowop (separately for each thread) determines when to stop.
Syntax:
flowop finishoncount name=<name>,value=<ops/s>,[target=<any-flowop] |
finishonbytes
Stops the filebench run when the number of bytes of I/O specified by value have been read and/or written. If a target flowop is specified than the i/o bandwidth produced by that particular flowop (separately for each thread) determines when to stop.
Syntax:
flowop finishonbytes name=<name>,value=<bytes>,[target=<io-producing-flowop] |
Attributes
The behavior of processes, threads and flowops can be modified by supplying them with attributes. Some attributes are required, others are optional and have default values which will be used if they are not supplied. Some attributes are booleans, that is they are true if supplied and false if not. Other attributes take numeric or string values, sometimes with default values if a value is not supplied. The supplied values may either be a constant string or integer, as appropriate, or may be a string or integer variable which has had a value assigned to it by a set command. For example, directing a fileset to pre allocate 100% of its files can be done by any of the following:
define fileset name=foo, prealloc define fileset name=foo, prealloc=100 set $preallocpercent = 100 define fileset name=foo, prealloc=$preallocpercent
In this section, <intval> will be used to designate either an integer or an integer variable, wile <strval> will be used to designate either a string or a string variable.
cached
When specified, this attribute prevents the filesystem caches from being flushed after file or fileset creation or reuse. If not specified, filebench will attempt to flush filesystem caches so as to prevent a workload's I/Os from running completely out of filesystem cache.
Note: for ZFS we will still do the 'zpool export/import' (via the 'fs_flush' script to get rid of the cached copy in the ARC) even if "cached" is set to true.
Syntax:
define file | fileset ... ,cached ... |
dirwidth
The dirwidth attribute of a fileset specifies the average number of entries in each directory created as part of the fileset. Filebench also uses this in combination with the total number of files in the fileset to calculate the required mean depth of the fileset's directory tree. The default is a dirwidth of 0, which specifies a single level directory containing all the files of the fileset.
Syntax:
define fileset ... ,dirwidth=<intval> |
dirgamma
The dirgamma attribute of a fileset specifies the alpha parameter of the gamma distribution which will be used to decide whether a given subdirectory contains files or additional subdirectories. If the dirgamma attribute is not specified, it defaults to 1500. The value can range from 100 to 10000, and the corresponds to a gamma of 0.1 to 10.
Syntax:
define fileset ... ,dirgamma=<intval> |
namelength
The namelength attribute is not presently supported.
filesize
The filesize attribute of a file or fileset specifies the size of file(s) that will be created. If a fileset and filegamma has been specified as other than 0, the filesize attribute will actually specify the mean file size, with the actual size of each file based on the gamma distribution with alpha based on the filegamma attribute.
Syntax:
define file | fileset ... ,filesize=<intval> |
filesizegamma
The filesize attribute of a fileset specifies the alpha parameter for the gamma distribution used to select the sizes of each file created as part of a fileset, where the mean file size is specified by the filesize attribute. If the filesizegamma attribute is set to zero, all files will be created with filesize number of bytes. This attribute also specifies the mean width of each created subdirectory, with the actual width selected from a gamma distribution with alpha of dirgamma. If the filesizegamma attribute is not specified, it defaults to 1500. The value can range from 100 to 10000, and the corresponds to a gamma of 0.1 to 10.
Syntax:
define fileset ... ,filesizegamma=<intval> |
directio
Specifies opening the file in direct, rather than buffered, I/O mode. Essentially bypasses filesystem caches, so each I/O request results in an actual I/O to the attached device. Needs to be specified with the flowop that opens the file, which is often openfile, but can be one of the other I/O flowops.
Syntax:
flowop read | aiowrite | openfile | write | appendfile | appendfilerand ... ,directio |
dsync
Specifies the use of synchronous writes, which do not complete until the attached device has written the data to non volatile storage. Not only does this disable filesystem write back caching, it also is supposed to prevent device (such as an attached disk drive) from doing write back caching. This attribute needs to be specified with all flowops that might open the file, as the file must be opened as a synchronous file for this attribute to be effective. While openfile is often used for that purpose, any of the other flowops will open a file if it is not already open, so they may need the attribute defined as well.
Syntax:
flowop read | aiowrite | openfile | write | appendfile | appendfilerand ... ,dsync |
fd
The fd parameter to the flowop clause is used to explicitly set the file descriptor on which the file is opened. This is useful where the script is used to emulate an application that has a number of files open on different descriptors or does involved opens/closes using a limited or extended range of descriptors. In the example, an arbitrary file from the file bigfileset is opened by the first flowop, and assigned file descriptor 1. The read flowop which follows also references file descriptor 1, so it will read whichever file was opened by the first flowop. Finally, the closefile flowop will close the file referenced by file descriptor 1.
Example:
thread name=filereaderthread,memsize=10m,instances=$nthreads { flowop openfile name=openfile,filesetname=bigfileset,fd=1 flowop read name=readfile1,fd=1 flowop closefile name=closefile,fd=1 } |
Syntax:
flowop read | aiowrite | openfile | write | writewholefile | appendfile | appendfilerand ... ,fd=<integer> |
srcfd
The srcfd attribute specifies the file descriptor to use as a source of filesize information when invoking the writewholefile flowop. In the example below the code is emulating a copy file operation, where the file is read in then written out to a new file, which, of course, would end up with the same size as the original.
Example:
thread name=filereaderthread,memsize=10m,instances=$nthreads { flowop openfile name=openfile1,filesetname=bigfileset,fd=1 flowop readwholefile name=readfile1,fd=1 flowop createfile name=createfile2,filesetname=destfiles,fd=2 flowop writewholefile name=writefile2,filesetname=destfiles,fd=2,srcfd=1 flowop closefile name=closefile1,fd=1 flowop closefile name=closefile2,fd=2 } |
Syntax:
flowop writewholefile ... ,srcfd=<integer> |
opennext
The opennext attribute is used with I/O flowops to indicate that the flowop should open a different file with each invocation. This attribute only applies to filesets and is meaningless when used with a simple file.
Syntax:
flowop read | aiowrite | openfile | write | appendfile | appendfilerand ... ,opennext |
filename
The filename attribute specifies the name of a file. It is used with I/O flowops to specify which particular file to access.
Syntax:
flowop read | aiowrite | openfile | write | appendfile | appendfilerand ... ,filename=<strval> |
filesetname
The filesetname attribute specifies the name of a fileset. It is used with I/O flowops to specify which particular fileset to access.
Syntax:
flowop read | aiowrite | openfile | write | writewholefile | appendfile | appendfilerand ... ,filesetname=<strval> |
instances
When process and threads are specified using the define command, the instances attribute may be used so request multiple copies of the defined process or threads. If used with the process definition, the requested number of operating system processes will be created, each with its own copy of the threads and flowops included in the definition. Similarly, if used with a thread definition, the requested number of threads will be created for each operating system process, each with its own copy of the specified flowops. If the instance flowop is not included, then a single instance of the process or thread is created.
Syntax:
define process name=<procname> ... ,instances=<intval> threads name=<threadname> ... ,instances=<intval> |
iosize
The iosize attribute is used with I/O commands to specify the size of the I/O operations (i.e. disk reads) that they will perform. It defaults to zero if omitted.
Syntax:
flowop read|aiowrite|openfile|write|appendfile|appendfilerand ... ,iosize=<intval> |
iters
Individual flowops may be executed multiple times each time they are invoked by setting the iters attribute to the desired number of executions. If not specified, the flowop will only be executed once each time it is invoked.
Syntax:
flowop ... ,iters=<intval> |
memsize
This parameter in the thread clause sets the size of the private memory segment of the thread.
Example:
define process name=filewriter,instances=1 { thread name=filewriterthread,memsize=10m,instances=1 { flowop .... } } |
Syntax:
thread ... ,memsize=<intval> |
name
Files, filesets, processes, threads and flowops all require names, which are set with the name attribute. Names must be globally unique within an entity type. Thus if you have two processes, each of which has a read type flowop, you must make sure that both read flowops have unique names, such as name=read1 name=read2.
Syntax:
define file|define fileset|define process|thread|flowop name=<strval> ... |
nice
The nice attribute allows you to lower the priority of the process (or set of processes if multiple instances are requested) below that of what it otherwise would be. Note that all processes are automatically set to a lower priority than the master process controlling the run. But if you want a particular process to be at a lower priority than the others, specify nice with some integer value to accomplish this.
Syntax:
process ... ,nice=<intval> |
entries
Filesets are typically used to create a group of files, and the entries attribute is used to set the number of such files. If the entries attribute is not specified, only one file will be created.
Syntax:
define fileset ... ,entries=<intval> |
prealloc
The files defined by the file or fileset entities can either exist as potential files or actual ones. As a potential one, information about them is kept by the file or fileset entity, but the do not occupy disk space or exist in a directory. If they do not exist they can be created later with the creatfile flowop. When used with a file, the prealloc attribute specifies that the file should actually exist. When used with a fileset, it specifies the percentage of files that should actually exist, with a default value of 100%.
Syntax:
define file | fileset ... ,prealloc[=<intval>] |
paralloc
Use of this attribute can speed up the pre allocation of files, by creating and writing them in parallel. However, it only works with files at present, not filesets.
Syntax:
define file ... ,paralloc |
reuse
The reuse attribute allows the reuse of existing files or filesets which have the same name as the specified file or fileset. If the file is too large it will be truncated, and if it is too small it will be rewritten. A fileset with a matching name will also be reused, with individual files adjusted to match their new specified sizes.
Syntax:
define file | fileset ... ,reuse |
path
The path directive is used in the fileset clause to set the prefix path for the dataset. It is often set at the head of the script or passed in as a parameter:
Example:
set $dir=/tmp define fileset name=myset,path=$dir,size=16k,entries=1000,dirwidth=20 |
Syntax:
define fileset ... ,path=<strval> |
random
The random attribute is used with the read, write and aiowrite flowops to specify that a random location within the file be picked for access. Without this attribute, the next sequential file blocks will be read or written.
Syntax:
flowop read | aiowrite | write ... ,random |
rate
The rate attribute is used to set the event generation rate for the event generator as part of the eventgen command.
Syntax:
eventgen rate=<intval> |
size
The size attribute is used with the define file and define fileset commands. For define file it sets the size of the file. For define fileset, it sets the mean size of the files, with the actual size set by the gamma random distribution specified for the fileset.
Syntax:
define file | fileset ... ,size=<intval> |
target
Certain flowops depend on actions in other flowops, in which case they need a target attribute to tell them the name of the flowop on which they are depending. A prime example is the semaphore flowops, as illustrated in the following example from the oltp.f workload:
Example:
... flowop semblock name=lg-block,value=3200,highwater=1000 ... flowop sempost name=shadow-post-lg,value=1,target=lg-block,blocking ... |
Here the sempost flowop has a target of lg-block, the name of the associated semblock flowop. The actual semaphore is created by the semblock flowop, but the sempost flowop will use the lg-block target to find it and do the requested post operation on it.
Syntax:
flowop ... ,target=<strval> |
useism
The useism attribute tells the thread to use shared memory for its thread memory region.
Syntax:
thread ... ,useism |
value
The value attribute is used for passing miscellaneous integer values to the flowops. The exact meaning is dependent on the particular use.
Syntax:
flowop ... ,value=<intval> |
workingset
The workingset attribute is used by some I/O flowops to specify a maximum byte range of the file that will actually be read from or written to. This can be less than the actual file size, or for writes may also be larger than the current size, where it serves to set the maximum size the file can grow to.
Syntax:
flowop read | aiowrite | write | appendfile | appendfilerand ... workingset=<intval> |
blocking
Used with semaphore flowops.
Syntax:
flowop sempost|semblock ... ,blocking |
highwater
Used with the semblock semaphore flowop.
Syntax:
flowop semblock ... ,highwater=<intval> |