Content-type: text/html Man page of COLLGUI


Section: User Commands (1)
Updated: LOCAL
Index Return to Main Contents


collgui - a graphical front-end for collect  


collgui [-d] [-vga] [-size <num>] [<file> [<file> [...]]]  


allows collgui to be used on a VGA-sized (640x480) display. sets the fontsize to <number>, which influences size of window. The default size is 12. Turns on debugging output  


collgui is intented to help evaluate data collected by collect. It operates as a go-between among collect, cfilt, and gnuplot. That is why it is a good idea to understand cfilt, especially if you want to do complicated or non-standard things. Help on cfilt can be optained by reading the cfilt(1) manpage or running cfilt -h.

collgui automates the extraction of information from a binary data file written by collect, and feed it to gnuplot to produce a, hopefully, useful graphical rendition of the data. It saves you having to type the same stuff repeatedly.

collgui, and, more importantly, cfilt, are dogs. They just don't run very fast. As far as collgui is concerned, speed is not important -- it just take a while to start! However cfilt must wade through oodles of text to get the data it wants. Perl is very nice for wading through data, though. I also realize that cfilt is unreadable and very much of a hack, however I've made a pretty serious attempt to make sure that it does what you tell it to.

As you will see, collgui offers two different methods for selecting samples for graphing. Close to the top, you can set the START: and END: times. These arguments get passed to collect, so, if your data-file contains lots of samples, but you only want a fraction of them, using START: and END: will substantially speed up your extraction because the selection is handled by collect itself. Further down, you can set an X Range for gnuplot. This has more-or-less the same effect as setting START: and END:, but collect provides all samples, and cfilt must extract for all selected subsystems from this data. This is slower. The difference is where the time-selection is done. Of course, you may want to set START: and END:, and still set the X Range in order to give gnuplot explicit instructions as to what should be displayed -- gnuplot tends to want to use round numbers for the beginning and end of ranges.

When you save a user-defined setting/configuration, a unique ID is saved with it, consisting of filename (no path) plus file size. When you recall this setting, if the unique ID of your current 'open' data-file matches the saved one, things like START:, END:, X-range, Y-range, average samples, X-units, and 'samples w/process data' are also restored. If the unique ID's don't match, then only the subsystem settings are restored.  


There are certain 'features' of collgui that are worthy of mention. In particular, the mechanism by which one of many objects (such as LSM Volumes, Disks, Tapes, Single CPUs) can be selected is a bit particular. If there are less than a fixed number of objects (~30), an MenuButton is created (when 'Add' is pressed' a vertical list is presented). If the the number is greater than this constant, a separate window is created with a listbox containing all possible objects. A double-click on an object in the listbox will add it to the selection listbox.

The selection mechanism for processes deserves special mention: this is always a separate window with a listbox and a slider marked 'sample' and a button marked 'List Processes' next to it. Using the slider, a sample (record) can be selected from the collection period, and when st Processes' is pressed, all processes from that sample are displayed. Double-clicking on a process will enter its PID in the selection listbox. Currently, it is only possible to select processes using their PIDs. In the future it may be possible to select using usernames or commands. At the top of each column is a button which turns red when the mouse is over it. Pressing the button will cause the list to be sorted using the values in the button's column.



A look at the main window of collgui, from top to bottom:


File 'opens' a collect binary data file. A dialog box with a directory browser will open. At the top of the box is a window with a list of input datafiles that have been selected. Select one or more binary collect files and press "OK" when finished. The order in which the input files appear in the list is the order in which collect will read them. guess!

Options set the position of the labels for the various lines graphed Label (or not) the X Axis. This can save valuable graph real estate. controls the format of the X-Axis tic-mark labels. allows user to specify the label for the Y-Axis, rather than using the default, "KB/Transfers/Packets/Pages/etc". allows users to chose JPEG, PPM (Portable Pixmap - Color), or PBM (Portable Bitmap - Grayscale).

Settings saves the current configuration -- that is, all the information needed to reproduce the current graph. The information is saved in $HOME/.collguirc and read in on start. you are presented a list of user-defined settings. Double-clicking on an entry will remove it from the list, clicking on 'commit' will save your changes. I have defined some basic settings for looking at data, not necessarily very useful. Here you can choose settings previously stored.

FILE indicator, START: and END:

The FILE indicator shows you the name of the currently 'open' collect binary data file.

the START: and END: areas allows you to specify a time-range for samples to be extracted from the binary data file. They are set to the times of the first and last sample respectively (i.e., the whole run) when you open a file. The 'RESET' button at the bottom of the window will restore these to their default values. These values are passed directly to collect during playback, so this is the fastest method for extracting a sub-range from your collection period.

X: From: To: and Y: From: To:

These areas allow you to specify gnuplot X and Y ranges. The X entry-windows are wider the those of Y because you can specify a time when 'time' is chosen for X Axis Units.

Average Samples and X Units:

The average samples area allows you to cause cfilt to take an average over N samples. That is, for N samples read from the binary data file, cfilt will produce 1 output line with average values. X Units allows you to specify either 'time' or 'samples' for the horizontal axis of the graph.

Samples w/Process Data:

This solves the problem of graphing intermittently gathered process data against constantly gathered other data. For example, if you specified an interval for collect of -i1,4, thereby collecting process data only every four seconds, and you try to graph that against, say cpu idle time, which was gathered every second, you will get zeros for the process data for samples in which no process data appears. Clicking on the 'Samples w/Process Data' checkbutton will cause, via 'cfilt -p', only samples with process data to be used. In the above example, it would be the equivalent of taking every 4th sample (actually it would mean takings samples #1,5,9,13,17,etc).

Subsystem Widgets:

The first 6 subsystems have multiline output from collect. Therefore it possible to select using certain criteria (see cfilt). The last 2 only have a single output line from collect. When you 'open' a binary data file, the 'Add' and which data was collected or that exist. Otherwise 'Expressions' will be grayed-out, and the 'Add' button will say 'No Data'. The widget controls do nothing more that give you a graphical front-end to cfilt. The 'Expressions' menu offers a confortable way of choosing cfilt expressions. The 'sum' checkbutton corresponds to the '+' sign at the end of the subsystem name in an expression. The values in the listbox correspond to the selection criterion for cfilt, that is <tag>=<value>,<value>,...,<value>. You are perfectly welcome to type in the expression window itself. If you specify an illegal expression, you will get an error message.


Single CPU

This will create a graph consisting of a set of horizontal 'bands', one per CPU. It is recommended to select all CPUs when using this display option.

CPU Summary

This graphs four values for all CPUs: USER+SYS+IDLE+WAIT, USER+SYS+IDLE, USER+SYS, and USER, as in 'SMP Stack' under 'Single CPU'.


DISPLAY will graph your selections to the gnuplot 'x11' output device, whereas PRINT will set the device to 'postscript', and ask you for an output filename (which may be "|lpr -Pfoobar", to route directly to printer). You can also set the environment variable COLLGUI_PRINT to a default you like (you still get prompted for a filename, but get the contents of this variable COLLGUI_PRINT as default). The {JPEG|PPM|PBM} button produces an image file in the corresponding format. RESET clears most settings, and sets the rest to reasonable startup values (like START: and END: time).



Here, a quick guide for those who want to jump in without looking a the cfilt readme/help:

Let's take 'Disks' for an example. If you click on 'Expressions' and select 'KB/Sec', and then, without selecting any specific disks, click on the 'DISPLAY' button, you will get the TOTAL KiloBytes/Second throughput for all disks for which data was collected. Data is totalled because the list on the right is empty. cfilt assumes, since you have not selected any particular disk(s), you want a grand total. If, however, you now add 'rz0' and 'rz1' (assuming these disks exist on your system, and you collected data for them), you will now get two lines graphed, KB/Sec for rz0 and KB/Sec for rz1. Now if you click on Expressions and select '%Busy', you will get 4 lines: 'KB/Sec' and '%Busy' for rz0, and 'KB/Sec' and '%Busy' for rz1. Now if you click on the 'sum' Checkbutton, (and the DISPLAY), you will get only 2 lines this time: 'KB/Sec' for rz0+rz1, and '%Busy' for rz0+rz1. 'sum' sums over all objects in the listbox, or over all objects for which data was collected if no specific object has been selected (i.e., the listbox is empty).

It is sometimes useful to graph dissimilar data together, for example cpu idle and disk KB/sec. Using gnuplot (at the moment) one only has one vertical scale. In order to get such incongruous data together in a reasonable fashion on the same graph, data may have to be 'normalized' ('scaled' to fit into a particular range, typically 0-100). Sticking a percent sign ('%') on the end of an expression will cause this data to be normalized. For all reasonable expression possibilities, I have offered 'Normalized' and 'Raw' options. The only difference is the '%' on the end. You can also choose the end of the normalized range yourself by giving that value after the '%', for example: disk:rkb/s+wkb/s%150



You can customize some things in collgui using the usual Tk class names. For example, you could set your default background to red by adding something like:

Collgui*background: red

to your ~/.Xdefaults file. and them merging this change into your in-memory resource database using:

xrdb -merge ~/.Xdefaults



default printer file, can also be "|lpr -lpr -flags -here" (see gnuplot help "set output") the name/path of collect, if not collect, or if collect is not in your path (for example 'collect3' or '/usr/foo/bin/collect4') the name/path for cfilt, if not cfilt, or if cfilt is not in your path the name/path for gnuplot, if not gnuplot, or if gnuplot is not in your path the name/path for the cjpeg program, which is used to convert PPM image files to JPEG. the name/path for the ppmtogif program, which is used to convert PPM image files to GIF format.




This document was created by man2html, using the manual pages.
Time: 02:42:46 GMT, October 02, 2010