colgui --machines machinesfile [-switches]
colgui --hosts pattern [-switches]
colgui --address addresses [-switches]
Provides a grapical user interface to collectl, displaying real-time graphs for one or more hosts. By default, plots are generated for the local system. One can specify other/additional systems via a file containg a list of those addresses, the hosts listed in /etc/hosts by applying an appropriate filter or by specifying a specific address or addresses at the command line.
The easiest way to get started is to use one or more of the following switches which many people find meet most of their needs. Over time the need may arise to change the way the display looks, modify the data collection itself, simultaneously log the data as it is being collected or even change the way colgui connects to remotes systems. In those situations, more advanced switches are provided.
When first getting started, you can use the following switches to generate plots for your local system. To generate remote plots see the following section on "Host Selection".
- The frequency at which data should be collected. This is passed unaltered to collectl as -i.
- The number of plots displayed in a row before starting a new row. By default, a new row is automatically started for each host. see --geometry to alter this behavior.
- Select the plots to display by the "standard" subsystems that collectl uses. This too is passed unaltered to collectl as -s.
- The hosts are chosen from the /etc/hosts file by executing the command "grep pattern /etc/hosts". The display form of the hostname will be taken from the second field if it is defined. When using "--geometry nd", the third column will be used.
- One or more host names, separated by spaces and quoted if necessary. If it is desired to display a shorter hostname when "--geometery nd" is chosen, append that synonym to the hostname separated by a colon.
- The machinesfile is a text file similar in format to the /etc/hosts file. See below for more details on the format and how the 2nd and 3rd names (if specified) will be used.
Alternate Plot Selection
These additional plot selection options can be used in any combination with or without -s.
- Select one or more plots, many of which can also be selected by -s. For more information see "Plot Selection" further below.
- Select one or more custom, user developed plots. check out /opt/colplot/examples/*cfg to see how these work...
ADVANCED PLOTTING SELECTIONS
The first set of these effect the size of individual plots and how they are displayed.
- Change the size of the x-axis to be n-intervals wide, where an interval corresponds to "-i int" seconds.
- Change the size of the y-axis to be "int" pixels high.
--geometry [n, c, nd, cd]
Choose the display geometry. By default, everything displays in "normal" mode,
that is a new row is started for each host. In "compact" mode, each row is
filled to the number of plots specified by -r.
Dense modes, specified by adding the "d" modifier to one of the other two modes, removes many of the elements common to each plot and displays them elsewhere, proving more efficient use of the screen real estate, something that becomes more important as the number of plots grows.
NOTE - colgui always generates the same number of plots for all systems. This means that if doing detail plots where the number of network, disks, etc can in fact be different, colgui will pad unused entries with blank plots which won't have an active sweeper line in them.
Some of the less common plotting switches are:
- When colgui starts up, it queries each node for its configuration since some nodes can have different numbers of devices or device names. When there are a large number of nodes this can slow down the whole startup process. This switch will set the configurations of all nodes to that of the first one querried and can significantly speed startup. Be very careful when doing detail reporting becuase if two systems have a different number of devices, you will either get errors or incorrect data displayed. If any device names differ (and this is always the case with lustre), all systems will show the same names and this can be confusing.
--plottype [l, p, b, s, r]
Line plots, the default, are displayed using connected solid lines,
the beginning Y axis value. A "point" plot, also known as a scatter plot but
the "s" was taken, is one in which the points are not connected. "Bar" plots
are vertical bars, more often associated with business graphics.
Appending the "s" to any of the first three types (I told you the "s" was taken) of plots will produce "stacked" plots (when there are multiple values being plotted) such that rather than each point relative to the base of the y-axis it is stacked on top of the previous one.
Radial or "radar" plots are actually circular plots and this must be combined with l or p and optionally s. At this time, radial plots may produce some oddly formatted displays.
- By default, a radial plot has the same number of intervals as an "xy" plot, that is based on the value of --xaxis. This switch allows seeing that interval independently.
- Some data may be presented very spikey and this allows one to provide a smoothing value which softens those spikes.
- For those who want a wider plotting line, this is the way to go. Enter the width in pixels.
- This is actually the horizontal distance between points in pixels. Changing either this or --xaxis effects the width of the plot, but this does it without changing the number of data points that will fit on it.
- The number of samples to collect, this is passed unaltered to collectl as -c.
- If collectl is stored somewhere other than /usr/sbin on the target machine, use this to specify its location. However, remember that this path will be passed to ALL machines being monitored.
- Like --colbin, this allows you to change the location of where to look for colmux.
- This defines what types of lustre machines are being monitored when -sl is selected since there is no apriori way for colgui to know that. Choose any combination of "cmo" to choose client, mds or oss noting these types of plots will be displayed for ALL machines selected. It is passed unaltered to collectl as -L. Also be aware that for any machines NOT configured as running lustre, at least version 1.5.3 of collectl will be required.
- Collectl is capable of monitoring nfs clients or servers, supporting either nfs version 2 or 3, but only 1 of the 4 combinations during any single run. By default, it is assumed a machine is running as a v3 server. To change either the version or to make the target machine a client, use this switch. It is passed unalted to collectl as -O.
In addition to displaying plots, colgui can also be requested to log the data simultaneously.
- Write a copy of each record received to the terminal. Naturally the speed of the display can effect how quickly the plots can be updated.
- Create a file in the specified directory named for the host this is running on and the date/time of the data collection. Each record will be preceeded by the name of the host (or address) from which the data was collected.
- Similar to log1file except now a separate file is created for each host, named for that host as well as the date/time that the collection was started.
You can combine --log1file and --logfiles with --logterm but not each other.
If Compress::Zlib is installed, the logs will automatically be compressed. If logging to the terminal AND a file simultaneously, compression will be turned off.
- By default, colgui communicates over port 1234. This option allows you to select a different one.
- If colgui cannot directly connect to the target machines, one can put the "colmux" program on a machine that can, using it as a proxy. Specify the address of that machine with this switch.
- When communicating through a proxy, this machine`s address is hidden from other machines. Enter the address that needs to be used to connect back to this machine.
- By default, colgui uses ssh for all communications. If not available but rsh is, select this switch.
- If rsh or ssh requires some username other than the one being run under, this is the way to change it.
One can actually select plots in one of three ways. Using -s, one selects a default plot that matches the associated subsystem(s). Some of these plots contain multiple y-axes so that they can present the maximum amount of information in the minimal amount of space.
Using -p, one selects specific plots by name. These names can be either comma separated (no whitespace) or separated by whitespace and quoted. The list of available plots can be displayed with --showplots, some of which are those displayed via -s. Many of these plots are actually the multi-yaxis plots broken into 2, single axis plots. A number of these plots contain data fields not available as -s plots so it's worth familiarizing yourself with them.
Finally, when nothing quite fits the bill, one can use custom plots, referred to by -c. Here too one can specify one or more name, however in this case these name actual files, whose default extensions are "cfg". These files contain user defined plots so that you can essentially plot any data fields known by collectl!
The rules of how to define a custom plot are contained in the sample mem.cfg which can be found in the examples directory. There are also a number of custom lustre plots that can display a broad set of information. These can also be used as a starting point for building your own. There are also FAQs for both colplot and colgui that may provide addition help.
One thing to remember is that colgui and colplot actually share ALL the plots, both standard as well as custom. This means that any custom plots constructed for colgui can be used by colplot and visa-versa. If there appear to be problems using custom plots - either colgui is reporting errors OR the data being displayed does't look correct, you can also see the parameters colgui will be using to generate its plots by using --showparams, which shows ALL plot definitions, not just custom ones.
Finally, you CAN mix -s, -p and -c in any combinations you like.
This is a file that names the machines which are to be monitored. At minimal, it lists one machine per line. Each entry must be an address or a name that can be resolved to an address. Additional names may be specified, separated by whitespace.
If a second name exists, it will be used when a title is displayed on a plot. If it doesn't exist, the value of the first field will be displayed.
When displaying plots in compressed/dense format, host names are displayed vertically. In some cases, the names are simply too long to fit and if specified, the value of the 3rd field will be used, otherwise the second field will be used.
USING COLMUX AS A PROXY
This is a feature that allows you to monitor systems to which you have no direct connectivity. This is typically the case when a machine that does has connectivity isn't configured to run X. This feature has been successfully tested in a number of configurations but certainly not all. If you do encounter problems be sure to report them.
To use this feature, you need to find a machine to act as a proxy and which is capable of accessing the target machines via both rsh/ssh and a socket connection. If there are firewalls involved they may have to be opened up, at least for a specific port which can then be specified with "--port".
Since machines can have multiple interfaces on them, be sure to use addresses that the machine running colmux can see. If you do encounter problems, try logging into the machine on which colmux is running and try to run it manually using the same node list but without --proxy. Often this will reveal connectivity/reachability problems you didn't realize you had.
Requires at least collectl V1.5.6.
When displaying detail data normal/dense using --geometry nd, there is only a single title line displayed for all systems. This means that if the devices are not the same, the titles can be misleading. If you're not sure what you're displaying, use --showparams to see this level of information.
AUTHORThis program was written by Mark Seger ([email protected]).
Copyright 2005 Hewlett-Packard Development Company, L.P.