pmccabe(1) calculate McCabe cyclomatic complexity or non-commented line counts for C and C++ programs

SYNOPSIS

pmccabe [-bCdfFntTvV?] [file(s)]

DESCRIPTION

pmccabe processes the named files, or standard input if none are named. In default mode it calculates statistics including McCabe cyclomatic complexity for each function. The files are expected to be either C (ANSI or K&R) or C++.
-?
Print an informative usage message.
-v
Print column headers
-V
Print pmccabe version number

De-commenting mode

-d
Intended to help count non-commented source lines via something like:
pmccabe -d *.c | grep -v '^[<blank><tab>]*$' | wc -l

Comments are removed, cpp directives are replaced by cpp, string literals are replaced by STRINGLITERAL, character constants are replaced by CHARLITERAL. The resulting source code is much easier to parse. This is the first step performed by pmccabe so that its parser can be simpler.

None of the other options work sensibly with -d.

Line-counting mode

-n
Counts non-commented source lines. The output format is identical to that of the anac program except that column headers and totals must be requested if desired. If you want column headers add -v. If you want totals add -t. If all you want is totals add -T.

Complexity mode (default)

-C
Custom output format - don't use it.
-c
Report non-commented, non-blank lines per function (and file) instead of the raw number of lines. Note that pre-processor directives are NOT counted.
-b
Output format compatible with compiler error browsing tools which understand "classic" compiler errors. Numerical sorting on this format is possible using:
sort -n +1 -t%
-t
Print column totals. Note the total number of lines is *NOT* the number of non-commented source lines - it's the same as would be reported by "wc -l".
-T
Print column totals *ONLY*.
-f
Include per-file totals along with the per-function totals.
-F
Print per-file totals but NOT per-function totals.

Parsing

pmccabe ignores all cpp preprocessor directives - calculating the complexity of the appearance of the code rather than the complexity after the preprocessor mangles the code. This is especially important since simple things like getchar(3) expand into macros which increase complexity.

Output Format

A line is written to standard output for each function found of the form:
Modified McCabe Cyclomatic Complexity
|   Traditional McCabe Cyclomatic Complexity
|       |    # Statements in function
|       |        |   First line of function
|       |        |       |   # lines in function
|       |        |       |       |  filename(definition line number):function
|       |        |       |       |           |
5       6       11      34      27      gettoken.c(35): matchparen

Column 1 contains cyclomatic complexity calculated by adding 1 (for the function) to the occurences of for, if, while, switch, &&, ||, and ?. Unlike "normal" McCabe cyclomatic complexity, each case in a switch statement is not counted as additional complexity. This treatment of switch statements and complexity may be more useful than the "normal" measure for judging maintenance effort and code difficulty.

Column 2 is the cyclomatic complexity calculated in the "usual" way with regard to switch statements. Specifically it is calculated as in column 1 but counting each case rather than the switch and may be more useful than column 1 for judging testing effort.

Column 3 contains a statement count. It is calculated by adding each occurence of for, if, while, switch, ?, and semicolon within the function. One possible surprise is that for statements have a minimum statement count of 3. This is realistic since for(A; B; C){...} is really shorthand for A; while (B) { ... C;}. The number of statements within a file is the sum of the number of statements for each function implemented within that file, plus one for each of those functions (because functions are statements too), plus one for each other file-scoped statement (usually declarations).

Column 4 contains the first line number in the function. This is not necessarily the same line on which the function name appears.

Column 5 is the number of lines of the function, from the number in column 4 through the line containing the closing curly brace.

The final column contains the file name, line number on which the function name occurs, and the name of the function.

APPLICATIONS

The obvious application of pmccabe is illustrated by the following which gives a list of the "top ten" most complex functions:
pmccabe *.c | sort -nr | head -10

Many files contain more than one C function and sometimes it would be useful to extract each function separately. matchparen() (see example output above) can be extracted from gettoken.c by extracting 27 lines starting with line 34. This can form the basis of tools which operate on functions instead of files (e.g., use as a front-end for diff(1)).

DIAGNOSTICS

pmccabe returns a nonzero exit status if files could not be opened and upon encountering some parsing errors.

Error messages to standard error, usually explaining that the parser is confused about something, mimic classic C compiler error messages.

WARNINGS

pmccabe is confused by unmatched curly braces or parentheses which sometimes occur with hasty use of cpp directives. In these cases a diagnostic is printed and the complexity results for the files named may be unreliable. Most times the "#ifdef" directives may be modified such that the curly braces match. Note that if pmccabe is confused by a cpp directive, most pretty printers will be too. In some cases, preprocessing with unifdef(1) may be appropriate.

Statement counting could arguably be improved by: counting occurences of the comma operator, multiple assignments, assignments within conditional tests, and logical conjunction. However since there is no crisp statement definition from the language or from people I've queried, statement counting will probably not be improved. If you have a crisp definition I'll be happy to consider it.

Templates cause pmccabe's scanner to exit.

It's a shame that ctags output isn't provided.

AUTHOR

Paul Bame