pkoptsvm(1) program to optimize parameters for SVM classification


pkoptsvm -t training [options] [advanced options]


pkoptsvm The support vector machine depends on several parameters. Ideally, these parameters should be optimized for each classification problem. In case of a radial basis kernel function, two important parameters are {cost} and {gamma}. The utility pkoptsvm can optimize these two parameters, based on an accuracy assessment (the Kappa value). If an input test set (-i) is provided, it is used for the accuracy assessment. If not, the accuracy assessment is based on a cross validation (-cv) of the training sample.

The optimization routine uses a grid search. The initial and final values of the parameters can be set with -cc startvalue -cc endvalue and -g startvalue -g endvalue for cost and gamma respectively. The search uses a multiplicative step for iterating the parameters (set with the options -stepcc and -stepg). An often used approach is to define a relatively large multiplicative step first (e.g 10) to obtain an initial estimate for both parameters. The estimate can then be optimized by defining a smaller step (>1) with constrained start and end values for the parameters cost and gamma.


-t filename, --training filename
training vector file. A single vector file contains all training features (must be set as: b0, b1, b2,...) for all classes (class numbers identified by label option).
-i filename, --input filename
input test vector file
-cc startvalue -cc endvalue, --ccost startvalue --ccost endvalue
min and max boundaries the parameter C of C-SVC, epsilon-SVR, and nu-SVR (optional: initial value)
-g startvalue -g endvalue, --gamma startvalue --gamma endvalue
min max boundaries for gamma in kernel function (optional: initial value)
-step stepsize, --step stepsize
multiplicative step for ccost and gamma in GRID search
-v level, --verbose level
use 1 to output intermediate results for plotting

Advanced options

-tln layer, --tln layer
training layer name(s)
-label attribute, --label attribute
identifier for class label in training vector file. (default: label)
-bal size, --balance size
balance the input data to this number of samples for each class (default: 0)
-random, --random
in case of balance, randomize input data
-min number, --min number
if number of training pixels is less then min, do not take this class into account
-b band, --band band
band index (starting from 0, either use band option or use start to end)
-sband band, --startband band
start band sequence number
-eband band, --endband band
end band sequence number
-offset value, --offset value
offset value for each spectral band input features: refl[band]=(DN[band]-offset[band])/scale[band]
-scale value, --scale value
scale value for each spectral band input features: refl=(DN[band]-offset[band])/scale[band] (use 0 if scale min and max in each band to -1.0 and 1.0)
-svmt type, --svmtype type
type of SVM (C_SVC, nu_SVC,one_class, epsilon_SVR, nu_SVR)
-kt type, --kerneltype type
type of kernel function (linear,polynomial,radial,sigmoid)
-kd value, --kd value
degree in kernel function
-c0 value, --coef0 value
coef0 in kernel function
-nu value, --nu value
the parameter nu of nu-SVC, one-class SVM, and nu-SVR
-eloss value, --eloss value
the epsilon in loss function of epsilon-SVR
-cache number, --cache number
cache memory size in MB (default: 100)
-etol value, --etol value
the tolerance of termination criterion (default: 0.001)
-shrink, --shrink
whether to use the shrinking heuristics
-cv value, --cv value
n-fold cross validation mode (default: 0)
-cf, --cf
use Overall Accuracy instead of kappa
-maxit number, --maxit number
maximum number of iterations
-tol value, --tolerance value
relative tolerance for stopping criterion (default: 0.0001)
-a value, --algorithm value
GRID, or any optimization algorithm from
-c name, --class name
list of class names.
-r value, --reclass value
list of class values (use same order as in --class option).