pksvm(1) classify raster image using Support Vector Machine


pksvm -t training [-i input] [-o output] [-cv value] [options] [advanced options]


pksvm implements a support vector machine (SVM) to solve a supervised classification problem. The implementation is based on the open source C++ library libSVM ( Both raster and vector files are supported as input. The output will contain the classification result, either in raster or vector format, corresponding to the format of the input. A training sample must be provided as an OGR vector dataset that contains the class labels and the features for each training point. The point locations are not considered in the training step. You can use the same training sample for classifying different images, provided the number of bands of the images are identical. Use the utility pkextract to create a suitable training sample, based on a sample of points or polygons. For raster output maps you can attach a color table using the option -ct.


-t filename, --training filename
Training vector file. A single vector file contains all training features (must be set as: b0, b1, b2,...) for all classes (class numbers identified by label option). Use multiple training files for bootstrap aggregation (alternative to the --bag and --bagsize options, where a random subset is taken from a single training file)
-i filename, --input filename
input image
-o filename, --output filename
Output classification image
-cv value, --cv value
N-fold cross validation mode (default: 0)
-tln layer, --tln layer
Training layer name(s)
-c name, --class name
List of class names.
-r value, --reclass value
List of class values (use same order as in --class option).
-of GDALformat, --oformat GDALformat
Output image format (see also gdal_translate(1)).
-f format, --f format
Output ogr format for active training sample
Creation option for output file. Multiple options can be specified.
-ct filename, --ct filename
Color table in ASCII format having 5 columns: id R G B ALFA (0: transparent, 255: solid)
-label attribute, --label attribute
Identifier for class label in training vector file. (default: label)
-prior value, --prior value
Prior probabilities for each class (e.g., -prior 0.3 -prior 0.3 -prior 0.2) Used for input only (ignored for cross validation)
-g gamma, --gamma gamma
Gamma in kernel function
-cc cost, --ccost cost
The parameter C of C_SVC, epsilon_SVR, and nu_SVR
-m filename, --mask filename
Only classify within specified mask (vector or raster). For raster mask, set nodata values with the option --msknodata.
-msknodata value, --msknodata value
Mask value(s) not to consider for classification. Values will be taken over in classification image.
-nodata value, --nodata value
Nodata value to put where image is masked as nodata
-v level, --verbose level
Verbose level

Advanced options

-b band, --band band
Band index (starting from 0, either use --band option or use --startband to --endband)
-sband band, --startband band
Start band sequence number
-eband band, --endband band
End band sequence number
-bal size, --balance size
Balance the input data to this number of samples for each class
-min number, --min number
If number of training pixels is less then min, do not take this class into account (0: consider all classes)
-bag value, --bag value
Number of bootstrap aggregations (default is no bagging: 1)
-bagsize value, --bagsize value
Percentage of features used from available training features for each bootstrap aggregation (one size for all classes, or a different size for each class respectively
-comb rule, --comb rule
How to combine bootstrap aggregation classifiers (0: sum rule, 1: product rule, 2: max rule). Also used to aggregate classes with rc option.
-cb filename, --classbag filename
Output for each individual bootstrap aggregation
-prob filename, --prob filename
Probability image.
-offset value, --offset value
Offset value for each spectral band input features: refl[band]=(DN[band]-offset[band])/scale[band]
-scale value, --scale value
Scale value for each spectral band input features: refl=(DN[band]-offset[band])/scale[band] (use 0 if scale min and max in each band to -1.0 and 1.0)
-svmt type, --svmtype type
Type of SVM (C_SVC, nu_SVC,one_class, epsilon_SVR, nu_SVR)
-kt type, --kerneltype type
Type of kernel function (linear,polynomial,radial,sigmoid)
-kd value, --kd value
Degree in kernel function
-c0 value, --coef0 value
Coef0 in kernel function
-nu value, --nu value
The parameter nu of nu-SVC, one-class SVM, and nu-SVR
-eloss value, --eloss value
The epsilon in loss function of epsilon-SVR
-cache number, --cache number
Cache <> memory size in MB (default: 100)
-etol value, --etol value
the tolerance of termination criterion (default: 0.001)
-shrink, --shrink
Whether to use the shrinking heuristics
-na number, --nactive number
Number of active training points


Classify input image input.tif with a support vector machine. A training sample that is provided as an OGR vector dataset. It contains all features (same dimensionality as input.tif) in its fields (please check pkextract(1) on how to obtain such a file from a "clean" vector file containing locations only). A two-fold cross validation (cv) is performed (output on screen). The parameters cost and gamma of the support vector machine are set to 1000 and 0.1 respectively. A colourtable (a five column text file: image value, RED, GREEN, BLUE, ALPHA) has also been provided.

pksvm -i input.tif -t training.sqlite -o output.tif -cv 2 -ct colourtable.txt -cc 1000 -g 0.1

Classification using bootstrap aggregation. The training sample is randomly split in three subsamples (33% of the original sample each).

pksvm -i input.tif -t training.sqlite -o output.tif -bs 33 -bag 3

Classification using prior probabilities for each class. The priors are automatically normalized. The order in which the options -p are provide should respect the alphanumeric order of the class names (class 10 comes before 2...)

pksvm -i input.tif -t training.sqlite -o output.tif -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 0.2 -p 1 -p 1 -p 1