zpaq(1) PAQ open standard maximum compressor

SYNOPSIS


create : zpaq [opnsitqv]c<config>[,N...] archive.zpaq file [file ...]
append : zpaq [opnsitqv]a<config>[,N...] archive.zpaq file [file ...]
list : zpaq l archive.zpaq
extract: zpaq [opntq]x[N] archive.zpaq
debug : zpaq [pthv]rF[,N...] [args...]

DESCRIPTION

General

PAQ is a series of open source data compression archivers that have evolved through collaborative development to top rankings on several benchmarks measuring compression ratio although at the expense of speed and memory usage.

Zpaq ia a proposed standard format for highly compressed data that allows new compression algorithms to be developed without breaking compatibility with older programs. Zpaq is based on PAQ-like context mixing algorithms which are top ranked on many benchmarks. The format supports archivers, single file compressors, and memory to memory compression.

ZPAQ is a configurable file compressor and archiver. Its goal is a high compression ratio in an open format without loss of compatibility between versions as advanced compression techniques are discovered.

Compression uses by default built-in configuration files. Three examples are supplied:

  min.cfg - Fast, minimal compression (LZP + order 3). Requires 4 MB memory.
  mid.cfg - Average compression and speed. Requires 111 MB.
  max.cfg - Slow but good compression. Requires 278 MB.

The config file is not needed to extract.

NOTE: in extract mode, if the FILES are listed the files are renamed during written out.

head2 Commands

a
Append to archive.
c
Create archive.
i
Don't store file sizes as comments (saves a few bytes). Normally the input file size is stored as a decimal string, taking a few bytes. The comment field has no effect on the program except that it is displayed by the l and x commands.
l
List contents of archive.
n
In create mode: Don't store filenames (names will be needed to decompress). In extract mode: decompress all to one file. The effect is to require that filenames be given during decompression.

During extract ignore all stored filenames and append all output to one file, the first file in [files...].

o
Optimize (run faster). You need a C++ compiler installed to use this option. If not, drop the ``o''. You can still use zpaq but it will take about twice as long to run.

If successful, compression is typically 50% to 100% faster. Zpaq will look for a program named "zpaq_X" in the temporary directory, where X is derived from the SHA1 checksum of the block header produced by config file CONFIG with arguments N. If the program exists, then Zpaq will call it with the same arguments to perform the compression. If it does not exist then Zpaq will create a source code file "zpaq_X.cpp" in the temporary directory, compile it, and link it to "zpaq.cpp" or "zpaq.o" depending on the installation.

The temporary directory is specified by the environment variable TEMP if it exists, or else the current directory.

The program "zpaq_X" will compress its input in the same format as described by CONFIG, but faster. If CONFIG specifies a preprocessor, then "zpaq_X" will expect to find it too. It will also decompress archive blocks in the same configuration but fail if it attempts to decompress blocks in any other configuration.

Program "zpaq_X" will accept the c, a and x commands with all of the same modifiers, but will ignore the v and o modifiers and ignore any CONFIG file and arguments passed to it. It will not accept the l or r commands. Extraction requires a block number (``x1'', ``x2'', etc). A different optimized program is used to extract each block.

Zpaq will call the external program "zpaqmake" to compile "zpaq_X.cpp", passing it "zpaq_X" as an argument. Normally this will be a script that calls a C++ compiler to produce "zpaq_X.o", links to "zpaq.o" and outputs "zpaq_X". The script could link to "zpaq.cpp" instead of "zpaq.o".

p
In create mode: Store filename paths in archive. The default is to store the name without the path. For example:

    zpaq pc<CONFIG> books.zpaq dir/file

will store the name as "dir/file". If the p option is also given during extraction, then ZPAQ will attempt to extract "file" to the subdirectory instead of the current directory. This will fail if directory does not exist. ZPAQ does not create directories as needed.

In extract mode: extract to stored paths instead of current directory

The default is to extract to the current directory regardless of how the file names are stored. Stored paths must be relative to the current directory, not start with a ``/'', ``\'', a drive letter like ``C:'' or contain ``../'' or ``..\''. If extracting to a subdirectory, it must already exist. It will not be created.

[files...] overrides and has no restrictions on file names. Each segment extracts to a different file. If any segments do not have a stored filename then they can only be extracted using the p or n modifiers.

q
Quiet mode. Don't display compression progress on the screen.
s
Don't store SHA1 checksums (saves 20 bytes).The decompressor will not check that the output is identical to the original input.
t
In create mode: Append locator tag to non-ZPAQ data.

Append a locator tag to non-ZPAQ data. The tag is a string of 13 bytes that allows ZPAQ and UNZPAQ to find the start of a sequence of ZPAQ blocks embedded in other data. Program "zpaqsfx" already has this tag at the end. However, if a new stub is compiled from the source then the t command should be used when appending the first file.

In extract mode: don't post-process (for debugging). Expect checksum errors.

v
Verbose mode. Show CONFIG file as it compiles. This is useful for error checking.
x
Extract. Use ``ox'' to extract fast. You can extract more slowly with plain ``x'' if you don't have C++ installed. Output files are renamed in the same order they are stored and listed. If you don't rename the output files, then the files will be extracted to the current directory with the same names they had when stored.
,N
Usd in create mode. Pass numeric arguments to CONFIG file. Appended suffix like ``,2'' means use 4 times more memory. Each increment doubles usage. You need the same memory to decompress.
N
Used in extract mode. Extract only block N (1, 2, 3...), where 1 is the first block. Otherwise all blocks are extracted. The l command shows which files are in each block.

Debug and Development Options

To debug CONFIG file, use:

  zpaq [pthv]r<CONFIG>[,N...] [args...]

the r runn the ZPAQL program in HCOMP section of configuration file F. The program is run once for each byte of input from the file named in the first argument and once at EOF with the input byte (or -1) in the A register. Output is to the file named in the second argument. If run with no arguments then take input from stdin and output to stdout. Modifiers are listed below.

h
When tracing, display register and memory contents in hexadecimal instead of decimal.
p
Run PCOMP (default is to run HCOMP).
t
Trace (single step), args are numeric inputs otherwise args are input, output (default stdin, stdout). The arguments should be numbers rather than file names. The program is run once for each argument with the value in the A register. As each instruction is executed the register contents are shown. At HALT, memory contents are displayed.
v
Verbose compile. Display the CONFIG file as it is being compiled. If an error occurs, it will be easier to locate. Modifier v is also useful for displaying jump targets.
,N
Pass numeric arguments to CONFIG file. Pass up to 9 numeric arguments to CONFIG file (like the c and a commands).

OPTIONS

-h
Display short help.

EXAMPLES

Create

To create an archive:

    zpaq c<CONFIG> archive.zpaq files ...

If the archive exists then it is overwritten. File names are stored without a path.

Append

To (a)ppend to an existing archive. If the archive does not exist then it is created as with the c command:

    zpaq a<CONFIG> archive.zpaq files ...

List

To list the contents of an archive. Files are listed in the same order they were added:

    zpaq l archive.zpaq

To extract the contents of the archive. New files are created and named according to the stored filenames. Does not clobber existing files. Extracts to current directory:

    zpaq x archive.zpaq

If the files to be extracted already exist, then zpaq will refuse to clobber them and skip to the next file. If the files are compressed with a path (folder or directory), then that directory must exist when the file is extracted. zpaq will not create directories.

To extract files and renames in the order they were added to the archive. Clobbers any already existing output files. The number of files extracted is the smaller of the number of filenames on the command line or the number of files in the archive.

    zpaq x archive.zpaq file ...

Extract

To extract and rename:

    zpaq x archive.zpaq files ...
    unzpaq x archive.zpaq files ...

Files are extracted in the same order they are saved and renamed. Unlike using stored names, if the file exists, then it is overwritten (clobbered). Only files named on the command line are extracted. Any additional files in the archive are ignored. For example:

    zpaq x archive.zpaq foo bar

To extracts files like x, but without post-processing. This may be useful for debugging or developing config files:

    zpaq t archive.zpaq [files ...]

Config file

The distribution contain several default CONFIG files:

  min.cfg - for fast but poor compression.
  max.cfg - for slow but good compression.
  mid.cfg - for moderate speed and compression (default).

Other config files are available as add-on options or you can write them as explained later.

A numeric argument may be appended to CONGIF to increase memory usage for better compression. Each increment doubles usage. There should be no space before or after the comma. For example:

  zpaq cmax.cfg archive files...    = 246 MB
  zpaq cmax.cfg,1 archive files...  = 476 MB
  zpaq cmax.cfg,2 archive files...  = 938 MB
  zpaq cmax.cfg,3 archive files...  = 1861 MB
  zpaq cmax.cfg,-1 archive files... = 130 MB (negative values allowed)

Modifiers may be in any order before the ``c'' or ``a'' command. The modifiers, command, and configuration file must be written together without any spaces. An example: to create archive with options i, p, s and configuration file "max.cfg". Modifiers have the following meaning:

  zpaq ipsc<CONFIG> archive.zpaq file1 file22

ENVIRONMENT

Temporary directory TEMPDIR is use during optimize command o.

None.

FILES

Compression commands c and a need a configuration file. See examples in directory "/usr/share/doc/zpaq".

STANDARDS

See zpaq*.pdf (ZPAQ Level 1 and later) in section AVAILABILITY . It is anticipated that future levels (ZPAQ-2, ZPAQ-3, etc.) will be backward compatible, such that newer levels can read archives produced by older programs.

AUTHORS

Program was written by Matt Mahoney <[email protected]>

This manual page was put together by Jari Aalto <[email protected]>. under license GNU GPL version 2 or (at your option) any later version. For more information about license, visit <http://www.gnu.org/copyleft/gpl.html>.