apt-cacher(8) caching proxy for Debian packages

SYNOPSIS

Server:

[ -h|--help ] [ -i|-d ] [ -c configfile ] [ -p pidfile ] [ -r directory ] [ -R retries ] [ config_option=foo ]

DESCRIPTION

Apt-cacher is a caching proxy for Debian packages, allowing a number of computers to share a single cache. Packages requested from the cache only need to be downloaded from the Debian mirrors once, no matter how many local machines need to install them. This saves network bandwidth, improves performance for users, and reduces the load on the mirrors.

In addition to proxying and caching HTTP requests, apt-cacher can proxy and cache FTP and HTTPS GET/HEAD requests, Debian bugs SOAP requests and proxy (but not cache) HTTPS CONNECT requests.

COMMAND-LINE OPTIONS

-c configfile
Specify alternative configuration file to default [/etc/apt-cacher/apt-cacher.conf]
-d
Stand-alone daemon-mode. Fork and run in the background
-h, --help
Print brief usage.
-i
Inetd daemon-mode: Only use in /etc/inetd.conf
-p pidfile
Write PID of running process to this file.
-r directory
Experimental option to chroot to given directory
-R retries
Number of attempts to bind to daemon port.
-v, --version
Show version and exit.
config_option=value
Override values in configuration file. Can be given multiple times.

USAGE

Setting up apt-cacher involves two stages: installing apt-cacher itself on a single machine on your network to act as a server and configuring all client machines to use the server's cache.

Apt-cacher can be installed to run either as a daemon [preferred] or as a CGI script on a web server such as Apache [deprecated]. When a client (apt-get(1), aptitude(8), synaptic(8) etc.) requests a package from the cache machine, the request is handled by apt-cacher which checks whether it already has that particular package. If so, the package is returned immediately to the client for installation. If not, or if the package in the local cache has been superseded by a more recent version, the package is fetched from the specified mirror. While being fetched it is simultaneously streamed to the client, and also saved to the local cache for future use.

Other client machines on your network do not need apt-cacher installed in order to use the server cache. The only modification on each client computer is to direct it to use the server cache. See CLIENT CONFIGURATION below for ways of doing this.

SERVER INSTALLATION

Apt-cacher can be installed in various ways on the server. The recommended way is by running the program as a daemon. This should give the best performance and the lowest overall memory usage.

Daemon Mode

Stand-alone Daemon:

Edit the file /etc/default/apt-cacher and change AUTOSTART=1, then run (as root)
/etc/init.d/apt-cacher start

to start the daemon.

Inetd Daemon:

Edit /etc/inetd.conf and add the line
3142 stream tcp nowait www-data /usr/sbin/apt-cacher apt-cacher -i

Restart or send SIGHUP to inetd after saving the file. This is a good method if you do not wish the daemon to be loaded all the time.

In either daemon mode, clients can access the server using http://apt-cacher.server:port/

NOTE: in inetd mode access control checks are not performed and the allowed_hosts and denied_hosts options have no effect. Access controls for inetd can be implemented using using inetd or tcpd wrapper. See inetd.conf(5) and hosts_access(5) for further details.

CGI Mode

This mode is deprecated and not recommended for long-term use because it brings a visible performance impact on the network and server speed. To use it you will need to ensure your webserver supports CGI. Clients can access the server using http://apt-cacher.server[:port]/cgi-bin/apt-cacher/.

Migration away from deprecated CGI mode can be smoothed using the following configuration options

cgi_advise_to_use []
This is a custom error message that is used to advise clients to use an alternative to CGI.
cgi_redirect []
If set, this option is an absolute URL that is used to redirect any CGI requests. It can be used to seamlessly redirect CGI access to an instance of apt-cacher running as an INETD or stand-alone daemon.

SERVER CONFIGURATION OPTIONS

Apt-cacher uses a configuration file for setting important options. Additionally there are few command line options to control behaviour. See COMMAND-LINE OPTIONS above.

The default configuration file is /etc/apt-cacher/apt-cacher.conf. It is read every time the daemon starts or an inetd/CGI instance is executed. Therefore a stand-alone daemon may need to be restarted or reloaded using the init script in order to reread its configuration. A running daemon will also reread the configuration file on receiving SIGHUP (see SIGNALS below).

As an alternative to editing the configuration file, configuration fragments to override the defaults can also be placed in a directory named conf.d in the same directory as the main configuration file, e.g. '/etc/apt-cacher/conf.d/'. Files placed here can have an arbitrary name. They are read using glob(7) semantics: case-insensitive, ascending ASCII order; dot-files (beginning with '.') ignored. Also, backup files ending with '~' and any files ending with '.disabled' or '.dpkg-{old,new,dist,tmp}' are also ignored. Duplicate settings read later will override any previous ones.

Each line in the file consists of

configuration_option = value

Long lines can be split by preceding the newlines with '\'. Whitespace is ignored. Lines beginning with '#' are comments and are ignored. If multiple assignments of the same option occur, only the last one will take effect. For binary options, 0 means off or disabled, any other integer means on or enabled. Options which can accept lists may use either ';' or ',' to separate the individual list members. To include these separators within a list item escape them with '\'.

The options available in the config file (and their default settings) are:

Universal Options

admin_email [root@localhost]
The email address of the administrator is displayed in the info page and traffic reports.
allowed_locations
Only allow access to specific upstream mirrors. The requested URL must match an item in this list for access to be granted. The part of the URL referring to the apt-cacher server itself (http://apt-cacher.server:port[/apt-cacher]/) is ignored. Matching begins immediately after that. If '%PATH_MAP%' in included in this option, it will be expanded to the keys of the path_map setting. Note this item contains string(s), not regexps.
allowed_ssl_locations []
Only allow HTTPS/SSL proxy CONNECT to hosts or IPs which match an item in this list.
allowed_ssl_ports [443]
Only allow HTTPS/SSL proxy CONNECT to ports which match an item in this list. Adding further items to this can pose a significant security risk. DO NOT do it unless you understand the full implications.
cache_dir [/var/cache/apt-cacher]
The directory where apt-cacher will store local copies of all packages requested. This can grow to many hundreds of MB, so make sure it is on a partition with plenty of room. NOTE: the cache directory needs to contain some subdirectories for correct storage management. If you try to create a custom directory, please use the script /usr/share/apt-cacher/install.pl or use the initially created cache directory as example.
concurrent_import_limit [Number of CPU cores from /proc/cpuinfo or 0]
Importing new checksums can cause high CPU usage on slower systems. This option sets a limit to the number of index files that are imported simultaneously, thereby limiting CPU load average, but, possibly, taking longer. Leave unset or set to 0 for no limit.
checksum [0]
Switches on checksum validation of cached files. Checksum validation will slow apt-cacher response as requested files have to be downloaded completely before validation can occur. This slow down can be prevented by setting this value to 'lazy' in which case files will be passed on as they are received and checked afterwards. Requires package libberkeleydb-perl to be installed.
checksum_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches the URL filename of all index files from which checksums are imported into the checksum database if checksum mode is enabled.
clean_cache [1]
Whether to flush obsolete versions of packages from your cache daily. You can check what will be done by running
/usr/share/apt-cacher/apt-cacher-cleanup.pl -s
which will just show what would be done to the contents of the cache. A package is only obsolete if none of the (optional) namespaces, distributions (stable, testing, etc) or architectures you use reference it. It should be safe to leave this on.
curl_idle_timeout [120]
The maximum time in seconds the libcurl backend will wait, unused, before exiting.
curl_ssl_insecure []
If this is set to 1, HTTPS GET requests (which are inherently insecure as transfer from the proxy to the client is unverified) is even less secure as the libcurl backend skips peer verification with the upstream source. You really shouldn't use this. HTTPS CONNECT proxying is more secure.
curl_throttle [10]
Controls how fast the libcurl process runs. Increasing this setting will reduce the CPU load of the libcurl process, possibly at the expense of slower response times and a lower throughput. On most systems this option should be left unchanged.
data_timeout [120]
Time in seconds which, if no data is received from upstream, a request will timeout. This option used to be known as fetch_timeout. The old name is still recognised for backwards compatibility.
debug [0]
Whether debug mode is enabled. Off by default. When turned on (non-nil), lots of extra debug information will be written to the error log. This can make the error log become quite big, so only use it when trying to debug problems. Additional information from the libcurl backend can be obtained by increasing this parameter. The correspondence between this setting and curl_infotype is:-
1
CURLINFO_TEXT
2
CURLINFO_HEADER_IN
3
CURLINFO_HEADER_OUT
4
CURLINFO_DATA_IN
5
CURLINFO_DATA_OUT
6
CURLINFO_SSL_DATA_IN
7
CURLINFO_SSL_DATA_OUT
See CURLOPT_DEBUGFUNCTION in curl_easy_setopt(3) for further information.
disk_usage_limit []
Optional upper limit for the disk space usage of cache_dir in bytes. Units (k, KiB, M, MiB etc.) are recognised with the same semantics as the rate limit option above.
distinct_namespaces [0]
Set this to 1 to enable support for caching multiple Distributions (e.g Debian and Ubuntu) from a single apt-cacher instance. When enabled, package files are stored in a subdirectory, the name of which is derived from the matching key of path_map or the part of the URL preceding 'pool' or 'dists'. This is typically 'debian', 'ubuntu' or 'security'. This mechanism prevents clashes between the Distributions.
If you enable this option, any existing namespace specific package files which are not in the correct subdirectory of cache_dir would be deleted by apt-cacher-cleanup.pl. If you wish to keep them and import them into the correct namespace then run (as root)
/usr/share/apt-cacher/apt-cacher-import.pl -u {cache_dir}/packages
If you wish to limit the possible namespaces see path_map.
expire_hours [0]
How many hours Package and Release files are cached before they are assumed to be too old and must be re-fetched. Setting 0 means that the validity of these files is checked on each access by comparing time stamps in HTTP headers on the server with those stored locally. Use of this setting is deprecated as HTTP header validation and sending If-Modified-Requests is much more efficient.
generate_reports [1]
Whether to generate traffic reports daily. Traffic reports can be accessed by pointing a browser to
http://apt-cacher.server:3142/report/ [daemon mode]
or
http://apt-cacher.server[:port]/apt-cacher/report/ [CGI mode].
group [www-data]
The effective group id to change to.
http_proxy []
Apt-cacher can pass all its requests to an external http proxy like Squid, which could be very useful if you are using an ISP that blocks port 80 and requires all web traffic to go through its proxy. The option takes a URI of the form
[protocol://][user[:password]@]hostname[:port]
The default protocol is http and the default port is 1080.
http_proxy_auth []
External http proxy sometimes need authentication to get full access. The format is 'username:password', eg: 'proxyuser:proxypass'. This option is deprecated. Proxy authentication details should be specified in the http_proxy URI.
index_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches the URL filename of all index-type files (files that are uniquely identified by their full path and need to be checked for freshness).
installer_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches the URL filename of all files used by Aptitude, Apt, the Debian/Ubuntu installer or Debian Live (files that are uniquely identified by their full path but don't need to be checked for freshness). Within this item the shorthand '%VALID_UBUNTU_RELEASE_NAMES%' will be expanded to the list configured in ubuntu_release_names as regexp alternatives.
interface []
Specify a particular interface to use for the upstream connection. Can be an interface name, IP address or host name. If unset, the default route is used.
libcurl []
This is a list of configuration options for the libcurl backend. Each item consists of the curl_easy_setopt(3) option (without the CURLOPT_ prefix) and the desired setting. For example:
libcurl = dns_cache_timeout 300, maxredirs 10, noproxy localhost;
Be very careful with this. Apt-cacher depends on the libcurl backend working in a predictable way. You can very easily break things by configuring this.
limit [0]
Rate limiting sets the maximum rate in bytes per second used for fetching files from the upstream mirrors. Optionally, use SI unit abbreviations ('k', 'M', 'G' etc.) for decimal multiples (1000) or 'KiB', 'MiB', or 'GiB' etc. for binary (1024) multiples. Legacy lowercase suffixes based on wget(1) syntax are interpreted as decimal for backwards compatibility, but should be avoided in new configurations. Use 0 for no rate limiting. By default this setting is per libcurl connection. For global limiting, see limit_global below.
limit_global [0]
If set, this makes the specified rate limit apply overall. The libcurl backend does not have good support for this, so it is implemented by a simple division of the rate by the total number of downloads. There is no way for downloads, dynamically, to use bandwidth released by another idle/finished download. If you really want good global bandwidth control, don't use this option at all; use traffic shaping instead.
log_dir [/var/log/apt-cacher]
Directory to use for the access and error log files and traffic report. The access log records all successful package requests using a timestamp, whether the request was fulfilled from cache, the IP address of the requesting computer, the size of the package transferred, and the name of the package. The error log records major faults, and is also used for debug messages if the debug directive is set to 1. Debugging is toggled by sending SIGUSR1 (see SIGNALS below).
This option was formerly named 'logdir', but was renamed for consistency. logdir is still recognised but should not be used for new installations.
max_loadavg []
If set this limits the maximum 1 minute loadavg permitted for apt-cacher to attempt to handle an client connection.
offline_mode [0]
Avoid making any outgoing connections, return files available in the cache or just return errors if they are missing.
package_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (refer to the perlre(1) manpage) which matches the URL filename of all package-type files (files that are uniquely identified, within their Distribution, by their filename).
path_map []
A mapping scheme to rewrite URLs, which converts the first part of the URL after the apt-cacher server name (the key) to a remote mirror. For example, if you set
path_map = debian ftp.debian.org/debian
retrieving
http://apt-cacher.server:3142/debian/dists/stable/Release
will actually fetch
http://apt-cacher.server:3142/ftp.debian.org/debian/dists/stable/Release
If distinct_namespaces is set, then you can use multiple mappings to cache different Distributions separately. For example
path_map = debian ftp.debian.org/debian; ubuntu archive.ubuntu.com/ubuntu
Multiple, space separated, mirrors can be given for each mapping. They will be tried in order in the event of failure/unavailability of the previous one. For example
path_map = debian ftp.uk.debian.org/debian ftp.debian.org/debian ;
All path_map keys need to be included in allowed_locations, if the latter is used. The shorthand '%PATH_MAP%' can be used to simplify this.
There are 4 internal path_map settings for the Debian and Ubuntu changelog and AppStream servers which will be merged with the specified configuration:
debian-changelogs packages.debian.org metadata.ftp-master.debian.org; ubuntu-changelogs changelogs.ubuntu.com; debian-appstream appstream.debian.org; ubuntu-appstream appstream.ubuntu.com;
These can be overridden by specifying an alternative mirror for that key, or deleted by just specifying the key with no mirror.
pdiff_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches APT pdiff files. These are ed(1) scripts which APT use to patch index files rather than redownloading the whole file afresh.
return_buffer_size [1048576]
The buffer size that is used for reads when returning a file using slower read/write loop. The default is 1MB. You may wish to adjust this to trade speed against memory consumption. By default files are returned using sendfile(2) which is much faster and does not make use of this setting.
reverse_path_map [1]
This setting enables a reverse map from the requested URL to the path_map key. It helps prevent having multiple copies of the same file cached under different file names.
skip_checksum_files_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches the URL filename for which checksum validation is not performed. Note that all files matched by installer_files_regexp are automatically skipped and do not need to be added here as well.
soap_url_regexp [see default /etc/apt-cacher/apt-cacher.conf]
Perl regular expression (perlre(1)) which matches URLs that are permitted as upstream source for apt-listbugs(1) requests.

apt-listbugs(1) makes requests to the Debian Bugs server via SOAP POST requests. These are not cached, but are simply passed through as a convenience.

supported_archs [see default /etc/apt-cacher/apt-cacher.conf]
A list of Architectures that is supported and you wish to allow in fetched filenames. 'all' and 'any' are automatically added to this list which is then used to expand %VALID_ARCHS% in *_files_regexp.
use_proxy use_proxy_auth [obsolete]
Use of external proxy and proxy authentication used to be turned on or off with these options. They are now ignored and an upstream proxy will always be used if configured.
use_sendfile [Sys::Syscall::sendfile_defined()]
By default, if sendfile(2) is available on the system, it is used to return files to client and is much more efficient than read/write. The default, which is determined by the value returned by Sys::Syscall::sendfile_defined() can be explicitly overridden.
user [www-data]
The effective user id to change to after allocating the ports.

Stand-alone Daemon-mode Options

allowed_hosts []
If your apt-cacher server is directly connected to the Internet and you are worried about unauthorised fetching of packages through it, you can specify a range of IP addresses that are allowed to use it. Localhost (127.0.0.1/8, ::1) is always allowed, other addresses must be matched by allowed_hosts and not by denied_hosts to be permitted to use the cache. This option can be a single item, list, IP address with netmask or IP range, resolvable hostname, or '*' to allow all access. See the default configuration file for further details and examples.
allowed_hosts_6
Deprecated option analogous to allowed_hosts, but for IPv6 clients. allowed_hosts can now take IPv6 addresses directly.
daemon_addr []
The daemon can be restricted to listen only on particular local IP address(es). If unset, the daemon will listen on all available addresses. Single item or list of IPs. Use with care.
daemon_port [3142]
The TCP port to bind to.
denied_hosts
The opposite of allowed_hosts setting, excludes hosts from the list of allowed hosts. Not used in inetd daemon mode.
denied_hosts_6
Deprecated option analogous to denied_hosts, but for IPv6 clients. denied_hosts can now take IPv6 addresses directly.
request_empty_lines [5]
The number of empty lines tolerated before an incoming connection is closed.
request_timeout [10]
Maximum time in seconds that will be waited for a incoming request before closing the connection.

CLIENT CONFIGURATION

There are two different ways of configuring clients to use apt-cacher's cache. Ensure that you do not use a mixture of both methods. Changing both proxy settings and base URLs can create some confusion.

Access cache like a mirror
To use the cache in this way, edit /etc/apt/sources.list on each client and prepend the address of the apt-cacher server to each deb/src line. This mode is limited to using HTTP protocol for the upstream source.
For example, if you have:
deb http://ftp.debian.org stable main
change it to read either
deb http://apt-cacher.server[:port]/ftp.debian.org stable main [server in daemon mode]
or     
deb http://apt-cacher.server[:port]/apt-cacher/ftp.debian.org stable main [server in CGI mode]
Access cache like a proxy
For clients to use the cache in this way, set the apt-cacher server as a proxy on each client by setting the proxy URL in apt.conf. For example:
Acquire::http::Proxy "http://apt-cacher.server:port";
See apt.conf(5) for further details.
It is not recommended to set the http_proxy environment variable as this may effect a wide variety of applications using a variety of URLs. Apt-cacher will not work as a general purpose web cache!
In this mode apt-cacher supports HTTP, FTP, HTTPS CONNECT, HTTPS GET and Debian Bugs SOAP proxying.

FAQ

Q: Can I just copy some .debs into the cache dir and have it work?

A: Almost! A bit additional work is also required to make them usable and persistent in the cache.

First: alongside with the .debs apt-cacher stores the HTTP headers that were sent from the server. If you copy .debs straight into the storage directory and don't add those things, fetching them *will* fail.

Fortunately Apt-cacher now comes with an import helper script to make things easier. Just put a bunch of .debs into /var/cache/apt-cacher/import (or a directory called 'import' inside whatever you've set your cache dir to be), and run /usr/share/apt-cacher/apt-cacher-import.pl (you can specify alternative source directory with the first parameter). The script will run through all the package files it finds in that dir and move them around to the correct locations plus create additional flag/header files. Run it with "-h" to get more information about how to use additional features -- it can work in recursive mode while discovering the files and save space by making links to files located elsewhere in the filesystem.

Second: if the daily cleanup operation is enabled (see clean_cache option above) and there is no Packages.gz (or .bz2) file that refers to the new files, the package files will be removed really soon. From another point of view: if there are potential clients that would download these packages and the clients did run "apt-get update" using apt-cacher once, there is no reason to worry.

Q: I have an invalid/corrupt file in the cache, how can I remove it?

A: There are several possibilities for this:

1)
In either daemon mode, apt-cacher tries to be a well-behaved cache and respects Cache-Control and Pragma headers. Refreshing a file can be forced by adding Cache-Control: no-cache to the request. The easiest way of doing this is with the --no-cache option of wget(1), for example:
wget -O/dev/null --no-cache http://localhost:3142/debian/dists/stable/Release
2)
Enable checksumming which will validate cached files as they are fetched.
3)
Wait until after apt-cacher-cleanup.pl has run (which should remove invalid files).
4)
Manually delete the file from /var/cache/apt-cacher/packages.

Q: Does the daily generation of reports or cleaning the cache depend on whether apt-cacher is running continuously as a daemon?

A: No, the regular maintenance jobs are independent of a running server. They are executed by cron(8) and use only static data like logs and cached index files and package directory listing. However, apt-cacher should be configured correctly because cleanup runs it directly (in inetd mode) to refresh the Packages/Sources files.

Q: Are host names permissible in the configuration file?

A: Since 1.7.0 DNS resolvable hostnames are permissible.

Unlike with some other software like Apache, the access control is slightly different because there is no configurable checking order. Instead, a client host is checked using both filters, allowed_hosts and denied_hosts. Following combinations are possible: allowed_hosts=* and denied_hosts is empty, then every host is allowed; allowed_hosts=<ip data> and denied_hosts=empty, then only defined hosts are permitted; allowed_hosts=* and denied_hosts=<ip data>, then every host is accepted except of those matched by denied_hosts; allowed_hosts=<ip data> and denied_hosts=<ip data>, then only the clients from allowed_hosts are accepted except of those matched by denied_hosts. allowed_hosts=<empty> blocks everything (except localhost). denied_hosts must not have an "*" value, use empty allowed_hosts setting if you want that.

Q: generate_reports: how does being able to view the reports depend on the web server you are running? Are they only available if apt-cacher is running on port 80?

The report is generated using a script (started by a cron job, see above) and is stored as $log_dir/report.html. You can access it using the "/report" path in the access URL. If apt-cacher is running in CGI mode, then the URL for the browser looks like http://apt-cacher.server[:port]/apt-cacher/report/.

LIMITATIONS

Apt-cacher currently handles forwarding GET requests to HTTP, FTP and HTTPS sources. Support for other access methods (ssh, rsync) is not currently planned.

SIGNALS

Apt-cacher handles the following signals:
HUP
Causes the configuration file to be re-read.
USR1
Toggles printing of debug output to /var/log/apt-cacher/error.log

FILES

/etc/apt-cacher/apt-cacher.conf
main configuration file
/etc/apt-cacher/conf.d/
configuration fragments directory
/var/cache/apt-cacher/
cache/working directory
/var/log/apt-cacher
log directory, rotated by logrotate(8) if available
/var/log/apt-cacher/report.html
report page, generated by the helper script

AUTHOR

Apt-cacher was originally written by Nick Andrews <[email protected]>. This manual page was originally written by Jonathan Oxer <[email protected]>, for the Debian GNU/Linux system (but may be used by others). It was maintained by Eduard Bloch <[email protected]>, and it is now maintained by Mark Hindley <[email protected]>.