dgit(7) principles of operation

SUMMARY

dgit treats the Debian archive as a version control system, and bidirectionally gateways between the archive and git. The git view of the package can contain the usual upstream git history, and will be augmented by commits representing uploads done by other developers not using dgit. This git history is stored in a canonical location known as dgit-repos which lives on a dedicated git server.

MODEL

You may use any suitable git workflow with dgit, provided you satisfy dgit's requirements:

dgit maintains a pseudo-remote called dgit, with one branch per suite. This remote cannot be used with plain git.

The dgit-repos repository for each package contains one ref per suite named refs/dgit/suite. These should be pushed to only by dgit. They are fast forwarding. Each push on this branch corresponds to an upload (or attempted upload).

However, it is perfectly fine to have other branches in dgit-repos; normally the dgit-repos repo for the package will be accessible via the remote name `origin'.

dgit push will also make signed tags called debian/version (a la DEP-14) and push them to dgit-repos. These are used at the server to authenticate pushes.

dgit push can operate on any commit which is a descendant of the current dgit/suite tip in dgit-repos.

Uploads made by dgit contain an additional field Dgit in the source package .dsc. (This is added by dgit push.) This specifies a commit (an ancestor of the dgit/suite branch) whose tree is identical to the unpacked source upload.

Uploads not made by dgit are represented in git by commits which are synthesised by dgit. The tree of each such commit corresponds to the unpacked source; there is an origin commit with the contents, and a psuedo-merge from last known upload - that is, from the contents of the dgit/suite branch.

dgit expects repos that it works with to have a dgit remote. This refers to the well-known dgit-repos location (on a dedicated Debian VM). dgit fetch updates the remote tracking branch for dgit/suite.

dgit does not (currently) represent the orig tarball(s) in git. The orig tarballs are downloaded (by dgit clone) into the parent directory, as with a traditional (non-gitish) dpkg-source workflow. You need to retain these tarballs in the parent directory for dgit build and dgit push.

dgit repositories could be cloned with standard (git) methods. The only exception is that for sourceful builds / uploads the orig tarball(s) need to be present in the parent directory.

To a user looking at the archive, changes pushed using dgit look like changes made in an NMU: in a `3.0 (quilt)' package the delta from the previous upload is recorded in a new patch constructed by dpkg-source.

READ-ONLY DISTROS

Distros which do not maintain a set of dgit history git repositories can still be used in a read-only mode with dgit. Currently Ubuntu is configured this way.

PACKAGE SOURCE FORMATS

If you are not the maintainer, you do not need to worry about the source format of the package. You can just make changes as you like in git. If the package is a `3.0 (quilt)' package, the patch stack will usually not be represented in the git history.

FORMAT 3.0 (QUILT)

For a format `3.0 (quilt)' source package, dgit may have to make a commit on your current branch to contain metadata used by quilt and dpkg-source.

This is because `3.0 (quilt)' source format represents the patch stack as files in debian/patches/ actually inside the source tree. This means that, taking the whole tree (as seen by git or ls) (i) dpkg-source cannot represent certain trees, and (ii) packing up a tree in `3.0 (quilt)' and then unpacking it does not always yield the same tree.

dgit will automatically work around this for you when building and pushing. The only thing you need to know is that dgit build, sbuild, etc., may make new commits on your HEAD. If you're not a quilt user this commit won't contain any changes to files you care about.

You can explicitly request that dgit do just this fixup, by running dgit quilt-fixup.

If you are a quilt user you need to know that dgit's git trees are `patches applied packaging branches' and do not contain the .pc directory (which is used by quilt to record which patches are applied). If you want to manipulate the patch stack you probably want to be looking at tools like git-dpm.

FILES IN THE SOURCE PACKAGE BUT NOT IN GIT - AUTOTOOLS ETC.

This section is mainly of interest to maintainers who want to use dgit with their existing git history for the Debian package.

Some developers like to have an extra-clean git tree which lacks files which are normally found in source tarballs and therefore in Debian source packages. For example, it is conventional to ship ./configure in the source tarball, but some people prefer not to have it present in the git view of their project.

dgit requires that the source package unpacks to exactly the same files as are in the git commit on which dgit push operates. So if you just try to dgit push directly from one of these extra-clean git branches, it will fail.

As the maintainer you therefore have the following options:

  • Persuade upstream that the source code in their git history and the source they ship as tarballs should be identical. Of course simply removing the files from the tarball may make the tarball hard for people to use.
  • One answer is to commit the (maybe autogenerated) files, perhaps with some simple automation to deal with conflicts and spurious changes. This has the advantage that someone who clones the git repository finds the program just as easy to build as someone who uses the tarball.
  • Have separate git branches which do contain the extra files, and after regenerating the extra files (whenever you would have to anyway), commit the result onto those branches.
  • Provide source packages which lack the files you don't want in git, and arrange for your package build to create them as needed. This may mean not using upstream source tarballs and makes the Debian source package less useful for people without Debian build infrastructure.

Of course it may also be that the differences are due to build system bugs, which cause unintended files to end up in the source package. dgit will notice this and complain. You may have to fix these bugs before you can unify your existing git history with dgit's.

FILES IN THE SOURCE PACKAGE BUT NOT IN GIT - DOCS, BINARIES ETC.

Some upstream tarballs contain build artifacts which upstream expects some users not to want to rebuild (or indeed to find hard to rebuild), but which in Debian we always rebuild.

Examples sometimes include crossbuild firmware binaries and documentation. To avoid problems when building updated source packages (in particular, to avoid trying to represent as changes in the source package uninteresting or perhaps unrepresentable changes to such files) many maintainers arrange for the package clean target to delete these files.

dpkg-source does not (with any of the commonly used source formats) represent deletion of files (outside debian/) present in upstream. Thus deleting such files in a dpkg-source working tree does not actually result in them being deleted from the source package. Thus deleting the files in rules clean sweeps this problem under the rug.

However, git does always properly record file deletion. Since dgit's principle is that the dgit git tree is the same of dpkg-source -x, that means that a dgit-compatible git tree always contains these files.

For the non-maintainer, this can be observed in the following suboptimal occurrences:

  • The package clean target often deletes these files, making the git tree dirty trying to build the source package, etc. This can be fixed by using dgit -wg aka --clean=git, so that the package clean target is never run.
  • The package build modifies these files, so that builds make the git tree dirty. This can be worked around by using `git reset --hard' after each build (or at least before each commit or push).

From the maintainer's point of view, the main consequence is that to make a dgit-compatible git branch it is necessary to commit these files to git. The maintainer has a few additional options for mitigation: for example, it may be possible for the rules file to arrange to do the build in a temporary area, which avoids updating the troublesome files; they can then be left in the git tree without seeing trouble.

PROBLEMS WITH PACKAGE CLEAN TARGETS ETC.

A related problem is other unexpected behaviour by a package's clean target. If a package's rules modify files which are distributed in the package, or simply forget to remove certain files, dgit will complain that the tree is dirty.

Again, the solution is to use dgit -wg aka --clean=git, which instructs dgit to use git clean instead of the package's build target, along with perhaps git reset --hard before each build.

This is 100% reliable, but has the downside that if you forget to git add or to commit, and then use dgit -wg or git reset --hard, your changes may be lost.