ara(1) a utility for doing boolean regexp queries on the the Debian

SYNOPSIS

Batch mode:

ara [options] query

In batch mode, ara takes one or more queries as arguments, read the database files according to its configuration, and outputs the results to stdout.

Interactive mode:

ara [options] -i

With the -i or -interactive options, ara reads the database files and then prompts the user for queries or commands. The results are displayed (with the help of a pager such as more or less if necessary), and ara prompts the user again. Interactive mode is strongly recommended, since loading the package databases can be long, but once loaded, queries run quite fast. This is a major advantage of ara over tools such as dpkg-iasearch or dpkg-dctrl.

For key bindings see KEY BINDINGS.

Graphical interface (GTK2):

A graphical interface, xara(1), is provided by the Debian package xara-gtk.

Query syntax

See the EXAMPLES section for a quick introduction ; xara has some built-in help. The syntax is described in detail below.

DESCRIPTION

ara and xara allow the user to search the Debian software package database (which includes installed and uninstalled packages) using powerful queries made of boolean combinations of regular expressions acting on fields given by patterns.

For example, the query section=utils & depends:(gtk or tk8 or xlibs or kde or gnome or qt) & debian & package will display packages in the section utils that have graphical interfaces (because they depend on graphical toolkits or X11 libraries), and whose description contains the words debian and package.

RATIONALE

Debian users can easily install software with the commands dselect or apt-get install. They can choose (on Debian 3.1 unstable) from over 30,000 packages. Finding the right package can be quite difficult. Although packages are categorized in crude sections, there are still too many packages and reading all descriptions is out of the question.

The database files are huge and their mail-like syntax makes them hard to search with line-oriented tools like grep. There exist commands such as dpkg-iasearch(1) or dpkg-dctrl(1) but their capabilities are limited. Graphical package management tools such as aptitude or synaptic have search capabilities. Although ara can call apt to install or remove packages, its orientation is that of a powerful search tool. Indeed, the name ara comes from the imperative form of the Turkish verb aramak which means "to search".

THE DEBIAN PACKAGE DATABASE

The database of Debian packages is a huge text file at /var/lib/dpkg/available (or a collection of text files under /var/lib/apt/lists/). These files are in a mailbox-like format, and a typical entry looks like this:

Priority: required
Section: base
Installed-Size: 460
Origin: debian
Maintainer: Dpkg Development <[email protected]>
Bugs: debbugs://bugs.debian.org
Architecture: i386
Source: dpkg
Version: 1.10.24
Replaces: dpkg (<< 1.10.3)
Depends: libc6 (>= 2.3.2.ds1-4), ....
Filename: pool/main/d/dpkg/dselect_1.10.24_i386.deb
Size: 119586
MD5sum: c740f7f68dab08badf4f60b51a33500a
Description: a user tool to manage Debian packages
 dselect is the primary user interface for installing, removing and
 managing Debian packages. It is a front-end to dpkg.
Each package is thus described by a set of fields (like Package, Description, Version...).

QUERY SYNTAX AND SEMANTICS

Here we describe the query syntax in some detail. As of version 1.0, ara introduces new, simplified syntax which is quite traditional and should be familiar to anyone having used search engines. Search terms are simply combined with AND, OR and NOT boolean operators. Having a look at the EXAMPLES section at the end of this manual should provide you a starting point.

Consider the set D of Debian package descriptions contained in the file /var/lib/dpkg/available (or in files under /var/lib/apt/lists/). Each description is a set of couples of the form (f,v) where f and v are strings: f is the name of the field (namely, Package, Description, Filename, Depends, etc.); v is its value. Thus D is a set of set of couples, forming the universe. Queries select subsets of the universe D. Output options select which fields of the selected part of the universe to display, and how to display them.

Queries

A query is a boolean combination of atomic expressions. An atomic expression selects a subset of the set D of descriptions. I call this set the meaning of the expression; if e denotes an atomic expression, its meaning is denoted by [e]. The meaning of a boolean combination of atomic expressions is just the boolean combination of the meaning of its constituents. In other words, if e1 and e2 are atomic expressions, then e1 & e2 is a query, whose meaning is the intersection of the meanings of e1 and e2; and the meaning of e1 | e2 is the union of the meanings of e1 and e2.

Atomic expressions

Atomic expressions can be of the forms pattern, /regexp/, quoted_string, fieldspec operator1 string, or fieldspec operator2 regexp.

Boolean operators and constants

e1 & e2 (also e1 AND e2, e1 and e2)
This is logical conjunction (set intersection). Returns the intersection of [e1] and [e2], i.e. packages satisfying both e1 and e2.

e1 | e2 (also e1 OR e2, e1 or e2)
This is logical disjunction (set union). Union of [e1] and [e2], i.e. packages satisfying e1, e2 or both.

!e1 (also NOT e1, not e1)
This is logical negation (set complementation). Complement of [e1], i.e. packages not satisfying e1.

Please note that ~ stands for the current default field specifier and is not an alias for the complementation operator.

true (also all)
The set of all descriptions, i.e. all packages.

false (also none)
The empty set, i.e. no packages.

Field specifiers

A field specifier fieldspec is a comma-separated list of field patterns.

Field patterns are like simple shell patterns and they may contain star characters (which stand for anything) or question marks (which stand for any single character). They are case-insensitive. They specify a set of fields.

For example description and Description specify the set of fields { Description }, whereas de* specifies { Description, Depends }.

The special specifier ~ denotes the current default specifier (see below).

Current fields specifiers and simplified atomic expressions

The need to repeat the field specifier can make the above syntax cumbersome. That is why there is a current field specifier. The current field specified is, by default, Description,Package. Simplified atomic expressions are simply words or simplified shell expressions (which do not need to be enclosed in double quotes) and they are searched in fields in the current field specifier. They can be made of letters, digits, underscores, dashes and periods. They may contain stars of question marks which are interpreted as for field patterns (i.e., as simplified shell expressions). If double quotes are used, other characters and spaces can be used.

The default field specifier in a query query can be changed to fieldspec by simply prefixing the query with fieldspec:. This gives fieldspec:query. However if query is complex (i.e., contains binary boolean operators) you need to enclose query in parentheses, as in fieldspec:(query1 or query2).

String literals

String literals can be given with or without double quotes; without double quotes, the syntax is as for C identifiers, except that you can use dashes, you must start with a latin letter ([a-zA-Z]) and you can continue with Latin letters, decimal digits or underscore ([a-zA-Z0-9_]). Inside double quotes, all characters are allowed, except double quotes, which must be preceded by a backslash.

Variables

Results of queries can be stored in variables, which may be recalled later. This isn't very useful in batch mode but is useful in interactive and graphical modes.

Variable names start with a dollar and follow usual conventions for variables, i.e., they can be any mix of alphanumeric characters and symbols such as underscore, dash, etc.

Variable names are case-sensitive so that $Installed and $installed are different.

To assign the result of a query (which is a set of packages) a variable named $variable just execute the query $variable := query. You may then recall this particular set by simply writing $variable.

Example: $installed := status:(installed & !not-installed)

Operators

Hierarchical comparison operators can be negated by changing the direction of the angle brackets and adding or removing an equality sign at end (<= becomes >). Other operators are negated as follows: = becomes != and =~ becomes !~.

fieldspec=string
Atomic expression selecting packages having a field in fieldspec having a value a value exactly equal to string.

fieldspec<string (fieldspec<=string, fieldspec>string, fieldspec>=string)
Atomic expression selecting packages having a field in fieldspec whose value is strictly less than string. The order used is the Debian versioning order. This order is compatible with the natural order on integers and with Debian version numbers. When comparing strings not containing special characters, letters sort before numbers, as opposed to lexicographic ASCII order we are used to. This means that hexadecimal numbers (such as MD5 sums) will not have their usual order.

Note that string must be on the right side of the operator (i.e., you cannot write 1000 < Size).

fieldspec=~/expression/ (also fieldspec:/expression/)
Selects descriptions whose field named fieldspec exists and whose value matches, case-sensitively, the regular expression expression.

fieldspec=~/expression/i (also fieldspec:/expression/i)
Same as above, but the regular expression is case-insensitive.

fieldspec=~/expression/w (also fieldspec:/expression/w)
Same as above, but the regular expression is case-sensitive and matches only at word boundaries. Note that letters-to-digit or digit-to-letter transitions are considered to be word boundaries.

fieldspec=~/expression/iw (also fieldspec:/expression/iw)
The regular expression here is case-insensitive and matched at word boundaries.

Regular expressions

Regular expressions are given between a pair of slashes; the last slash can be followed by a commutative sequence of letters denoting flags. Regular expression syntax is sed-like: grouping parentheses and alternation must be backslashed. For more details, see the Objective Caml manual chapter on the Str module. In short (x,x1,x2 are meta-symbols denoting regular expressions):

/./
Any character.

/toto/
Literal string toto.

/x1x2/
Concatenation.

/x1\|x2/
Alternation.

\(x1\)*
Star closure.

[c-d]
Character range.

\b
Word boundaries.

/x/i
Case insensitive.

/x/w
At word boundaries.

Remark

Most queries will contain an appreciable amount of shell metacharacters. For example, logical disjunction is denoted by the pipe character, which is used by all known shells. The problem is aggravated by the fact that names of real commands are likely to appear in the used expressions; successfully setting up a UNIX pipeline by error is therefore plausible.

When calling ara from the command line in batch mode, You are strongly urged to protect your queries by surrounding them with simple quotes; never write something like ara Pack*=~/halt|reboot|shutdown/ as this will very likely reboot your system (and is incorrect regular expression syntax, if halt or reboot or shutdown is meant: pipes must be backslashed). Instead, one should write ara 'Pack*=~/halt\|reboot\|shutdown /'

OPTIONS

Operation

-interactive, -i
Interactive mode ; prompt for a query, display it.

-config <path> (also for xara)
Set configuration file name (default $HOME/.ara/ara.config).

-noconfig
Dont attempt to create a configuration file.

-nohistory
Dont save command history

Help options

-help (also for xara)
Display some help

-about Display copyright, thanks and dedication.

-version, -about (also for xara)
Print author, license, version and dedication (and exit if called from CLI).

-examples
Display some documentation including examples exit.

-q <query>
Query (e.g., depends:xlibs & !package:xcalc).

-query <query>
Ditto.

Options pertaining to the terminal

-progress (-noprogress)
Show or dont show progress indicator when loading database.

-lines <height>
Set height of terminal for interactive display. By default this is taken from the environment variable LINES or as 25 if it is undefined.

-columns <width>
Set width of terminal for interactive display. By default this is taken from the environment variable COLUMNS or as 25 if it is undefined.

-pager (-nopager)
Use (or dont use) a pager displaying long output in interactive mode. The pager command is defined in the configuration file $HOME/.ara/ara.config. By default this is /etc/alternatives/pager. The pager is only used when the output size exceeds the terminal height.

-debug (also for xara)
Enable debugging information

-debug-level (also for xara)
Set debugging level (higher is more verbose, max is 100, default is 10)

Display styles

-new Show only newest version of each package.

-old
List all versions of packages.

-short <query>
Display names of packages satisfying query (and their version if -old is set), with multiple packages per line.

-list <query>
Same, but display one package name per line, and no curly braces (default).

-raw <query>
For each package satisfying the query, display all selected fields.

-table <query>
Display results as a table.

-noborders
Dont draw ASCII borders for tabular output.

-borders
Draw ASCII borders for tabular output.

-count <query>
Display number of matching packages.

-fields <field_1[:width_1],...>
Limit output to specified fields. The optional width specifiers are used with the -table option and ignored otherwise. Use * to display all fields (but remember to escape the star character from your shell).

-ast
Dump the abstract syntax tree of parsed queries to stderr.

EXAMPLES

ara 'Section=utils'
List the name of every package in section utils.

ara 'Section=utils and !Depends:(gnome|kde|gtk)'

 ... except those whose dependency field matches the regexp gnome\|kde\|gtk

ara -list 'Section=utils and Status:(installed & !not-installed)'
List all installed packages in section utils.

ara -short 'section=utils and !depends:(gtk|gnome|kde) and priority=optional'

 ... list multiple names per line, and show only optional packages.

ara -short 'section=utils & (!depends:(gtk|gnome|kde) | size<100000) & priority=optional'
Well, exclude gtk,gnome or kde stuff only if 100000 bytes or greater.

ara -noborders -fields Package,Size,Maintainer:20 -table \ -short 'section=utils & (!depends:(gtk|gnome|kde) | size<100000) & priority=optional'


 ... show Package, Size and Maintainer fields from the above results as a nice ascii table, limiting the maintainer field to 20 characters, but without crude ASCII borders.

ara -old -fields Package:8,Size,Description:100 \ -table 'Section=games and not (Depends:(gtk|sdl|kde|opengl|gnome|qt) or /shoot\|kill\|destroy\|blast\|race\|bomb/iw or /multi\(-\|\)player\|strategy\|conquest\|3\(-\|\)d/iw) and Depends:(xlibs or vga) and Size <= 1000000'

Assuming a 125-column display, display the first eight characters of the package name, the size in bytes, and the first hundred characters of the (first line) of the description of all packages in the games section whose size does not exceeding one million bytes, and which do not depend on fancy stuff like GTK, SDL, KDE, OpenGL, Qt or Gnome, do not mention some form of violence (to shoot, to kill, etc.) in their description, are not described as multi-player, strategy, conquest or three-dimensional, and yet depend on either xlibs or svga to exclude console-based games.

SPEED

ara reads the whole database into memory and then processes queries. Since the database is usually big, this takes some time. However, queries then run quite fast. So specify multiple queries or use the -interactive option to amortize the cost of reading the database.

LICENSE

ara is released under the GNU General Public License, version 2, a copy of which is included in the source distribution.

THANKS

Many thanks to George Danchev, Thomas Schoepf and Sven Luther for doing the Debian packaging of ara and many helpful comments.

CONFIGURATION FILES

The system-wide configuration file for ara is /etc/ara.config. Its syntax is self-evident and follows the Ocaml lexical conventions.

Values in the user-specific configuration file $HOME/.ara/ara.config override those of /etc/ara.config.

OTHER FILES

Command line history is saved in $HOME/.ara/ara.history.

The following databases are loaded by default:

/var/lib/dpkg/available
/var/lib/dpkg/status
/var/lib/apt/lists/*_Packages
/var/lib/apt/lists/*_Sources

ENVIRONMENT VARIABLES

In ara the variables LINES and COLUMNS are used to determine the dimensions of the terminal. Note that these variables are not exported by default in your shell ; add export LINES COLUMNS in your .zshrc or .bashrc.

KNOWN BUGS

Due to lack of Unicode support, non-ASCII characters lead to problems under Unicode terminals. Note that the database files are encoded in Latin1.