bidiv(1) bidirectional text filter

SYNOPSIS

bidiv [ -plj ] [ -w width ] [file...]

DESCRIPTION

bidiv is a filter, or viewer, for birectional text stored in logical-order. It converts such text into visual-order text which can be viewed on terminals that do not handle bidirectionality. The output visual-order text is formatted assuming a fixed number of characters per line (automatically determined or given with the -w parameter).

bidiv is oriented towards Hebrew, and assumes the input to be a Hebrew and ASCII text encoded in one of the two common logical-order encodings: ISO-8859-8-i or UTF-8. Actually, bidiv guesses the encoding of its input at a character by character basis, so the input might be a mix of ISO-8859-8-i and Hebrew UTF-8. bidiv's output is visual-order text, in either the ISO-8859-8 or UTF-8 encoding, depending on your locale setting.

bidiv reads each file in sequence, converts it into visual order and writes it on the standard output. Thus:

$ bidiv file

prints file on your terminal (assuming it has the appropriate fonts, but no bidirectionality support), and:

$ bidiv file1 file2 | less

concatenates file1 and file2, and shows the results using the pager less.

If no input file is given, bidiv reads from the standard input file.

For more ideas on how to use bidiv, see the EXAMPLES section below.

OPTIONS

-p
Paragraph-based direction (default): When formatting a bidirectional output line, bidiv needs to be aware of that line's base direction. A line whose base direction is RTL (right to left) gets right-justified and its first element appears on the right. Otherwise, the line is left-justified and its first element appears on the left.

The -p option tells bidiv to choose a base direction per paragraph, where a paragraph is delimited by an empty line. This is bidiv's default behavior, and usually gives the expected results on most texts and emails.

The direction of the entire paragraph is chosen according to the first strongly-directioned character (i.e., an alphabetic character) appearing in the paragraph. Currently, if the first output line of a paragraph has no directional characters (e.g., a line of minus signs before an email signature, or a line containing only numbers) that line is output with the same direction of the previous paragraph, but it does not determine the direction of the rest of the paragraph. If the first line of the first paragraph does not have a direction, the RTL direction is arbitrarily chosen.

-l
Line-based direction: This option choose an alternative method of choosing each output line's base direction. When this option is enabled, the base direction of each output line is determined on its own (again, according to the first character on the line with a strong direction). This method may give wrong results in the case where a line starts with a word of the opposite direction. This case is rare, but does happen under random line-splitting circumstances, or when the text is defining words of a foreign language.

-j
Do not justify: By default, RTL lines are right-justified, i.e., they are padded with spaces on the left when shorter than the required line width (see the -w option). The -j option tells bidiv not to preform this justifications, and leave short lines unpadded.
-w width
bidiv formats its output for lines of the given width. Lines are split when longer than this width, and RTL lines are right-justfied to fill that width unless the -j option is given.

When the -w option is not given, bidiv uses the value of the COLUMNS variable, which is usually automatically defined by the user's shell. When that both the -w option and the COLUMNS variable are missing, the default of 80 columns is used.

OPERANDS

The following operand is supported:
file
A path name of an input file. If no file is specified, the standard input is used.

EXAMPLES

1.
bidiv README | less
2.
man something | bidiv | less

(or groff -man -Tlatin1 something.1 |sed 's/.^H\(.\)/\1/g' |../bidiv -w 65)

3.
set "bidiv" as a filter for your mail program (mutt, pine, etc.) for viewing mail with the ISO 8859-8-i character set, and Hebrew UTF-8 mail.

ENVIRONMENT

COLUMNS see -w option.

EXIT STATUS

The following exit values are returned:
0
All input files were output successfully.
>0
An error occurred.

AUTHOR

Written by Nadav Har'El, http://nadav.harel.org.il.

Please send bug reports and comments to [email protected].

The latest version of this software can be found in ftp://ftp.ivrix.org.il/pub/ivrix/src/cmdline