mchars_alloc(3) character table for mandoc

Other Alias

mchars_free, mchars_num2char, mchars_num2uc, mchars_spec2cp, mchars_spec2str

LIBRARY

Mandoc Macro Compiler Library (libmandoc, -lmandoc)

SYNOPSIS

#include <sys/types.h>
#include <mandoc.h>

struct mchars *

mchars_alloc(void);

void

mchars_free(struct mchars *table);

char

mchars_num2char(const char *decimal, size_t sz);

int

mchars_num2uc(const char *hexadecimal, size_t sz);

int

mchars_spec2cp(const struct mchars *table, const char *name, size_t sz);

const char *

mchars_spec2str(const struct mchars *table, const char *name, size_t sz, size_t *rsz);

const char *

mchars_uc2str(int codepoint);

DESCRIPTION

These functions translate Unicode character numbers and roff(7) character names into glyphs. See mandoc_char(7) for a list of roff(7) special characters. These functions are intended for external use by programs formatting mdoc(7) and man(7) pages for output, for example the mandoc(1) output formatter modules and makewhatis(8). The decimal, hexadecimal, name, and size input arguments are usually obtained from the mandoc_escape(3) parser function.

The function mchars_num2char() converts a decimal string representation of a character number consisting of sz digits into a printable ASCII character. If the input string is non-numeric or does not represent a printable ASCII character, the NUL character ('\0') is returned. For example, the mandoc(1) -Tascii, -Tutf8, and -Thtml output modules use this function to render roff(7) \N escape sequences.

The function mchars_num2uc() converts a hexadecimal string representation of a Unicode codepoint consisting of sz digits into an integer representation. If the input string is non-numeric or represents an ASCII character, the NUL character ('\0') is returned. For example, the mandoc(1) -Tutf8 and -Thtml output modules use this function to render roff(7) \[uXXXX] and \C'uXXXX' escape sequences.

The function mchars_alloc() allocates an opaque struct mchars * table object for subsequent use by the following two lookup functions. When no longer needed, this object can be destroyed with mchars_free().

The function mchars_spec2cp() looks up a roff(7) special character name consisting of sz characters in the table and returns the corresponding Unicode codepoint. If the name is not recognized, -1 is returned. For example, the mandoc(1) -Tutf8 and -Thtml output modules use this function to render roff(7) \[name] and \C'name' escape sequences.

The function mchars_spec2str() looks up a roff(7) special character name consisting of sz characters in the table and returns an ASCII string representation. The length of the representation is returned in rsz. In many cases, the meaning of such ASCII representations is not quite obvious, so using roff(7) special characters in documents intended for ASCII rendering is usually a bad idea. If the name is not recognized, NULL is returned. For example, makewhatis(8) and the mandoc(1) -Tascii output module use this function to render roff(7) \[name] and \C'name' escape sequences.

The function mchars_uc2str() performs a reverse lookup of the Unicode codepoint and returns an ASCII string representation, or the string "<?>" if none is available.

FILES

These funtions are implemented in the file chars.c.

HISTORY

These functions and their predecessors have been available since the following mandoc versions:
functionsincepredecessorsince
mchars_alloc()1.11.3ascii2htab()1.5.3
mchars_free()1.11.2asciifree()1.6.0
mchars_num2char()1.11.2chars_num2char()1.10.10
mchars_num2uc()1.11.3------
mchars_spec2cp()1.11.2chars_spec2cp()1.10.5
mchars_spec2str()1.11.2a2ascii()1.5.3
mchars_uc2str()1.13.2------

AUTHORS

Kristaps Dzonsons <[email protected]>
Ingo Schwarze <[email protected]>