codes of individual characters.
* Character Sets:: The space of possible character codes
is divided into various character sets.
-* Chars and Bytes:: More information about multibyte encodings.
-* Splitting Characters:: Converting a character to its byte sequence.
* Scanning Charsets:: Which character sets are used in a buffer?
* Translation of Characters:: Translation tables are used for conversion.
* Coding Systems:: Coding systems are conversions for saving files.
unique number, called a @dfn{codepoint}, to each and every character.
The range of codepoints defined by Unicode, or the Unicode
@dfn{codespace}, is @code{0..10FFFF} (in hex) inclusive. Emacs
-extends this range with codepoints in the range @code{3FFF80..3FFFFF},
-which it uses for representing raw 8-bit bytes that cannot be
-interpreted as characters. Thus, a character codepoint in Emacs is a
-22-bit integer number.
+extends this range with codepoints in the range @code{110000..3FFFFF},
+which it uses for representing characters that are not unified with
+Unicode and raw 8-bit bytes that cannot be interpreted as characters
+(the latter occupy the range @code{3FFF80..3FFFFF}). Thus, a
+character codepoint in Emacs is a 22-bit integer number.
@cindex internal representation of characters
@cindex characters, representation in buffers and strings
writes text to a disk file or passes it to some other process.
Occasionally, Emacs needs to hold and manipulate encoded text or
-binary non-text data in its buffer or string. For example, when Emacs
-visits a file, it first reads the file's text verbatim into a buffer,
-and only then converts it to the internal representation. Before the
-conversion, the buffer holds encoded text.
+binary non-text data in its buffers or strings. For example, when
+Emacs visits a file, it first reads the file's text verbatim into a
+buffer, and only then converts it to the internal representation.
+Before the conversion, the buffer holds encoded text.
@cindex unibyte text
Encoded text is not really text, as far as Emacs is concerned, but
@end defun
@defun byte-to-position byte-position
-Return the buffer position, in character units, corresponding to
-byte-position @var{byte-position} in the current buffer. If
-@var{byte-position} is out of range, the value is @code{nil}.
+Return the buffer position, in character units, corresponding to given
+@var{byte-position} in the current buffer. If @var{byte-position} is
+out of range, the value is @code{nil}. In a multibyte buffer, an
+arbitrary value of @var{byte-position} can be not at character
+boundary, but inside a multibyte sequence representing a single
+character; in this case, this function returns the buffer position of
+the character whose multibyte sequence includes @var{byte-position}.
+In other words, the value does not change for all byte positions that
+belong to the same character.
@end defun
@defun multibyte-string-p string
@section Converting Text Representations
Emacs can convert unibyte text to multibyte; it can also convert
-multibyte text to unibyte, though this conversion loses information. In
-general these conversions happen when inserting text into a buffer, or
-when putting text from several strings together in one string. You can
-also explicitly convert a string's contents to either representation.
+multibyte text to unibyte, provided that the multibyte text contains
+only @acronym{ASCII} and 8-bit characters. In general, these
+conversions happen when inserting text into a buffer, or when putting
+text from several strings together in one string. You can also
+explicitly convert a string's contents to either representation.
Emacs chooses the representation for a string based on the text that
it is constructed from. The general rule is to convert unibyte text to
user that cannot be overridden automatically.
Converting unibyte text to multibyte text leaves @acronym{ASCII} characters
-unchanged, and likewise character codes 128 through 159. It converts
-the non-@acronym{ASCII} codes 160 through 255 by adding the value
-@code{nonascii-insert-offset} to each character code. By setting this
-variable, you specify which character set the unibyte characters
-correspond to (@pxref{Character Sets}). For example, if
-@code{nonascii-insert-offset} is 2048, which is @code{(- (make-char
-'latin-iso8859-1) 128)}, then the unibyte non-@acronym{ASCII} characters
-correspond to Latin 1. If it is 2688, which is @code{(- (make-char
-'greek-iso8859-7) 128)}, then they correspond to Greek letters.
-
- Converting multibyte text to unibyte is simpler: it discards all but
-the low 8 bits of each character code. If @code{nonascii-insert-offset}
-has a reasonable value, corresponding to the beginning of some character
-set, this conversion is the inverse of the other: converting unibyte
-text to multibyte and back to unibyte reproduces the original unibyte
-text.
-
-@defvar nonascii-insert-offset
-This variable specifies the amount to add to a non-@acronym{ASCII} character
-when converting unibyte text to multibyte. It also applies when
-@code{self-insert-command} inserts a character in the unibyte
-non-@acronym{ASCII} range, 128 through 255. However, the functions
-@code{insert} and @code{insert-char} do not perform this conversion.
-
-The right value to use to select character set @var{cs} is @code{(-
-(make-char @var{cs}) 128)}. If the value of
-@code{nonascii-insert-offset} is zero, then conversion actually uses the
-value for the Latin 1 character set, rather than zero.
-@end defvar
+unchanged, and converts bytes with codes 128 through 159 to the
+multibyte representation of raw eight-bit bytes.
-@defvar nonascii-translation-table
-This variable provides a more general alternative to
-@code{nonascii-insert-offset}. You can use it to specify independently
-how to translate each code in the range of 128 through 255 into a
-multibyte character. The value should be a char-table, or @code{nil}.
-If this is non-@code{nil}, it overrides @code{nonascii-insert-offset}.
-@end defvar
+ Converting multibyte text to unibyte converts all @acronym{ASCII}
+and eight-bit characters to their single-byte form, but loses
+information for non-@acronym{ASCII} characters by discarding all but
+the low 8 bits of each character's codepoint. Converting unibyte text
+to multibyte and back to unibyte reproduces the original unibyte text.
-The next three functions either return the argument @var{string}, or a
+The next two functions either return the argument @var{string}, or a
newly created string with no text properties.
-@defun string-make-unibyte string
-This function converts the text of @var{string} to unibyte
-representation, if it isn't already, and returns the result. If
-@var{string} is a unibyte string, it is returned unchanged. Multibyte
-character codes are converted to unibyte according to
-@code{nonascii-translation-table} or, if that is @code{nil}, using
-@code{nonascii-insert-offset}. If the lookup in the translation table
-fails, this function takes just the low 8 bits of each character.
-@end defun
-
-@defun string-make-multibyte string
-This function converts the text of @var{string} to multibyte
-representation, if it isn't already, and returns the result. If
-@var{string} is a multibyte string or consists entirely of
-@acronym{ASCII} characters, it is returned unchanged. In particular,
-if @var{string} is unibyte and entirely @acronym{ASCII}, the returned
-string is unibyte. (When the characters are all @acronym{ASCII},
-Emacs primitives will treat the string the same way whether it is
-unibyte or multibyte.) If @var{string} is unibyte and contains
-non-@acronym{ASCII} characters, the function
-@code{unibyte-char-to-multibyte} is used to convert each unibyte
-character to a multibyte character.
-@end defun
-
@defun string-to-multibyte string
This function returns a multibyte string containing the same sequence
-of character codes as @var{string}. Unlike
-@code{string-make-multibyte}, this function unconditionally returns a
-multibyte string. If @var{string} is a multibyte string, it is
-returned unchanged.
+of characters as @var{string}. If @var{string} is a multibyte string,
+it is returned unchanged.
+@end defun
+
+@defun string-to-unibyte string
+This function returns a unibyte string containing the same sequence of
+characters as @var{string}. It signals an error if @var{string}
+contains a non-@acronym{ASCII} character. If @var{string} is a
+unibyte string, it is returned unchanged.
@end defun
@defun multibyte-char-to-unibyte char
This convert the multibyte character @var{char} to a unibyte
-character, based on @code{nonascii-translation-table} and
-@code{nonascii-insert-offset}.
+character. If @var{char} is a non-@acronym{ASCII} character, the
+value is -1.
@end defun
@defun unibyte-char-to-multibyte char
This convert the unibyte character @var{char} to a multibyte
-character, based on @code{nonascii-translation-table} and
-@code{nonascii-insert-offset}.
+character.
@end defun
@node Selecting a Representation
is @code{nil}, the buffer becomes unibyte.
This function leaves the buffer contents unchanged when viewed as a
-sequence of bytes. As a consequence, it can change the contents viewed
-as characters; a sequence of two bytes which is treated as one character
-in multibyte representation will count as two characters in unibyte
-representation. Character codes 128 through 159 are an exception. They
-are represented by one byte in a unibyte buffer, but when the buffer is
-set to multibyte, they are converted to two-byte sequences, and vice
-versa.
+sequence of bytes. As a consequence, it can change the contents
+viewed as characters; a sequence of three bytes which is treated as
+one character in multibyte representation will count as three
+characters in unibyte representation. Eight-bit characters
+representing raw bytes are an exception. They are represented by one
+byte in a unibyte buffer, but when the buffer is set to multibyte,
+they are converted to two-byte sequences, and vice versa.
This function sets @code{enable-multibyte-characters} to record which
representation is in use. It also adjusts various data in the buffer
@defun string-as-unibyte string
This function returns a string with the same bytes as @var{string} but
treating each byte as a character. This means that the value may have
-more characters than @var{string} has.
+more characters than @var{string} has. Eight-bit characters
+representing raw bytes are an exception: each one of them is converted
+to a single byte.
If @var{string} is already a unibyte string, then the value is
@var{string} itself. Otherwise it is a newly created string, with no
-text properties. If @var{string} is multibyte, any characters it
-contains of charset @code{eight-bit-control} or @code{eight-bit-graphic}
-are converted to the corresponding single byte.
+text properties.
@end defun
@defun string-as-multibyte string
This function returns a string with the same bytes as @var{string} but
-treating each multibyte sequence as one character. This means that the
-value may have fewer characters than @var{string} has.
+treating each multibyte sequence as one character. This means that
+the value may have fewer characters than @var{string} has. If a byte
+sequence in @var{string} is invalid as a multibyte representation of a
+single character, each byte in the sequence is treated as raw 8-bit
+byte.
If @var{string} is already a multibyte string, then the value is
@var{string} itself. Otherwise it is a newly created string, with no
-text properties. If @var{string} is unibyte and contains any individual
-8-bit bytes (i.e.@: not part of a multibyte form), they are converted to
-the corresponding multibyte character of charset @code{eight-bit-control}
-or @code{eight-bit-graphic}.
+text properties.
@end defun
@node Character Codes
The unibyte and multibyte text representations use different
character codes. The valid character codes for unibyte representation
range from 0 to 255---the values that can fit in one byte. The valid
-character codes for multibyte representation range from 0 to 4194303,
-but not all values in that range are valid. The values 128 through
-255 do not usually show up in multibyte text, but they can occur if
-you do explicit encoding and decoding (@pxref{Explicit Encoding}).
-Some other character codes cannot occur at all in multibyte text.
-Only the @acronym{ASCII} codes 0 through 127 are completely legitimate
-in both representations.
+character codes for multibyte representation range from 0 to 4194303
+(#x3FFFFF). In this code space, values 0 through 127 are for
+@acronym{ASCII} charcters, and values 129 through 4194175 (#x3FFF7F)
+are for non-@acronym{ASCII} characters. Values 0 through 1114111
+(#10FFFF) corresponds to Unicode characters of the same codepoint,
+while values 4194176 (#x3FFF80) through 4194303 (#x3FFFFF) are for
+representing eight-bit raw bytes.
@defun characterp charcode
This returns @code{t} if @var{charcode} is a valid character, and
@example
(characterp 65)
@result{} t
-(characterp 256)
- @result{} nil
(characterp 4194303)
@result{} t
(characterp 4194304)
@end example
@end defun
+@defun get-byte pos &optional string
+This function returns the byte at current buffer's character position
+@var{pos}. If the current buffer is unibyte, this is literally the
+byte at that position. If the buffer is multibyte, byte values of
+@acronym{ASCII} characters are the same as character codepoints,
+whereas eight-bit raw bytes are converted to their 8-bit codes. The
+function signals an error if the character at @var{pos} is
+non-@acronym{ASCII}.
+
+The optional argument @var{string} means to get a byte value from that
+string instead of the current buffer.
+@end defun
+
@node Character Sets
@section Character Sets
@cindex character sets
- Emacs classifies characters into various @dfn{character sets}, each of
-which has a name which is a symbol. Each character belongs to one and
-only one character set.
-
- In general, there is one character set for each distinct script. For
-example, @code{latin-iso8859-1} is one character set,
-@code{greek-iso8859-7} is another, and @code{ascii} is another. An
-Emacs character set can hold at most 9025 characters; therefore, in some
-cases, characters that would logically be grouped together are split
-into several character sets. For example, one set of Chinese
-characters, generally known as Big 5, is divided into two Emacs
-character sets, @code{chinese-big5-1} and @code{chinese-big5-2}.
-
- @acronym{ASCII} characters are in character set @code{ascii}. The
-non-@acronym{ASCII} characters 128 through 159 are in character set
-@code{eight-bit-control}, and codes 160 through 255 are in character set
-@code{eight-bit-graphic}.
+@cindex charset
+@cindex coded character set
+An Emacs @dfn{character set}, or @dfn{charset}, is a set of characters
+in which each character is assigned a numeric code point. (The
+Unicode standard calls this a @dfn{coded character set}.) Each
+charset has a name which is a symbol. A single character can belong
+to any number of different character sets, but it will generally have
+a different code point in each charset. Examples of character sets
+include @code{ascii}, @code{iso-8859-1}, @code{greek-iso8859-7}, and
+@code{windows-1255}. The code point assigned to a character in a
+charset is usually different from its code point used in Emacs buffers
+and strings.
+
+@cindex @code{emacs}, a charset
+@cindex @code{unicode}, a charset
+@cindex @code{eight-bit}, a charset
+ Emacs defines several special character sets. The character set
+@code{unicode} includes all the characters whose Emacs code points are
+in the range @code{0..10FFFF}. The character set @code{emacs}
+includes all @acronym{ASCII} and non-@acronym{ASCII} characters.
+Finally, the @code{eight-bit} charset includes the 8-bit raw bytes;
+Emacs uses it to represent raw bytes encountered in text.
@defun charsetp object
Returns @code{t} if @var{object} is a symbol that names a character set,
The value is a list of all defined character set names.
@end defvar
-@defun charset-list
-This function returns the value of @code{charset-list}. It is only
-provided for backward compatibility.
+@defun charset-priority-list &optional highestp
+This functions returns a list of all defined character sets ordered by
+their priority. If @var{highestp} is non-@code{nil}, the function
+returns a single character set of the highest priority.
+@end defun
+
+@defun set-charset-priority &rest charsets
+This function makes @var{charsets} the highest priority character sets.
@end defun
@defun char-charset character
-This function returns the name of the character set that @var{character}
-belongs to, or the symbol @code{unknown} if @var{character} is not a
-valid character.
+This function returns the name of the character set of highest
+priority that @var{character} belongs to. @acronym{ASCII} characters
+are an exception: for them, this function always returns @code{ascii}.
@end defun
@defun charset-plist charset
-This function returns the charset property list of the character set
-@var{charset}. Although @var{charset} is a symbol, this is not the same
-as the property list of that symbol. Charset properties are used for
-special purposes within Emacs.
+This function returns the property list of the character set
+@var{charset}. Although @var{charset} is a symbol, this is not the
+same as the property list of that symbol. Charset properties include
+important information about the charset, such as its documentation
+string, short name, etc.
@end defun
-@deffn Command list-charset-chars charset
-This command displays a list of characters in the character set
-@var{charset}.
-@end deffn
-
-@node Chars and Bytes
-@section Characters and Bytes
-@cindex bytes and characters
-
-@cindex introduction sequence (of character)
-@cindex dimension (of character set)
- In multibyte representation, each character occupies one or more
-bytes. Each character set has an @dfn{introduction sequence}, which is
-normally one or two bytes long. (Exception: the @code{ascii} character
-set and the @code{eight-bit-graphic} character set have a zero-length
-introduction sequence.) The introduction sequence is the beginning of
-the byte sequence for any character in the character set. The rest of
-the character's bytes distinguish it from the other characters in the
-same character set. Depending on the character set, there are either
-one or two distinguishing bytes; the number of such bytes is called the
-@dfn{dimension} of the character set.
-
-@defun charset-dimension charset
-This function returns the dimension of @var{charset}; at present, the
-dimension is always 1 or 2.
+@defun put-charset-property charset propname value
+This function sets the @var{propname} property of @var{charset} to the
+given @var{value}.
@end defun
-@defun charset-bytes charset
-This function returns the number of bytes used to represent a character
-in character set @var{charset}.
+@defun get-charset-property charset propname
+This function returns the value of @var{charset}s property
+@var{propname}.
@end defun
- This is the simplest way to determine the byte length of a character
-set's introduction sequence:
-
-@example
-(- (charset-bytes @var{charset})
- (charset-dimension @var{charset}))
-@end example
-
-@node Splitting Characters
-@section Splitting Characters
-@cindex character as bytes
-
- The functions in this section convert between characters and the byte
-values used to represent them. For most purposes, there is no need to
-be concerned with the sequence of bytes used to represent a character,
-because Emacs translates automatically when necessary.
-
-@defun split-char character
-Return a list containing the name of the character set of
-@var{character}, followed by one or two byte values (integers) which
-identify @var{character} within that character set. The number of byte
-values is the character set's dimension.
-
-If @var{character} is invalid as a character code, @code{split-char}
-returns a list consisting of the symbol @code{unknown} and @var{character}.
+@deffn Command list-charset-chars charset
+This command displays a list of characters in the character set
+@var{charset}.
+@end deffn
-@example
-(split-char 2248)
- @result{} (latin-iso8859-1 72)
-(split-char 65)
- @result{} (ascii 65)
-(split-char 128)
- @result{} (eight-bit-control 128)
-@end example
+@defun decode-char charset code-point
+This function decodes a character that is assigned a @var{code-point}
+in @var{charset}, to the corresponding Emacs character, and returns
+that character. If @var{charset} doesn't contain a character of that
+code point, the value is @code{nil}. If @var{code-point} doesnt't fit
+in a Lisp integer (@pxref{Integer Basics, most-positive-fixnum}), it
+can be specified as a cons cell @code{(@var{high} . @var{low})}, where
+@var{low} are the lower 16 bits of the value and @var{high} are the
+high 16 bits.
@end defun
-@c FIXME: update split-char and make-char
-@cindex generate characters in charsets
-@defun make-char charset &optional code1 code2
-This function returns the character in character set @var{charset} whose
-position codes are @var{code1} and @var{code2}. This is roughly the
-inverse of @code{split-char}. Normally, you should specify either one
-or both of @var{code1} and @var{code2} according to the dimension of
-@var{charset}. For example,
-
-@example
-(make-char 'latin-iso8859-1 72)
- @result{} 2248
-@end example
-
-Actually, the eighth bit of both @var{code1} and @var{code2} is zeroed
-before they are used to index @var{charset}. Thus you may use, for
-instance, an ISO 8859 character code rather than subtracting 128, as
-is necessary to index the corresponding Emacs charset.
+@defun encode-char char charset
+This function returns the code point assigned to the character
+@var{char} in @var{charset}. If @var{charset} doesn't contain
+@var{char}, the value is @code{nil}.
@end defun
@node Scanning Charsets
of the text in question.
@defun charset-after &optional pos
-This function return the charset of a character in the current buffer
-at position @var{pos}. If @var{pos} is omitted or @code{nil}, it
-defaults to the current value of point. If @var{pos} is out of range,
-the value is @code{nil}.
+This function returns the charset of highest priority containing the
+character in the current buffer at position @var{pos}. If @var{pos}
+is omitted or @code{nil}, it defaults to the current value of point.
+If @var{pos} is out of range, the value is @code{nil}.
@end defun
@defun find-charset-region beg end &optional translation
-This function returns a list of the character sets that appear in the
-current buffer between positions @var{beg} and @var{end}.
+This function returns a list of the character sets of highest priority
+that contain charcters in the current buffer between positions
+@var{beg} and @var{end}.
The optional argument @var{translation} specifies a translation table to
be used in scanning the text (@pxref{Translation of Characters}). If it
@end defun
@defun find-charset-string string &optional translation
-This function returns a list of the character sets that appear in the
-string @var{string}. It is just like @code{find-charset-region}, except
-that it applies to the contents of @var{string} instead of part of the
-current buffer.
+This function returns a list of the character sets of highest priority
+that contain characters in @var{string}. It is just like
+@code{find-charset-region}, except that it applies to the contents of
+@var{string} instead of part of the current buffer.
@end defun
@node Translation of Characters
@cindex character translation tables
@cindex translation tables
- A @dfn{translation table} is a char-table that specifies a mapping
-of characters into characters. These tables are used in encoding and
-decoding, and for other purposes. Some coding systems specify their
-own particular translation tables; there are also default translation
-tables which apply to all other coding systems.
+ A @dfn{translation table} is a char-table (@pxref{Char-Tables}) that
+specifies a mapping of characters into characters. These tables are
+used in encoding and decoding, and for other purposes. Some coding
+systems specify their own particular translation tables; there are
+also default translation tables which apply to all other coding
+systems.
- For instance, the coding-system @code{utf-8} has a translation table
-that maps characters of various charsets (e.g.,
-@code{latin-iso8859-@var{x}}) into Unicode character sets. This way,
-it can encode Latin-2 characters into UTF-8. Meanwhile,
-@code{unify-8859-on-decoding-mode} operates by specifying
-@code{standard-translation-table-for-decode} to translate
-Latin-@var{x} characters into corresponding Unicode characters.
+ A translation table has two extra slots. The first is either
+@code{nil} or a translation table that performs the reverse
+translation; the second is the maximum number of characters to look up
+for translation.
@defun make-translation-table &rest translations
This function returns a translation table based on the argument
@var{to-alt}.
@end defun
- In decoding, the translation table's translations are applied to the
-characters that result from ordinary decoding. If a coding system has
-property @code{translation-table-for-decode}, that specifies the
-translation table to use. (This is a property of the coding system,
-as returned by @code{coding-system-get}, not a property of the symbol
-that is the coding system's name. @xref{Coding System Basics,, Basic
-Concepts of Coding Systems}.) Otherwise, if
-@code{standard-translation-table-for-decode} is non-@code{nil},
-decoding uses that table.
-
- In encoding, the translation table's translations are applied to the
-characters in the buffer, and the result of translation is actually
-encoded. If a coding system has property
-@code{translation-table-for-encode}, that specifies the translation
-table to use. Otherwise the variable
-@code{standard-translation-table-for-encode} specifies the translation
-table.
+ During decoding, the translation table's translations are applied to
+the characters that result from ordinary decoding. If a coding system
+has property @code{:decode-translation-table}, that specifies the
+translation table to use, or a list of translation tables to apply in
+sequence. (This is a property of the coding system, as returned by
+@code{coding-system-get}, not a property of the symbol that is the
+coding system's name. @xref{Coding System Basics,, Basic Concepts of
+Coding Systems}.) Finally, if
+@code{standard-translation-table-for-decode} is non-@code{nil}, the
+resulting characters are translated by that table.
+
+ During encoding, the translation table's translations are applied to
+the characters in the buffer, and the result of translation is
+actually encoded. If a coding system has property
+@code{:encode-translation-table}, that specifies the translation table
+to use, or a list of translation tables to apply in sequence. In
+addition, if the variable @code{standard-translation-table-for-encode}
+is non-@code{nil}, it specifies the translation table to use for
+translating the result.
@defvar standard-translation-table-for-decode
-This is the default translation table for decoding, for
-coding systems that don't specify any other translation table.
+This is the default translation table for decoding. If a coding
+systems specifies its own translation tables, the table that is the
+value of this variable, if non-@code{nil}, is applied after them.
@end defvar
@defvar standard-translation-table-for-encode
-This is the default translation table for encoding, for
-coding systems that don't specify any other translation table.
+This is the default translation table for encoding. If a coding
+systems specifies its own translation tables, the table that is the
+value of this variable, if non-@code{nil}, is applied after them.
@end defvar
+@defun make-translation-table-from-vector vec
+This function returns a translation table made from @var{vec} that is
+an array of 256 elements to map byte values 0 through 255 to
+characters. Elements may be @code{nil} for untranslated bytes. The
+returned table has a translation table for reverse mapping in the
+first extra slot.
+
+This function provides an easy way to make a private coding system
+that maps each byte to a specific character. You can specify the
+returned table and the reverse translation table using the properties
+@code{:decode-translation-table} and @code{:encode-translation-table}
+respectively in the @var{props} argument to
+@code{define-coding-system}.
+@end defun
+
+@defun make-translation-table-from-alist alist
+This function is similar to @code{make-translation-table} but returns
+a complex translation table rather than a simple one-to-one mapping.
+Each element of @var{alist} is of the form @code{(@var{from}
+. @var{to})}, where @var{from} and @var{to} are either a character or
+a vector specifying a sequence of characters. If @var{from} is a
+character, that character is translated to @var{to} (i.e.@: to a
+character or a character sequence). If @var{from} is a vector of
+characters, that sequence is translated to @var{to}. The returned
+table has a translation table for reverse mapping in the first extra
+slot.
+@end defun
+
@node Coding Systems
@section Coding Systems