From ce9b56fe13c59f2c718c03d6ea7f9dc3a0619e42 Mon Sep 17 00:00:00 2001 From: Kenichi Handa Date: Thu, 15 Sep 2005 02:55:22 +0000 Subject: [PATCH] Fix the paragraph describing the limitation of UTF-8/16/7. --- etc/ChangeLog | 5 +++++ etc/PROBLEMS | 20 ++++++++++---------- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/etc/ChangeLog b/etc/ChangeLog index bcd49751247..316ec3e4cd1 100644 --- a/etc/ChangeLog +++ b/etc/ChangeLog @@ -1,3 +1,8 @@ +2005-09-15 Kenichi Handa + + * PROBLEMS: Fix the paragraph describing the limitation of + UTF-8/16/7. + 2005-09-14 Romain Francoise * NEWS: Add entry for write-region-inhibit-fsync. diff --git a/etc/PROBLEMS b/etc/PROBLEMS index ae9a42bde6d..3b9dc6b17ff 100644 --- a/etc/PROBLEMS +++ b/etc/PROBLEMS @@ -841,9 +841,16 @@ mule-unicode-0100-24ff:-gnu-unifont-*-iso10646-1 ** The UTF-8/16/7 coding systems don't encode CJK (Far Eastern) characters. -Emacs by default only supports the parts of the Unicode BMP whose code -points are in the ranges 0000-33ff and e000-ffff. This excludes: most -of CJK, Yi and Hangul, as well as everything outside the BMP. +Emacs directly supports the Unicode BMP whose code points are in the +ranges 0000-33ff and e000-ffff, and indirectly supports the parts of +CJK characters belonging to these legacy charsets: + + GB2312, Big5, JISX0208, JISX0212, JISX0213-1, JISX0213-2, KSC5601 + +The latter support is done in Utf-Translate-Cjk mode (turned on by +default). Which Unicode CJK characters are decoded into which Emacs +charset is decided by the current language environment. For instance, +in Chinese-GB, most of them are decoded into chinese-gb2312. If you read UTF-8 data with code points outside these ranges, the characters appear in the buffer as raw bytes of the original UTF-8 @@ -853,13 +860,6 @@ If you read such characters from UTF-16 or UTF-7 data, they are substituted with the Unicode `replacement character', and you lose information. -To edit such UTF data, turn on Utf-Translate-Cjk mode, which makes -many common CJK characters available for encoding and decoding and can -be extended by updating the tables it uses. This also allows you to -save as UTF buffers containing characters decoded by the chinese-, -japanese- and korean- coding systems, e.g. cut and pasted from -elsewhere. - ** Mule-UCS loads very slowly. Changes to Emacs internals interact badly with Mule-UCS's `un-define' -- 2.39.2