* regex.c (RE_TARGET_MULTIBYTE_P): New macro.
(GET_CHAR_BEFORE_2): Check target_multibyte, not multibyte. If
that is zero, convert an eight-bit char to multibyte.
(MAKE_CHAR_MULTIBYTE, CHAR_LEADING_CODE): New dummy new macros for
non-emacs case.
(PATFETCH): Convert an eight-bit char to multibyte.
(HANDLE_UNIBYTE_RANGE): New macro.
(regex_compile): Setup the compiled pattern for multibyte chars
even if the given regex string is unibyte. Use PATFETCH_RAW
instead of PATFETCH in many places. To handle `charset'
specification of unibyte, call HANDLE_UNIBYTE_RANGE. Use bitmap
only for ASCII chars.
(analyse_first) <exactn>: Simplified because the compiled pattern
is multibyte.
<charset_not>: Setup fastmap from bitmap only for ASCII chars.
<charset>: Use CHAR_LEADING_CODE to get leading codes.
<categoryspec>: If multibyte, setup fastmap only for ASCII chars
here.
(re_compile_fastmap) [emacs]: Call analyse_first with the arg
multibyte always 1.
(re_search_2) In emacs, set the locale variable multibyte to 1,
otherwise to 0. New local variable target_multibyte. Check it
to decide the multibyteness of STR1 and STR2. If
target_multibyte is zero, convert unibyte chars to multibyte
before translating and checking fastmap.
(TARGET_CHAR_AND_LENGTH): New macro.
(re_match_2_internal): In emacs, set the locale variable multibyte
to 1, otherwise to 0. New local variable target_multibyte. Check
it to decide the multibyteness of STR1 and STR2. Use
TARGET_CHAR_AND_LENGTH to fetch a character from D.
<charset, charset_not>: If multibyte is nonzero, check fastmap
only for ASCII chars. Call bcmp_translate with
target_multibyte, not with multibyte.
<begline>: Declare the local variable C as `unsigned'.
(bcmp_translate): Change the last arg name to target_multibyte.
(internal_self_insert): In a multibyte buffer, insert C
as is without converting it to unibyte. In a unibyte buffer,
convert C to multibyte before checking the syntax.
(Fset_unibyte_charset): If the dimension of CHARSET is
not 1, singals an error. Update the elements of
unibyte_to_multibyte_table.
(init_charset_once): Initialize unibyte_to_multibyte_table.
(syms_of_charset): Define the charset `iso-8859-1'.
(unibyte_to_multibyte_table): New variable.
(unibyte_char_to_multibyte): Move to character.h and defined as
macro.
(multibyte_char_to_unibyte): If C is an eight-bit character,
convert it to the corresponding byte value.
(LEADING_CODE_LATIN_1_MIN)
(LEADING_CODE_LATIN_1_MAX): New macros.
(unibyte_to_multibyte_table): Extern it.
(unibyte_char_to_multibyte): New macro.
(MAKE_CHAR_MULTIBYTE): Use unibyte_to_multibyte_table.
(CHAR_LEADING_CODE): New macro.
(FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE): New macro.
Kenichi Handa [Wed, 21 Aug 2002 12:53:56 +0000 (12:53 +0000)]
(coding_set_destination): Fix coding->destination for
the case converting a region.
(encode_coding_utf_8): Encode eight-bit chars as single byte.
(encode_coding_object): Fix coding->dst_pos and
coding->dst_pos_byte for the case converting a region.
Kenichi Handa [Mon, 19 Aug 2002 06:12:31 +0000 (06:12 +0000)]
(fontset-plain-name): If the fontset
name doesn't ends with "-fontset-*", use family name as the first
part of the plain name.
(create-fontset-from-ascii-font): If "fontset-startup" is not yet
created, use that name for the fontset. Fix arguments to
subst-char-in-string.
Kenichi Handa [Thu, 15 Aug 2002 02:28:42 +0000 (02:28 +0000)]
(charset_unibyte): Renamed from charset_primary.
(Funibyte_charset): Renamed from Fprimary_charset.
(Fset_unibyte_charset): Renamed from Fset_primary_charset.
(syms_of_charset): Adjusted for the above changes.
Kenichi Handa [Thu, 15 Aug 2002 02:28:08 +0000 (02:28 +0000)]
(unibyte_char_to_multibyte): Refer to
charset_unibyte, not charset_primary.
(multibyte_char_to_unibyte): Likewise.
(Funibyte_char_to_multibyte): Likewise.
Kenichi Handa [Thu, 15 Aug 2002 02:27:11 +0000 (02:27 +0000)]
(reset-language-environment): Don't
set nonascii-translation-table and nonascii-insert-offset. Call
set-unibyte-charset, not set-primary-charset.
(nonascii-translation-table, nonascii-insert-offset): Declare
these variable as obsolete ones.
(set-language-environment): Call set-unibyte-charset, not
set-primary-charset. Call set-charset-priority with `charset'
info of the language environment.
Kenichi Handa [Thu, 1 Aug 2002 12:36:17 +0000 (12:36 +0000)]
Call map-charset-chars on big5
(not chinese-big5-1/2) to set categories `c', `C', and `|'.
(next-word-boundary-han): New function. Register it in
next-word-boundary-function-table.
(next-word-boundary-kana): Likewise.
Kenichi Handa [Thu, 1 Aug 2002 12:33:55 +0000 (12:33 +0000)]
(Vnext_word_boundary_function_table): New variable.
(syms_of_syntax): Declare it as a Lisp variable.
(scan_words): Call functions in Vnext_word_boundary_function_table
if any.
Dave Love [Wed, 31 Jul 2002 22:28:25 +0000 (22:28 +0000)]
(leim): Don't put PARALLEL in environment.
($(srcdir)/src/config.in, $(srcdir)/src/stamp-h.in): New.
(install-arch-indep, install-arch-indep): Merge changes from
trunk.
(tar-file-name-coding-system): New variable. Make
it permanent-local.p
(tar-header-block-tokenize): Decode filename and linkname by
tar-file-name-coding-system.
(tar-header-block-checksum): Call multibyte-char-to-unibyte to get
the byte value of eight-bit chars.
(tar-summarize-buffer): Call set-buffer-multibyte with METHOD
`to'. Delete unnecessary call of position-bytes.
(tar-mode): Set tar-file-name-coding-system. Delete unnecessary
call of position-bytes.
(tar-extract): Simplified by calling decode-coding-region with
DESTINATION argument. Don't toggle multibyteness of tar buffer.
(tar-copy): Don't toggle multibyteness of tar buffer.
(tar-expunge): Likewise.
(tar-clear-modification-flags): Delete unnecessary call of
position-bytes.
(tar-rename-entry): Call tar-alter-one-field with encoded new
name.
(tar-alter-one-field): Don't toggle multibyteness of tar buffer.
Convert new-data-string by string-to-multibyte before inserting
it.
(tar-subfile-save-buffer): Don't toggle multibyteness of tar
buffer. Simplified by calling encoding-coding-region with
DESTINATION argument.
(tar-mode-write-file): Delete unnecessary call of
byte-to-position.
(archive-file-name-coding-system): New variable.
Make it permanent-local.
(byte-after, bref, insert-unibyte): New function. Change most of
char-after, aref, insert to them respectively.
(archive-mode): Set archive-file-name-coding-system.
(archive-summarize): Don't change the buffer's multibyteness.
(archive-extract): Inherit archive-file-name-coding-system from
archive-superior-buffer. Bind coding-system-for-write to
archive-file-name-coding-system.
(archive-*-write-file-member): Encode ENAME by
archive-file-name-coding-system. Bind coding-system-for-write to
no-conversion.
(archive-rename-entry): Encode the filename by
archive-file-name-coding-system.
(archive-mode-revert): Don't change the buffer's multibyteness.
(archive-arc-summarize, archive-lzh-summarize,
archive-zoo-summarize): Don't change the buffer's multibyteness.
Decode filenames by archive-file-name-coding-system.
(archive-arc-rename-entry, archive-zip-chmod-entry): Don't change
the buffer's multibyteness.