Kenichi Handa [Tue, 1 Oct 2002 01:33:07 +0000 (01:33 +0000)]
(set-case-syntax-set-multibyte): This variable
deleted.
(set-case-syntax-charset): New variable.
(set-case-syntax-1): New function.
(set-case-syntax-delims, set-case-syntax-pair, set-case-syntax):
Call set-case-syntax-1 on arguments.
Dave Love [Sat, 14 Sep 2002 11:47:38 +0000 (11:47 +0000)]
(ucs-bengali-to-is13194-alist, ucs-assamese-to-is13194-alist)
(ucs-gurmukhi-to-is13194-alist, ucs-gujarati-to-is13194-alist)
(ucs-oriya-to-is13194-alist, ucs-tamil-to-is13194-alist)
(ucs-telugu-to-is13194-alist, ucs-malayalam-to-is13194-alist))):
Remove declarations and let-bind them in re-written top-level loop
over scripts, including ucs-devanagari-to-is13194-alist.
Dave Love [Thu, 5 Sep 2002 17:43:48 +0000 (17:43 +0000)]
(message-posting-charset): defvar when compiling.
(rfc2047-header-encoding-alist): Add `address-mime' part.
(rfc2047-charset-encoding-alist): Use B for iso-8859-7. Doc fix.
(rfc2047-q-encoding-alist): Augment header list.
(rfc2047-encodable-p): Use mm-find-mime-charset-region.
(rfc2047-special-chars, rfc2047-non-special-chars): New.
(rfc2047-dissect-region, rfc2047-encode-region, rfc2047-encode):
Rewritten to avoid charset stuff and to take account of rfc2822
tokens.
(rfc2047-encode-message-header): Don't include header name field
in encoding. Add `address-mime' case and bind
rfc2047-special-chars for `mime' case.
(char_quoted): Use FETCH_CHAR_AS_MULTIBYTE to convert
unibyte chars to multibyte.
(back_comment): Likewise.
(scan_words): Likewise.
(skip_chars): The arg syntaxp is deleted, and the code for
handling syntaxes is moved to skip_syntaxes. Callers changed.
Fix the case that the multibyteness of STRING and the current
buffer doesn't match.
(skip_syntaxes): New function.
(SYNTAX_WITH_MULTIBYTE_CHECK): Check C by ASCII_CHAR_P, not by
SINGLE_BYTE_CHAR_P.
(Fforward_comment): Use FETCH_CHAR_AS_MULTIBYTE to convert unibyte
chars to multibyte.
(scan_lists): Likewise.
(Fbackward_prefix_chars): Likewise.
(scan_sexps_forward): Likewise.
(compile_pattern_1): Don't adjust the multibyteness of
the regexp pattern and the matching target. Set cp->buf.multibyte
to the multibyteness of the regexp pattern. Set
cp->but.target_multibyte to the multibyteness of the matching
target.
(wordify): Use FETCH_STRING_CHAR_AS_MULTIBYTE_ADVANCE instead of
FETCH_STRING_CHAR_ADVANCE.
(Freplace_match): Convert unibyte chars to multibyte.
* regex.c (RE_TARGET_MULTIBYTE_P): New macro.
(GET_CHAR_BEFORE_2): Check target_multibyte, not multibyte. If
that is zero, convert an eight-bit char to multibyte.
(MAKE_CHAR_MULTIBYTE, CHAR_LEADING_CODE): New dummy new macros for
non-emacs case.
(PATFETCH): Convert an eight-bit char to multibyte.
(HANDLE_UNIBYTE_RANGE): New macro.
(regex_compile): Setup the compiled pattern for multibyte chars
even if the given regex string is unibyte. Use PATFETCH_RAW
instead of PATFETCH in many places. To handle `charset'
specification of unibyte, call HANDLE_UNIBYTE_RANGE. Use bitmap
only for ASCII chars.
(analyse_first) <exactn>: Simplified because the compiled pattern
is multibyte.
<charset_not>: Setup fastmap from bitmap only for ASCII chars.
<charset>: Use CHAR_LEADING_CODE to get leading codes.
<categoryspec>: If multibyte, setup fastmap only for ASCII chars
here.
(re_compile_fastmap) [emacs]: Call analyse_first with the arg
multibyte always 1.
(re_search_2) In emacs, set the locale variable multibyte to 1,
otherwise to 0. New local variable target_multibyte. Check it
to decide the multibyteness of STR1 and STR2. If
target_multibyte is zero, convert unibyte chars to multibyte
before translating and checking fastmap.
(TARGET_CHAR_AND_LENGTH): New macro.
(re_match_2_internal): In emacs, set the locale variable multibyte
to 1, otherwise to 0. New local variable target_multibyte. Check
it to decide the multibyteness of STR1 and STR2. Use
TARGET_CHAR_AND_LENGTH to fetch a character from D.
<charset, charset_not>: If multibyte is nonzero, check fastmap
only for ASCII chars. Call bcmp_translate with
target_multibyte, not with multibyte.
<begline>: Declare the local variable C as `unsigned'.
(bcmp_translate): Change the last arg name to target_multibyte.
(internal_self_insert): In a multibyte buffer, insert C
as is without converting it to unibyte. In a unibyte buffer,
convert C to multibyte before checking the syntax.
(Fset_unibyte_charset): If the dimension of CHARSET is
not 1, singals an error. Update the elements of
unibyte_to_multibyte_table.
(init_charset_once): Initialize unibyte_to_multibyte_table.
(syms_of_charset): Define the charset `iso-8859-1'.