From 6040dfd121dd5c66eb76d16447b03605e1d4c31d Mon Sep 17 00:00:00 2001 From: Eli Zaretskii Date: Sat, 27 Jan 2024 10:11:32 +0200 Subject: [PATCH] Fix description of when "\xNNN" is considered a unibyte character * doc/lispref/objects.texi (Non-ASCII in Strings): More accurate description of when a hexadecimal escape sequence yields a unibyte character. (Bug#68751) (cherry picked from commit 53481cc954641256602830a6d74def86440ac4a9) --- doc/lispref/objects.texi | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/lispref/objects.texi b/doc/lispref/objects.texi index 07ceb0d7a98..b8fd5ed4345 100644 --- a/doc/lispref/objects.texi +++ b/doc/lispref/objects.texi @@ -1180,13 +1180,14 @@ character), Emacs automatically assumes that it is multibyte. You can also use hexadecimal escape sequences (@samp{\x@var{n}}) and octal escape sequences (@samp{\@var{n}}) in string constants. -@strong{But beware:} If a string constant contains hexadecimal or -octal escape sequences, and these escape sequences all specify unibyte -characters (i.e., less than 256), and there are no other literal -non-@acronym{ASCII} characters or Unicode-style escape sequences in -the string, then Emacs automatically assumes that it is a unibyte -string. That is to say, it assumes that all non-@acronym{ASCII} -characters occurring in the string are 8-bit raw bytes. +@strong{But beware:} If a string constant contains octal escape +sequences or one- or two-digit hexadecimal escape sequences, and these +escape sequences all specify unibyte characters (i.e., codepoints less +than 256), and there are no other literal non-@acronym{ASCII} +characters or Unicode-style escape sequences in the string, then Emacs +automatically assumes that it is a unibyte string. That is to say, it +assumes that all non-@acronym{ASCII} characters occurring in the +string are 8-bit raw bytes. In hexadecimal and octal escape sequences, the escaped character code may contain a variable number of digits, so the first subsequent -- 2.39.5