From: Eli Zaretskii Date: Sun, 30 May 2021 10:20:02 +0000 (+0300) Subject: Improve documentation of regexp ranges X-Git-Tag: emacs-28.0.90~2246^2 X-Git-Url: http://git.eshelyaron.com/gitweb/?a=commitdiff_plain;h=5dfa5e26dd;p=emacs.git Improve documentation of regexp ranges * doc/lispref/searching.texi (Regexp Special): Document the effect of using octal escapes in regexp ranges. (Bug#17758) --- diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 8b900da616f..1ee4be7dd13 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi @@ -363,7 +363,7 @@ preceding expression either once or not at all. For example, @anchor{Non-greedy repetition} @item @samp{*?}, @samp{+?}, @samp{??} @cindex non-greedy repetition characters in regexp -These are @dfn{non-greedy} variants of the operators @samp{*}, @samp{+} +are @dfn{non-greedy} variants of the operators @samp{*}, @samp{+} and @samp{?}. Where those operators match the largest possible substring (consistent with matching the entire containing expression), the non-greedy variants match the smallest possible substring @@ -438,6 +438,13 @@ including newline. However, a reversed range should always be from the letter @samp{z} to the letter @samp{a} to make it clear that it is not a typo; for example, @samp{[+-*/]} should be avoided, because it matches only @samp{/} rather than the likely-intended four characters. + +@item +If the end points of a range are raw 8-bit bytes (@pxref{Text +Representations}), or if the range start is ASCII and the end is a raw +byte (as in @samp{[a-\377]}), the range will match only ASCII +characters and raw 8-bit bytes, but not non-ASCII characters. This +feature is intended for searching text in unibyte buffers and strings. @end enumerate Some kinds of character alternatives are not the best style even