From: Eli Zaretskii Date: Wed, 23 Nov 2022 14:54:01 +0000 (+0200) Subject: Improve documentation of locale-specific string comparison X-Git-Tag: emacs-29.0.90~1573 X-Git-Url: http://git.eshelyaron.com/gitweb/?a=commitdiff_plain;h=43e616aca56daa438e47051e15f8d2a7454a5cb1;p=emacs.git Improve documentation of locale-specific string comparison * doc/lispref/strings.texi (Text Comparison): * src/fns.c (Fstring_collate_equalp): Improve documentation of 'string-collate-equalp' and 'string-collate-lessp'. (Bug#59275) --- diff --git a/doc/lispref/strings.texi b/doc/lispref/strings.texi index 4454188cc4a..2f277ea73ae 100644 --- a/doc/lispref/strings.texi +++ b/doc/lispref/strings.texi @@ -558,11 +558,13 @@ differences, like @code{char-equal} when @code{case-fold-search} is @cindex locale-dependent string equivalence @defun string-collate-equalp string1 string2 &optional locale ignore-case This function returns @code{t} if @var{string1} and @var{string2} are -equal with respect to collation rules. A collation rule is not only +equal with respect to the collation rules of the specified +@var{locale}, which defaults to your current system locale. A +collation rule is not only determined by the lexicographic order of the characters contained in -@var{string1} and @var{string2}, but also further rules about +@var{string1} and @var{string2}, but also by further rules about relations between these characters. Usually, it is defined by the -@var{locale} environment Emacs is running with and by the Standard C +locale environment with which Emacs is running and by the Standard C library against which Emacs was linked@footnote{ For more information about collation rules and their locale dependencies, see @uref{https://unicode.org/reports/tr10/, The Unicode @@ -589,8 +591,12 @@ dependent; a @var{locale} @code{"en_US.UTF-8"} is applicable on POSIX systems, while it would be, e.g., @code{"enu_USA.1252"} on MS-Windows systems. -If @var{ignore-case} is non-@code{nil}, characters are converted to lower-case -before comparing them. +If @var{ignore-case} is non-@code{nil}, characters are compared +case-insensitively, by converting them to lower-case. However, if the +underlying system library doesn't provide locale-specific collation +rules, this function falls back to @code{string-equal}, in which case +the @var{ignore-case} argument is ignored, and the comparison will +always be case-sensitive. @vindex w32-collate-ignore-punctuation To emulate Unicode-compliant collation on MS-Windows systems, @@ -672,11 +678,13 @@ This function returns the result of comparing @var{string1} and @cindex locale-dependent string comparison @defun string-collate-lessp string1 string2 &optional locale ignore-case This function returns @code{t} if @var{string1} is less than -@var{string2} in collation order. A collation order is not only +@var{string2} in collation order of the specified @var{locale}, which +defaults to your current system locale. A collation order is not only determined by the lexicographic order of the characters contained in -@var{string1} and @var{string2}, but also further rules about +@var{string1} and @var{string2}, but also by further rules about relations between these characters. Usually, it is defined by the -@var{locale} environment Emacs is running with. +locale environment with which Emacs is running, and by the Standard C +library against which Emacs was linked. For example, punctuation and whitespace characters might be ignored for sorting (@pxref{Sequence Functions}): @@ -706,8 +714,12 @@ systems. The @var{locale} value of @code{"POSIX"} or @code{"C"} lets @end group @end example -If @var{ignore-case} is non-@code{nil}, characters are converted to lower-case -before comparing them. +If @var{ignore-case} is non-@code{nil}, characters are compared +case-insensitively, by converting them to lower-case. However, if the +underlying system library doesn't provide locale-specific collation +rules, this function falls back to @code{string-lessp}, in which case +the @var{ignore-case} argument is ignored, and the comparison will +always be case-sensitive. To emulate Unicode-compliant collation on MS-Windows systems, bind @code{w32-collate-ignore-punctuation} to a non-@code{nil} value, since diff --git a/src/fns.c b/src/fns.c index e337c0958d5..7cc6d00afef 100644 --- a/src/fns.c +++ b/src/fns.c @@ -644,7 +644,8 @@ bind `w32-collate-ignore-punctuation' to a non-nil value, since the codeset part of the locale cannot be \"UTF-8\" on MS-Windows. If your system does not support a locale environment, this function -behaves like `string-equal'. +behaves like `string-equal', and in that case the IGNORE-CASE argument +is ignored. Do NOT use this function to compare file names for equality. */) (Lisp_Object s1, Lisp_Object s2, Lisp_Object locale, Lisp_Object ignore_case)