From f67b6c12766d0ba5768cf31344ad112b54e5d694 Mon Sep 17 00:00:00 2001 From: Luc Teirlinck Date: Mon, 27 Oct 2003 15:54:13 +0000 Subject: [PATCH] (Creating Strings): Argument START to `substring' can not be `nil'. Expand description of `substring-no-properties'. Correct description of `split-string', especially with respect to empty matches. Prevent very bad line break in definition of `split-string-default-separators'. (Text Comparison): `string=' and `string<' also accept symbols as arguments. (String Conversion): More completely describe argument BASE in `string-to-number'. (Formatting Strings): `%s' and `%S" in `format' do require corresponding object. Clarify behavior of numeric prefix after `%' in `format'. (Case Conversion): The argument to `upcase-initials' can be a character. --- lispref/strings.texi | 144 +++++++++++++++++++++++++++++-------------- 1 file changed, 97 insertions(+), 47 deletions(-) diff --git a/lispref/strings.texi b/lispref/strings.texi index 79aeb976f1e..b0106f9a73b 100644 --- a/lispref/strings.texi +++ b/lispref/strings.texi @@ -172,7 +172,7 @@ In this example, the index for @samp{e} is @minus{}3, the index for @samp{f} is @minus{}2, and the index for @samp{g} is @minus{}1. Therefore, @samp{e} and @samp{f} are included, and @samp{g} is excluded. -When @code{nil} is used as an index, it stands for the length of the +When @code{nil} is used for @var{end}, it stands for the length of the string. Thus, @example @@ -208,10 +208,11 @@ For example: @result{} [b (c)] @end example -A @code{wrong-type-argument} error is signaled if either @var{start} or -@var{end} is not an integer or @code{nil}. An @code{args-out-of-range} -error is signaled if @var{start} indicates a character following -@var{end}, or if either integer is out of range for @var{string}. +A @code{wrong-type-argument} error is signaled if @var{start} is not +an integer or if @var{end} is neither an integer nor @code{nil}. An +@code{args-out-of-range} error is signaled if @var{start} indicates a +character following @var{end}, or if either integer is out of range +for @var{string}. Contrast this function with @code{buffer-substring} (@pxref{Buffer Contents}), which returns a string containing a portion of the text in @@ -219,9 +220,12 @@ the current buffer. The beginning of a string is at index 0, but the beginning of a buffer is at index 1. @end defun -@defun substring-no-properties string start &optional end -This works like @code{substring} but discards all text properties -from the value. +@defun substring-no-properties string &optional start end +This works like @code{substring} but discards all text properties from +the value. Also, @var{start} may be omitted or @code{nil}, which is +equivalent to 0. Thus, @w{@code{(substring-no-properties +@var{string})}} returns a copy of @var{string}, with all text +properties removed. @end defun @defun concat &rest sequences @@ -264,7 +268,7 @@ description of @code{mapconcat} in @ref{Mapping Functions}, Lists}. @end defun -@defun split-string string separators omit-nulls +@defun split-string string &optional separators omit-nulls This function splits @var{string} into substrings at matches for the regular expression @var{separators}. Each match for @var{separators} defines a splitting point; the substrings between the splitting points @@ -285,7 +289,7 @@ null strings are always omitted from the result. Thus: @example (split-string " two words ") -@result{} ("two" "words") + @result{} ("two" "words") @end example The result is not @samp{("" "two" "words" "")}, which would rarely be @@ -294,33 +298,62 @@ useful. If you need such a result, use an explict value for @example (split-string " two words " split-string-default-separators) -@result{} ("" "two" "words" "") + @result{} ("" "two" "words" "") @end example More examples: @example (split-string "Soup is good food" "o") -@result{} ("S" "up is g" "" "d f" "" "d") + @result{} ("S" "up is g" "" "d f" "" "d") (split-string "Soup is good food" "o" t) -@result{} ("S" "up is g" "d f" "d") + @result{} ("S" "up is g" "d f" "d") (split-string "Soup is good food" "o+") -@result{} ("S" "up is g" "d f" "d") + @result{} ("S" "up is g" "d f" "d") @end example -Empty matches do count, when not adjacent to another match: +Empty matches do count, except that @code{split-string} will not look +for a final empty match when it already reached the end of the string +using a non-empty match or when @var{string} is empty: @example -(split-string "Soup is good food" "o*") -@result{}("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") -(split-string "Nice doggy!" "") -@result{}("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") +(split-string "aooob" "o*") + @result{} ("" "a" "" "b" "") +(split-string "ooaboo" "o*") + @result{} ("" "" "a" "b" "") +(split-string "" "") + @result{} ("") +@end example + +However, when @var{separators} can match the empty string, +@var{omit-nulls} is usually @code{t}, so that the subtleties in the +three previous examples are rarely relevant: + +@example +(split-string "Soup is good food" "o*" t) + @result{} ("S" "u" "p" " " "i" "s" " " "g" "d" " " "f" "d") +(split-string "Nice doggy!" "" t) + @result{} ("N" "i" "c" "e" " " "d" "o" "g" "g" "y" "!") +(split-string "" "" t) + @result{} nil +@end example + +Somewhat odd, but predictable, behavior can occur for certain +``non-greedy'' values of @var{separators} that can prefer empty +matches over non-empty matches. Again, such values rarely occur in +practice: + +@example +(split-string "ooo" "o*" t) + @result{} nil +(split-string "ooo" "\\|o+" t) + @result{} ("o" "o" "o") @end example @end defun @defvar split-string-default-separators The default value of @var{separators} for @code{split-string}, initially -@samp{"[ \f\t\n\r\v]+"}. +@w{@samp{"[ \f\t\n\r\v]+"}}. @end defvar @node Modifying Strings @@ -367,7 +400,8 @@ in case if @code{case-fold-search} is non-@code{nil}. @defun string= string1 string2 This function returns @code{t} if the characters of the two strings -match exactly. +match exactly. Symbols are also allowed as arguments, in which case +their print names are used. Case is always significant, regardless of @code{case-fold-search}. @example @@ -441,6 +475,9 @@ no characters is less than any other string. @result{} nil @end group @end example + +Symbols are also allowed as arguments, in which case their print names +are used. @end defun @defun string-lessp string1 string2 @@ -545,8 +582,10 @@ negative. @example (number-to-string 256) @result{} "256" +@group (number-to-string -23) @result{} "-23" +@end group (number-to-string -23.5) @result{} "-23.5" @end example @@ -560,20 +599,22 @@ See also the function @code{format} in @ref{Formatting Strings}. @defun string-to-number string &optional base @cindex string to number This function returns the numeric value of the characters in -@var{string}. If @var{base} is non-@code{nil}, integers are converted -in that base. If @var{base} is @code{nil}, then base ten is used. -Floating point conversion always uses base ten; we have not implemented -other radices for floating point numbers, because that would be much -more work and does not seem useful. If @var{string} looks like an -integer but its value is too large to fit into a Lisp integer, +@var{string}. If @var{base} is non-@code{nil}, it must be an integer +between 2 and 16 (inclusive), and integers are converted in that base. +If @var{base} is @code{nil}, then base ten is used. Floating point +conversion only works in base ten; we have not implemented other +radices for floating point numbers, because that would be much more +work and does not seem useful. If @var{string} looks like an integer +but its value is too large to fit into a Lisp integer, @code{string-to-number} returns a floating point result. -The parsing skips spaces and tabs at the beginning of @var{string}, then -reads as much of @var{string} as it can interpret as a number. (On some -systems it ignores other whitespace at the beginning, not just spaces -and tabs.) If the first character after the ignored whitespace is -neither a digit, nor a plus or minus sign, nor the leading dot of a -floating point number, this function returns 0. +The parsing skips spaces and tabs at the beginning of @var{string}, +then reads as much of @var{string} as it can interpret as a number in +the given base. (On some systems it ignores other whitespace at the +beginning, not just spaces and tabs.) If the first character after +the ignored whitespace is neither a digit in the given base, nor a +plus or minus sign, nor the leading dot of a floating point number, +this function returns 0. @example (string-to-number "256") @@ -675,16 +716,12 @@ Starting in Emacs 21, if the object is a string, its text properties are copied into the output. The text properties of the @samp{%s} itself are also copied, but those of the object take priority. -If there is no corresponding object, the empty string is used. - @item %S Replace the specification with the printed representation of the object, made with quoting (that is, using @code{prin1}---@pxref{Output Functions}). Thus, strings are enclosed in @samp{"} characters, and @samp{\} characters appear where necessary before special characters. -If there is no corresponding object, the empty string is used. - @item %o @cindex integer to octal Replace the specification with the base-eight representation of an @@ -747,12 +784,17 @@ operation} error. @cindex padding All the specification characters allow an optional numeric prefix between the @samp{%} and the character. The optional numeric prefix -defines the minimum width for the object. If the printed representation -of the object contains fewer characters than this, then it is padded. -The padding is on the left if the prefix is positive (or starts with -zero) and on the right if the prefix is negative. The padding character -is normally a space, but if the numeric prefix starts with a zero, zeros -are used for padding. Here are some examples of padding: +defines the minimum width for the object. If the printed +representation of the object contains fewer characters than this, then +it is padded. The padding is on the left if the prefix is positive +(or starts with zero) and on the right if the prefix is negative. The +padding character is normally a space, but if the numeric prefix +starts with a zero, zeros are used for padding. Some of these +conventions are ignored for specification characters for which they do +not make sense. That is, %s, %S and %c accept a numeric prefix +starting with 0, but still pad with @emph{spaces} on the left. Also, +%% accepts a numeric prefix, but ignores it. Here are some examples +of padding: @example (format "%06d is padded on the left with zeros" 123) @@ -872,11 +914,15 @@ When the argument to @code{capitalize} is a character, @code{capitalize} has the same result as @code{upcase}. @example +@group (capitalize "The cat in the hat") @result{} "The Cat In The Hat" +@end group +@group (capitalize "THE 77TH-HATTED CAT") @result{} "The 77th-Hatted Cat" +@end group @group (capitalize ?x) @@ -885,16 +931,20 @@ has the same result as @code{upcase}. @end example @end defun -@defun upcase-initials string -This function capitalizes the initials of the words in @var{string}, -without altering any letters other than the initials. It returns a new -string whose contents are a copy of @var{string}, in which each word has +@defun upcase-initials string-or-char +If @var{string-or-char} is a string, this function capitalizes the +initials of the words in @var{string-or-char}, without altering any +letters other than the initials. It returns a new string whose +contents are a copy of @var{string-or-char}, in which each word has had its initial letter converted to upper case. The definition of a word is any sequence of consecutive characters that are assigned to the word constituent syntax class in the current syntax table (@pxref{Syntax Class Table}). +When the argument to @code{upcase-initials} is a character, +@code{upcase-initials} has the same result as @code{upcase}. + @example @group (upcase-initials "The CAT in the hAt") -- 2.39.5