From: Stefan Monnier Date: Fri, 28 Mar 2025 04:46:53 +0000 (-0400) Subject: (replace-region-contents): Improve and promote (bug#76313) X-Git-Url: http://git.eshelyaron.com/gitweb/?a=commitdiff_plain;h=db2f934f4a3b1c4b2ee4d287f9468c22690d49e0;p=emacs.git (replace-region-contents): Improve and promote (bug#76313) Swap the role of `replace-region-contents` and `replace-buffer-contents`, so `replace-region-contents` is the main function, implemented in C, and `replace-buffer-contents` is a mere wrapper (marked as obsolete). Also remove the need to rely on narrowing and on describing the new text as a function. Finally, allow MAX-SECS==0 to require a cheap replacement, and add an INHERIT argument. * src/editfns.c: Include `coding.h`. (Freplace_region_contents): Rename from `Freplace_buffer_contents`. Change calling convention to that of `replace-region-contents`. Add more options for the SOURCE argument. Add INHERIT argument. Skip the costly algorithm if MAX-SECS is 0. * src/insdel.c (replace_range): Allow NEW to be a buffer. * lisp/subr.el (replace-buffer-contents): New implementation. * lisp/emacs-lisp/subr-x.el (replace-region-contents): Delete. * doc/lispref/text.texi (Replacing): Document new API for `replace-region-contents`. Remove documentation of `replace-buffer-contents`. * test/src/editfns-tests.el (replace-buffer-contents-1) (replace-buffer-contents-2, replace-buffer-contents-bug31837): Use `replace-region-contents`. (editfns--replace-region): Delete. (editfns-tests--replace-region): Use `replace-region-contents`. Adds tests for new types of SOURCE args. (cherry picked from commit 7c82cc8b975175aebbad1c43ec1cd98b3232f482) --- diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi index a9a3d8c1e1a..447b093953e 100644 --- a/doc/lispref/text.texi +++ b/doc/lispref/text.texi @@ -4783,30 +4783,42 @@ all markers unrelocated. @node Replacing @section Replacing Buffer Text - You can use the following function to replace the text of one buffer -with the text of another buffer: + You can use the following function to replace some the text of the +current buffer: -@deffn Command replace-buffer-contents source &optional max-secs max-costs -This function replaces the accessible portion of the current buffer -with the accessible portion of the buffer @var{source}. @var{source} -may either be a buffer object or the name of a buffer. When -@code{replace-buffer-contents} succeeds, the text of the accessible -portion of the current buffer will be equal to the text of the -accessible portion of the @var{source} buffer. +@defun replace-region-contents beg end source &optional max-secs max-costs inherit +This function replaces the region between @var{beg} and @var{end} +of the current buffer with the text found in @var{source} which +is usually a string or a buffer, in which case it will use the +accessible portion of that buffer. This function attempts to keep point, markers, text properties, and overlays in the current buffer intact. One potential case where this -behavior is useful is external code formatting programs: they -typically write the reformatted text into a temporary buffer or file, -and using @code{delete-region} and @code{insert-buffer-substring} -would destroy these properties. However, the latter combination is -typically faster (@xref{Deletion}, and @ref{Insertion}). - -For its working, @code{replace-buffer-contents} needs to compare the -contents of the original buffer with that of @var{source} which is a -costly operation if the buffers are huge and there is a high number of -differences between them. In order to keep -@code{replace-buffer-contents}'s runtime in bounds, it has two +behavior is useful is external code formatting programs: they typically +write the reformatted text into a temporary buffer or file, and using +@code{insert} and @code{delete-region} would destroy these properties. + +However, in order to do that, @code{replace-region-contents} needs to +compare the contents of the original buffer with that of @var{source}, +using a costly algorithm which makes the operation much slower than +a simple @code{insert} and @code{delete-region}. In many cases, you may +not need that refinement, and you will then want to pass 0 as +@var{max-secs} argument, so as to short-circuit that costly algorithm: +It will then be just as fast as @code{insert} and @code{delete-region} +while still preserving point and markers marginally better. + +Beyond that basic usage, if you need to use as source a subset of the +accessible portion of a buffer, @var{source} can also be a vector +@code{[@var{sbuf} @var{sbeg} @var{send}]} where the region between +@var{sbeg} and @var{send} in buffer @var{sbuf} is the text +you want to use as source. + +If you need the inserted text to inherit text-properties +from the adjoining text, you can pass a non-@code{nil} value as +@var{inherit} argument. + +When you do want the costly refined replacement, in order to keep +@code{replace-region-contents}'s runtime in bounds, it has two optional arguments. @var{max-secs} defines a hard boundary in terms of seconds. If given @@ -4817,26 +4829,14 @@ and exceeded, it will fall back to @code{delete-region} and the actual costs exceed this limit, heuristics are used to provide a faster but suboptimal solution. The default value is 1000000. -@code{replace-buffer-contents} returns @code{t} if a non-destructive +@code{replace-region-contents} returns @code{t} if a non-destructive replacement could be performed. Otherwise, i.e., if @var{max-secs} was exceeded, it returns @code{nil}. -@end deffn -@defun replace-region-contents beg end replace-fn &optional max-secs max-costs -This function replaces the region between @var{beg} and @var{end} -using the given @var{replace-fn}. The function @var{replace-fn} is -run in the current buffer narrowed to the specified region and it -should return either a string or a buffer replacing the region. - -The replacement is performed using @code{replace-buffer-contents} (see -above) which also describes the @var{max-secs} and @var{max-costs} -arguments and the return value. - -Note: If the replacement is a string, it will be placed in a temporary -buffer so that @code{replace-buffer-contents} can operate on it. -Therefore, if you already have the replacement in a buffer, it makes -no sense to convert it to a string using @code{buffer-substring} or -similar. +Note: When using the refined replacement algorithm, if the replacement +is a string, it will be internally copied to a temporary buffer. +Therefore, all else being equal, it is preferable to pass a buffer than +a string as @var{source} argument. @end defun @node Decompression diff --git a/lisp/emacs-lisp/subr-x.el b/lisp/emacs-lisp/subr-x.el index be6cbf33a97..50b10a32400 100644 --- a/lisp/emacs-lisp/subr-x.el +++ b/lisp/emacs-lisp/subr-x.el @@ -281,35 +281,6 @@ the string." (declare (pure t) (side-effect-free t)) (string-remove-suffix "\n" string)) -(defun replace-region-contents (beg end replace-fn - &optional max-secs max-costs) - "Replace the region between BEG and END using REPLACE-FN. -REPLACE-FN runs on the current buffer narrowed to the region. It -should return either a string or a buffer replacing the region. - -The replacement is performed using `replace-buffer-contents' -which also describes the MAX-SECS and MAX-COSTS arguments and the -return value. - -Note: If the replacement is a string, it'll be placed in a -temporary buffer so that `replace-buffer-contents' can operate on -it. Therefore, if you already have the replacement in a buffer, -it makes no sense to convert it to a string using -`buffer-substring' or similar." - (save-excursion - (save-restriction - (narrow-to-region beg end) - (goto-char (point-min)) - (let ((repl (funcall replace-fn))) - (if (bufferp repl) - (replace-buffer-contents repl max-secs max-costs) - (let ((source-buffer (current-buffer))) - (with-temp-buffer - (insert repl) - (let ((tmp-buffer (current-buffer))) - (set-buffer source-buffer) - (replace-buffer-contents tmp-buffer max-secs max-costs))))))))) - ;;;###autoload (defmacro named-let (name bindings &rest body) "Looping construct taken from Scheme. diff --git a/lisp/subr.el b/lisp/subr.el index 0f5de9c9c04..844b75e60d2 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -4352,6 +4352,19 @@ Point in BUFFER will be placed after the inserted text." (with-current-buffer buffer (insert-buffer-substring current start end)))) +(defun replace-buffer-contents (source &optional max-secs max-costs) + "Replace accessible portion of current buffer with that of SOURCE. +SOURCE can be a buffer or a string that names a buffer. +Interactively, prompt for SOURCE. + +The replacement is performed using `replace-region-contents' +which also describes the MAX-SECS and MAX-COSTS arguments and the +return value." + (declare (obsolete replace-region-contents "31.1")) + (interactive "bSource buffer: ") + (replace-region-contents (point-min) (point-max) (get-buffer source) + max-secs max-costs)) + (defun replace-string-in-region (string replacement &optional start end) "Replace STRING with REPLACEMENT in the region from START to END. The number of replaced occurrences are returned, or nil if STRING diff --git a/src/coding.c b/src/coding.c index b0bd5d3a9ab..63b0dbeb18b 100644 --- a/src/coding.c +++ b/src/coding.c @@ -7898,6 +7898,8 @@ code_conversion_save (bool with_work_buf, bool multibyte) bset_enable_multibyte_characters (current_buffer, multibyte ? Qt : Qnil); if (EQ (workbuf, Vcode_conversion_reused_workbuf)) reused_workbuf_in_use = true; + /* FIXME: Maybe we should stay in the new workbuf, because we often + switch right back to it anyway in order to initialize it further. */ set_buffer_internal (current); } diff --git a/src/editfns.c b/src/editfns.c index 53d6cce7c82..25625793c42 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -54,6 +54,7 @@ along with GNU Emacs. If not, see . */ #include "buffer.h" #include "window.h" #include "blockinput.h" +#include "coding.h" #ifdef WINDOWSNT # include "w32common.h" @@ -1914,11 +1915,14 @@ static bool compareseq_early_abort (struct context *); #include "minmax.h" #include "diffseq.h" -DEFUN ("replace-buffer-contents", Freplace_buffer_contents, - Sreplace_buffer_contents, 1, 3, "bSource buffer: ", - doc: /* Replace accessible portion of current buffer with that of SOURCE. -SOURCE can be a buffer or a string that names a buffer. -Interactively, prompt for SOURCE. +DEFUN ("replace-region-contents", Freplace_region_contents, + Sreplace_region_contents, 3, 6, 0, + doc: /* Replace the region between BEG and END with that of SOURCE. +SOURCE can be a buffer, a string, or a vector [SBUF SBEG SEND] +denoting the subtring SBEG..SEND of buffer SBUF. + +If optional argument INHERIT is non-nil, the inserted text will inherit +properties from adjoining text. As far as possible the replacement is non-destructive, i.e. existing buffer contents, markers, properties, and overlays in the current @@ -1940,18 +1944,85 @@ computation. If the actual costs exceed this limit, heuristics are used to provide a faster but suboptimal solution. The default value is 1000000. +Note: If the replacement is a string, it’ll usually be placed internally +in a temporary buffer. Therefore, all else being equal, it is preferable +to pass a buffer rather than a string as SOURCE argument. + This function returns t if a non-destructive replacement could be performed. Otherwise, i.e., if MAX-SECS was exceeded, it returns -nil. */) - (Lisp_Object source, Lisp_Object max_secs, Lisp_Object max_costs) +nil. + +SOURCE can also be a function that will be called with no arguments +and with current buffer narrowed to BEG..END, and should return +a buffer or a string. But this is deprecated. */) + (Lisp_Object beg, Lisp_Object end, Lisp_Object source, + Lisp_Object max_secs, Lisp_Object max_costs, Lisp_Object inherit) { - struct buffer *a = current_buffer; - Lisp_Object source_buffer = Fget_buffer (source); - if (NILP (source_buffer)) - nsberror (source); - struct buffer *b = XBUFFER (source_buffer); - if (! BUFFER_LIVE_P (b)) + validate_region (&beg, &end); + ptrdiff_t min_a = XFIXNUM (beg); + ptrdiff_t size_a = XFIXNUM (end) - min_a; + eassume (size_a >= 0); + bool a_empty = size_a == 0; + bool inh = !NILP (inherit); + + if (FUNCTIONP (source)) + { + specpdl_ref count = SPECPDL_INDEX (); + record_unwind_protect (save_restriction_restore, + save_restriction_save ()); + Fnarrow_to_region (beg, end); + source = calln (source); + unbind_to (count, Qnil); + } + ptrdiff_t min_b, size_b; + struct buffer *b; + if (STRINGP (source)) + { + min_b = BEG; /* Assuming we'll copy it into a buffer. */ + size_b = SCHARS (source); + b = NULL; + } + else if (BUFFERP (source)) + { + b = XBUFFER (source); + min_b = BUF_BEGV (b); + size_b = BUF_ZV (b) - min_b; + } + else + { + CHECK_TYPE (VECTORP (source), + list (Qor, Qstring, Qbuffer, Qvector), source); + /* Let `Faref' signal an error if it's too small. */ + Lisp_Object send = Faref (source, make_fixnum (2)); + Lisp_Object sbeg = AREF (source, 1); + CHECK_BUFFER (AREF (source, 0)); + b = XBUFFER (AREF (source, 0)); + specpdl_ref count = SPECPDL_INDEX (); + record_unwind_current_buffer (); + set_buffer_internal (b); + validate_region (&sbeg, &send); + unbind_to (count, Qnil); + min_b = XFIXNUM (sbeg); + size_b = XFIXNUM (send) - min_b; + } + bool b_empty = size_b == 0; + if (b && !BUFFER_LIVE_P (b)) error ("Selecting deleted buffer"); + + /* Handle trivial cases where at least one accessible portion is + empty. */ + + if (a_empty && b_empty) + return Qt; + else if (a_empty || b_empty + || EQ (max_secs, make_fixnum (0)) + || EQ (max_costs, make_fixnum (0))) + { + replace_range (min_a, min_a + size_a, source, true, false, inh); + return Qt; + } + + struct buffer *a = current_buffer; if (a == b) error ("Cannot replace a buffer with itself"); @@ -1977,36 +2048,8 @@ nil. */) time_limit = tlim; } - ptrdiff_t min_a = BEGV; - ptrdiff_t min_b = BUF_BEGV (b); - ptrdiff_t size_a = ZV - min_a; - ptrdiff_t size_b = BUF_ZV (b) - min_b; - eassume (size_a >= 0); - eassume (size_b >= 0); - bool a_empty = size_a == 0; - bool b_empty = size_b == 0; - - /* Handle trivial cases where at least one accessible portion is - empty. */ - - if (a_empty && b_empty) - return Qt; - - if (a_empty) - { - Finsert_buffer_substring (source, Qnil, Qnil); - return Qt; - } - - if (b_empty) - { - del_range_both (BEGV, BEGV_BYTE, ZV, ZV_BYTE, true); - return Qt; - } - specpdl_ref count = SPECPDL_INDEX (); - ptrdiff_t diags = size_a + size_b + 3; ptrdiff_t del_bytes = size_a / CHAR_BIT + 1; ptrdiff_t ins_bytes = size_b / CHAR_BIT + 1; @@ -2020,6 +2063,18 @@ nil. */) unsigned char *deletions_insertions = memset (buffer + 2 * diags, 0, del_bytes + ins_bytes); + /* The rest of the code is not prepared to handle a string SOURCE. */ + if (!b) + { + Lisp_Object workbuf + = code_conversion_save (true, STRING_MULTIBYTE (source)); + b = XBUFFER (workbuf); + set_buffer_internal (b); + CALLN (Finsert, source); + set_buffer_internal (a); + } + Lisp_Object source_buffer = make_lisp_ptr (b, Lisp_Vectorlike); + /* FIXME: It is not documented how to initialize the contents of the context structure. This code cargo-cults from the existing caller in src/analyze.c of GNU Diffutils, which appears to @@ -2053,7 +2108,7 @@ nil. */) Lisp_Object src = CALLN (Fvector, source_buffer, make_fixnum (BUF_BEGV (b)), make_fixnum (BUF_ZV (b))); - replace_range (BEGV, ZV, src, true, false, false); + replace_range (min_a, min_a + size_a, src, true, false, inh); SAFE_FREE_UNBIND_TO (count, Qnil); return Qnil; } @@ -2069,7 +2124,7 @@ nil. */) modification hooks, because then they don't want that. */ if (!inhibit_modification_hooks) { - prepare_to_modify_buffer (BEGV, ZV, NULL); + prepare_to_modify_buffer (min_a, min_a + size_a, NULL); specbind (Qinhibit_modification_hooks, Qt); modification_hooks_inhibited = true; } @@ -2102,10 +2157,9 @@ nil. */) eassert (beg_a <= end_a); eassert (beg_b <= end_b); eassert (beg_a < end_a || beg_b < end_b); - /* FIXME: Use 'replace_range'! */ ASET (src, 1, make_fixed_natnum (beg_b)); ASET (src, 2, make_fixed_natnum (end_b)); - replace_range (beg_a, end_a, src, true, false, false); + replace_range (beg_a, end_a, src, true, false, inh); } --i; --j; @@ -2115,8 +2169,8 @@ nil. */) if (modification_hooks_inhibited) { - signal_after_change (BEGV, size_a, ZV - BEGV); - update_compositions (BEGV, ZV, CHECK_INSIDE); + signal_after_change (min_a, size_a, size_b); + update_compositions (min_a, min_a + size_b, CHECK_INSIDE); /* We've locked the buffer's file above in prepare_to_modify_buffer; if the buffer is unchanged at this point, i.e. no insertions or deletions have been made, unlock @@ -4787,7 +4841,7 @@ it to be non-nil. */); defsubr (&Sinsert_buffer_substring); defsubr (&Scompare_buffer_substrings); - defsubr (&Sreplace_buffer_contents); + defsubr (&Sreplace_region_contents); defsubr (&Ssubst_char_in_region); defsubr (&Stranslate_region_internal); defsubr (&Sdelete_region); diff --git a/src/insdel.c b/src/insdel.c index 9b770725971..20267265ab8 100644 --- a/src/insdel.c +++ b/src/insdel.c @@ -1409,9 +1409,9 @@ adjust_after_insert (ptrdiff_t from, ptrdiff_t from_byte, adjust_after_replace (from, from_byte, Qnil, newlen, len_byte); } -/* Replace the text from character positions FROM to TO with NEW. - NEW could either be a string, the replacement text, or a vector - [BUFFER BEG END], where BUFFER is the buffer with the replacement +/* Replace the text from character positions FROM to TO with the + replacement text NEW. NEW could either be a string, a buffer, or + a vector [BUFFER BEG END], where BUFFER is the buffer with the replacement text and BEG and END are buffer positions in BUFFER that give the replacement text beginning and end. If PREPARE, call prepare_to_modify_buffer. @@ -1439,6 +1439,12 @@ replace_range (ptrdiff_t from, ptrdiff_t to, Lisp_Object new, insbeg = 0; inschars = SCHARS (new); } + else if (BUFFERP (new)) + { + insbuf = XBUFFER (new); + insbeg = BUF_BEGV (insbuf); + inschars = BUF_ZV (insbuf) - insbeg; + } else { CHECK_VECTOR (new); diff --git a/test/src/editfns-tests.el b/test/src/editfns-tests.el index c3f825c6149..3da9d4e8acd 100644 --- a/test/src/editfns-tests.el +++ b/test/src/editfns-tests.el @@ -289,7 +289,7 @@ (narrow-to-region 8 13) (goto-char 12) (should (looking-at " \\'")) - (replace-buffer-contents source) + (replace-region-contents (point-min) (point-max) source) (should (looking-at " \\'"))) (should (equal (marker-buffer marker) (current-buffer))) (should (equal (marker-position marker) 16))) @@ -306,7 +306,7 @@ (let ((source (current-buffer))) (with-temp-buffer (insert "foo BAR baz qux") - (replace-buffer-contents source) + (replace-region-contents (point-min) (point-max) source) (should (equal-including-properties (buffer-string) "foo bar baz qux")))))) @@ -318,44 +318,44 @@ (switch-to-buffer "b") (insert-char (char-from-name "SMILE")) (insert "5678") - (replace-buffer-contents "a") + (replace-region-contents (point-min) (point-max) (get-buffer "a")) (should (equal (buffer-substring-no-properties (point-min) (point-max)) (concat (string (char-from-name "SMILE")) "1234")))) -(defun editfns--replace-region (from to string) - (save-excursion - (save-restriction - (narrow-to-region from to) - (let ((buf (current-buffer))) - (with-temp-buffer - (let ((str-buf (current-buffer))) - (insert string) - (with-current-buffer buf - (replace-buffer-contents str-buf)))))))) - (ert-deftest editfns-tests--replace-region () ;; :expected-result :failed (with-temp-buffer - (insert "here is some text") - (let ((m5n (copy-marker (+ (point-min) 5))) - (m5a (copy-marker (+ (point-min) 5) t)) - (m6n (copy-marker (+ (point-min) 6))) - (m6a (copy-marker (+ (point-min) 6) t)) - (m7n (copy-marker (+ (point-min) 7))) - (m7a (copy-marker (+ (point-min) 7) t))) - (editfns--replace-region (+ (point-min) 5) (+ (point-min) 7) "be") - (should (equal (buffer-string) "here be some text")) - (should (equal (point) (point-max))) - ;; Markers before the replaced text stay before. - (should (= m5n (+ (point-min) 5))) - (should (= m5a (+ (point-min) 5))) - ;; Markers in the replaced text can end up at either end, depending - ;; on whether they're advance-after-insert or not. - (should (= m6n (+ (point-min) 5))) - (should (<= (+ (point-min) 5) m6a (+ (point-min) 7))) - ;; Markers after the replaced text stay after. - (should (= m7n (+ (point-min) 7))) - (should (= m7a (+ (point-min) 7)))))) + (let ((tmpbuf (current-buffer))) + (insert " be ") + (narrow-to-region (+ (point-min) 2) (- (point-max) 2)) + (dolist (args `((,tmpbuf) + (,(vector tmpbuf (point-min) (point-max))) + (,"be") + (,(vector tmpbuf (point-min) (point-max)) 0) + (,"be" 0))) + (with-temp-buffer + (insert "here is some text") + (let ((m5n (copy-marker (+ (point-min) 5))) + (m5a (copy-marker (+ (point-min) 5) t)) + (m6n (copy-marker (+ (point-min) 6))) + (m6a (copy-marker (+ (point-min) 6) t)) + (m7n (copy-marker (+ (point-min) 7))) + (m7a (copy-marker (+ (point-min) 7) t))) + (apply #'replace-region-contents + (+ (point-min) 5) (+ (point-min) 7) args) + (should (equal (buffer-string) "here be some text")) + (should (equal (point) (point-max))) + ;; Markers before the replaced text stay before. + (should (= m5n (+ (point-min) 5))) + (should (= m5a (+ (point-min) 5))) + ;; Markers in the replaced text can end up at either end, depending + ;; on whether they're advance-after-insert or not. + (should (= m6n (+ (point-min) 5))) + (should (<= (+ (point-min) 5) m6a (+ (point-min) 7))) + ;; Markers after the replaced text stay after. + (should (= m7n (+ (point-min) 7))) + (should (= m7a (+ (point-min) 7))))) + (widen))))) (ert-deftest delete-region-undo-markers-1 () "Make sure we don't end up with freed markers reachable from Lisp."