Fix the new PEG library

author Eli Zaretskii <eliz@gnu.org>

Sun, 31 Mar 2024 07:29:34 +0000 (10:29 +0300)

committer Eshel Yaron <me@eshelyaron.com>

Sun, 31 Mar 2024 18:05:43 +0000 (20:05 +0200)
author Eli Zaretskii <eliz@gnu.org>
Sun, 31 Mar 2024 07:29:34 +0000 (10:29 +0300)
committer Eshel Yaron <me@eshelyaron.com>
Sun, 31 Mar 2024 18:05:43 +0000 (20:05 +0200)
diff --git a/doc/lispref/peg.texi b/doc/lispref/peg.texi

index ef4dfa7653e3e787ab404444ac03c71714653687..fbf57852ee00a7e15a64345f7bf6c0d5f225b679 100644 (file)
--- a/doc/lispref/peg.texi
+++ b/doc/lispref/peg.texi
@@ -7,29 +7,34 @@
  @chapter Parsing Expression Grammars
  @cindex text parsing
  @cindex parsing expression grammar
+@cindex PEG
  
    Emacs Lisp provides several tools for parsing and matching text,
  from regular expressions (@pxref{Regular Expressions}) to full
-@acronym{LL} grammar parsers (@pxref{Top,, Bovine parser
-development,bovine}).  @dfn{Parsing Expression Grammars}
+left-to-right (a.k.a.@: @acronym{LL}) grammar parsers (@pxref{Top,,
+Bovine parser development,bovine}).  @dfn{Parsing Expression Grammars}
  (@acronym{PEG}) are another approach to text parsing that offer more
  structure and composibility than regular expressions, but less
  complexity than context-free grammars.
  
-A @acronym{PEG} parser is defined as a list of named rules, each of
-which matches text patterns, and/or contains references to other
+A Parsing Expression Grammar (@acronym{PEG}) describes a formal language
+in terms of a set of rules for recognizing strings in the language.  In
+Emacs, a @acronym{PEG} parser is defined as a list of named rules, each
+of which matches text patterns and/or contains references to other
  rules.  Parsing is initiated with the function @code{peg-run} or the
  macro @code{peg-parse} (see below), and parses text after point in the
  current buffer, using a given set of rules.
  
  @cindex parsing expression
-The definition of each rule is referred to as a @dfn{parsing
-expression} (@acronym{PEX}), and can consist of a literal string, a
-regexp-like character range or set, a peg-specific construct
-resembling an elisp function call, a reference to another rule, or a
-combination of any of these.  A grammar is expressed as a tree of
-rules in which one rule is typically treated as a ``root'' or
-``entry-point'' rule.  For instance:
+@cindex root, of parsing expression grammar
+@cindex entry-point, of parsing expression grammar
+Each rule in a @acronym{PEG} is referred to as a @dfn{parsing
+expression} (@acronym{PEX}), and can be specified a a literal string, a
+regexp-like character range or set, a peg-specific construct resembling
+an Emacs Lisp function call, a reference to another rule, or a
+combination of any of these.  A grammar is expressed as a tree of rules
+in which one rule is typically treated as a ``root'' or ``entry-point''
+rule.  For instance:
  
  @example
  @group
@@ -56,14 +61,17 @@ first rule is considered the ``entry-point'':
  @end group
  @end example
  
-This macro represents the simplest use of the @acronym{PEG} library,
-but also the least flexible, as the rules must be written directly
-into the source code.  A more flexible approach involves use of three
-macros in conjunction: @code{with-peg-rules}, a @code{let}-like
-construct that makes a set of rules available within the macro body;
-@code{peg-run}, which initiates parsing given a single rule; and
-@code{peg}, which is used to wrap the entry-point rule name.  In fact,
-a call to @code{peg-parse} expands to just this set of calls.  The
+@c FIXME: These two should be formally defined using @defmac and @defun.
+@findex with-peg-rules
+@findex peg-run
+The @code{peg-parse} macro represents the simplest use of the
+@acronym{PEG} library, but also the least flexible, as the rules must be
+written directly into the source code.  A more flexible approach
+involves use of three macros in conjunction: @code{with-peg-rules}, a
+@code{let}-like construct that makes a set of rules available within the
+macro body; @code{peg-run}, which initiates parsing given a single rule;
+and @code{peg}, which is used to wrap the entry-point rule name.  In
+fact, a call to @code{peg-parse} expands to just this set of calls.  The
  above example could be written as:
  
  @example
@@ -79,33 +87,43 @@ above example could be written as:
  This allows more explicit control over the ``entry-point'' of parsing,
  and allows the combination of rules from different sources.
  
+@c FIXME: Use @defmac.
+@findex define-peg-rule
  Individual rules can also be defined using a more @code{defun}-like
  syntax, using the macro @code{define-peg-rule}:
  
  @example
+@group
  (define-peg-rule digit ()
    [0-9])
+@end group
  @end example
  
  This also allows for rules that accept an argument (supplied by the
-@code{funcall} PEG rule).
+@code{funcall} PEG rule, @pxref{PEX Definitions}).
  
+@c FIXME: Use @defmac.
+@findex define-peg-ruleset
  Another possibility is to define a named set of rules with
  @code{define-peg-ruleset}:
  
  @example
+@group
  (define-peg-ruleset number-grammar
          '((number sign digit (* digit))
            digit  ;; A reference to the definition above.
            (sign (or "+" "-" ""))))
+@end group
  @end example
  
  Rules and rulesets defined this way can be referred to by name in
  later calls to @code{peg-run} or @code{with-peg-rules}:
  
  @example
+@group
  (with-peg-rules number-grammar
    (peg-run (peg number)))
+@end group
  @end example
  
  By default, calls to @code{peg-run} or @code{peg-parse} produce no
@@ -125,11 +143,11 @@ act upon parsed strings, rules can include @dfn{actions}, see
  Parsing expressions can be defined using the following syntax:
  
  @table @code
-@item (and E1 E2 ...)
-A sequence of @acronym{PEX}s that must all be matched.  The @code{and} form is
-optional and implicit.
+@item (and @var{e1} @var{e2}@dots{})
+A sequence of @acronym{PEX}s that must all be matched.  The @code{and}
+form is optional and implicit.
  
-@item (or E1 E2 ...)
+@item (or @var{e1} @var{e2}@dots{})
  Prioritized choices, meaning that, as in Elisp, the choices are tried
  in order, and the first successful match is used.  Note that this is
  distinct from context-free grammars, in which selection between
@@ -141,43 +159,43 @@ Matches any single character, as the regexp ``.''.
  @item @var{string}
  A literal string.
  
-@item (char @var{C})
-A single character @var{C}, as an Elisp character literal.
+@item (char @var{c})
+A single character @var{c}, as an Elisp character literal.
  
-@item (* @var{E})
-Zero or more instances of expression @var{E}, as the regexp @samp{*}.
+@item (* @var{e})
+Zero or more instances of expression @var{e}, as the regexp @samp{*}.
  Matching is always ``greedy''.
  
-@item (+ @var{E})
-One or more instances of expression @var{E}, as the regexp @samp{+}.
+@item (+ @var{e})
+One or more instances of expression @var{e}, as the regexp @samp{+}.
  Matching is always ``greedy''.
  
-@item (opt @var{E})
-Zero or one instance of expression @var{E}, as the regexp @samp{?}.
+@item (opt @var{e})
+Zero or one instance of expression @var{e}, as the regexp @samp{?}.
  
-@item SYMBOL
+@item @var{symbol}
  A symbol representing a previously-defined PEG rule.
  
-@item (range CH1 CH2)
-The character range between CH1 and CH2, as the regexp @samp{[CH1-CH2]}.
+@item (range @var{ch1} @var{ch2})
+The character range between @var{ch1} and @var{ch2}, as the regexp
+@samp{[@var{ch1}-@var{ch2}]}.
  
-@item [CH1-CH2 "+*" ?x]
+@item [@var{ch1}-@var{ch2} "+*" ?x]
  A character set, which can include ranges, character literals, or
  strings of characters.
  
  @item [ascii cntrl]
  A list of named character classes.
  
-@item (syntax-class @var{NAME})
+@item (syntax-class @var{name})
  A single syntax class.
  
-@item (funcall E ARGS...)
-Call @acronym{PEX} E (previously defined with @code{define-peg-rule})
-with arguments @var{ARGS}.
+@item (funcall @var{e} @var{args}@dots{})
+Call @acronym{PEX} @var{e} (previously defined with
+@code{define-peg-rule}) with arguments @var{args}.
  
  @item (null)
  The empty string.
-
  @end table
  
  The following expressions are used as anchors or tests -- they do not
@@ -210,19 +228,19 @@ Beginning of symbol.
  @item (eos)
  End of symbol.
  
-@item (if E)
-Returns non-@code{nil} if parsing @acronym{PEX} E from point succeeds (point
-is not moved).
-
-@item (not E)
-Returns non-@code{nil} if parsing @acronym{PEX} E from point fails (point
-is not moved).
+@item (if @var{e})
+Returns non-@code{nil} if parsing @acronym{PEX} @var{e} from point
+succeeds (point is not moved).
  
-@item (guard EXP)
-Treats the value of the Lisp expression EXP as a boolean.
+@item (not @var{e})
+Returns non-@code{nil} if parsing @acronym{PEX} @var{e} from point fails
+(point is not moved).
  
+@item (guard @var{exp})
+Treats the value of the Lisp expression @var{exp} as a boolean.
  @end table
  
+@c FIXME: peg-char-classes should be mentioned in the text below.
  @vindex peg-char-classes
  Character class matching can use the same named character classes as
  in regular expressions (@pxref{Top,, Character Classes,elisp})
@@ -234,12 +252,13 @@ in regular expressions (@pxref{Top,, Character Classes,elisp})
  @cindex parsing stack
  By default the process of parsing simply moves point in the current
  buffer, ultimately returning @code{t} if the parsing succeeds, and
-@code{nil} if it doesn't.  It's also possible to define ``actions''
-that can run arbitrary Elisp at certain points in the parsed text.
-These actions can optionally affect something called the @dfn{parsing
-stack}, which is a list of values returned by the parsing process.
-These actions only run (and only return values) if the parsing process
-ultimately succeeds; if it fails the action code is not run at all.
+@code{nil} if it doesn't.  It's also possible to define @dfn{parsing
+actions} that can run arbitrary Elisp at certain points in the parsed
+text.  These actions can optionally affect something called the
+@dfn{parsing stack}, which is a list of values returned by the parsing
+process.  These actions only run (and only return values) if the parsing
+process ultimately succeeds; if it fails the action code is not run at
+all.
  
  Actions can be added anywhere in the definition of a rule.  They are
  distinguished from parsing expressions by an initial backquote
@@ -247,12 +266,13 @@ distinguished from parsing expressions by an initial backquote
  of hyphens (@samp{--}) somewhere within it.  Symbols to the left of
  the hyphens are bound to values popped from the stack (they are
  somewhat analogous to the argument list of a lambda form).  Values
-produced by code to the right are pushed to the stack (analogous to
-the return value of the lambda).  For instance, the previous grammar
-can be augmented with actions to return the parsed number as an actual
-integer:
+produced by code to the right of the hyphens are pushed onto the stack
+(analogous to the return value of the lambda).  For instance, the
+previous grammar can be augmented with actions to return the parsed
+number as an actual integer:
  
  @example
+@group
  (with-peg-rules ((number sign digit (* digit
                                         `(a b -- (+ (* a 10) b)))
                           `(sign val -- (* sign val)))
@@ -261,6 +281,7 @@ integer:
                             (and ""  `(-- 1))))
                   (digit [0-9] `(-- (- (char-before) ?0))))
    (peg-run (peg number)))
+@end group
  @end example
  
  There must be values on the stack before they can be popped and
@@ -271,43 +292,53 @@ only left-hand terms will consume (and discard) values from the stack.
  At the end of parsing, stack values are returned as a flat list.
  
  To return the string matched by a @acronym{PEX} (instead of simply
-moving point over it), a rule like this can be used:
+moving point over it), a grammar can use a rule like this:
  
  @example
+@group
  (one-word
    `(-- (point))
    (+ [word])
    `(start -- (buffer-substring start (point))))
+@end group
  @end example
  
-The first action pushes the initial value of point to the stack.  The
-intervening @acronym{PEX} moves point over the next word.  The second
-action pops the previous value from the stack (binding it to the
-variable @code{start}), and uses that value to extract a substring
-from the buffer and push it to the stack.  This pattern is so common
-that @acronym{PEG} provides a shorthand function that does exactly the
-above, along with a few other shorthands for common scenarios:
+@noindent
+The first action above pushes the initial value of point to the stack.
+The intervening @acronym{PEX} moves point over the next word.  The
+second action pops the previous value from the stack (binding it to the
+variable @code{start}), then uses that value to extract a substring from
+the buffer and push it to the stack.  This pattern is so common that
+@acronym{PEG} provides a shorthand function that does exactly the above,
+along with a few other shorthands for common scenarios:
  
  @table @code
-@item (substring @var{E})
-Match @acronym{PEX} @var{E} and push the matched string to the stack.
-
-@item (region @var{E})
-Match @var{E} and push the start and end positions of the matched
-region to the stack.
-
-@item (replace @var{E} @var{replacement})
-Match @var{E} and replaced the matched region with the string @var{replacement}.
-
-@item (list @var{E})
-Match @var{E}, collect all values produced by @var{E} (and its
-sub-expressions) into a list, and push that list to the stack.  Stack
+@findex substring (a PEG shorthand)
+@item (substring @var{e})
+Match @acronym{PEX} @var{e} and push the matched string onto the stack.
+
+@findex region (a PEG shorthand)
+@item (region @var{e})
+Match @var{e} and push the start and end positions of the matched
+region onto the stack.
+
+@findex replace (a PEG shorthand)
+@item (replace @var{e} @var{replacement})
+Match @var{e} and replaced the matched region with the string
+@var{replacement}.
+
+@findex list (a PEG shorthand)
+@item (list @var{e})
+Match @var{e}, collect all values produced by @var{e} (and its
+sub-expressions) into a list, and push that list onto the stack.  Stack
  values are typically returned as a flat list; this is a way of
  ``grouping'' values together.
  @end table
  
  @node Writing PEG Rules
  @section Writing PEG Rules
+@cindex PEG rules, pitfalls
+@cindex Parsing Expression Grammar, pitfalls in rules
  
  Something to be aware of when writing PEG rules is that they are
  greedy.  Rules which can consume a variable amount of text will always
@@ -319,9 +350,10 @@ backtracking.  For instance, this rule will never succeed:
  (forest (+ "tree" (* [blank])) "tree" (eol))
  @end example
  
-The @acronym{PEX} @code{(+ "tree" (* [blank]))} will consume all
-repetitions of the word ``tree'', leaving none to match the final
-@code{"tree"}.
+@noindent
+The @acronym{PEX} @w{@code{(+ "tree" (* [blank]))}} will consume all
+the repetitions of the word @samp{tree}, leaving none to match the final
+@samp{tree}.
  
  In these situations, the desired result can be obtained by using
  predicates and guards -- namely the @code{not}, @code{if} and
@@ -331,6 +363,7 @@ predicates and guards -- namely the @code{not}, @code{if} and
  (forest (+ "tree" (* [blank])) (not (eol)) "tree" (eol))
  @end example
  
+@noindent
  The @code{if} and @code{not} operators accept a parsing expression and
  interpret it as a boolean, without moving point.  The contents of a
  @code{guard} operator are evaluated as regular Lisp (not a
@@ -345,6 +378,7 @@ rule:
  (end-game "game" (eob))
  @end example
  
+@noindent
  when run in a buffer containing the text ``game over'' after point,
  will move point to just after ``game'' then halt parsing, returning
  @code{nil}.  Successful parsing will always return @code{t}, or the
diff --git a/etc/NEWS b/etc/NEWS

index 2ba1268ddad3ebb4917d2f3d923017edd6aeaa4f..d168c63894b81f292951a202e649f158b6b200b1 100644 (file)
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1697,8 +1697,8 @@ preventing the installation of Compat if unnecessary.
  
  +++
  ** New package PEG.
-Emacs now includes a library for writing (P)arsing (E)xpression
-(G)rammars, an approach to text parsing that provides more structure
+Emacs now includes a library for writing Parsing Expression
+Grammars (PEG), an approach to text parsing that provides more structure
  than regular expressions, but less complexity than context-free
  grammars.  The Info manual "(elisp) Parsing Expression Grammars" has
  documentation and examples.
diff --git a/test/lisp/peg-tests.el b/test/lisp/peg-tests.el

deleted file mode 100644 (file)

index 864e09b..0000000
--- a/test/lisp/peg-tests.el
+++ /dev/null
@@ -1,367 +0,0 @@
-;;; peg-tests.el --- Tests of PEG parsers            -*- lexical-binding: t; -*-
-
-;; Copyright (C) 2008-2023  Free Software Foundation, Inc.
-
-;; This program is free software; you can redistribute it and/or modify
-;; it under the terms of the GNU General Public License as published by
-;; the Free Software Foundation, either version 3 of the License, or
-;; (at your option) any later version.
-
-;; This program is distributed in the hope that it will be useful,
-;; but WITHOUT ANY WARRANTY; without even the implied warranty of
-;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-;; GNU General Public License for more details.
-
-;; You should have received a copy of the GNU General Public License
-;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
-
-;;; Commentary:
-
-;; Tests and examples, that used to live in peg.el wrapped inside an `eval'.
-
-;;; Code:
-
-(require 'peg)
-(require 'ert)
-
-;;; Tests:
-
-(defmacro peg-parse-string (pex string &optional noerror)
-  "Parse STRING according to PEX.
-If NOERROR is non-nil, push nil resp. t if the parse failed
-resp. succeeded instead of signaling an error."
-  (let ((oldstyle (consp (car-safe pex)))) ;PEX is really a list of rules.
-    `(with-temp-buffer
-       (insert ,string)
-       (goto-char (point-min))
-       ,(if oldstyle
-            `(with-peg-rules ,pex
-               (peg-run (peg ,(caar pex))
-                        ,(unless noerror '#'peg-signal-failure)))
-          `(peg-run (peg ,pex)
-                    ,(unless noerror '#'peg-signal-failure))))))
-
-(define-peg-rule peg-test-natural ()
-  [0-9] (* [0-9]))
-
-(ert-deftest peg-test ()
-  (should (peg-parse-string peg-test-natural "99 bottles" t))
-  (should (peg-parse-string ((s "a")) "a" t))
-  (should (not (peg-parse-string ((s "a")) "b" t)))
-  (should (peg-parse-string ((s (not "a"))) "b" t))
-  (should (not (peg-parse-string ((s (not "a"))) "a" t)))
-  (should (peg-parse-string ((s (if "a"))) "a" t))
-  (should (not (peg-parse-string ((s (if "a"))) "b" t)))
-  (should (peg-parse-string ((s "ab")) "ab" t))
-  (should (not (peg-parse-string ((s "ab")) "ba" t)))
-  (should (not (peg-parse-string ((s "ab")) "a" t)))
-  (should (peg-parse-string ((s (range ?0 ?9))) "0" t))
-  (should (not (peg-parse-string ((s (range ?0 ?9))) "a" t)))
-  (should (peg-parse-string ((s [0-9])) "0" t))
-  (should (not (peg-parse-string ((s [0-9])) "a" t)))
-  (should (not (peg-parse-string ((s [0-9])) "" t)))
-  (should (peg-parse-string ((s (any))) "0" t))
-  (should (not (peg-parse-string ((s (any))) "" t)))
-  (should (peg-parse-string ((s (eob))) "" t))
-  (should (peg-parse-string ((s (not (eob)))) "a" t))
-  (should (peg-parse-string ((s (or "a" "b"))) "a" t))
-  (should (peg-parse-string ((s (or "a" "b"))) "b" t))
-  (should (not (peg-parse-string ((s (or "a" "b"))) "c" t)))
-  (should (peg-parse-string (and "a" "b") "ab" t))
-  (should (peg-parse-string ((s (and "a" "b"))) "abc" t))
-  (should (not (peg-parse-string (and "a" "b") "ba" t)))
-  (should (peg-parse-string ((s (and "a" "b" "c"))) "abc" t))
-  (should (peg-parse-string ((s (* "a") "b" (eob))) "b" t))
-  (should (peg-parse-string ((s (* "a") "b" (eob))) "ab" t))
-  (should (peg-parse-string ((s (* "a") "b" (eob))) "aaab" t))
-  (should (not (peg-parse-string ((s (* "a") "b" (eob))) "abc" t)))
-  (should (peg-parse-string ((s "")) "abc" t))
-  (should (peg-parse-string ((s "" (eob))) "" t))
-  (should (peg-parse-string ((s (opt "a") "b")) "abc" t))
-  (should (peg-parse-string ((s (opt "a") "b")) "bc" t))
-  (should (not (peg-parse-string ((s (or))) "ab" t)))
-  (should (peg-parse-string ((s (and))) "ab" t))
-  (should (peg-parse-string ((s (and))) "" t))
-  (should (peg-parse-string ((s ["^"])) "^" t))
-  (should (peg-parse-string ((s ["^a"])) "a" t))
-  (should (peg-parse-string ["-"] "-" t))
-  (should (peg-parse-string ((s ["]-"])) "]" t))
-  (should (peg-parse-string ((s ["^]"])) "^" t))
-  (should (peg-parse-string ((s [alpha])) "z" t))
-  (should (not (peg-parse-string ((s [alpha])) "0" t)))
-  (should (not (peg-parse-string ((s [alpha])) "" t)))
-  (should (not (peg-parse-string ((s ["][:alpha:]"])) "z" t)))
-  (should (peg-parse-string ((s (bob))) "" t))
-  (should (peg-parse-string ((s (bos))) "x" t))
-  (should (not (peg-parse-string ((s (bos))) " x" t)))
-  (should (peg-parse-string ((s "x" (eos))) "x" t))
-  (should (peg-parse-string ((s (syntax-class whitespace))) " " t))
-  (should (peg-parse-string ((s (= "foo"))) "foo" t))
-  (should (let ((f "foo")) (peg-parse-string ((s (= f))) "foo" t)))
-  (should (not (peg-parse-string ((s (= "foo"))) "xfoo" t)))
-  (should (equal (peg-parse-string ((s `(-- 1 2))) "") '(2 1)))
-  (should (equal (peg-parse-string ((s `(-- 1 2) `(a b -- a b))) "") '(2 1)))
-  (should (equal (peg-parse-string ((s (or (and (any) s)
-                                          (substring [0-9]))))
-                                  "ab0cd1ef2gh")
-                '("2")))
-  ;; The PEG rule `other' doesn't exist, which will cause a byte-compiler
-  ;; warning, but not an error at run time because the rule is not actually
-  ;; used in this particular case.
-  (should (equal (peg-parse-string ((s (substring (or "a" other)))
-                                    ;; Unused left-recursive rule, should
-                                    ;; cause a byte-compiler warning.
-                                    (r (* "a") r))
-                                   "af")
-                 '("a")))
-  (should (equal (peg-parse-string ((s (list x y))
-                                   (x `(-- 1))
-                                   (y `(-- 2)))
-                                  "")
-                '((1 2))))
-  (should (equal (peg-parse-string ((s (list (* x)))
-                                    (x "" `(-- 'x)))
-                                   "xxx")
-                 ;; The empty loop body should be matched once!
-                 '((x))))
-  (should (equal (peg-parse-string ((s (list (* x)))
-                                    (x "x" `(-- 'x)))
-                                   "xxx")
-                 '((x x x))))
-  (should (equal (peg-parse-string ((s (region (* x)))
-                                   (x "x" `(-- 'x)))
-                                  "xxx")
-                 ;; FIXME: Since string positions start at 0, this should
-                 ;; really be '(3 x x x 0) !!
-                '(4 x x x 1)))
-  (should (equal (peg-parse-string ((s (region (list (* x))))
-                                   (x "x" `(-- 'x 'y)))
-                                  "xxx")
-                '(4 (x y x y x y) 1)))
-  (should (equal (with-temp-buffer
-                  (save-excursion (insert "abcdef"))
-                  (list
-                   (peg-run (peg "a"
-                                 (replace "bc" "x")
-                                 (replace "de" "y")
-                                 "f"))
-                   (buffer-string)))
-                '(t "axyf")))
-  (with-temp-buffer
-    (insert "toro")
-    (goto-char (point-min))
-    (should (peg-run (peg "to")))
-    (should-not (peg-run (peg "to")))
-    (should (peg-run (peg "ro")))
-    (should (eobp)))
-  (with-temp-buffer
-    (insert "   ")
-    (goto-char (point-min))
-    (peg-run (peg (+ (syntax-class whitespace))))
-    (should (eobp)))
-  )
-
-;;; Examples:
-
-;; peg-ex-recognize-int recognizes integers.  An integer begins with a
-;; optional sign, then follows one or more digits.  Digits are all
-;; characters from 0 to 9.
-;;
-;; Notes:
-;; 1) "" matches the empty sequence, i.e. matches without consuming
-;;    input.
-;; 2) [0-9] is the character range from 0 to 9.  This can also be
-;;    written as (range ?0 ?9).  Note that 0-9 is a symbol.
-(defun peg-ex-recognize-int ()
-  (with-peg-rules ((number   sign digit (* digit))
-                  (sign     (or "+" "-" ""))
-                  (digit    [0-9]))
-    (peg-run (peg number))))
-
-;; peg-ex-parse-int recognizes integers and computes the corresponding
-;; value.  The grammar is the same as for `peg-ex-recognize-int'
-;; augmented with parsing actions.  Unfortunaletly, the actions add
-;; quite a bit of clutter.
-;;
-;; The actions for the sign rule push -1 on the stack for a minus sign
-;; and 1 for plus or no sign.
-;;
-;; The action for the digit rule pushes the value for a single digit.
-;;
-;; The action `(a b -- (+ (* a 10) b)), takes two items from the stack
-;; and pushes the first digit times 10 added to the second digit.
-;;
-;; The action `(sign val -- (* sign val)), multiplies val with the
-;; sign (1 or -1).
-(defun peg-ex-parse-int ()
-  (with-peg-rules ((number sign digit (* digit
-                                        `(a b -- (+ (* a 10) b)))
-                          `(sign val -- (* sign val)))
-                   (sign (or (and "+" `(-- 1))
-                            (and "-" `(-- -1))
-                            (and ""  `(-- 1))))
-                  (digit [0-9] `(-- (- (char-before) ?0))))
-    (peg-run (peg number))))
-
-;; Put point after the ) and press C-x C-e
-;; (peg-ex-parse-int)-234234
-
-;; Parse arithmetic expressions and compute the result as side effect.
-(defun peg-ex-arith ()
-  (peg-parse
-   (expr _ sum eol)
-   (sum product (* (or (and "+" _ product `(a b -- (+ a b)))
-                      (and "-" _ product `(a b -- (- a b))))))
-   (product value (* (or (and "*" _ value `(a b -- (* a b)))
-                        (and "/" _ value `(a b -- (/ a b))))))
-   (value (or (and (substring number) `(string -- (string-to-number string)))
-             (and "(" _ sum ")" _)))
-   (number (+ [0-9]) _)
-   (_ (* [" \t"]))
-   (eol (or "\n" "\r\n" "\r"))))
-
-;; (peg-ex-arith)   1 + 2 * 3 * (4 + 5)
-;; (peg-ex-arith)   1 + 2 ^ 3 * (4 + 5)  ; fails to parse
-
-;; Parse URI according to RFC 2396.
-(defun peg-ex-uri ()
-  (peg-parse
-   (URI-reference (or absoluteURI relativeURI)
-                 (or (and "#" (substring fragment))
-                     `(-- nil))
-                 `(scheme user host port path query fragment --
-                          (list :scheme scheme :user user
-                                :host host :port port
-                                :path path :query query
-                                :fragment fragment)))
-   (absoluteURI (substring scheme) ":" (or hier-part opaque-part))
-   (hier-part ;(-- user host port path query)
-    (or net-path
-       (and `(-- nil nil nil)
-            abs-path))
-    (or (and "?" (substring query))
-       `(-- nil)))
-   (net-path "//" authority (or abs-path `(-- nil)))
-   (abs-path "/" path-segments)
-   (path-segments segment (list (* "/" segment)) `(s l -- (cons s l)))
-   (segment (substring (* pchar) (* ";" param)))
-   (param (* pchar))
-   (pchar (or unreserved escaped [":@&=+$,"]))
-   (query (* uric))
-   (fragment (* uric))
-   (relativeURI (or net-path abs-path rel-path) (opt "?" query))
-   (rel-path rel-segment (opt abs-path))
-   (rel-segment (+ unreserved escaped [";@&=+$,"]))
-   (authority (or server reg-name))
-   (server (or (and (or (and (substring userinfo) "@")
-                       `(-- nil))
-                   hostport)
-              `(-- nil nil nil)))
-   (userinfo (* (or unreserved escaped [";:&=+$,"])))
-   (hostport (substring host) (or (and ":" (substring port))
-                                 `(-- nil)))
-   (host (or hostname ipv4address))
-   (hostname (* domainlabel ".") toplabel (opt "."))
-   (domainlabel alphanum
-               (opt (* (or alphanum "-") (if alphanum))
-                    alphanum))
-   (toplabel alpha
-            (* (or alphanum "-") (if alphanum))
-            alphanum)
-   (ipv4address (+ digit) "." (+ digit) "." (+ digit) "." (+ digit))
-   (port (* digit))
-   (scheme alpha (* (or alpha digit ["+-."])))
-   (reg-name (or unreserved escaped ["$,;:@&=+"]))
-   (opaque-part uric-no-slash (* uric))
-   (uric (or reserved unreserved escaped))
-   (uric-no-slash (or unreserved escaped [";?:@&=+$,"]))
-   (reserved (set ";/?:@&=+$,"))
-   (unreserved (or alphanum mark))
-   (escaped "%" hex hex)
-   (hex (or digit [A-F] [a-f]))
-   (mark (set "-_.!~*'()"))
-   (alphanum (or alpha digit))
-   (alpha (or lowalpha upalpha))
-   (lowalpha [a-z])
-   (upalpha [A-Z])
-   (digit [0-9])))
-
-;; (peg-ex-uri)http://luser@www.foo.com:8080/bar/baz.html?x=1#foo
-;; (peg-ex-uri)file:/bar/baz.html?foo=df#x
-
-;; Split STRING where SEPARATOR occurs.
-(defun peg-ex-split (string separator)
-  (peg-parse-string ((s (list (* (* sep) elt)))
-                    (elt (substring (+ (not sep) (any))))
-                    (sep (= separator)))
-                   string))
-
-;; (peg-ex-split "-abc-cd-" "-")
-
-;; Parse a lisp style Sexp.
-;; [To keep the example short, ' and . are handled as ordinary symbol.]
-(defun peg-ex-lisp ()
-  (peg-parse
-   (sexp _ (or string list number symbol))
-   (_ (* (or [" \n\t"] comment)))
-   (comment ";" (* (not (or "\n" (eob))) (any)))
-   (string "\"" (substring  (* (not "\"") (any))) "\"")
-   (number (substring (opt (set "+-")) (+ digit))
-          (if terminating)
-          `(string -- (string-to-number string)))
-   (symbol (substring (and symchar (* (not terminating) symchar)))
-          `(s -- (intern s)))
-   (symchar [a-z A-Z 0-9 "-;!#%&'*+,./:;<=>?@[]^_`{|}~"])
-   (list "("           `(-- (cons nil nil)) `(hd -- hd hd)
-        (* sexp        `(tl e -- (setcdr tl (list e))))
-        _ ")"  `(hd _tl -- (cdr hd)))
-   (digit [0-9])
-   (terminating (or (set " \n\t();\"'") (eob)))))
-
-;; (peg-ex-lisp)
-
-;; We try to detect left recursion and report it as error.
-(defun peg-ex-left-recursion ()
-  (eval '(peg-parse (exp (or term
-                            (and exp "+" exp)))
-                   (term (or digit
-                             (and term "*" term)))
-                   (digit [0-9]))
-        t))
-
-(defun peg-ex-infinite-loop ()
-  (eval '(peg-parse (exp (* (or "x"
-                               "y"
-                               (action (foo))))))
-        t))
-
-;; Some efficiency problems:
-
-;; Find the last digit in a string.
-;; Recursive definition with excessive stack usage.
-(defun peg-ex-last-digit (string)
-  (peg-parse-string ((s (or (and (any) s)
-                           (substring [0-9]))))
-                   string))
-
-;; (peg-ex-last-digit "ab0cd1ef2gh")
-;; (peg-ex-last-digit (make-string 50 ?-))
-;; (peg-ex-last-digit (make-string 1000 ?-))
-
-;; Find the last digit without recursion.  Doesn't run out of stack,
-;; but probably still too inefficient for large inputs.
-(defun peg-ex-last-digit2 (string)
-  (peg-parse-string ((s `(-- nil)
-                       (+ (* (not digit) (any))
-                          (substring digit)
-                          `(_d1 d2 -- d2)))
-                    (digit [0-9]))
-                   string))
-
-;; (peg-ex-last-digit2 "ab0cd1ef2gh")
-;; (peg-ex-last-digit2 (concat (make-string 500000 ?-) "8a9b"))
-;; (peg-ex-last-digit2 (make-string 500000 ?-))
-;; (peg-ex-last-digit2 (make-string 500000 ?5))
-
-(provide 'peg-tests)
-;;; peg-tests.el ends here
diff --git a/test/lisp/progmodes/peg-tests.el b/test/lisp/progmodes/peg-tests.el

new file mode 100644 (file)

index 0000000..864e09b
--- /dev/null
+++ b/test/lisp/progmodes/peg-tests.el
@@ -0,0 +1,367 @@
+;;; peg-tests.el --- Tests of PEG parsers            -*- lexical-binding: t; -*-
+
+;; Copyright (C) 2008-2023  Free Software Foundation, Inc.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; Tests and examples, that used to live in peg.el wrapped inside an `eval'.
+
+;;; Code:
+
+(require 'peg)
+(require 'ert)
+
+;;; Tests:
+
+(defmacro peg-parse-string (pex string &optional noerror)
+  "Parse STRING according to PEX.
+If NOERROR is non-nil, push nil resp. t if the parse failed
+resp. succeeded instead of signaling an error."
+  (let ((oldstyle (consp (car-safe pex)))) ;PEX is really a list of rules.
+    `(with-temp-buffer
+       (insert ,string)
+       (goto-char (point-min))
+       ,(if oldstyle
+            `(with-peg-rules ,pex
+               (peg-run (peg ,(caar pex))
+                        ,(unless noerror '#'peg-signal-failure)))
+          `(peg-run (peg ,pex)
+                    ,(unless noerror '#'peg-signal-failure))))))
+
+(define-peg-rule peg-test-natural ()
+  [0-9] (* [0-9]))
+
+(ert-deftest peg-test ()
+  (should (peg-parse-string peg-test-natural "99 bottles" t))
+  (should (peg-parse-string ((s "a")) "a" t))
+  (should (not (peg-parse-string ((s "a")) "b" t)))
+  (should (peg-parse-string ((s (not "a"))) "b" t))
+  (should (not (peg-parse-string ((s (not "a"))) "a" t)))
+  (should (peg-parse-string ((s (if "a"))) "a" t))
+  (should (not (peg-parse-string ((s (if "a"))) "b" t)))
+  (should (peg-parse-string ((s "ab")) "ab" t))
+  (should (not (peg-parse-string ((s "ab")) "ba" t)))
+  (should (not (peg-parse-string ((s "ab")) "a" t)))
+  (should (peg-parse-string ((s (range ?0 ?9))) "0" t))
+  (should (not (peg-parse-string ((s (range ?0 ?9))) "a" t)))
+  (should (peg-parse-string ((s [0-9])) "0" t))
+  (should (not (peg-parse-string ((s [0-9])) "a" t)))
+  (should (not (peg-parse-string ((s [0-9])) "" t)))
+  (should (peg-parse-string ((s (any))) "0" t))
+  (should (not (peg-parse-string ((s (any))) "" t)))
+  (should (peg-parse-string ((s (eob))) "" t))
+  (should (peg-parse-string ((s (not (eob)))) "a" t))
+  (should (peg-parse-string ((s (or "a" "b"))) "a" t))
+  (should (peg-parse-string ((s (or "a" "b"))) "b" t))
+  (should (not (peg-parse-string ((s (or "a" "b"))) "c" t)))
+  (should (peg-parse-string (and "a" "b") "ab" t))
+  (should (peg-parse-string ((s (and "a" "b"))) "abc" t))
+  (should (not (peg-parse-string (and "a" "b") "ba" t)))
+  (should (peg-parse-string ((s (and "a" "b" "c"))) "abc" t))
+  (should (peg-parse-string ((s (* "a") "b" (eob))) "b" t))
+  (should (peg-parse-string ((s (* "a") "b" (eob))) "ab" t))
+  (should (peg-parse-string ((s (* "a") "b" (eob))) "aaab" t))
+  (should (not (peg-parse-string ((s (* "a") "b" (eob))) "abc" t)))
+  (should (peg-parse-string ((s "")) "abc" t))
+  (should (peg-parse-string ((s "" (eob))) "" t))
+  (should (peg-parse-string ((s (opt "a") "b")) "abc" t))
+  (should (peg-parse-string ((s (opt "a") "b")) "bc" t))
+  (should (not (peg-parse-string ((s (or))) "ab" t)))
+  (should (peg-parse-string ((s (and))) "ab" t))
+  (should (peg-parse-string ((s (and))) "" t))
+  (should (peg-parse-string ((s ["^"])) "^" t))
+  (should (peg-parse-string ((s ["^a"])) "a" t))
+  (should (peg-parse-string ["-"] "-" t))
+  (should (peg-parse-string ((s ["]-"])) "]" t))
+  (should (peg-parse-string ((s ["^]"])) "^" t))
+  (should (peg-parse-string ((s [alpha])) "z" t))
+  (should (not (peg-parse-string ((s [alpha])) "0" t)))
+  (should (not (peg-parse-string ((s [alpha])) "" t)))
+  (should (not (peg-parse-string ((s ["][:alpha:]"])) "z" t)))
+  (should (peg-parse-string ((s (bob))) "" t))
+  (should (peg-parse-string ((s (bos))) "x" t))
+  (should (not (peg-parse-string ((s (bos))) " x" t)))
+  (should (peg-parse-string ((s "x" (eos))) "x" t))
+  (should (peg-parse-string ((s (syntax-class whitespace))) " " t))
+  (should (peg-parse-string ((s (= "foo"))) "foo" t))
+  (should (let ((f "foo")) (peg-parse-string ((s (= f))) "foo" t)))
+  (should (not (peg-parse-string ((s (= "foo"))) "xfoo" t)))
+  (should (equal (peg-parse-string ((s `(-- 1 2))) "") '(2 1)))
+  (should (equal (peg-parse-string ((s `(-- 1 2) `(a b -- a b))) "") '(2 1)))
+  (should (equal (peg-parse-string ((s (or (and (any) s)
+                                          (substring [0-9]))))
+                                  "ab0cd1ef2gh")
+                '("2")))
+  ;; The PEG rule `other' doesn't exist, which will cause a byte-compiler
+  ;; warning, but not an error at run time because the rule is not actually
+  ;; used in this particular case.
+  (should (equal (peg-parse-string ((s (substring (or "a" other)))
+                                    ;; Unused left-recursive rule, should
+                                    ;; cause a byte-compiler warning.
+                                    (r (* "a") r))
+                                   "af")
+                 '("a")))
+  (should (equal (peg-parse-string ((s (list x y))
+                                   (x `(-- 1))
+                                   (y `(-- 2)))
+                                  "")
+                '((1 2))))
+  (should (equal (peg-parse-string ((s (list (* x)))
+                                    (x "" `(-- 'x)))
+                                   "xxx")
+                 ;; The empty loop body should be matched once!
+                 '((x))))
+  (should (equal (peg-parse-string ((s (list (* x)))
+                                    (x "x" `(-- 'x)))
+                                   "xxx")
+                 '((x x x))))
+  (should (equal (peg-parse-string ((s (region (* x)))
+                                   (x "x" `(-- 'x)))
+                                  "xxx")
+                 ;; FIXME: Since string positions start at 0, this should
+                 ;; really be '(3 x x x 0) !!
+                '(4 x x x 1)))
+  (should (equal (peg-parse-string ((s (region (list (* x))))
+                                   (x "x" `(-- 'x 'y)))
+                                  "xxx")
+                '(4 (x y x y x y) 1)))
+  (should (equal (with-temp-buffer
+                  (save-excursion (insert "abcdef"))
+                  (list
+                   (peg-run (peg "a"
+                                 (replace "bc" "x")
+                                 (replace "de" "y")
+                                 "f"))
+                   (buffer-string)))
+                '(t "axyf")))
+  (with-temp-buffer
+    (insert "toro")
+    (goto-char (point-min))
+    (should (peg-run (peg "to")))
+    (should-not (peg-run (peg "to")))
+    (should (peg-run (peg "ro")))
+    (should (eobp)))
+  (with-temp-buffer
+    (insert "   ")
+    (goto-char (point-min))
+    (peg-run (peg (+ (syntax-class whitespace))))
+    (should (eobp)))
+  )
+
+;;; Examples:
+
+;; peg-ex-recognize-int recognizes integers.  An integer begins with a
+;; optional sign, then follows one or more digits.  Digits are all
+;; characters from 0 to 9.
+;;
+;; Notes:
+;; 1) "" matches the empty sequence, i.e. matches without consuming
+;;    input.
+;; 2) [0-9] is the character range from 0 to 9.  This can also be
+;;    written as (range ?0 ?9).  Note that 0-9 is a symbol.
+(defun peg-ex-recognize-int ()
+  (with-peg-rules ((number   sign digit (* digit))
+                  (sign     (or "+" "-" ""))
+                  (digit    [0-9]))
+    (peg-run (peg number))))
+
+;; peg-ex-parse-int recognizes integers and computes the corresponding
+;; value.  The grammar is the same as for `peg-ex-recognize-int'
+;; augmented with parsing actions.  Unfortunaletly, the actions add
+;; quite a bit of clutter.
+;;
+;; The actions for the sign rule push -1 on the stack for a minus sign
+;; and 1 for plus or no sign.
+;;
+;; The action for the digit rule pushes the value for a single digit.
+;;
+;; The action `(a b -- (+ (* a 10) b)), takes two items from the stack
+;; and pushes the first digit times 10 added to the second digit.
+;;
+;; The action `(sign val -- (* sign val)), multiplies val with the
+;; sign (1 or -1).
+(defun peg-ex-parse-int ()
+  (with-peg-rules ((number sign digit (* digit
+                                        `(a b -- (+ (* a 10) b)))
+                          `(sign val -- (* sign val)))
+                   (sign (or (and "+" `(-- 1))
+                            (and "-" `(-- -1))
+                            (and ""  `(-- 1))))
+                  (digit [0-9] `(-- (- (char-before) ?0))))
+    (peg-run (peg number))))
+
+;; Put point after the ) and press C-x C-e
+;; (peg-ex-parse-int)-234234
+
+;; Parse arithmetic expressions and compute the result as side effect.
+(defun peg-ex-arith ()
+  (peg-parse
+   (expr _ sum eol)
+   (sum product (* (or (and "+" _ product `(a b -- (+ a b)))
+                      (and "-" _ product `(a b -- (- a b))))))
+   (product value (* (or (and "*" _ value `(a b -- (* a b)))
+                        (and "/" _ value `(a b -- (/ a b))))))
+   (value (or (and (substring number) `(string -- (string-to-number string)))
+             (and "(" _ sum ")" _)))
+   (number (+ [0-9]) _)
+   (_ (* [" \t"]))
+   (eol (or "\n" "\r\n" "\r"))))
+
+;; (peg-ex-arith)   1 + 2 * 3 * (4 + 5)
+;; (peg-ex-arith)   1 + 2 ^ 3 * (4 + 5)  ; fails to parse
+
+;; Parse URI according to RFC 2396.
+(defun peg-ex-uri ()
+  (peg-parse
+   (URI-reference (or absoluteURI relativeURI)
+                 (or (and "#" (substring fragment))
+                     `(-- nil))
+                 `(scheme user host port path query fragment --
+                          (list :scheme scheme :user user
+                                :host host :port port
+                                :path path :query query
+                                :fragment fragment)))
+   (absoluteURI (substring scheme) ":" (or hier-part opaque-part))
+   (hier-part ;(-- user host port path query)
+    (or net-path
+       (and `(-- nil nil nil)
+            abs-path))
+    (or (and "?" (substring query))
+       `(-- nil)))
+   (net-path "//" authority (or abs-path `(-- nil)))
+   (abs-path "/" path-segments)
+   (path-segments segment (list (* "/" segment)) `(s l -- (cons s l)))
+   (segment (substring (* pchar) (* ";" param)))
+   (param (* pchar))
+   (pchar (or unreserved escaped [":@&=+$,"]))
+   (query (* uric))
+   (fragment (* uric))
+   (relativeURI (or net-path abs-path rel-path) (opt "?" query))
+   (rel-path rel-segment (opt abs-path))
+   (rel-segment (+ unreserved escaped [";@&=+$,"]))
+   (authority (or server reg-name))
+   (server (or (and (or (and (substring userinfo) "@")
+                       `(-- nil))
+                   hostport)
+              `(-- nil nil nil)))
+   (userinfo (* (or unreserved escaped [";:&=+$,"])))
+   (hostport (substring host) (or (and ":" (substring port))
+                                 `(-- nil)))
+   (host (or hostname ipv4address))
+   (hostname (* domainlabel ".") toplabel (opt "."))
+   (domainlabel alphanum
+               (opt (* (or alphanum "-") (if alphanum))
+                    alphanum))
+   (toplabel alpha
+            (* (or alphanum "-") (if alphanum))
+            alphanum)
+   (ipv4address (+ digit) "." (+ digit) "." (+ digit) "." (+ digit))
+   (port (* digit))
+   (scheme alpha (* (or alpha digit ["+-."])))
+   (reg-name (or unreserved escaped ["$,;:@&=+"]))
+   (opaque-part uric-no-slash (* uric))
+   (uric (or reserved unreserved escaped))
+   (uric-no-slash (or unreserved escaped [";?:@&=+$,"]))
+   (reserved (set ";/?:@&=+$,"))
+   (unreserved (or alphanum mark))
+   (escaped "%" hex hex)
+   (hex (or digit [A-F] [a-f]))
+   (mark (set "-_.!~*'()"))
+   (alphanum (or alpha digit))
+   (alpha (or lowalpha upalpha))
+   (lowalpha [a-z])
+   (upalpha [A-Z])
+   (digit [0-9])))
+
+;; (peg-ex-uri)http://luser@www.foo.com:8080/bar/baz.html?x=1#foo
+;; (peg-ex-uri)file:/bar/baz.html?foo=df#x
+
+;; Split STRING where SEPARATOR occurs.
+(defun peg-ex-split (string separator)
+  (peg-parse-string ((s (list (* (* sep) elt)))
+                    (elt (substring (+ (not sep) (any))))
+                    (sep (= separator)))
+                   string))
+
+;; (peg-ex-split "-abc-cd-" "-")
+
+;; Parse a lisp style Sexp.
+;; [To keep the example short, ' and . are handled as ordinary symbol.]
+(defun peg-ex-lisp ()
+  (peg-parse
+   (sexp _ (or string list number symbol))
+   (_ (* (or [" \n\t"] comment)))
+   (comment ";" (* (not (or "\n" (eob))) (any)))
+   (string "\"" (substring  (* (not "\"") (any))) "\"")
+   (number (substring (opt (set "+-")) (+ digit))
+          (if terminating)
+          `(string -- (string-to-number string)))
+   (symbol (substring (and symchar (* (not terminating) symchar)))
+          `(s -- (intern s)))
+   (symchar [a-z A-Z 0-9 "-;!#%&'*+,./:;<=>?@[]^_`{|}~"])
+   (list "("           `(-- (cons nil nil)) `(hd -- hd hd)
+        (* sexp        `(tl e -- (setcdr tl (list e))))
+        _ ")"  `(hd _tl -- (cdr hd)))
+   (digit [0-9])
+   (terminating (or (set " \n\t();\"'") (eob)))))
+
+;; (peg-ex-lisp)
+
+;; We try to detect left recursion and report it as error.
+(defun peg-ex-left-recursion ()
+  (eval '(peg-parse (exp (or term
+                            (and exp "+" exp)))
+                   (term (or digit
+                             (and term "*" term)))
+                   (digit [0-9]))
+        t))
+
+(defun peg-ex-infinite-loop ()
+  (eval '(peg-parse (exp (* (or "x"
+                               "y"
+                               (action (foo))))))
+        t))
+
+;; Some efficiency problems:
+
+;; Find the last digit in a string.
+;; Recursive definition with excessive stack usage.
+(defun peg-ex-last-digit (string)
+  (peg-parse-string ((s (or (and (any) s)
+                           (substring [0-9]))))
+                   string))
+
+;; (peg-ex-last-digit "ab0cd1ef2gh")
+;; (peg-ex-last-digit (make-string 50 ?-))
+;; (peg-ex-last-digit (make-string 1000 ?-))
+
+;; Find the last digit without recursion.  Doesn't run out of stack,
+;; but probably still too inefficient for large inputs.
+(defun peg-ex-last-digit2 (string)
+  (peg-parse-string ((s `(-- nil)
+                       (+ (* (not digit) (any))
+                          (substring digit)
+                          `(_d1 d2 -- d2)))
+                    (digit [0-9]))
+                   string))
+
+;; (peg-ex-last-digit2 "ab0cd1ef2gh")
+;; (peg-ex-last-digit2 (concat (make-string 500000 ?-) "8a9b"))
+;; (peg-ex-last-digit2 (make-string 500000 ?-))
+;; (peg-ex-last-digit2 (make-string 500000 ?5))
+
+(provide 'peg-tests)
+;;; peg-tests.el ends here
author	Eli Zaretskii <eliz@gnu.org>
	Sun, 31 Mar 2024 07:29:34 +0000 (10:29 +0300)
committer	Eshel Yaron <me@eshelyaron.com>
	Sun, 31 Mar 2024 18:05:43 +0000 (20:05 +0200)
doc/lispref/peg.texi		patch \| blob \| history
etc/NEWS		patch \| blob \| history
test/lisp/peg-tests.el	[deleted file]	patch \| blob \| history
test/lisp/progmodes/peg-tests.el	[new file with mode: 0644]	patch \| blob