It's often useful to be able to identify and find certain @dfn{things} in
a buffer, like function and class definitions, statements, code blocks,
-strings, comments, etc. Emacs allows users to define what kind of
-tree-sitter node corresponds to a ``thing''. This enables handy
-features like jumping to the next function, marking the code block at
-point, or transposing two function arguments.
+strings, comments, etc., in terms of node types defined by the
+tree-sitter grammar used in the buffer. Emacs allows Lisp programs to
+define what kinds of tree-sitter nodes corresponds to each ``thing''.
+This enables handy features like jumping to the next function, marking
+the code block at point, transposing two function arguments, etc.
The ``things'' feature in Emacs is independent of the pattern matching
-feature of tree-sitter, and comparatively less powerful, but more
-suitable for navigation and traversing the parse tree.
+feature of tree-sitter (@pxref{Pattern Matching}), and comparatively
+less powerful, but more suitable for navigation and traversing the
+buffer text in terms of the tree-sitter parse tree.
@findex treesit-thing-definition
@findex treesit-thing-defined-p
test if a thing is defined with @code{treesit-thing-defined-p}.
@defvar treesit-thing-settings
-This is an alist of thing definitions for each language. The key of
-each entry is a language symbol, and the value is a list of thing
-definitions of the form @w{@code{(@var{thing} @var{pred})}}, where
-@var{thing} is a symbol representing the thing, like @code{defun},
-@code{sexp}, or @code{sentence}; and @var{pred} specifies what kind of
-tree-sitter node is this @var{thing}.
+This is an alist of thing definitions for each language supported by the
+grammar used in a buffer; it should be defined by the buffer's major
+mode (the default value is @code{nil}). The key of each entry is a
+language symbol (e.g., @code{c} for C, @code{cpp} for C@t{++}, etc.),
+and the value is a list of thing definitions of the form
+@w{@code{(@var{thing} @var{pred})}}, where @var{thing} is a symbol
+representing the thing, and @var{pred} specifies what kinds of
+tree-sitter nodes are considered as this @var{thing}.
+
+@cindex @code{sexp}, treesit-defined thing
+@cindex @code{list}, treesit-defined thing
+The symbol used to define the @var{thing} can be anything meaningful for
+the major mode: @code{defun}, @code{defclass}, @code{sentence},
+@code{comment}, @code{string}, etc. To support tree-sitter based
+navigation commands (@pxref{List Motion}), the mode should define two
+things: @code{list} and @code{sexp}.
@var{pred} can be a regexp string that matches the type of the node; it
can be a function that takes a node as the argument and returns a
Finally, @var{pred} can refer to other @var{thing}s defined in this
list. For example, @w{@code{(or sexp sentence)}} defines something
that's either a @code{sexp} thing or a @code{sentence} thing, as defined
-by some other rule in the alist.
+by some other rules in the alist.
+@cindex @code{named}, treesit-defined thing
+@cindex @code{anonymous}, treesit-defined thing
There are two pre-defined predicates: @code{named} and @code{anonymous},
-which qualify, respectively, named and anonymous nodes. They can be
-combined with @code{and} to narrow down the match.
+which qualify, respectively, named and anonymous nodes of the
+tree-sitter grammar. They can be combined with @code{and} to narrow
+down the match.
-Here's an example @code{treesit-thing-settings} for C and C++:
+Here's an example @code{treesit-thing-settings} for C and C@t{++}:
@example
@group
(comment "comment")
(string "raw_string_literal")
(text (or comment string)))
+@end group
+@group
(cpp
(defun ("function_definition" . cpp-ts-mode-defun-valid-p))
(defclass "class_specifier")
@noindent
Note that this example is modified for didactic purposes, and isn't
-exactly how C and C@t{++} modes define things.
+exactly how tree-sitter based C and C@t{++} modes define things.
@end defvar
-Emacs builtin functions already make use some thing definitions.
+Emacs builtin functions already make use of some thing definitions.
Command @code{treesit-forward-sexp} uses the @code{sexp} definition if
-major mode defines it; @code{treesit-forward-list},
+major mode defines it (@pxref{List Motion}); @code{treesit-forward-list},
@code{treesit-down-list}, @code{treesit-up-list},
@code{treesit-show-paren-data} use the @code{list} definition (its
symbol @code{list} has the symbol property @code{treesit-thing-symbol}
Defun movement functions like @code{treesit-end-of-defun} uses the
@code{defun} definition (@code{defun} definition is overridden by
@var{treesit-defun-type-regexp} for backward compatibility). Major
-modes can also define @code{comment}, @code{string}, @code{text}
-(generally comments and strings).
+modes can also define @code{comment}, @code{string}, and @code{text}
+things (to match comments and strings).
The rest of this section lists a few functions that take advantage of
the thing definitions. Besides the functions below, some other
@code{treesit-induce-sparse-tree}, etc. @xref{Retrieving Nodes}.
@defun treesit-node-match-p node thing &optional ignore-missing
-This function checks whether @var{node} is a @var{thing}.
+This function checks whether @var{node} represents a @var{thing}.
-If @var{node} is a @var{thing}, return non-@code{nil}, otherwise return
-@code{nil}. For convenience, if @code{node} is @code{nil}, this
+If @var{node} represents @var{thing}, return non-@code{nil}, otherwise
+return @code{nil}. For convenience, if @code{node} is @code{nil}, this
function just returns @code{nil}.
The @var{thing} can be either a thing symbol like @code{defun}, or
@end defun
@defun treesit-thing-prev position thing
-This function returns the first node before @var{position} that is the
-specified @var{thing}. If no such node exists, it returns @code{nil}.
+This function returns the first node before @var{position} in the
+current buffer that is the specified @var{thing}. If no such node
+exists, it returns @code{nil}.
It's guaranteed that, if a node is returned, the node's end position is
less or equal to @var{position}. In other words, this function never
returns a node that encloses @var{position}.
A positive @var{arg} means moving forward that many instances of
@var{thing}; negative @var{arg} means moving backward. If @var{side} is
-@code{beg}, this function stops at the beginning of @var{thing}; if
-@code{end}, stop at the end of @var{thing}.
+@code{beg}, this function returns the position of the beginning of
+@var{thing}; if it's @code{end}, it returns the position at the end of
+@var{thing}.
Like in @code{treesit-thing-prev}, @var{thing} can be a thing symbol
defined in @code{treesit-thing-settings}, or a predicate.
@var{position}.
If @var{strict} is non-@code{nil}, this function uses strict comparison,
-i.e., start position must be strictly greater than @var{position}, and end
-position must be strictly less than @var{position}.
+i.e., start position must be strictly smaller than @var{position}, and end
+position must be strictly greater than @var{position}.
@var{thing} can be either a thing symbol defined in
@code{treesit-thing-settings}, or a predicate.
The `list' type uses the `list' thing defined in `treesit-thing-settings'.
See `treesit-thing-at-point'. With this type commands use syntax tables to
-navigate symbols and treesit definition to navigate lists.
+navigate symbols and treesit definitions to navigate lists.
The `sexp' type uses the `sexp' thing defined in `treesit-thing-settings'.
-With this type commands use only the treesit definition of parser nodes,
-without distinction between symbols and lists."
+With this type commands use only the treesit definitions of parser nodes,
+without distinction between symbols and lists. Since tree-sitter grammars
+could group node types in arbitrary ways, navigation by `sexp' might not
+match your expectations, and might produce different results in differnt
+treesit-based modes."
(interactive "p")
(if (not (treesit-thing-defined-p 'list (treesit-language-at (point))))
(user-error "No `list' thing is defined in `treesit-thing-settings'")
(treesit--thing-sibling pos thing nil))
(defun treesit-thing-at (pos thing &optional strict)
- "Return the smallest THING enclosing POS.
+ "Return the smallest node enclosing POS for THING.
-The returned node, if non-nil, must enclose POS, i.e., its start
-<= POS, its end > POS. If STRICT is non-nil, the returned node's
-start must < POS rather than <= POS.
+The returned node, if non-nil, must enclose POS, i.e., its
+start <= POS, its end > POS. If STRICT is non-nil, the returned
+node's start must be < POS rather than <= POS.
-THING should be a thing defined in `treesit-thing-settings', or
-it can be a predicate described in `treesit-thing-settings'."
+THING should be a thing defined in `treesit-thing-settings' for
+the current buffer's major mode, or it can be a predicate
+described in `treesit-thing-settings'."
(let* ((cursor (treesit-node-at pos))
(iter-pred (lambda (node)
(and (treesit-node-match-p node thing t)
(if (eq counter 0) pos nil)))
(defun treesit-thing-at-point (thing tactic)
- "Return the THING at point, or nil if none is found.
+ "Return the node for THING at point, or nil if no THING is found at point.
THING can be a symbol, a regexp, a predicate function, and more;
-see `treesit-thing-settings' for details.
+for details, see `treesit-thing-settings' as defined by the
+current buffer's major mode.
-Return the top-level THING if TACTIC is `top-level'; return the
-smallest enclosing THING as POS if TACTIC is `nested'."
+Return the top-level node for THING if TACTIC is `top-level'; return
+the smallest node enclosing THING at point if TACTIC is `nested'."
(let ((node (treesit-thing-at (point) thing)))
(if (eq tactic 'top-level)
doc:
/* A list defining things.
-The value should be an alist of (LANGUAGE . DEFINITIONS), where
-LANGUAGE is a language symbol, and DEFINITIONS is a list of
+The value should be defined by the major mode, and should be an alist
+of the form (LANGUAGE . DEFINITIONS), where LANGUAGE is a language
+symbol and DEFINITIONS is a list whose elements are of the form
(THING PRED)
-THING is a symbol representing the thing, like `defun', `sexp', or
-`sentence'; PRED defines what kind of node can be qualified as THING.
+THING is a symbol representing the thing, like `defun', `defclass',
+`sexp', `sentence', `comment', or any other symbol that is meaningful
+for the major mode; PRED defines what kind of node can be qualified
+as THING.
PRED can be a regexp string that matches the type of the node; it can
be a predicate function that takes the node as the sole argument and
cons (REGEXP . FN), which is a combination of a regexp and a predicate
function, and the node has to match both to qualify as the thing.
-PRED can also be recursively defined. It can be (or PRED...), meaning
-satisfying anyone of the inner PREDs qualifies the node; or (and
-PRED...) meaning satisfying all of the inner PREDs qualifies the node;
-or (not PRED), meaning not satisfying the inner PRED qualifies the node.
+PRED can also be recursively defined. It can be:
-There are two pre-defined predicates, `named' and `anonymous`. They
+ (or PRED...), meaning satisfying any of the inner PREDs qualifies the node;
+ (and PRED...) meaning satisfying all of the inner PREDs qualifies the node;
+ (not PRED), meaning not satisfying the inner PRED qualifies the node.
+
+There are two pre-defined predicates, `named' and `anonymous'. They
match named nodes and anonymous nodes, respectively.
Finally, PRED can refer to other THINGs defined in this list by using