schemes. The tree-sitter counterpart of @var{font-lock-keywords} is
@var{treesit-font-lock-settings}.
-@c FIXME: The ``query'' part here and thereafter comes ``out of the
-@c blue''. There should be some text here explaining what those
-@c ``queries'' are and how are they related to fontifications, or a
-@c cross-reference to another place with such an explanation.
+In general, tree-sitter fontification works like the following: a Lisp
+program provides a @dfn{query} consisting of @dfn{patterns} with
+@dfn{capture names}. Tree-sitter finds the nodes in the parse tree
+that match these patterns, tags the corresponding capture names onto
+the nodes, and returns them to the Lisp program. The Lisp program
+takes theses nodes and highlights the corresponding buffer text of
+each node depending on the tagged capture name of the node. For
+example, a node tagged @code{font-lock-keyword} would simply be
+highlighted in @code{font-lock-keyword} face. For more information on
+queries, patterns and capture names, @pref{Pattern Matching}.
+
@defun treesit-font-lock-rules :keyword value query...
This function is used to set @var{treesit-font-lock-settings}. It
takes care of compiling queries and other post-processing, and outputs
@item @tab @code{keep} @tab Fill-in regions without an existing face
@end multitable
-@c FIXME: The ``capture names'' part should be expl,ained before it is
-@c first used: what it is and how it's related to fontifications.
-Capture names in @var{query} should be face names like
+Lisp programs mark patterns in the query with capture names (names
+that starts with @code{@@}), and tree-sitter will return matched nodes
+with capture names tagged onto them. For the purpose of
+fontification, capture names in @var{query} should be face names like
@code{font-lock-keyword-face}. The captured node will be fontified
with that face. Capture names can also be function names, in which
case the function is called with 3 arguments: @var{start}, @var{end},
@code{font-lock-maximum-decoration} controls which levels are
activated.
-@c FIXME: This should be rewritten using our style: ``each element of
-@c the list is a list of the form (FOO BAR BAZ), where FOO...'' etc.
-Inside each sublist are feature symbols, which correspond to the
+Each element of the list is a list of the form @w{@code{(@var{feature}
+@dots{})}}, where each @var{feature} corresponds to the
@code{:feature} value of a query defined in
@code{treesit-font-lock-rules}. Removing a feature symbol from this
list disables the corresponding query during font-lock.
Major modes should set this variable before calling
@code{treesit-font-lock-enable}.
-@c FIXME: ``for further changes''? This should clarify when this
-@c function has to be called.
@findex treesit-font-lock-recompute-features
-In addition, for further changes to this variable to take effect, call
-@code{treesit-font-lock-recompute-features}.
+For this variable to take effect, a Lisp program should call
+@code{treesit-font-lock-recompute-features} (which resets
+@code{treesit-font-lock-settings} accordingly).
@end defvar
@defvar treesit-font-lock-settings
A list of settings for tree-sitter based font lock. The exact format
of this variable is considered internal. One should always use
@code{treesit-font-lock-rules} to set this variable.
-
-@c FIXME: If the format is considered ``internal'', why do we need to
-@c describe it here?
-Each @var{setting} is of form
-
-@example
-(@var{query} @var{enable} @var{feature} @var{override})
-@end example
-
-@var{query} must be a compiled query (@pxref{Pattern Matching}).
-
-For @var{setting} to be activated for font-lock, @var{enable} must be
-@code{t}. To disable this @var{setting}, set @var{enable} to
-@code{nil}.
-
-@var{feature} is the ``feature name'' of the query, users can control
-which features are enabled with @code{font-lock-maximum-decoration}
-and @code{treesit-font-lock-feature-list}.
-
-@var{override} is the override flag for this query. Its value can be
-@code{t}, @code{nil}, @code{append}, @code{prepend}, or @code{keep}.
-@c FIXME: See where?
-See more in @code{treesit-font-lock-rules}.
+@c Because the format is internal, we don't document them here.
+@c Though We do have explanations in the docstring.
@end defvar
Multi-language major modes should provide range functions in
@var{language} is a language symbol, and @var{rules} is a list of the
form @w{@code{(@var{matcher} @var{anchor} @var{offset})}}.
-@c FIXME: ``node''?
-First, Emacs passes the node at point to @var{matcher}; if it returns
-non-@code{nil}, this rule is applicable. Then Emacs passes the node
-to @var{anchor}, which returns a buffer position. Emacs takes the
-column number of that position, adds @var{offset} to it, and the
-result is the indentation column for the current line.
+First, Emacs passes the smallest tree-sitter node at the beginning of
+the current line to @var{matcher}; if it returns non-@code{nil}, this
+rule is applicable. Then Emacs passes the node to @var{anchor}, which
+returns a buffer position. Emacs takes the column number of that
+position, adds @var{offset} to it, and the result is the indentation
+column for the current line.
The @var{matcher} and @var{anchor} are functions, and Emacs provides
convenient defaults for them.
-@c FIXME: Clarify the following description. In particular, how to
-@c find/compute ``the largest node'' and its ``parent''?
Each @var{matcher} or @var{anchor} is a function that takes three
arguments: @var{node}, @var{parent}, and @var{bol}. The argument
@var{bol} is the buffer position whose indentation is required: the
position of the first non-whitespace character after the beginning of
the line. The argument @var{node} is the largest (highest-in-tree)
node that starts at that position; and @var{parent} is the parent of
-@var{node}. @var{matcher} should return non-@code{nil} if the rule is
-applicable, and @var{anchor} should return a buffer position that is
-the basis of the indentation.
+@var{node}. Emacs finds @var{bol}, @var{node} and @var{parent} and
+passes them to each @var{matcher} and @var{anchor}. @var{matcher}
+should return non-@code{nil} if the rule is applicable, and
+@var{anchor} should return a buffer position.
@end defvar
@defvar treesit-simple-indent-presets
@ftable @code
@item no-node
-This matcher is a symbol that matches the case where @var{node} is
+This matcher is a function that matches the case where @var{node} is
@code{nil}, i.e., there is no node that starts at @var{bol}. This is
the case when @var{bol} is on an empty line or inside a multi-line
string, etc.
@item parent-is
-This matcher is a function of one argument, @var{type}; it matches if
-the type of the parent node is @var{type}.
+This matcher is a function of one argument, @var{type}; it return a
+function that given @w{@code{(@var{node} @var{parent} @var{bol})}},
+matches if @var{parent}'s type is @var{type}.
@item node-is
-This matcher is a function of one argument, @var{type}; it matches if
-the node's type is @var{type}.
+This matcher is a function of one argument, @var{type}; it returns a
+function that given @w{@code{(@var{node} @var{parent} @var{bol})}},
+matches if @var{node}'s type is @var{type}.
-@c FIXME: The description of this matcher is unclear. What is
-@c ``parent'' and what does it mean ``captures NODE''?
@item query
-This matcher is a function of one argument, @var{query}; it matches if
-querying @var{parent} with @var{query} captures @var{node}. The
-capture name does not matter. @c Why is this bit important?
+This matcher is a function of one argument, @var{query}; it returns a
+function that given @w{@code{(@var{node} @var{parent} @var{bol})}},
+matches if querying @var{parent} with @var{query} captures @var{node}
+(@pxref{Pattern Matching}).
@item match
This matcher is a function of 5 arguments: @var{node-type},
@var{parent-type}, @var{node-field}, @var{node-index-min}, and
-@var{node-index-max}). It matches if @var{node}'s type is @var{node-type},
-@var{parent}'s type is @var{parent-type}, @var{node}'s field name in
-@var{parent} is @var{node-field}, and @var{node}'s index among its
-siblings is between @var{node-index-min} and @var{node-index-max}. If
-@c FIXME: ``constraint''?
-the value of a constraint is nil, this matcher doesn't check for that
-constraint. For example, to match the first child where parent is
+@var{node-index-max}). It returns a function that given
+@w{@code{(@var{node} @var{parent} @var{bol})}}, matches if
+@var{node}'s type is @var{node-type}, @var{parent}'s type is
+@var{parent-type}, @var{node}'s field name in @var{parent} is
+@var{node-field}, and @var{node}'s index among its siblings is between
+@var{node-index-min} and @var{node-index-max}. If the value of an
+argument is @code{nil}, this matcher doesn't check for that argument.
+For example, to match the first child where parent is
@code{argument_list}, use
@example
(match nil "argument_list" nil nil 0 0)
@end example
-@c FIXME: ``PARENT''? is that an argument of the anchor function
@item first-sibling
-This anchor returns the start of the first child of @var{parent}.
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the start of the first child of @var{parent}.
@item parent
-This anchor returns the start of @var{parent}. @c FIXME: Likewise.
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the start of @var{parent}.
@item parent-bol
-This anchor returns the first non-space character on the line of
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the first non-space character on the line of
@var{parent}.
-@c FIXME: ``NODE''?
@item prev-sibling
-This anchor returns the start of the previous sibling of @var{node}.
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the start of the previous sibling of @var{node}.
@item no-indent
-This anchor returns the start of @var{node}, i.e., no indent. @c ???
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the start of @var{node}.
@item prev-line
-This anchor returns the first non-whitespace charater on the previous
-line.
+This anchor is a function that given @w{@code{(@var{node} @var{parent}
+@var{bol})}}, returns the first non-whitespace charater on the
+previous line.
@end ftable
@end defvar
@item (symbol-error @var{error-msg})
This means Emacs could not find in the library the expected function
that every language definition library should export.
-@item (version_mismatch @var{error-msg})
+@item (version-mismatch @var{error-msg})
This means the version of language definition library is incompatible
with that of the tree-sitter library.
@end table
The grammar file is usually @file{grammar.js} in a language
definition's project repository. The link to a language definition's
home page can be found on
-@uref{https://tree-sitter.github.io/tree-sitter, the tree-sitter's
+@uref{https://tree-sitter.github.io/tree-sitter, tree-sitter's
homepage}.
The grammar definition is written in JavaScript. For example, the
@end defun
There is no need to explicitly parse a buffer, because parsing is done
-automatically and lazily. A parser only parses when the mode queris
-for a node in its syntax tree. Therefore, when a parser is first
-created, it doesn't parse the buffer; it waits until the mode queries
-for a node for the first time. Similarly, when some change is made in
-the buffer, a parser doesn't re-parse immediately.
+automatically and lazily. A parser only parses when a Lisp program
+queris for a node in its syntax tree. Therefore, when a parser is
+first created, it doesn't parse the buffer; it waits until the Lisp
+program queries for a node for the first time. Similarly, when some
+change is made in the buffer, a parser doesn't re-parse immediately.
@vindex treesit-buffer-too-large
When a parser does parse, it checks for the size of the buffer.
@group
;; Find the node at point in a C parser's syntax tree.
(treesit-node-at (point) 'c)
- @result{} #<treesit-node from 1 to 4 in *scratch*>
+ @result{} #<treesit-node (primitive_type) in *scratch*>
@end group
@end example
@end defun
@group
;; Get the child that has "body" as its field name.
(treesit-child-by-field-name node "body")
- @result{} #<treesit-node from 3 to 11 in *scratch*>
+ @result{} #<treesit-node (compound_statement) in *scratch*>
@end group
@end example
@end defun
By default, this function only traverses named nodes, but if @var{all}
is non-@code{nil}, it traverses all the nodes. If @var{backward} is
-@c FIXME: What does it mean to ``traverse backward''?
-non-nil, it traverses backwards. If @var{limit} is non-@code{nil}, it
+non-nil, it traverses backwards (meaning visiting the last child first
+when traversing down the tree). If @var{limit} is non-@code{nil}, it
must be a number that limits the tree traversal to that many levels
down the tree.
@end defun
@defun treesit-search-forward start predicate &optional all backward up
-@c FIXME: Explain better what is the differencve between this function
-@c and the previous one.
-This function is somewhat similar to @code{treesit-search-subtree}.
-It also traverse the parse tree and matches each node with
-@var{predicate} (except for @var{start}), where @var{predicate} can be
-a (case-insensitive) regexp or a function. For a tree like the below
-where @var{start} is marked 1, this function traverses as numbered:
+While @code{treesit-search-subtree} traverses the subtree of a node,
+this function usually starts with a leaf node and traverses every node
+comes after it in terms of buffer position. It is useful for
+answering questions like ``what is the first node after @var{start} in
+the buffer that satisfies some condition?''
+
+Like @code{treesit-search-subtree}, this function also traverse the
+parse tree and matches each node with @var{predicate} (except for
+@var{start}), where @var{predicate} can be a (case-insensitive) regexp
+or a function. For a tree like the below where @var{start} is marked
+1, this function traverses as numbered:
@example
@group
@cindex tree-sitter extra node
@cindex extra node, tree-sitter
-A node can be ``extra'': extra nodes represent things like comments,
+A node can be ``extra'': such nodes represent things like comments,
which can appear anywhere in the text.
@cindex tree-sitter node that has changes
@heading More query syntax
-Besides node type and capture, tree-sitter's query syntax can express
-anonymous node, field name, wildcard, quantification, grouping,
-alternation, anchor, and predicate.
+Besides node type and capture, tree-sitter's pattern syntax can
+express anonymous node, field name, wildcard, quantification,
+grouping, alternation, anchor, and predicate.
@subheading Anonymous node
@subheading Wild card
-In a query pattern, @samp{(_)} matches any named node, and @samp{_}
-matches any named and anonymous node. For example, to capture any
-named child of a @code{binary_expression} node, the pattern would be
+In a pattern, @samp{(_)} matches any named node, and @samp{_} matches
+any named and anonymous node. For example, to capture any named child
+of a @code{binary_expression} node, the pattern would be
@example
(binary_expression (_) @@in_biexp)
@subheading Field name
-It is possible to capture child nodes that have specific field names:
+It is possible to capture child nodes that have specific field names.
+In the pattern below, @code{declarator} and @code{body} are field
+names, indicated by the colon following them.
-@c FIXME: The significance of ``:'' should be explained, and also what
-@c are ``declarator'' and ``body''.
@example
@group
(function_definition
@samp{*} matches the preceding pattern zero or more times, @samp{+}
matches one or more times, and @samp{?} matches zero or one time.
-@c FIXME: ``pattern'' or :''query''? Or maybe ``query pattern''?
For example, the following pattern matches @code{type_declaration}
nodes that has @emph{zero or more} @code{long} keyword.
@subheading Alternation
Again, similar to regular expressions, we can express ``match anyone
-from this group of patterns'' in the query pattern. The syntax is a
-list of patterns enclosed in square brackets. For example, to capture
-some keywords in C, the query pattern would be
+from this group of patterns'' in a pattern. The syntax is a list of
+patterns enclosed in square brackets. For example, to capture some
+keywords in C, the pattern would be
@example
@group
@subheading Predicate
It is possible to add predicate constraints to a pattern. For
-example, with the following query pattern:
+example, with the following pattern:
@example
@group
@heading S-expression patterns
-@cindex query patterns as sexps
+@cindex patterns as sexps
@cindex patterns, tree-sitter, in sexp form
-Besides strings, Emacs provides a s-expression based syntax for query
+Besides strings, Emacs provides a s-expression based syntax for
patterns. It largely resembles the string-based syntax. For example,
-the following pattern
+the following query
@example
@group