</div>
<hr>
<span id="Tree_002dsitter-Language-Definitions"></span><h3 class="section">37.1 Tree-sitter Language Definitions</h3>
+<span id="index-language-definitions_002c-for-tree_002dsitter"></span>
<span id="Loading-a-language-definition"></span><h3 class="heading">Loading a language definition</h3>
+<span id="index-loading-language-definition-for-tree_002dsitter"></span>
+<span id="index-language-argument_002c-for-tree_002dsitter"></span>
<p>Tree-sitter relies on language definitions to parse text in that
-language. In Emacs, A language definition is represented by a symbol.
-For example, C language definition is represented as <code>c</code>, and
-<code>c</code> can be passed to tree-sitter functions as the <var>language</var>
-argument.
+language. In Emacs, a language definition is represented by a symbol.
+For example, the C language definition is represented as the symbol
+<code>c</code>, and <code>c</code> can be passed to tree-sitter functions as the
+<var>language</var> argument.
</p>
<span id="index-treesit_002dextra_002dload_002dpath"></span>
<span id="index-treesit_002dload_002dlanguage_002derror"></span>
<p>Tree-sitter language definitions are distributed as dynamic libraries.
In order to use a language definition in Emacs, you need to make sure
that the dynamic library is installed on the system. Emacs looks for
-language definitions under load paths in
-<code>treesit-extra-load-path</code>, <code>user-emacs-directory</code>/tree-sitter,
-and system default locations for dynamic libraries, in that order.
-Emacs tries each extensions in <code>treesit-load-suffixes</code>. If Emacs
-cannot find the library or has problem loading it, Emacs signals
-<code>treesit-load-language-error</code>. The signal data is a list of
-specific error messages.
+language definitions in several places, in the following order:
+</p>
+<ul>
+<li> first, in the list of directories specified by the variable
+<code>treesit-extra-load-path</code>;
+</li><li> then, in the <samp>tree-sitter</samp> subdirectory of the directory
+specified by <code>user-emacs-directory</code> (see <a href="Init-File.html">The Init File</a>);
+</li><li> and finally, in the system’s default locations for dynamic libraries.
+</li></ul>
+
+<p>In each of these directories, Emacs looks for a file with file-name
+extensions specified by the variable <code>treesit-load-suffixes</code>.
+</p>
+<p>If Emacs cannot find the library or has problems loading it, Emacs
+signals the <code>treesit-load-language-error</code> error. The data of
+that signal could be one of the following:
+</p>
+<dl compact="compact">
+<dt><span><code>(not-found <var>error-msg</var> …)</code></span></dt>
+<dd><p>This means that Emacs could not find the language definition library.
+</p></dd>
+<dt><span><code>(symbol-error <var>error-msg</var>)</code></span></dt>
+<dd><p>This means that Emacs could not find in the library the expected function
+that every language definition library should export.
+</p></dd>
+<dt><span><code>(version-mismatch <var>error-msg</var>)</code></span></dt>
+<dd><p>This means that the version of language definition library is incompatible
+with that of the tree-sitter library.
+</p></dd>
+</dl>
+
+<p>In all of these cases, <var>error-msg</var> might provide additional
+details about the failure.
</p>
<dl class="def">
-<dt id="index-treesit_002dlanguage_002davailable_002dp"><span class="category">Function: </span><span><strong>treesit-language-available-p</strong> <em>language</em><a href='#index-treesit_002dlanguage_002davailable_002dp' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function checks whether the dynamic library for <var>language</var> is
-present on the system, and return non-nil if it is.
+<dt id="index-treesit_002dlanguage_002davailable_002dp"><span class="category">Function: </span><span><strong>treesit-language-available-p</strong> <em>language &optional detail</em><a href='#index-treesit_002dlanguage_002davailable_002dp' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function returns non-<code>nil</code> if the language definitions for
+<var>language</var> exist and can be loaded.
+</p>
+<p>If <var>detail</var> is non-<code>nil</code>, return <code>(t . nil)</code> when
+<var>language</var> is available, and <code>(nil . <var>data</var>)</code> when it’s
+unavailable. <var>data</var> is the signal data of
+<code>treesit-load-language-error</code>.
</p></dd></dl>
<span id="index-treesit_002dload_002dname_002doverride_002dlist"></span>
-<p>By convention, the dynamic library for <var>language</var> is
-<code>libtree-sitter-<var>language</var>.<var>ext</var></code>, where <var>ext</var> is the
-system-specific extension for dynamic libraries. Also by convention,
+<p>By convention, the file name of the dynamic library for <var>language</var> is
+<samp>libtree-sitter-<var>language</var>.<var>ext</var></samp>, where <var>ext</var> is the
+system-specific extension for dynamic libraries. Also by convention,
the function provided by that library is named
-<code>tree_sitter_<var>language</var></code>. If a language definition doesn’t
-follow this convention, you should add an entry
+<code>tree_sitter_<var>language</var></code>. If a language definition library
+doesn’t follow this convention, you should add an entry
</p>
<div class="example">
<pre class="example">(<var>language</var> <var>library-base-name</var> <var>function-name</var>)
</pre></div>
-<p>to <code>treesit-load-name-override-list</code>, where
-<var>library-base-name</var> is the base filename for the dynamic library
-(conventionally <code>libtree-sitter-<var>language</var></code>), and
+<p>to the list in the variable <code>treesit-load-name-override-list</code>, where
+<var>library-base-name</var> is the basename of the dynamic library’s file name,
+(usually, <samp>libtree-sitter-<var>language</var></samp>), and
<var>function-name</var> is the function provided by the library
-(conventionally <code>tree_sitter_<var>language</var></code>). For example,
+(usually, <code>tree_sitter_<var>language</var></code>). For example,
</p>
<div class="example">
<pre class="example">(cool-lang "libtree-sitter-coool" "tree_sitter_cooool")
</pre></div>
-<p>for a language too cool to abide by conventions.
+<p>for a language that considers itself too “cool” to abide by
+conventions.
</p>
+<span id="index-language_002ddefinition-version_002c-compatibility"></span>
<dl class="def">
<dt id="index-treesit_002dlanguage_002dversion"><span class="category">Function: </span><span><strong>treesit-language-version</strong> <em>&optional min-compatible</em><a href='#index-treesit_002dlanguage_002dversion' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Tree-sitter library has a <em>language version</em>, a language
-definition’s version needs to match this version to be compatible.
-</p>
-<p>This function returns tree-sitter library’s language version. If
-<var>min-compatible</var> is non-nil, it returns the minimal compatible
-version.
+<dd><p>This function returns the version of the language-definition
+Application Binary Interface (<acronym>ABI</acronym>) supported by the
+tree-sitter library. By default, it returns the latest ABI version
+supported by the library, but if <var>min-compatible</var> is
+non-<code>nil</code>, it returns the oldest ABI version which the library
+still can support. Language definition libraries must be built for
+ABI versions between the oldest and the latest versions supported by
+the tree-sitter library, otherwise the library will be unable to load
+them.
</p></dd></dl>
<span id="Concrete-syntax-tree"></span><h3 class="heading">Concrete syntax tree</h3>
+<span id="index-syntax-tree_002c-concrete"></span>
<p>A syntax tree is what a parser generates. In a syntax tree, each node
represents a piece of text, and is connected to each other by a
+------------+ +--------------+ +------------+
</pre></div>
-<p>We can also represent it in s-expression:
+<p>We can also represent it as an s-expression:
</p>
<div class="example">
<pre class="example">(root (expression (number) (operator) (number)))
</pre></div>
<span id="Node-types"></span><h4 class="subheading">Node types</h4>
-
-<span id="index-tree_002dsitter-node-type"></span>
-<span id="tree_002dsitter-node-type"></span><span id="index-tree_002dsitter-named-node"></span>
-<span id="tree_002dsitter-named-node"></span><span id="index-tree_002dsitter-anonymous-node"></span>
-<p>Names like <code>root</code>, <code>expression</code>, <code>number</code>,
-<code>operator</code> are nodes’ <em>type</em>. However, not all nodes in a
-syntax tree have a type. Nodes that don’t are <em>anonymous nodes</em>,
-and nodes with a type are <em>named nodes</em>. Anonymous nodes are
-tokens with fixed spellings, including punctuation characters like
-bracket ‘<samp>]</samp>’, and keywords like <code>return</code>.
+<span id="index-node-types_002c-in-a-syntax-tree"></span>
+
+<span id="index-type-of-node_002c-tree_002dsitter"></span>
+<span id="tree_002dsitter-node-type"></span><span id="index-named-node_002c-tree_002dsitter"></span>
+<span id="tree_002dsitter-named-node"></span><span id="index-anonymous-node_002c-tree_002dsitter"></span>
+<p>Names like <code>root</code>, <code>expression</code>, <code>number</code>, and
+<code>operator</code> specify the <em>type</em> of the nodes. However, not all
+nodes in a syntax tree have a type. Nodes that don’t have a type are
+known as <em>anonymous nodes</em>, and nodes with a type are <em>named
+nodes</em>. Anonymous nodes are tokens with fixed spellings, including
+punctuation characters like bracket ‘<samp>]</samp>’, and keywords like
+<code>return</code>.
</p>
<span id="Field-names"></span><h4 class="subheading">Field names</h4>
+<span id="index-field-name_002c-tree_002dsitter"></span>
<span id="index-tree_002dsitter-node-field-name"></span>
-<span id="tree_002dsitter-node-field-name"></span><p>To make the syntax tree easier to
-analyze, many language definitions assign <em>field names</em> to child
-nodes. For example, a <code>function_definition</code> node could have a
-<code>declarator</code> and a <code>body</code>:
+<span id="tree_002dsitter-node-field-name"></span><p>To make the syntax tree easier to analyze, many language definitions
+assign <em>field names</em> to child nodes. For example, a
+<code>function_definition</code> node could have a <code>declarator</code> and a
+<code>body</code>:
</p>
<div class="example">
<pre class="example">(function_definition
<dl class="def">
<dt id="index-treesit_002dinspect_002dmode"><span class="category">Command: </span><span><strong>treesit-inspect-mode</strong><a href='#index-treesit_002dinspect_002dmode' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This minor mode displays the node that <em>starts</em> at point in
-mode-line. The mode-line will display
+<dd><p>This minor mode displays on the mode-line the node that <em>starts</em>
+at point. The mode-line will display
</p>
<div class="example">
-<pre class="example"><var>parent</var> <var>field-name</var>: (<var>child</var> (<var>grand-child</var> (...)))
+<pre class="example"><var>parent</var> <var>field</var>: (<var>child</var> (<var>grandchild</var> (…)))
</pre></div>
-<p><var>child</var>, <var>grand-child</var>, and <var>grand-grand-child</var>, etc, are
-nodes that have their beginning at point. And <var>parent</var> is the
-parent of <var>child</var>.
+<p><var>child</var>, <var>grand</var>, <var>grand-grandchild</var>, etc., are nodes that
+begin at point. <var>parent</var> is the parent node of <var>child</var>.
</p>
<p>If there is no node that starts at point, i.e., point is in the middle
of a node, then the mode-line only displays the smallest node that
-spans point, and its immediate parent.
+spans the position of point, and its immediate parent.
</p>
<p>This minor mode doesn’t create parsers on its own. It simply uses the
first parser in <code>(treesit-parser-list)</code> (see <a href="Using-Parser.html">Using Tree-sitter Parser</a>).
</p></dd></dl>
<span id="Reading-the-grammar-definition"></span><h3 class="heading">Reading the grammar definition</h3>
+<span id="index-reading-grammar-definition_002c-tree_002dsitter"></span>
<p>Authors of language definitions define the <em>grammar</em> of a
-language, and this grammar determines how does a parser construct a
-concrete syntax tree out of the text. In order to use the syntax
-tree effectively, we need to read the <em>grammar file</em>.
+programming language, which determines how a parser constructs a
+concrete syntax tree out of the program text. In order to use the
+syntax tree effectively, you need to consult the <em>grammar file</em>.
</p>
-<p>The grammar file is usually <code>grammar.js</code> in a language
-definition’s project repository. The link to a language definition’s
-home page can be found in tree-sitter’s homepage
-(<a href="https://tree-sitter.github.io/tree-sitter">https://tree-sitter.github.io/tree-sitter</a>).
+<p>The grammar file is usually <samp>grammar.js</samp> in a language
+definition’s project repository. The link to a language definition’s
+home page can be found on
+<a href="https://tree-sitter.github.io/tree-sitter">tree-sitter’s
+homepage</a>.
</p>
-<p>The grammar is written in JavaScript syntax. For example, the rule
-matching a <code>function_definition</code> node looks like
+<p>The grammar definition is written in JavaScript. For example, the
+rule matching a <code>function_definition</code> node looks like
</p>
<div class="example">
<pre class="example">function_definition: $ => seq(
)
</pre></div>
-<p>The rule is represented by a function that takes a single argument
+<p>The rules are represented by functions that take a single argument
<var>$</var>, representing the whole grammar. The function itself is
-constructed by other functions: the <code>seq</code> function puts together a
-sequence of children; the <code>field</code> function annotates a child with
-a field name. If we write the above definition in BNF syntax, it
-would look like
+constructed by other functions: the <code>seq</code> function puts together
+a sequence of children; the <code>field</code> function annotates a child
+with a field name. If we write the above definition in the so-called
+<em>Backus-Naur Form</em> (<acronym>BNF</acronym>) syntax, it would look like
</p>
<div class="example">
<pre class="example">function_definition :=
body: (compound_statement))
</pre></div>
-<p>Below is a list of functions that one will see in a grammar
-definition. Each function takes other rules as arguments and returns
-a new rule.
+<p>Below is a list of functions that one can see in a grammar definition.
+Each function takes other rules as arguments and returns a new rule.
</p>
-<ul>
-<li> <code>seq(rule1, rule2, ...)</code> matches each rule one after another.
-
-</li><li> <code>choice(rule1, rule2, ...)</code> matches one of the rules in its
-arguments.
-
-</li><li> <code>repeat(rule)</code> matches <var>rule</var> for <em>zero or more</em> times.
+<dl compact="compact">
+<dt><span><code>seq(<var>rule1</var>, <var>rule2</var>, …)</code></span></dt>
+<dd><p>matches each rule one after another.
+</p></dd>
+<dt><span><code>choice(<var>rule1</var>, <var>rule2</var>, …)</code></span></dt>
+<dd><p>matches one of the rules in its arguments.
+</p></dd>
+<dt><span><code>repeat(<var>rule</var>)</code></span></dt>
+<dd><p>matches <var>rule</var> for <em>zero or more</em> times.
This is like the ‘<samp>*</samp>’ operator in regular expressions.
-
-</li><li> <code>repeat1(rule)</code> matches <var>rule</var> for <em>one or more</em> times.
+</p></dd>
+<dt><span><code>repeat1(<var>rule</var>)</code></span></dt>
+<dd><p>matches <var>rule</var> for <em>one or more</em> times.
This is like the ‘<samp>+</samp>’ operator in regular expressions.
-
-</li><li> <code>optional(rule)</code> matches <var>rule</var> for <em>zero or one</em> time.
+</p></dd>
+<dt><span><code>optional(<var>rule</var>)</code></span></dt>
+<dd><p>matches <var>rule</var> for <em>zero or one</em> time.
This is like the ‘<samp>?</samp>’ operator in regular expressions.
-
-</li><li> <code>field(name, rule)</code> assigns field name <var>name</var> to the child
-node matched by <var>rule</var>.
-
-</li><li> <code>alias(rule, alias)</code> makes nodes matched by <var>rule</var> appear as
-<var>alias</var> in the syntax tree generated by the parser. For example,
-
+</p></dd>
+<dt><span><code>field(<var>name</var>, <var>rule</var>)</code></span></dt>
+<dd><p>assigns field name <var>name</var> to the child node matched by <var>rule</var>.
+</p></dd>
+<dt><span><code>alias(<var>rule</var>, <var>alias</var>)</code></span></dt>
+<dd><p>makes nodes matched by <var>rule</var> appear as <var>alias</var> in the syntax
+tree generated by the parser. For example,
+</p>
<div class="example">
<pre class="example">alias(preprocessor_call_exp, call_expression)
</pre></div>
-<p>makes any node matched by <code>preprocessor_call_exp</code> to appear as
+<p>makes any node matched by <code>preprocessor_call_exp</code> appear as
<code>call_expression</code>.
-</p></li></ul>
+</p></dd>
+</dl>
-<p>Below are grammar functions less interesting for a reader of a
+<p>Below are grammar functions of lesser importance for reading a
language definition.
</p>
-<ul>
-<li> <code>token(rule)</code> marks <var>rule</var> to produce a single leaf node.
-That is, instead of generating a parent node with individual child
-nodes under it, everything is combined into a single leaf node.
-
-</li><li> Normally, grammar rules ignore preceding whitespaces,
-<code>token.immediate(rule)</code> changes <var>rule</var> to match only when
-there is no preceding whitespaces.
-
-</li><li> <code>prec(n, rule)</code> gives <var>rule</var> a level <var>n</var> precedence.
-
-</li><li> <code>prec.left([n,] rule)</code> marks <var>rule</var> as left-associative,
-optionally with level <var>n</var>.
-
-</li><li> <code>prec.right([n,] rule)</code> marks <var>rule</var> as right-associative,
-optionally with level <var>n</var>.
-
-</li><li> <code>prec.dynamic(n, rule)</code> is like <code>prec</code>, but the precedence
-is applied at runtime instead.
-</li></ul>
-
-<p>The tree-sitter project talks about writing a grammar in more detail:
-<a href="https://tree-sitter.github.io/tree-sitter/creating-parsers">https://tree-sitter.github.io/tree-sitter/creating-parsers</a>.
-Read especially “The Grammar DSL” section.
+<dl compact="compact">
+<dt><span><code>token(<var>rule</var>)</code></span></dt>
+<dd><p>marks <var>rule</var> to produce a single leaf node. That is, instead of
+generating a parent node with individual child nodes under it,
+everything is combined into a single leaf node.
+</p></dd>
+<dt><span><code>token.immediate(<var>rule</var>)</code></span></dt>
+<dd><p>Normally, grammar rules ignore preceding whitespace; this
+changes <var>rule</var> to match only when there is no preceding
+whitespaces.
+</p></dd>
+<dt><span><code>prec(<var>n</var>, <var>rule</var>)</code></span></dt>
+<dd><p>gives <var>rule</var> the level-<var>n</var> precedence.
+</p></dd>
+<dt><span><code>prec.left([<var>n</var>,] <var>rule</var>)</code></span></dt>
+<dd><p>marks <var>rule</var> as left-associative, optionally with level <var>n</var>.
+</p></dd>
+<dt><span><code>prec.right([<var>n</var>,] <var>rule</var>)</code></span></dt>
+<dd><p>marks <var>rule</var> as right-associative, optionally with level <var>n</var>.
+</p></dd>
+<dt><span><code>prec.dynamic(<var>n</var>, <var>rule</var>)</code></span></dt>
+<dd><p>this is like <code>prec</code>, but the precedence is applied at runtime
+instead.
+</p></dd>
+</dl>
+
+<p>The documentation of the tree-sitter project has
+<a href="https://tree-sitter.github.io/tree-sitter/creating-parsers">more
+about writing a grammar</a>. Read especially “The Grammar DSL”
+section.
</p>
</div>
<hr>
<link href="Index.html" rel="index" title="Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source">
-<link href="Tree_002dsitter-C-API.html" rel="next" title="Tree-sitter C API">
+<link href="Tree_002dsitter-major-modes.html" rel="next" title="Tree-sitter major modes">
<link href="Pattern-Matching.html" rel="prev" title="Pattern Matching">
<style type="text/css">
<!--
<div class="section" id="Multiple-Languages">
<div class="header">
<p>
-Next: <a href="Tree_002dsitter-C-API.html" accesskey="n" rel="next">Tree-sitter C API Correspondence</a>, Previous: <a href="Pattern-Matching.html" accesskey="p" rel="prev">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Tree_002dsitter-major-modes.html" accesskey="n" rel="next">Developing major modes with tree-sitter</a>, Previous: <a href="Pattern-Matching.html" accesskey="p" rel="prev">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<span id="Parsing-Text-in-Multiple-Languages"></span><h3 class="section">37.6 Parsing Text in Multiple Languages</h3>
-<p>Sometimes, the source of a programming language could contain sources
-of other languages, HTML + CSS + JavaScript is one example. In that
-case, we need to assign individual parsers to text segments written in
-different languages. Traditionally this is achieved by using
-narrowing. While tree-sitter works with narrowing (see <a href="Using-Parser.html#tree_002dsitter-narrowing">narrowing</a>), the recommended way is to set ranges in which
-a parser will operate.
+<span id="index-multiple-languages_002c-parsing-with-tree_002dsitter"></span>
+<span id="index-parsing-multiple-languages-with-tree_002dsitter"></span>
+<p>Sometimes, the source of a programming language could contain snippets
+of other languages; <acronym>HTML</acronym> + <acronym>CSS</acronym> + JavaScript is one
+example. In that case, text segments written in different languages
+need to be assigned different parsers. Traditionally, this is
+achieved by using narrowing. While tree-sitter works with narrowing
+(see <a href="Using-Parser.html#tree_002dsitter-narrowing">narrowing</a>), the recommended way is
+instead to set regions of buffer text in which a parser will operate.
</p>
<dl class="def">
<dt id="index-treesit_002dparser_002dset_002dincluded_002dranges"><span class="category">Function: </span><span><strong>treesit-parser-set-included-ranges</strong> <em>parser ranges</em><a href='#index-treesit_002dparser_002dset_002dincluded_002dranges' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function sets the range of <var>parser</var> to <var>ranges</var>. Then
-<var>parser</var> will only read the text covered in each range. Each
-range in <var>ranges</var> is a list of cons <code>(<var>beg</var>
-. <var>end</var>)</code>.
+<dd><p>This function sets up <var>parser</var> to operate on <var>ranges</var>. The
+<var>parser</var> will only read the text of the specified ranges. Each
+range in <var>ranges</var> is a list of the form <code>(<var>beg</var> . <var>end</var>)</code><!-- /@w -->.
</p>
-<p>Each range in <var>ranges</var> must come in order and not overlap. That
-is, in pseudo code:
+<p>The ranges in <var>ranges</var> must come in order and must not overlap.
+That is, in pseudo code:
</p>
<div class="example">
<pre class="example">(cl-loop for idx from 1 to (1- (length ranges))
<span id="index-treesit_002drange_002dinvalid"></span>
<p>If <var>ranges</var> violates this constraint, or something else went
-wrong, this function signals a <code>treesit-range-invalid</code>. The
-signal data contains a specific error message and the ranges we are
-trying to set.
+wrong, this function signals the <code>treesit-range-invalid</code> error.
+The signal data contains a specific error message and the ranges we
+are trying to set.
</p>
<p>This function can also be used for disabling ranges. If <var>ranges</var>
-is nil, the parser is set to parse the whole buffer.
+is <code>nil</code>, the parser is set to parse the whole buffer.
</p>
<p>Example:
</p>
<dt id="index-treesit_002dparser_002dincluded_002dranges"><span class="category">Function: </span><span><strong>treesit-parser-included-ranges</strong> <em>parser</em><a href='#index-treesit_002dparser_002dincluded_002dranges' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function returns the ranges set for <var>parser</var>. The return
value is the same as the <var>ranges</var> argument of
-<code>treesit-parser-included-ranges</code>: a list of cons
-<code>(<var>beg</var> . <var>end</var>)</code>. And if <var>parser</var> doesn’t have any
-ranges, the return value is nil.
+<code>treesit-parser-included-ranges</code>: a list of cons cells of the form
+<code>(<var>beg</var> . <var>end</var>)</code><!-- /@w -->. If <var>parser</var> doesn’t have any
+ranges, the return value is <code>nil</code>.
</p>
<div class="example">
<pre class="example">(treesit-parser-included-ranges parser)
<var>parser-or-lang</var> could be either a parser or a language. If it is
a language, this function looks for the first parser in
<code>(treesit-parser-list)</code> for that language in the current buffer,
-and set range for it.
+and sets the ranges for it.
</p></dd></dl>
<dl class="def">
<dl class="def">
<dt id="index-treesit_002dquery_002drange"><span class="category">Function: </span><span><strong>treesit-query-range</strong> <em>source query &optional beg end</em><a href='#index-treesit_002dquery_002drange' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function matches <var>source</var> with <var>query</var> and returns the
-ranges of captured nodes. The return value has the same shape of
-other functions: a list of <code>(<var>beg</var> . <var>end</var>)</code>.
+ranges of captured nodes. The return value is a list of cons cells of
+the form <code>(<var>beg</var> . <var>end</var>)</code><!-- /@w -->, where <var>beg</var> and
+<var>end</var> specify the beginning and the end of a region of text.
</p>
<p>For convenience, <var>source</var> can be a language symbol, a parser, or a
-node. If a language symbol, this function matches in the root node of
-the first parser using that language; if a parser, this function
-matches in the root node of that parser; if a node, this function
-matches in that node.
-</p>
-<p>Parameter <var>query</var> is the query used to capture nodes
-(see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>). The capture names don’t matter. Parameter
-<var>beg</var> and <var>end</var>, if both non-nil, limits the range in which
-this function queries.
-</p>
-<p>Like other query functions, this function raises an
-<var>treesit-query-error</var> if <var>query</var> is malformed.
-</p></dd></dl>
-
-<dl class="def">
-<dt id="index-treesit_002dlanguage_002dat"><span class="category">Function: </span><span><strong>treesit-language-at</strong> <em>point</em><a href='#index-treesit_002dlanguage_002dat' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function tries to figure out which language is responsible for
-the text at <var>point</var>. It goes over each parser in
-<code>(treesit-parser-list)</code> and see if that parser’s range covers
-<var>point</var>.
+node. If it’s a language symbol, this function matches in the root
+node of the first parser using that language; if a parser, this
+function matches in the root node of that parser; if a node, this
+function matches in that node.
+</p>
+<p>The argument <var>query</var> is the query used to capture nodes
+(see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>). The capture names don’t matter. The
+arguments <var>beg</var> and <var>end</var>, if both non-<code>nil</code>, limit the
+range in which this function queries.
+</p>
+<p>Like other query functions, this function raises the
+<code>treesit-query-error</code> error if <var>query</var> is malformed.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002drange_002dfunctions"><span class="category">Variable: </span><span><strong>treesit-range-functions</strong><a href='#index-treesit_002drange_002dfunctions' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>A list of range functions. Font-locking and indenting code uses
-functions in this alist to set correct ranges for a language parser
-before using it.
+<dd><p>This variable holds the list of range functions. Font-locking and
+indenting code use functions in this list to set correct ranges for
+a language parser before using it.
</p>
-<p>The signature of each function should be
+<p>The signature of each function in the list should be:
</p>
<div class="example">
<pre class="example">(<var>start</var> <var>end</var> &rest <var>_</var>)
</pre></div>
-<p>where <var>start</var> and <var>end</var> marks the region that is about to be
-used. A range function only need to (but not limited to) update
+<p>where <var>start</var> and <var>end</var> specify the region that is about to be
+used. A range function only needs to (but is not limited to) update
ranges in that region.
</p>
-<p>Each function in the list is called in-order.
+<p>The functions in the list are called in order.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dupdate_002dranges"><span class="category">Function: </span><span><strong>treesit-update-ranges</strong> <em>&optional start end</em><a href='#index-treesit_002dupdate_002dranges' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function is used by font-lock and indent to update ranges before
-using any parser. Each range function in
+<dd><p>This function is used by font-lock and indentation to update ranges
+before using any parser. Each range function in
<var>treesit-range-functions</var> is called in-order. Arguments
<var>start</var> and <var>end</var> are passed to each range function.
</p></dd></dl>
+<span id="index-treesit_002dlanguage_002dat_002dpoint_002dfunction"></span>
+<dl class="def">
+<dt id="index-treesit_002dlanguage_002dat"><span class="category">Function: </span><span><strong>treesit-language-at</strong> <em>pos</em><a href='#index-treesit_002dlanguage_002dat' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function tries to figure out which language is responsible for
+the text at buffer position <var>pos</var>. Under the hood it just calls
+<code>treesit-language-at-point-function</code>.
+</p>
+<p>Various Lisp programs use this function. For example, the indentation
+program uses this function to determine which language’s rule to use
+in a multi-language buffer. So it is important to provide
+<code>treesit-language-at-point-function</code> for a multi-language major
+mode.
+</p></dd></dl>
+
<span id="An-example"></span><h3 class="heading">An example</h3>
<p>Normally, in a set of languages that can be mixed together, there is a
-major language and several embedded languages. We first parse the
-whole document with the major language’s parser, set ranges for the
-embedded languages, then parse the embedded languages.
+major language and several embedded languages. A Lisp program usually
+first parses the whole document with the major language’s parser, sets
+ranges for the embedded languages, and then parses the embedded
+languages.
</p>
-<p>Suppose we want to parse a very simple document that mixes HTML, CSS
-and JavaScript:
+<p>Suppose we need to parse a very simple document that mixes
+<acronym>HTML</acronym>, <acronym>CSS</acronym> and JavaScript:
</p>
<div class="example">
<pre class="example"><html>
</html>
</pre></div>
-<p>We first parse with HTML, then set ranges for CSS and JavaScript:
+<p>We first parse with <acronym>HTML</acronym>, then set ranges for <acronym>CSS</acronym>
+and JavaScript:
</p>
<div class="example">
<pre class="example">;; Create parsers.
(setq html (treesit-get-parser-create 'html))
(setq css (treesit-get-parser-create 'css))
(setq js (treesit-get-parser-create 'javascript))
+</pre><pre class="example">
-;; Set CSS ranges.
+</pre><pre class="example">;; Set CSS ranges.
(setq css-range
(treesit-query-range
'html
"(style_element (raw_text) @capture)"))
(treesit-parser-set-included-ranges css css-range)
+</pre><pre class="example">
-;; Set JavaScript ranges.
+</pre><pre class="example">;; Set JavaScript ranges.
(setq js-range
(treesit-query-range
'html
(treesit-parser-set-included-ranges js js-range)
</pre></div>
-<p>We use a query pattern <code>(style_element (raw_text) @capture)</code> to
-find CSS nodes in the HTML parse tree. For how to write query
-patterns, see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>.
+<p>We use a query pattern <code><span class="nolinebreak">(style_element</span> <span class="nolinebreak">(raw_text)</span> @capture)</code><!-- /@w -->
+to find <acronym>CSS</acronym> nodes in the <acronym>HTML</acronym> parse tree. For how
+to write query patterns, see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>.
</p>
</div>
<hr>
<div class="header">
<p>
-Next: <a href="Tree_002dsitter-C-API.html">Tree-sitter C API Correspondence</a>, Previous: <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Tree_002dsitter-major-modes.html">Developing major modes with tree-sitter</a>, Previous: <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
</div>
<hr>
<span id="Parser_002dbased-Font-Lock-1"></span><h4 class="subsection">24.6.10 Parser-based Font Lock</h4>
+<span id="index-parser_002dbased-font_002dlock"></span>
<p>Besides simple syntactic font lock and regexp-based font lock, Emacs
-also provides complete syntactic font lock with the help of a parser,
-currently provided by the tree-sitter library (see <a href="Parsing-Program-Source.html">Parsing Program Source</a>).
+also provides complete syntactic font lock with the help of a parser.
+Currently, Emacs uses the tree-sitter library (see <a href="Parsing-Program-Source.html">Parsing Program Source</a>) for this purpose.
</p>
-<dl class="def">
-<dt id="index-treesit_002dfont_002dlock_002denable"><span class="category">Function: </span><span><strong>treesit-font-lock-enable</strong><a href='#index-treesit_002dfont_002dlock_002denable' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function enables parser-based font lock in the current buffer.
-</p></dd></dl>
-
-<p>Parser-based font lock and other font lock mechanism are not mutually
+<p>Parser-based font lock and other font lock mechanisms are not mutually
exclusive. By default, if enabled, parser-based font lock runs first,
-then the simple syntactic font lock (if enabled), then regexp-based
-font lock.
+replacing syntactic font lock, then the regexp-based font lock.
</p>
<p>Although parser-based font lock doesn’t share the same customization
-variables with regexp-based font lock, parser-based font lock uses
-similar customization schemes. The tree-sitter counterpart of
-<var>font-lock-keywords</var> is <var>treesit-font-lock-settings</var>.
+variables with regexp-based font lock, it uses similar customization
+schemes. The tree-sitter counterpart of <var>font-lock-keywords</var> is
+<var>treesit-font-lock-settings</var>.
+</p>
+<span id="index-tree_002dsitter-fontifications_002c-overview"></span>
+<span id="index-fontifications-with-tree_002dsitter_002c-overview"></span>
+<p>In general, tree-sitter fontification works as follows:
+</p>
+<ul>
+<li> A Lisp program (usually, part of a major mode) provides a <em>query</em>
+consisting of <em>patterns</em>, each pattern associated with a
+<em>capture name</em>.
+
+</li><li> The tree-sitter library finds the nodes in the parse tree
+that match these patterns, tags the nodes with the corresponding
+capture names, and returns them to the Lisp program.
+
+</li><li> The Lisp program uses the returned nodes to highlight the portions of
+buffer text corresponding to each node as appropriate, using the
+tagged capture names of the nodes to determine the correct
+fontification. For example, a node tagged <code>font-lock-keyword</code>
+would be highlighted in <code>font-lock-keyword</code> face.
+</li></ul>
+
+<p>For more information about queries, patterns, and capture names, see
+<a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>.
+</p>
+<p>To setup tree-sitter fontification, a major mode should first set
+<code>treesit-font-lock-settings</code> with the output of
+<code>treesit-font-lock-rules</code>, then call
+<code>treesit-major-mode-setup</code>.
</p>
<dl class="def">
<dt id="index-treesit_002dfont_002dlock_002drules"><span class="category">Function: </span><span><strong>treesit-font-lock-rules</strong> <em>:keyword value query...</em><a href='#index-treesit_002dfont_002dlock_002drules' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function is used to set <var>treesit-font-lock-settings</var>. It
-takes care of compiling queries and other post-processing and outputs
-a value that <var>treesit-font-lock-settings</var> accepts. An example:
+takes care of compiling queries and other post-processing, and outputs
+a value that <var>treesit-font-lock-settings</var> accepts. Here’s an
+example:
</p>
<div class="example">
<pre class="example">(treesit-font-lock-rules
:language 'javascript
+ :feature 'constant
:override t
'((true) @font-lock-constant-face
(false) @font-lock-constant-face)
:language 'html
+ :feature 'script
"(script_element) @font-lock-builtin-face")
</pre></div>
<p>This function takes a list of text or s-exp queries. Before each
-query, there are <var>:keyword</var> and <var>value</var> pairs that configure
-that query. The <code>:lang</code> keyword sets the query’s language and
-every query must specify the language. Other keywords are optional:
+query, there are <var>:keyword</var>-<var>value</var> pairs that configure
+that query. The <code>:lang</code> keyword sets the query’s language and
+every query must specify the language. The <code>:feature</code> keyword
+sets the feature name of the query. Users can control which features
+are enabled with <code>font-lock-maximum-decoration</code> and
+<code>treesit-font-lock-feature-list</code> (see below).
+</p>
+<p>Other keywords are optional:
</p>
<table>
<thead><tr><th width="15%">Keyword</th><th width="15%">Value</th><th width="60%">Description</th></tr></thead>
<tr><td width="15%"></td><td width="15%"><code>append</code></td><td width="60%">Append the new face to existing ones</td></tr>
<tr><td width="15%"></td><td width="15%"><code>prepend</code></td><td width="60%">Prepend the new face to existing ones</td></tr>
<tr><td width="15%"></td><td width="15%"><code>keep</code></td><td width="60%">Fill-in regions without an existing face</td></tr>
-<tr><td width="15%"><code>:toggle</code></td><td width="15%"><var>symbol</var></td><td width="60%">If non-nil, its value should be a variable name. The variable’s value
-(nil/non-nil) disables/enables the query during fontification.</td></tr>
-<tr><td width="15%"></td><td width="15%">nil</td><td width="60%">Always enable this query.</td></tr>
-<tr><td width="15%"><code>:level</code></td><td width="15%"><var>integer</var></td><td width="60%">If non-nil, its value should be the decoration level for this query.
-Decoration level is controlled by <code>font-lock-maximum-decoration</code>.</td></tr>
-<tr><td width="15%"></td><td width="15%">nil</td><td width="60%">Always enable this query.</td></tr>
</table>
-<p>Note that a query is applied only when both <code>:toggle</code> and
-<code>:level</code> permit it. <code>:level</code> is used for global,
-coarse-grained control, whereas <code>:toggle</code> is for local,
-fine-grained control.
-</p>
-<p>Capture names in <var>query</var> should be face names like
+<p>Lisp programs mark patterns in the query with capture names (names
+that starts with <code>@</code>), and tree-sitter will return matched nodes
+tagged with those same capture names. For the purpose of
+fontification, capture names in <var>query</var> should be face names like
<code>font-lock-keyword-face</code>. The captured node will be fontified
-with that face. Capture names can also be function names, in which
-case the function is called with (<var>start</var> <var>end</var> <var>node</var>),
-where <var>start</var> and <var>end</var> are the start and end position of the
-node in buffer, and <var>node</var> is the node itself. If a capture name
-is both a face and a function, the face takes priority. If a capture
-name is not a face name nor a function name, it is ignored.
+with that face.
+</p>
+<span id="index-treesit_002dfontify_002dwith_002doverride"></span>
+<p>Capture names can also be function names, in which case the function
+is called with 4 arguments: <var>node</var> and <var>override</var>, <var>start</var>
+and <var>end</var>, where <var>node</var> is the node itself, <var>override</var> is
+the override property of the rule which captured this node, and
+<var>start</var> and <var>end</var> limits the region in which this function
+should fontify. (If this function wants to respect the <var>override</var>
+argument, it can use <code>treesit-fontify-with-override</code>.)
+</p>
+<p>Beyond the 4 arguments presented, this function should accept more
+arguments as optional arguments for future extensibility.
+</p>
+<p>If a capture name is both a face and a function, the face takes
+priority. If a capture name is neither a face nor a function, it is
+ignored.
</p></dd></dl>
+<p>Contextual entities, like multi-line strings, or <code>/* */</code> style
+comments, need special care, because change in these entities might
+cause change in a large portion of the buffer. For example, inserting
+the closing comment delimiter <code>*/</code> will change all the text
+between it and the opening delimiter to comment face. Such entities
+should be captured in a special name <code>contextual</code>, so Emacs can
+correctly update their fontification. Here is an example for
+comments:
+</p>
+<div class="example">
+<pre class="example">(treesit-font-lock-rules
+ :language 'javascript
+ :feature 'comment
+ :override t
+ '((comment) @font-lock-comment-face)
+ (comment) @contextual))
+</pre></div>
+
<dl class="def">
-<dt id="index-treesit_002dfont_002dlock_002dsettings"><span class="category">Variable: </span><span><strong>treesit-font-lock-settings</strong><a href='#index-treesit_002dfont_002dlock_002dsettings' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>A list of <var>setting</var>s for tree-sitter font lock. The exact format
-of this variable is considered internal. One should always use
-<code>treesit-font-lock-rules</code> to set this variable.
+<dt id="index-treesit_002dfont_002dlock_002dfeature_002dlist"><span class="category">Variable: </span><span><strong>treesit-font-lock-feature-list</strong><a href='#index-treesit_002dfont_002dlock_002dfeature_002dlist' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This is a list of lists of feature symbols. Each element of the list
+is a list that represents a decoration level.
+<code>font-lock-maximum-decoration</code> controls which levels are
+activated.
</p>
-<p>Each <var>setting</var> is of form
+<p>Each element of the list is a list of the form <code>(<var>feature</var> …)</code><!-- /@w -->, where each <var>feature</var> corresponds to the
+<code>:feature</code> value of a query defined in
+<code>treesit-font-lock-rules</code>. Removing a feature symbol from this
+list disables the corresponding query during font-lock.
</p>
-<div class="example">
-<pre class="example">(<var>language</var> <var>query</var>)
+<p>Common feature names, for many programming languages, include
+function-name, type, variable-name (left-hand-side or <acronym>LHS</acronym> of
+assignments), builtin, constant, keyword, string-interpolation,
+comment, doc, string, operator, preprocessor, escape-sequence, and key
+(in key-value pairs). Major modes are free to subdivide or extend
+these common features.
+</p>
+<p>For example, the value of this variable could be:
+</p><div class="example">
+<pre class="example">((comment string doc) ; level 1
+ (function-name keyword type builtin constant) ; level 2
+ (variable-name string-interpolation key)) ; level 3
</pre></div>
-<p>Each <var>setting</var> controls one parser (often of different language).
-And <var>language</var> is the language symbol (see <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>); <var>query</var> is the query (see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>).
+<p>Major modes should set this variable before calling
+<code>treesit-major-mode-setup</code>.
+</p>
+<span id="index-treesit_002dfont_002dlock_002drecompute_002dfeatures"></span>
+<p>For this variable to take effect, a Lisp program should call
+<code>treesit-font-lock-recompute-features</code> (which resets
+<code>treesit-font-lock-settings</code> accordingly), or
+<code>treesit-major-mode-setup</code> (which calls
+<code>treesit-font-lock-recompute-features</code>).
+</p></dd></dl>
+
+<dl class="def">
+<dt id="index-treesit_002dfont_002dlock_002dsettings"><span class="category">Variable: </span><span><strong>treesit-font-lock-settings</strong><a href='#index-treesit_002dfont_002dlock_002dsettings' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>A list of settings for tree-sitter based font lock. The exact format
+of this variable is considered internal. One should always use
+<code>treesit-font-lock-rules</code> to set this variable.
</p></dd></dl>
<p>Multi-language major modes should provide range functions in
</div>
<hr>
<span id="Parser_002dbased-Indentation-1"></span><h4 class="subsection">24.7.2 Parser-based Indentation</h4>
+<span id="index-parser_002dbased-indentation"></span>
-<p>When built with the tree-sitter library (see <a href="Parsing-Program-Source.html">Parsing Program Source</a>), Emacs could parse program source and produce a syntax tree.
-And this syntax tree can be used for indentation. For maximum
-flexibility, we could write a custom indent function that queries the
-syntax tree and indents accordingly for each language, but that would
-be a lot of work. It is more convenient to use the simple indentation
-engine described below: we only need to write some indentation rules
+<p>When built with the tree-sitter library (see <a href="Parsing-Program-Source.html">Parsing Program Source</a>), Emacs is capable of parsing the program source and producing
+a syntax tree. This syntax tree can be used for guiding the program
+source indentation commands. For maximum flexibility, it is possible
+to write a custom indentation function that queries the syntax tree
+and indents accordingly for each language, but that is a lot of work.
+It is more convenient to use the simple indentation engine described
+below: then the major mode needs only to write some indentation rules
and the engine takes care of the rest.
</p>
-<p>To enable the indentation engine, set the value of
+<p>To enable the parser-based indentation engine, either set
+<var>treesit-simple-indent-rules</var> and call
+<code>treesit-major-mode-setup</code>, or equivalently, set the value of
<code>indent-line-function</code> to <code>treesit-indent</code>.
</p>
<dl class="def">
<dt id="index-treesit_002dindent_002dfunction"><span class="category">Variable: </span><span><strong>treesit-indent-function</strong><a href='#index-treesit_002dindent_002dfunction' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This variable stores the actual function called by
<code>treesit-indent</code>. By default, its value is
-<code>treesit-simple-indent</code>. In the future we might add other
+<code>treesit-simple-indent</code>. In the future we might add other,
more complex indentation engines.
</p></dd></dl>
<span id="Writing-indentation-rules"></span><h3 class="heading">Writing indentation rules</h3>
+<span id="index-indentation-rules_002c-for-parser_002dbased-indentation"></span>
<dl class="def">
<dt id="index-treesit_002dsimple_002dindent_002drules"><span class="category">Variable: </span><span><strong>treesit-simple-indent-rules</strong><a href='#index-treesit_002dsimple_002dindent_002drules' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This local variable stores indentation rules for every language. It is
-a list of
-</p>
-<div class="example">
-<pre class="example">(<var>language</var> . <var>rules</var>)
-</pre></div>
-
-<p>where <var>language</var> is a language symbol, and <var>rules</var> is a list
-of
-</p>
-<div class="example">
-<pre class="example">(<var>matcher</var> <var>anchor</var> <var>offset</var>)
-</pre></div>
-
-<p>First Emacs passes the node at point to <var>matcher</var>, if it return
-non-nil, this rule applies. Then Emacs passes the node to
-<var>anchor</var>, it returns a point. Emacs takes the column number of
-that point, add <var>offset</var> to it, and the result is the indent for
-the current line.
+<dd><p>This local variable stores indentation rules for every language. It is
+a list of the form: <code>(<var>language</var> . <var>rules</var>)</code><!-- /@w -->, where
+<var>language</var> is a language symbol, and <var>rules</var> is a list of the
+form <code>(<var>matcher</var> <var>anchor</var> <var>offset</var>)</code><!-- /@w -->.
+</p>
+<p>First, Emacs passes the smallest tree-sitter node at the beginning of
+the current line to <var>matcher</var>; if it returns non-<code>nil</code>, this
+rule is applicable. Then Emacs passes the node to <var>anchor</var>, which
+returns a buffer position. Emacs takes the column number of that
+position, adds <var>offset</var> to it, and the result is the indentation
+column for the current line.
</p>
<p>The <var>matcher</var> and <var>anchor</var> are functions, and Emacs provides
-convenient presets for them. You can skip over to
-<code>treesit-simple-indent-presets</code> below, those presets should be
-more than enough.
-</p>
-<p>A <var>matcher</var> or an <var>anchor</var> is a function that takes three
-arguments (<var>node</var> <var>parent</var> <var>bol</var>). Argument <var>bol</var> is
-the point at where we are indenting: the position of the first
-non-whitespace character from the beginning of line; <var>node</var> is the
-largest (highest-in-tree) node that starts at that point; <var>parent</var>
-is the parent of <var>node</var>. A <var>matcher</var> returns nil/non-nil, and
-<var>anchor</var> returns a point.
+convenient defaults for them.
+</p>
+<p>Each <var>matcher</var> or <var>anchor</var> is a function that takes three
+arguments: <var>node</var>, <var>parent</var>, and <var>bol</var>. The argument
+<var>bol</var> is the buffer position whose indentation is required: the
+position of the first non-whitespace character after the beginning of
+the line. The argument <var>node</var> is the largest (highest-in-tree)
+node that starts at that position; and <var>parent</var> is the parent of
+<var>node</var>. However, when that position is on a whitespace or inside
+a multi-line string, no node that starts at that position, so
+<var>node</var> is <code>nil</code>. In that case, <var>parent</var> would be the
+smallest node that spans that position.
+</p>
+<p>Emacs finds <var>bol</var>, <var>node</var> and <var>parent</var> and
+passes them to each <var>matcher</var> and <var>anchor</var>. <var>matcher</var>
+should return non-<code>nil</code> if the rule is applicable, and
+<var>anchor</var> should return a buffer position.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dsimple_002dindent_002dpresets"><span class="category">Variable: </span><span><strong>treesit-simple-indent-presets</strong><a href='#index-treesit_002dsimple_002dindent_002dpresets' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This is a list of presets for <var>matcher</var>s and <var>anchor</var>s in
-<code>treesit-simple-indent-rules</code>. Each of them represent a function
-that takes <var>node</var>, <var>parent</var> and <var>bol</var> as arguments.
-</p>
-<div class="example">
-<pre class="example">no-node
-</pre></div>
-
-<p>This matcher matches the case where <var>node</var> is nil, i.e., there is
-no node that starts at <var>bol</var>. This is the case when <var>bol</var> is
-at an empty line or inside a multi-line string, etc.
-</p>
-<div class="example">
-<pre class="example">(parent-is <var>type</var>)
-</pre></div>
-
-<p>This matcher matches if <var>parent</var>’s type is <var>type</var>.
-</p>
-<div class="example">
-<pre class="example">(node-is <var>type</var>)
-</pre></div>
-
-<p>This matcher matches if <var>node</var>’s type is <var>type</var>.
-</p>
-<div class="example">
-<pre class="example">(query <var>query</var>)
-</pre></div>
-
-<p>This matcher matches if querying <var>parent</var> with <var>query</var>
-captures <var>node</var>. The capture name does not matter.
-</p>
-<div class="example">
-<pre class="example">(match <var>node-type</var> <var>parent-type</var>
- <var>node-field</var> <var>node-index-min</var> <var>node-index-max</var>)
-</pre></div>
-
-<p>This matcher checks if <var>node</var>’s type is <var>node-type</var>,
-<var>parent</var>’s type is <var>parent-type</var>, <var>node</var>’s field name in
-<var>parent</var> is <var>node-field</var>, and <var>node</var>’s index among its
-siblings is between <var>node-index-min</var> and <var>node-index-max</var>. If
-the value of a constraint is nil, this matcher doesn’t check for that
-constraint. For example, to match the first child where parent is
-<code>argument_list</code>, use
+<dd><p>This is a list of defaults for <var>matcher</var>s and <var>anchor</var>s in
+<code>treesit-simple-indent-rules</code>. Each of them represents a function
+that takes 3 arguments: <var>node</var>, <var>parent</var> and <var>bol</var>. The
+available default functions are:
+</p>
+<dl compact="compact">
+<dt id='index-no_002dnode'><span><code>no-node</code><a href='#index-no_002dnode' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This matcher is a function that is called with 3 arguments:
+<var>node</var>, <var>parent</var>, and <var>bol</var>, and returns non-<code>nil</code>,
+indicating a match, if <var>node</var> is <code>nil</code>, i.e., there is no
+node that starts at <var>bol</var>. This is the case when <var>bol</var> is on
+an empty line or inside a multi-line string, etc.
+</p>
+</dd>
+<dt id='index-parent_002dis'><span><code>parent-is</code><a href='#index-parent_002dis' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This matcher is a function of one argument, <var>type</var>; it returns a
+function that is called with 3 arguments: <var>node</var>, <var>parent</var>,
+and <var>bol</var>, and returns non-<code>nil</code> (i.e., a match) if
+<var>parent</var>’s type matches regexp <var>type</var>.
+</p>
+</dd>
+<dt id='index-node_002dis'><span><code>node-is</code><a href='#index-node_002dis' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This matcher is a function of one argument, <var>type</var>; it returns a
+function that is called with 3 arguments: <var>node</var>, <var>parent</var>,
+and <var>bol</var>, and returns non-<code>nil</code> if <var>node</var>’s type matches
+regexp <var>type</var>.
+</p>
+</dd>
+<dt id='index-query'><span><code>query</code><a href='#index-query' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This matcher is a function of one argument, <var>query</var>; it returns a
+function that is called with 3 arguments: <var>node</var>, <var>parent</var>,
+and <var>bol</var>, and returns non-<code>nil</code> if querying <var>parent</var>
+with <var>query</var> captures <var>node</var> (see <a href="Pattern-Matching.html">Pattern Matching Tree-sitter Nodes</a>).
+</p>
+</dd>
+<dt id='index-match'><span><code>match</code><a href='#index-match' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This matcher is a function of 5 arguments: <var>node-type</var>,
+<var>parent-type</var>, <var>node-field</var>, <var>node-index-min</var>, and
+<var>node-index-max</var>). It returns a function that is called with 3
+arguments: <var>node</var>, <var>parent</var>, and <var>bol</var>, and returns
+non-<code>nil</code> if <var>node</var>’s type matches regexp <var>node-type</var>,
+<var>parent</var>’s type matches regexp <var>parent-type</var>, <var>node</var>’s
+field name in <var>parent</var> matches regexp <var>node-field</var>, and
+<var>node</var>’s index among its siblings is between <var>node-index-min</var>
+and <var>node-index-max</var>. If the value of an argument is <code>nil</code>,
+this matcher doesn’t check that argument. For example, to match the
+first child where parent is <code>argument_list</code>, use
</p>
<div class="example">
<pre class="example">(match nil "argument_list" nil nil 0 0)
</pre></div>
-<div class="example">
-<pre class="example">first-sibling
-</pre></div>
-
-<p>This anchor returns the start of the first child of <var>parent</var>.
-</p>
-<div class="example">
-<pre class="example">parent
-</pre></div>
-
-<p>This anchor returns the start of <var>parent</var>.
-</p>
-<div class="example">
-<pre class="example">parent-bol
-</pre></div>
-
-<p>This anchor returns the beginning of non-space characters on the line
-where <var>parent</var> is on.
-</p>
-<div class="example">
-<pre class="example">prev-sibling
-</pre></div>
-
-<p>This anchor returns the start of the previous sibling of <var>node</var>.
-</p>
-<div class="example">
-<pre class="example">no-indent
-</pre></div>
-
-<p>This anchor returns the start of <var>node</var>, i.e., no indent.
-</p>
-<div class="example">
-<pre class="example">prev-line
-</pre></div>
-
-<p>This anchor returns the first non-whitespace charater on the previous
-line.
-</p></dd></dl>
+</dd>
+<dt id='index-first_002dsibling'><span><code>first-sibling</code><a href='#index-first_002dsibling' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the start of the first child
+of <var>parent</var>.
+</p>
+</dd>
+<dt id='index-parent'><span><code>parent</code><a href='#index-parent' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the start of <var>parent</var>.
+</p>
+</dd>
+<dt id='index-parent_002dbol'><span><code>parent-bol</code><a href='#index-parent_002dbol' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the first non-space character
+on the line of <var>parent</var>.
+</p>
+</dd>
+<dt id='index-prev_002dsibling'><span><code>prev-sibling</code><a href='#index-prev_002dsibling' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the start of the previous
+sibling of <var>node</var>.
+</p>
+</dd>
+<dt id='index-no_002dindent'><span><code>no-indent</code><a href='#index-no_002dindent' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the start of <var>node</var>.
+</p>
+</dd>
+<dt id='index-prev_002dline'><span><code>prev-line</code><a href='#index-prev_002dline' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This anchor is a function that is called with 3 arguments: <var>node</var>,
+<var>parent</var>, and <var>bol</var>, and returns the first non-whitespace
+charater on the previous line.
+</p></dd>
+</dl>
+
+</dd></dl>
<span id="Indentation-utilities"></span><h3 class="heading">Indentation utilities</h3>
+<span id="index-utility-functions-for-parser_002dbased-indentation"></span>
-<p>Here are some utility functions that can help writing indentation
-rules.
+<p>Here are some utility functions that can help writing parser-based
+indentation rules.
</p>
<dl class="def">
<dt id="index-treesit_002dcheck_002dindent"><span class="category">Function: </span><span><strong>treesit-check-indent</strong> <em>mode</em><a href='#index-treesit_002dcheck_002dindent' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function checks current buffer’s indentation against major mode
-<var>mode</var>. It indents the current buffer in <var>mode</var> and compares
-the indentation with the current indentation. Then it pops up a diff
-buffer showing the difference. Correct indentation (target) is in
-green, current indentation is in red.
-</p></dd></dl>
-
-<p>It is also helpful to use <code>treesit-inspect-mode</code> when writing
-indentation rules.
+<dd><p>This function checks the current buffer’s indentation against major
+mode <var>mode</var>. It indents the current buffer according to
+<var>mode</var> and compares the results with the current indentation.
+Then it pops up a buffer showing the differences. Correct
+indentation (target) is shown in green color, current indentation is
+shown in red color. </p></dd></dl>
+
+<p>It is also helpful to use <code>treesit-inspect-mode</code> (see <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>) when writing indentation rules.
</p>
</div>
<hr>
<hr>
<span id="Parsing-Program-Source-1"></span><h2 class="chapter">37 Parsing Program Source</h2>
+<span id="index-syntax-tree_002c-from-parsing-program-source"></span>
<p>Emacs provides various ways to parse program source text and produce a
-<em>syntax tree</em>. In a syntax tree, text is no longer a
-one-dimensional stream but a structured tree of nodes, where each node
-representing a piece of text. Thus a syntax tree can enable
-interesting features like precise fontification, indentation,
+<em>syntax tree</em>. In a syntax tree, text is no longer considered a
+one-dimensional stream of characters, but a structured tree of nodes,
+where each node representing a piece of text. Thus, a syntax tree can
+enable interesting features like precise fontification, indentation,
navigation, structured editing, etc.
</p>
<p>Emacs has a simple facility for parsing balanced expressions
-(see <a href="Parsing-Expressions.html">Parsing Expressions</a>). There is also SMIE library for generic
-navigation and indentation (see <a href="SMIE.html">Simple Minded Indentation Engine</a>).
+(see <a href="Parsing-Expressions.html">Parsing Expressions</a>). There is also the SMIE library for
+generic navigation and indentation (see <a href="SMIE.html">Simple Minded Indentation Engine</a>).
</p>
-<p>Emacs also provides integration with tree-sitter library
-(<a href="https://tree-sitter.github.io/tree-sitter">https://tree-sitter.github.io/tree-sitter</a>) if compiled with
-it. The tree-sitter library implements an incremental parser and has
-support from a wide range of programming languages.
+<p>In addition to those, Emacs also provides integration with
+<a href="https://tree-sitter.github.io/tree-sitter">the tree-sitter
+library</a>) if support for it was compiled in. The tree-sitter library
+implements an incremental parser and has support from a wide range of
+programming languages.
</p>
<dl class="def">
<dt id="index-treesit_002davailable_002dp"><span class="category">Function: </span><span><strong>treesit-available-p</strong><a href='#index-treesit_002davailable_002dp' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function returns non-nil if tree-sitter features are available
-for this Emacs instance.
+<dd><p>This function returns non-<code>nil</code> if tree-sitter features are
+available for the current Emacs session.
</p></dd></dl>
-<p>For tree-sitter integration with existing Emacs features,
-see <a href="Parser_002dbased-Font-Lock.html">Parser-based Font Lock</a>, <a href="Parser_002dbased-Indentation.html">Parser-based Indentation</a>, and
-<a href="List-Motion.html">Moving over Balanced Expressions</a>.
-</p>
-<p>About naming convention: use “tree-sitter” when referring to it as a
-noun, like <code>python-use-tree-sitter</code>, but use “treesit” for
-prefixes, like <code>python-treesit-indent-function</code>.
-</p>
-<p>To access the syntax tree of the text in a buffer, we need to first
-load a language definition and create a parser with it. Next, we can
-query the parser for specific nodes in the syntax tree. Then, we can
-access various information about the node, and we can pattern-match a
-node with a powerful syntax. Finally, we explain how to work with
-source files that mixes multiple languages. The following sections
-explain how to do each of the tasks in detail.
+<p>To be able to parse the program source using the tree-sitter library
+and access the syntax tree of the program, a Lisp program needs to
+load a language definition library, and create a parser for that
+language and the current buffer. After that, the Lisp program can
+query the parser about specific nodes of the syntax tree. Then, it
+can access various kinds of information about each node, and search
+for nodes using a powerful pattern-matching syntax. This chapter
+explains how to do all this, and also how a Lisp program can work with
+source files that mix multiple programming languages.
</p>
<ul class="section-toc">
<li><a href="Language-Definitions.html" accesskey="1">Tree-sitter Language Definitions</a></li>
<li><a href="Using-Parser.html" accesskey="2">Using Tree-sitter Parser</a></li>
<li><a href="Retrieving-Node.html" accesskey="3">Retrieving Node</a></li>
-<li><a href="Accessing-Node.html" accesskey="4">Accessing Node Information</a></li>
+<li><a href="Accessing-Node-Information.html" accesskey="4">Accessing Node Information</a></li>
<li><a href="Pattern-Matching.html" accesskey="5">Pattern Matching Tree-sitter Nodes</a></li>
<li><a href="Multiple-Languages.html" accesskey="6">Parsing Text in Multiple Languages</a></li>
-<li><a href="Tree_002dsitter-C-API.html" accesskey="7">Tree-sitter C API Correspondence</a></li>
+<li><a href="Tree_002dsitter-major-modes.html" accesskey="7">Developing major modes with tree-sitter</a></li>
+<li><a href="Tree_002dsitter-C-API.html" accesskey="8">Tree-sitter C API Correspondence</a></li>
</ul>
</div>
<hr>
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source">
<link href="Multiple-Languages.html" rel="next" title="Multiple Languages">
-<link href="Accessing-Node.html" rel="prev" title="Accessing Node">
+<link href="Accessing-Node-Information.html" rel="prev" title="Accessing Node Information">
<style type="text/css">
<!--
a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em}
<div class="section" id="Pattern-Matching">
<div class="header">
<p>
-Next: <a href="Multiple-Languages.html" accesskey="n" rel="next">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node.html" accesskey="p" rel="prev">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Multiple-Languages.html" accesskey="n" rel="next">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node-Information.html" accesskey="p" rel="prev">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<span id="Pattern-Matching-Tree_002dsitter-Nodes"></span><h3 class="section">37.5 Pattern Matching Tree-sitter Nodes</h3>
+<span id="index-pattern-matching-with-tree_002dsitter-nodes"></span>
-<p>Tree-sitter let us pattern match with a small declarative language.
-Pattern matching consists of two steps: first tree-sitter matches a
-<em>pattern</em> against nodes in the syntax tree, then it <em>captures</em>
-specific nodes in that pattern and returns the captured nodes.
+<span id="index-capturing_002c-tree_002dsitter-node"></span>
+<p>Tree-sitter lets Lisp programs match patterns using a small
+declarative language. This pattern matching consists of two steps:
+first tree-sitter matches a <em>pattern</em> against nodes in the syntax
+tree, then it <em>captures</em> specific nodes that matched the pattern
+and returns the captured nodes.
</p>
<p>We describe first how to write the most basic query pattern and how to
-capture nodes in a pattern, then the pattern-match function, finally
-more advanced pattern syntax.
+capture nodes in a pattern, then the pattern-matching function, and
+finally the more advanced pattern syntax.
</p>
<span id="Basic-query-syntax"></span><h3 class="heading">Basic query syntax</h3>
-<span id="index-Tree_002dsitter-query-syntax"></span>
-<span id="index-Tree_002dsitter-query-pattern"></span>
+<span id="index-tree_002dsitter-query-pattern-syntax"></span>
+<span id="index-pattern-syntax_002c-tree_002dsitter-query"></span>
+<span id="index-query_002c-tree_002dsitter"></span>
<p>A <em>query</em> consists of multiple <em>patterns</em>. Each pattern is an
s-expression that matches a certain node in the syntax node. A
-pattern has the following shape:
+pattern has the form <code>(<var>type</var> (<var>child</var>…))</code><!-- /@w -->
</p>
-<div class="example">
-<pre class="example">(<var>type</var> <var>child</var>...)
-</pre></div>
-
<p>For example, a pattern that matches a <code>binary_expression</code> node that
contains <code>number_literal</code> child nodes would look like
</p>
<pre class="example">(binary_expression (number_literal))
</pre></div>
-<p>To <em>capture</em> a node in the query pattern above, append
-<code>@capture-name</code> after the node pattern you want to capture. For
-example,
+<p>To <em>capture</em> a node using the query pattern above, append
+<code>@<var>capture-name</var></code> after the node pattern you want to
+capture. For example,
</p>
<div class="example">
<pre class="example">(binary_expression (number_literal) @number-in-exp)
</pre></div>
<p>captures <code>number_literal</code> nodes that are inside a
-<code>binary_expression</code> node with capture name <code>number-in-exp</code>.
+<code>binary_expression</code> node with the capture name
+<code>number-in-exp</code>.
</p>
-<p>We can capture the <code>binary_expression</code> node too, with capture
-name <code>biexp</code>:
+<p>We can capture the <code>binary_expression</code> node as well, with, for
+example, the capture name <code>biexp</code>:
</p>
<div class="example">
<pre class="example">(binary_expression
<span id="Query-function"></span><h3 class="heading">Query function</h3>
-<p>Now we can introduce the query functions.
+<span id="index-query-functions_002c-tree_002dsitter"></span>
+<p>Now we can introduce the <em>query functions</em>.
</p>
<dl class="def">
<dt id="index-treesit_002dquery_002dcapture"><span class="category">Function: </span><span><strong>treesit-query-capture</strong> <em>node query &optional beg end node-only</em><a href='#index-treesit_002dquery_002dcapture' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function matches patterns in <var>query</var> in <var>node</var>.
-Parameter <var>query</var> can be either a string, a s-expression, or a
+<dd><p>This function matches patterns in <var>query</var> within <var>node</var>.
+The argument <var>query</var> can be either a string, a s-expression, or a
compiled query object. For now, we focus on the string syntax;
s-expression syntax and compiled query are described at the end of the
section.
</p>
-<p>Parameter <var>node</var> can also be a parser or a language symbol. A
+<p>The argument <var>node</var> can also be a parser or a language symbol. A
parser means using its root node, a language symbol means find or
create a parser for that language in the current buffer, and use the
root node.
</p>
-<p>The function returns all captured nodes in a list of
-<code>(<var>capture_name</var> . <var>node</var>)</code>. If <var>node-only</var> is
-non-nil, a list of node is returned instead. If <var>beg</var> and
-<var>end</var> are both non-nil, this function only pattern matches nodes
-in that range.
+<p>The function returns all the captured nodes in a list of the form
+<code>(<var><span class="nolinebreak">capture_name</span></var> . <var>node</var>)</code><!-- /@w -->. If <var>node-only</var> is
+non-<code>nil</code>, it returns the list of nodes instead. By default the
+entire text of <var>node</var> is searched, but if <var>beg</var> and <var>end</var>
+are both non-<code>nil</code>, they specify the region of buffer text where
+this function should match nodes. Any matching node whose span
+overlaps with the region between <var>beg</var> and <var>end</var> are captured,
+it doesn’t have to be completely in the region.
</p>
<span id="index-treesit_002dquery_002derror"></span>
-<p>This function raise a <var>treesit-query-error</var> if <var>query</var> is
-malformed. The signal data contains a description of the specific
-error. You can use <code>treesit-query-validate</code> to debug the query.
+<span id="index-treesit_002dquery_002dvalidate"></span>
+<p>This function raises the <code>treesit-query-error</code> error if
+<var>query</var> is malformed. The signal data contains a description of
+the specific error. You can use <code>treesit-query-validate</code> to
+validate and debug the query.
</p></dd></dl>
-<p>For example, suppose <var>node</var>’s content is <code>1 + 2</code>, and
+<p>For example, suppose <var>node</var>’s text is <code>1 + 2</code>, and
<var>query</var> is
</p>
<div class="example">
(number_literal) @number-in-exp) @biexp")
</pre></div>
-<p>Querying that query would return
+<p>Matching that query would return
</p>
<div class="example">
<pre class="example">(treesit-query-capture node query)
(number-in-exp . <var><node for "2"></var>))
</pre></div>
-<p>As we mentioned earlier, a <var>query</var> could contain multiple
-patterns. For example, it could have two top-level patterns:
+<p>As mentioned earlier, <var>query</var> could contain multiple patterns.
+For example, it could have two top-level patterns:
</p>
<div class="example">
<pre class="example">(setq query
<dl class="def">
<dt id="index-treesit_002dquery_002dstring"><span class="category">Function: </span><span><strong>treesit-query-string</strong> <em>string query language</em><a href='#index-treesit_002dquery_002dstring' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function parses <var>string</var> with <var>language</var>, pattern matches
-its root node with <var>query</var>, and returns the result.
+<dd><p>This function parses <var>string</var> with <var>language</var>, matches its
+root node with <var>query</var>, and returns the result.
</p></dd></dl>
<span id="More-query-syntax"></span><h3 class="heading">More query syntax</h3>
-<p>Besides node type and capture, tree-sitter’s query syntax can express
-anonymous node, field name, wildcard, quantification, grouping,
-alternation, anchor, and predicate.
+<p>Besides node type and capture, tree-sitter’s pattern syntax can
+express anonymous node, field name, wildcard, quantification,
+grouping, alternation, anchor, and predicate.
</p>
<span id="Anonymous-node"></span><h4 class="subheading">Anonymous node</h4>
<span id="Wild-card"></span><h4 class="subheading">Wild card</h4>
-<p>In a query pattern, ‘<samp>(_)</samp>’ matches any named node, and ‘<samp>_</samp>’
-matches any named and anonymous node. For example, to capture any
-named child of a <code>binary_expression</code> node, the pattern would be
+<p>In a pattern, ‘<samp>(_)</samp>’ matches any named node, and ‘<samp>_</samp>’ matches
+any named and anonymous node. For example, to capture any named child
+of a <code>binary_expression</code> node, the pattern would be
</p>
<div class="example">
<pre class="example">(binary_expression (_) @in_biexp)
<span id="Field-name"></span><h4 class="subheading">Field name</h4>
-<p>We can capture child nodes that has specific field names:
+<p>It is possible to capture child nodes that have specific field names.
+In the pattern below, <code>declarator</code> and <code>body</code> are field
+names, indicated by the colon following them.
</p>
<div class="example">
<pre class="example">(function_definition
body: (_) @func-body)
</pre></div>
-<p>We can also capture a node that doesn’t have certain field, say, a
-<code>function_definition</code> without a <code>body</code> field.
+<p>It is also possible to capture a node that doesn’t have a certain
+field, say, a <code>function_definition</code> without a <code>body</code> field.
</p>
<div class="example">
<pre class="example">(function_definition !body) @func-no-body
<span id="Quantify-node"></span><h4 class="subheading">Quantify node</h4>
+<span id="index-quantify-node_002c-tree_002dsitter"></span>
<p>Tree-sitter recognizes quantification operators ‘<samp>*</samp>’, ‘<samp>+</samp>’ and
‘<samp>?</samp>’. Their meanings are the same as in regular expressions:
‘<samp>*</samp>’ matches the preceding pattern zero or more times, ‘<samp>+</samp>’
matches one or more times, and ‘<samp>?</samp>’ matches zero or one time.
</p>
-<p>For example, this pattern matches <code>type_declaration</code> nodes
-that has <em>zero or more</em> <code>long</code> keyword.
+<p>For example, the following pattern matches <code>type_declaration</code>
+nodes that has <em>zero or more</em> <code>long</code> keyword.
</p>
<div class="example">
<pre class="example">(type_declaration "long"*) @long-type
</pre></div>
-<p>And this pattern matches a type declaration that has zero or one
+<p>The following pattern matches a type declaration that has zero or one
<code>long</code> keyword:
</p>
<div class="example">
<span id="Grouping"></span><h4 class="subheading">Grouping</h4>
-<p>Similar to groups in regular expression, we can bundle patterns into a
-group and apply quantification operators to it. For example, to
+<p>Similar to groups in regular expression, we can bundle patterns into
+groups and apply quantification operators to them. For example, to
express a comma separated list of identifiers, one could write
</p>
<div class="example">
<span id="Alternation"></span><h4 class="subheading">Alternation</h4>
<p>Again, similar to regular expressions, we can express “match anyone
-from this group of patterns” in the query pattern. The syntax is a
-list of patterns enclosed in square brackets. For example, to capture
-some keywords in C, the query pattern would be
+from this group of patterns” in a pattern. The syntax is a list of
+patterns enclosed in square brackets. For example, to capture some
+keywords in C, the pattern would be
</p>
<div class="example">
<pre class="example">[
<div class="example">
<pre class="example">;; Anchor the child with the end of its parent.
(compound_expression (_) @last-child .)
+</pre><pre class="example">
-;; Anchor the child with the beginning of its parent.
+</pre><pre class="example">;; Anchor the child with the beginning of its parent.
(compound_expression . (_) @first-child)
+</pre><pre class="example">
-;; Anchor two adjacent children.
+</pre><pre class="example">;; Anchor two adjacent children.
(compound_expression
(_) @prev-child
.
</p>
<span id="Predicate"></span><h4 class="subheading">Predicate</h4>
-<p>We can add predicate constraints to a pattern. For example, if we use
-the following query pattern
+<p>It is possible to add predicate constraints to a pattern. For
+example, with the following pattern:
</p>
<div class="example">
<pre class="example">(
)
</pre></div>
-<p>Then tree-sitter only matches arrays where the first element equals to
+<p>tree-sitter only matches arrays where the first element equals to
the last element. To attach a predicate to a pattern, we need to
-group then together. A predicate always starts with a ‘<samp>#</samp>’.
+group them together. A predicate always starts with a ‘<samp>#</samp>’.
Currently there are two predicates, <code>#equal</code> and <code>#match</code>.
</p>
<dl class="def">
<dt id="index-equal-1"><span class="category">Predicate: </span><span><strong>equal</strong> <em>arg1 arg2</em><a href='#index-equal-1' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Matches if <var>arg1</var> equals to <var>arg2</var>. Arguments can be either a
-string or a capture name. Capture names represent the text that the
+<dd><p>Matches if <var>arg1</var> equals to <var>arg2</var>. Arguments can be either
+strings or capture names. Capture names represent the text that the
captured node spans in the buffer.
</p></dd></dl>
<dl class="def">
-<dt id="index-match"><span class="category">Predicate: </span><span><strong>match</strong> <em>regexp capture-name</em><a href='#index-match' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Matches if the text that <var>capture-name</var>’s node spans in the buffer
+<dt id="index-match-1"><span class="category">Predicate: </span><span><strong>match</strong> <em>regexp capture-name</em><a href='#index-match-1' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>Matches if the text that <var>capture-name</var>’s node spans in the buffer
matches regular expression <var>regexp</var>. Matching is case-sensitive.
</p></dd></dl>
-<p>Note that a predicate can only refer to capture names appeared in the
-same pattern. Indeed, it makes little sense to refer to capture names
-in other patterns anyway.
+<p>Note that a predicate can only refer to capture names that appear in
+the same pattern. Indeed, it makes little sense to refer to capture
+names in other patterns.
</p>
<span id="S_002dexpression-patterns"></span><h3 class="heading">S-expression patterns</h3>
-<p>Besides strings, Emacs provides a s-expression based syntax for query
-patterns. It largely resembles the string-based syntax. For example,
-the following pattern
+<span id="index-tree_002dsitter-patterns-as-sexps"></span>
+<span id="index-patterns_002c-tree_002dsitter_002c-in-sexp-form"></span>
+<p>Besides strings, Emacs provides a s-expression based syntax for
+tree-sitter patterns. It largely resembles the string-based syntax.
+For example, the following query
</p>
<div class="example">
<pre class="example">(treesit-query-capture
["return" "break"] @keyword))
</pre></div>
-<p>Most pattern syntax can be written directly as strange but
-never-the-less valid s-expressions. Only a few of them needs
-modification:
+<p>Most patterns can be written directly as strange but nevertheless
+valid s-expressions. Only a few of them needs modification:
</p>
<ul>
<li> Anchor ‘<samp>.</samp>’ is written as <code>:anchor</code>.
<span id="Compiling-queries"></span><h3 class="heading">Compiling queries</h3>
-<p>If a query will be used repeatedly, especially in tight loops, it is
-important to compile that query, because a compiled query is much
-faster than an uncompiled one. A compiled query can be used anywhere
-a query is accepted.
+<span id="index-compiling-tree_002dsitter-queries"></span>
+<span id="index-queries_002c-compiling"></span>
+<p>If a query is intended to be used repeatedly, especially in tight
+loops, it is important to compile that query, because a compiled query
+is much faster than an uncompiled one. A compiled query can be used
+anywhere a query is accepted.
</p>
<dl class="def">
<dt id="index-treesit_002dquery_002dcompile"><span class="category">Function: </span><span><strong>treesit-query-compile</strong> <em>language query</em><a href='#index-treesit_002dquery_002dcompile' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function compiles <var>query</var> for <var>language</var> into a compiled
query object and returns it.
</p>
-<p>This function raise a <var>treesit-query-error</var> if <var>query</var> is
-malformed. The signal data contains a description of the specific
-error. You can use <code>treesit-query-validate</code> to debug the query.
+<p>This function raises the <code>treesit-query-error</code> error if
+<var>query</var> is malformed. The signal data contains a description of
+the specific error. You can use <code>treesit-query-validate</code> to
+validate and debug the query.
+</p></dd></dl>
+
+<dl class="def">
+<dt id="index-treesit_002dquery_002dlanguage"><span class="category">Function: </span><span><strong>treesit-query-language</strong> <em>query</em><a href='#index-treesit_002dquery_002dlanguage' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function return the language of <var>query</var>.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dquery_002dexpand"><span class="category">Function: </span><span><strong>treesit-query-expand</strong> <em>query</em><a href='#index-treesit_002dquery_002dexpand' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function expands the s-expression <var>query</var> into a string
-query.
+<dd><p>This function converts the s-expression <var>query</var> into the string
+format.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dpattern_002dexpand"><span class="category">Function: </span><span><strong>treesit-pattern-expand</strong> <em>pattern</em><a href='#index-treesit_002dpattern_002dexpand' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function expands the s-expression <var>pattern</var> into a string
-pattern.
+<dd><p>This function converts the s-expression <var>pattern</var> into the string
+format.
</p></dd></dl>
-<p>Finally, tree-sitter project’s documentation about
-pattern-matching can be found at
+<p>For more details, read the tree-sitter project’s documentation about
+pattern-matching, which can be found at
<a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries">https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries</a>.
</p>
</div>
<hr>
<div class="header">
<p>
-Next: <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node.html">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>, Previous: <a href="Accessing-Node-Information.html">Accessing Node Information</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<link href="Index.html" rel="index" title="Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source">
-<link href="Accessing-Node.html" rel="next" title="Accessing Node">
+<link href="Accessing-Node-Information.html" rel="next" title="Accessing Node Information">
<link href="Using-Parser.html" rel="prev" title="Using Parser">
<style type="text/css">
<!--
<div class="section" id="Retrieving-Node">
<div class="header">
<p>
-Next: <a href="Accessing-Node.html" accesskey="n" rel="next">Accessing Node Information</a>, Previous: <a href="Using-Parser.html" accesskey="p" rel="prev">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Accessing-Node-Information.html" accesskey="n" rel="next">Accessing Node Information</a>, Previous: <a href="Using-Parser.html" accesskey="p" rel="prev">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<span id="Retrieving-Node-1"></span><h3 class="section">37.3 Retrieving Node</h3>
+<span id="index-retrieve-node_002c-tree_002dsitter"></span>
+<span id="index-tree_002dsitter_002c-find-node"></span>
+<span id="index-get-node_002c-tree_002dsitter"></span>
-<span id="index-tree_002dsitter-find-node"></span>
-<span id="index-tree_002dsitter-get-node"></span>
-<p>Before we continue, lets go over some conventions of tree-sitter
-functions.
+<span id="index-terminology_002c-for-tree_002dsitter-functions"></span>
+<p>Here’s some terminology and conventions we use when documenting
+tree-sitter functions.
</p>
<p>We talk about a node being “smaller” or “larger”, and “lower” or
“higher”. A smaller and lower node is lower in the syntax tree and
-therefore spans a smaller piece of text; a larger and higher node is
-higher up in the syntax tree, containing many smaller nodes as its
-children, and therefore spans a larger piece of text.
+therefore spans a smaller portion of buffer text; a larger and higher
+node is higher up in the syntax tree, it contains many smaller nodes
+as its children, and therefore spans a larger portion of text.
</p>
-<p>When a function cannot find a node, it returns nil. And for the
-convenience for function chaining, all the functions that take a node
-as argument and returns a node accept the node to be nil; in that
-case, the function just returns nil.
+<p>When a function cannot find a node, it returns <code>nil</code>. For
+convenience, all functions that take a node as argument and return
+a node, also accept the node argument of <code>nil</code> and in that case
+just return <code>nil</code>.
</p>
<span id="index-treesit_002dnode_002doutdated"></span>
<p>Nodes are not automatically updated when the associated buffer is
-modified. And there is no way to update a node once it is retrieved.
-Using an outdated node throws <code>treesit-node-outdated</code> error.
+modified, and there is no way to update a node once it is retrieved.
+Using an outdated node signals the <code>treesit-node-outdated</code> error.
</p>
<span id="Retrieving-node-from-syntax-tree"></span><h3 class="heading">Retrieving node from syntax tree</h3>
+<span id="index-retrieving-tree_002dsitter-nodes"></span>
+<span id="index-syntax-tree_002c-retrieving-nodes"></span>
<dl class="def">
-<dt id="index-treesit_002dnode_002dat"><span class="category">Function: </span><span><strong>treesit-node-at</strong> <em>beg end &optional parser-or-lang named</em><a href='#index-treesit_002dnode_002dat' class='copiable-anchor'> ¶</a></span></dt>
+<dt id="index-treesit_002dnode_002dat"><span class="category">Function: </span><span><strong>treesit-node-at</strong> <em>pos &optional parser-or-lang named</em><a href='#index-treesit_002dnode_002dat' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function returns the <em>smallest</em> node that starts at or after
-the <var>point</var>. In other words, the start of the node is equal or
-greater than <var>point</var>.
+the buffer position <var>pos</var>. In other words, the start of the node
+is greater or equal to <var>pos</var>.
</p>
-<p>When <var>parser-or-lang</var> is nil, this function uses the first parser
-in <code>(treesit-parser-list)</code> in the current buffer. If
-<var>parser-or-lang</var> is a parser object, it use that parser; if
-<var>parser-or-lang</var> is a language, it finds the first parser using
-that language in <code>(treesit-parser-list)</code> and use that.
+<p>When <var>parser-or-lang</var> is <code>nil</code> or omitted, this function uses
+the first parser in <code>(treesit-parser-list)</code> of the current
+buffer. If <var>parser-or-lang</var> is a parser object, it uses that
+parser; if <var>parser-or-lang</var> is a language, it finds the first
+parser using that language in <code>(treesit-parser-list)</code>, and uses
+that.
</p>
-<p>If <var>named</var> is non-nil, this function looks for a named node
+<p>If <var>named</var> is non-<code>nil</code>, this function looks for a named node
only (see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
</p>
+<p>When <var>pos</var> is after all the text in the buffer, technically there
+is no node after <var>pos</var>. But for convenience, this function will
+return the last leaf node in the parse tree. If <var>strict</var> is
+non-<code>nil</code>, this function will strictly comply to the semantics and
+return <var>nil</var>.
+</p>
<p>Example:
-</p><div class="example">
+</p>
+<div class="example">
<pre class="example">;; Find the node at point in a C parser's syntax tree.
(treesit-node-at (point) 'c)
- </pre></div>
+ ⇒ #<treesit-node (primitive_type) in 23-27>
+</pre></div>
</dd></dl>
<dl class="def">
<dt id="index-treesit_002dnode_002don"><span class="category">Function: </span><span><strong>treesit-node-on</strong> <em>beg end &optional parser-or-lang named</em><a href='#index-treesit_002dnode_002don' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function returns the <em>smallest</em> node that covers the span
-from <var>beg</var> to <var>end</var>. In other words, the start of the node is
-less or equal to <var>beg</var>, and the end of the node is greater or
-equal to <var>end</var>.
+<dd><p>This function returns the <em>smallest</em> node that covers the region
+of buffer text between <var>beg</var> and <var>end</var>. In other words, the
+start of the node is before or at <var>beg</var>, and the end of the node
+is at or after <var>end</var>.
</p>
-<p><em>Beware</em> that calling this function on an empty line that is not
-inside any top-level construct (function definition, etc) most
+<p><em>Beware:</em> calling this function on an empty line that is not
+inside any top-level construct (function definition, etc.) most
probably will give you the root node, because the root node is the
smallest node that covers that empty line. Most of the time, you want
-to use <code>treesit-node-at</code>.
+to use <code>treesit-node-at</code>, described above, instead.
</p>
-<p>When <var>parser-or-lang</var> is nil, this function uses the first parser
-in <code>(treesit-parser-list)</code> in the current buffer. If
-<var>parser-or-lang</var> is a parser object, it use that parser; if
+<p>When <var>parser-or-lang</var> is <code>nil</code>, this function uses the first
+parser in <code>(treesit-parser-list)</code> of the current buffer. If
+<var>parser-or-lang</var> is a parser object, it uses that parser; if
<var>parser-or-lang</var> is a language, it finds the first parser using
-that language in <code>(treesit-parser-list)</code> and use that.
+that language in <code>(treesit-parser-list)</code>, and uses that.
</p>
-<p>If <var>named</var> is non-nil, this function looks for a named node only
-(see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
+<p>If <var>named</var> is non-<code>nil</code>, this function looks for a named node
+only (see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
</p></dd></dl>
<dl class="def">
<dl class="def">
<dt id="index-treesit_002dbuffer_002droot_002dnode"><span class="category">Function: </span><span><strong>treesit-buffer-root-node</strong> <em>&optional language</em><a href='#index-treesit_002dbuffer_002droot_002dnode' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function finds the first parser that uses <var>language</var> in
-<code>(treesit-parser-list)</code> in the current buffer, and returns the
-root node of that buffer. If it cannot find an appropriate parser,
-nil is returned.
+<code>(treesit-parser-list)</code> of the current buffer, and returns the
+root node generated by that parser. If it cannot find an appropriate
+parser, it returns <code>nil</code>.
</p></dd></dl>
-<p>Once we have a node, we can retrieve other nodes from it, or query for
-information about this node.
+<p>Given a node, a Lisp program can retrieve other nodes starting from
+it, or query for information about this node.
</p>
<span id="Retrieving-node-from-other-nodes"></span><h3 class="heading">Retrieving node from other nodes</h3>
+<span id="index-syntax-tree-nodes_002c-retrieving-from-other-nodes"></span>
<span id="By-kinship"></span><h4 class="subheading">By kinship</h4>
+<span id="index-kinship_002c-syntax-tree-nodes"></span>
+<span id="index-nodes_002c-by-kinship"></span>
+<span id="index-syntax-tree-nodes_002c-by-kinship"></span>
<dl class="def">
<dt id="index-treesit_002dnode_002dparent"><span class="category">Function: </span><span><strong>treesit-node-parent</strong> <em>node</em><a href='#index-treesit_002dnode_002dparent' class='copiable-anchor'> ¶</a></span></dt>
<dl class="def">
<dt id="index-treesit_002dnode_002dchild"><span class="category">Function: </span><span><strong>treesit-node-child</strong> <em>node n &optional named</em><a href='#index-treesit_002dnode_002dchild' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function returns the <var>n</var>’th child of <var>node</var>. If
-<var>named</var> is non-nil, then it only counts named nodes
-(see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>). For example, in a node
-that represents a string: <code>"text"</code>, there are three children
-nodes: the opening quote <code>"</code>, the string content <code>text</code>, and
-the enclosing quote <code>"</code>. Among these nodes, the first child is
-the opening quote <code>"</code>, the first named child is the string
-content <code>text</code>.
+<var>named</var> is non-<code>nil</code>, it counts only named nodes
+(see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
+</p>
+<p>For example, in a node that represents a string <code>"text"</code>, there
+are three children nodes: the opening quote <code>"</code>, the string text
+<code>text</code>, and the closing quote <code>"</code>. Among these nodes, the
+first child is the opening quote <code>"</code>, and the first named child
+is the string text.
+</p>
+<p>This function returns <code>nil</code> if there is no <var>n</var>’th child.
+<var>n</var> could be negative, e.g., <code>-1</code> represents the last child.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dnode_002dchildren"><span class="category">Function: </span><span><strong>treesit-node-children</strong> <em>node &optional named</em><a href='#index-treesit_002dnode_002dchildren' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function returns all of <var>node</var>’s children in a list. If
-<var>named</var> is non-nil, then it only retrieves named nodes.
+<dd><p>This function returns all of <var>node</var>’s children as a list. If
+<var>named</var> is non-<code>nil</code>, it retrieves only named nodes.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dnext_002dsibling"><span class="category">Function: </span><span><strong>treesit-next-sibling</strong> <em>node &optional named</em><a href='#index-treesit_002dnext_002dsibling' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function finds the next sibling of <var>node</var>. If <var>named</var> is
-non-nil, it finds the next named sibling.
+non-<code>nil</code>, it finds the next named sibling.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dprev_002dsibling"><span class="category">Function: </span><span><strong>treesit-prev-sibling</strong> <em>node &optional named</em><a href='#index-treesit_002dprev_002dsibling' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function finds the previous sibling of <var>node</var>. If
-<var>named</var> is non-nil, it finds the previous named sibling.
+<var>named</var> is non-<code>nil</code>, it finds the previous named sibling.
</p></dd></dl>
<span id="By-field-name"></span><h4 class="subheading">By field name</h4>
+<span id="index-nodes_002c-by-field-name"></span>
+<span id="index-syntax-tree-nodes_002c-by-field-name"></span>
<p>To make the syntax tree easier to analyze, many language definitions
assign <em>field names</em> to child nodes (see <a href="Language-Definitions.html#tree_002dsitter-node-field-name">field name</a>). For example, a <code>function_definition</code> node
-could have a <code>declarator</code> and a <code>body</code>.
+could have a <code>declarator</code> node and a <code>body</code> node.
</p>
<dl class="def">
<dt id="index-treesit_002dchild_002dby_002dfield_002dname"><span class="category">Function: </span><span><strong>treesit-child-by-field-name</strong> <em>node field-name</em><a href='#index-treesit_002dchild_002dby_002dfield_002dname' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function finds the child of <var>node</var> that has <var>field-name</var>
-as its field name.
+<dd><p>This function finds the child of <var>node</var> whose field name is
+<var>field-name</var>, a string.
</p>
<div class="example">
<pre class="example">;; Get the child that has "body" as its field name.
(treesit-child-by-field-name node "body")
- </pre></div>
+ ⇒ #<treesit-node (compound_statement) in 45-89>
+</pre></div>
</dd></dl>
<span id="By-position"></span><h4 class="subheading">By position</h4>
+<span id="index-nodes_002c-by-position"></span>
+<span id="index-syntax-tree-nodes_002c-by-position"></span>
<dl class="def">
<dt id="index-treesit_002dfirst_002dchild_002dfor_002dpos"><span class="category">Function: </span><span><strong>treesit-first-child-for-pos</strong> <em>node pos &optional named</em><a href='#index-treesit_002dfirst_002dchild_002dfor_002dpos' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function finds the first child of <var>node</var> that extends beyond
-<var>pos</var>. “Extend beyond” means the end of the child node >=
-<var>pos</var>. This function only looks for immediate children of
-<var>node</var>, and doesn’t look in its grand children. If <var>named</var> is
-non-nil, it only looks for named child (see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
+buffer position <var>pos</var>. “Extends beyond” means the end of the
+child node is greater or equal to <var>pos</var>. This function only looks
+for immediate children of <var>node</var>, and doesn’t look in its
+grandchildren. If <var>named</var> is non-<code>nil</code>, it looks for the
+first named child (see <a href="Language-Definitions.html#tree_002dsitter-named-node">named node</a>).
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dnode_002ddescendant_002dfor_002drange"><span class="category">Function: </span><span><strong>treesit-node-descendant-for-range</strong> <em>node beg end &optional named</em><a href='#index-treesit_002dnode_002ddescendant_002dfor_002drange' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function finds the <em>smallest</em> child/grandchild... of
-<var>node</var> that spans the range from <var>beg</var> to <var>end</var>. It is
-similar to <code>treesit-node-at</code>. If <var>named</var> is non-nil, it only
-looks for named child.
+<dd><p>This function finds the <em>smallest</em> descendant node of <var>node</var>
+that spans the region of text between positions <var>beg</var> and
+<var>end</var>. It is similar to <code>treesit-node-at</code>. If <var>named</var>
+is non-<code>nil</code>, it looks for smallest named child.
</p></dd></dl>
<span id="Searching-for-node"></span><h3 class="heading">Searching for node</h3>
<dl class="def">
-<dt id="index-treesit_002dsearch_002dsubtree"><span class="category">Function: </span><span><strong>treesit-search-subtree</strong> <em>node predicate &optional all backward limit</em><a href='#index-treesit_002dsearch_002dsubtree' class='copiable-anchor'> ¶</a></span></dt>
+<dt id="index-treesit_002dsearch_002dsubtree"><span class="category">Function: </span><span><strong>treesit-search-subtree</strong> <em>node predicate &optional backward all limit</em><a href='#index-treesit_002dsearch_002dsubtree' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function traverses the subtree of <var>node</var> (including
-<var>node</var>), and match <var>predicate</var> with each node along the way.
-And <var>predicate</var> is a regexp that matches (case-insensitively)
-against each node’s type, or a function that takes a node and returns
-nil/non-nil. If a node matches, that node is returned, if no node
-ever matches, nil is returned.
+<var>node</var> itself), looking for a node for which <var>predicate</var>
+returns non-<code>nil</code>. <var>predicate</var> is a regexp that is matched
+(case-insensitively) against each node’s type, or a predicate function
+that takes a node and returns non-<code>nil</code> if the node matches. The
+function returns the first node that matches, or <code>nil</code> if none
+does.
</p>
-<p>By default, this function only traverses named nodes, if <var>all</var> is
-non-nil, it traverses all nodes. If <var>backward</var> is non-nil, it
-traverses backwards. If <var>limit</var> is non-nil, it only traverses
-that number of levels down in the tree.
+<p>By default, this function only traverses named nodes, but if <var>all</var>
+is non-<code>nil</code>, it traverses all the nodes. If <var>backward</var> is
+non-<code>nil</code>, it traverses backwards (i.e., it visits the last child first
+when traversing down the tree). If <var>limit</var> is non-<code>nil</code>, it
+must be a number that limits the tree traversal to that many levels
+down the tree.
</p></dd></dl>
<dl class="def">
-<dt id="index-treesit_002dsearch_002dforward"><span class="category">Function: </span><span><strong>treesit-search-forward</strong> <em>start predicate &optional all backward up</em><a href='#index-treesit_002dsearch_002dforward' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function is somewhat similar to <code>treesit-search-subtree</code>.
-It also traverse the parse tree and match each node with
-<var>predicate</var> (except for <var>start</var>), where <var>predicate</var> can be
-a (case-insensitive) regexp or a function. For a tree like the below
-where <var>start</var> is marked 1, this function traverses as numbered:
+<dt id="index-treesit_002dsearch_002dforward"><span class="category">Function: </span><span><strong>treesit-search-forward</strong> <em>start predicate &optional backward all</em><a href='#index-treesit_002dsearch_002dforward' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>Like <code>treesit-search-subtree</code>, this function also traverses the
+parse tree and matches each node with <var>predicate</var> (except for
+<var>start</var>), where <var>predicate</var> can be a (case-insensitive) regexp
+or a function. For a tree like the below where <var>start</var> is marked
+S, this function traverses as numbered from 1 to 12:
</p>
<div class="example">
-<pre class="example"> o
+<pre class="example"> 12
|
- 3--------4-----------8
- | | |
-o--o-+--1 5--+--6 9---+-----12
-| | | | | |
-o o 2 7 +-+-+ +--+--+
- | | | | |
- 10 11 13 14 15
+ S--------3----------11
+ | | |
+o--o-+--o 1--+--2 6--+-----10
+| | | |
+o o +-+-+ +--+--+
+ | | | | |
+ 4 5 7 8 9
</pre></div>
-<p>Same as in <code>treesit-search-subtree</code>, this function only searches
-for named nodes by default. But if <var>all</var> is non-nil, it searches
-for all nodes. If <var>backward</var> is non-nil, it searches backwards.
+<p>Note that this function doesn’t traverse the subtree of <var>start</var>,
+and it always traverse leaf nodes first, then upwards.
+</p>
+<p>Like <code>treesit-search-subtree</code>, this function only searches for
+named nodes by default, but if <var>all</var> is non-<code>nil</code>, it
+searches for all nodes. If <var>backward</var> is non-<code>nil</code>, it
+searches backwards.
+</p>
+<p>While <code>treesit-search-subtree</code> traverses the subtree of a node,
+this function starts with node <var>start</var> and traverses every node
+that comes after it in the buffer position order, i.e., nodes with
+start positions greater than the end position of <var>start</var>.
</p>
-<p>If <var>up</var> is non-nil, this function will only traverse to siblings
-and parents. In that case, only 1 3 4 8 would be traversed.
+<p>In the tree shown above, <code>treesit-search-subtree</code> traverses node
+S (<var>start</var>) and nodes marked with <code>o</code>, where this function
+traverses the nodes marked with numbers. This function is useful for
+answering questions like “what is the first node after <var>start</var> in
+the buffer that satisfies some condition?”
</p></dd></dl>
<dl class="def">
-<dt id="index-treesit_002dsearch_002dforward_002dgoto"><span class="category">Function: </span><span><strong>treesit-search-forward-goto</strong> <em>predicate side &optional all backward up</em><a href='#index-treesit_002dsearch_002dforward_002dgoto' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function jumps to the start or end of the next node in buffer
-that matches <var>predicate</var>. Parameters <var>predicate</var>, <var>all</var>,
-<var>backward</var>, and <var>up</var> are the same as in
-<code>treesit-search-forward</code>. And <var>side</var> controls which side of
-the matched no do we stop at, it can be <code>start</code> or <code>end</code>.
+<dt id="index-treesit_002dsearch_002dforward_002dgoto"><span class="category">Function: </span><span><strong>treesit-search-forward-goto</strong> <em>node predicate &optional start backward all</em><a href='#index-treesit_002dsearch_002dforward_002dgoto' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function moves point to the start or end of the next node after
+<var>node</var> in the buffer that matches <var>predicate</var>. If <var>start</var>
+is non-<code>nil</code>, stop at the beginning rather than the end of a node.
+</p>
+<p>This function guarantees that the matched node it returns makes
+progress in terms of buffer position: the start/end position of the
+returned node is always greater than that of <var>node</var>.
+</p>
+<p>Arguments <var>predicate</var>, <var>backward</var> and <var>all</var> are the same
+as in <code>treesit-search-forward</code>.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dinduce_002dsparse_002dtree"><span class="category">Function: </span><span><strong>treesit-induce-sparse-tree</strong> <em>root predicate &optional process-fn limit</em><a href='#index-treesit_002dinduce_002dsparse_002dtree' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function creates a sparse tree from <var>root</var>’s subtree.
</p>
-<p>Basically, it takes the subtree under <var>root</var>, and combs it so only
-the nodes that match <var>predicate</var> are left, like picking out grapes
-on the vine. Like previous functions, <var>predicate</var> can be a regexp
-string that matches against each node’s type case-insensitively, or a
-function that takes a node and return nil/non-nil.
+<p>It takes the subtree under <var>root</var>, and combs it so only the nodes
+that match <var>predicate</var> are left. Like previous functions, the
+<var>predicate</var> can be a regexp string that matches against each
+node’s type case-insensitively, or a function that takes a node and
+return non-<code>nil</code> if it matches.
</p>
<p>For example, for a subtree on the left that consist of both numbers
and letters, if <var>predicate</var> is “letter only”, the returned tree
e 5 e
</pre></div>
-<p>If <var>process-fn</var> is non-nil, instead of returning the matched
+<p>If <var>process-fn</var> is non-<code>nil</code>, instead of returning the matched
nodes, this function passes each node to <var>process-fn</var> and uses the
-returned value instead. If non-nil, <var>limit</var> is the number of
+returned value instead. If non-<code>nil</code>, <var>limit</var> is the number of
levels to go down from <var>root</var>.
</p>
-<p>Each node in the returned tree looks like <code>(<var>tree-sitter
-node</var> . (<var>child</var> ...))</code>. The <var>tree-sitter node</var> of the root
-of this tree will be nil if <var>ROOT</var> doesn’t match <var>pred</var>. If
-no node matches <var>predicate</var>, return nil.
+<p>Each node in the returned tree looks like
+<code>(<var><span class="nolinebreak">tree-sitter-node</span></var> . (<var>child</var> …))</code><!-- /@w -->. The
+<var>tree-sitter-node</var> of the root of this tree will be nil if
+<var>root</var> doesn’t match <var>predicate</var>. If no node matches
+<var>predicate</var>, the function returns <code>nil</code>.
</p></dd></dl>
-<span id="More-convenient-functions"></span><h3 class="heading">More convenient functions</h3>
+<span id="More-convenience-functions"></span><h3 class="heading">More convenience functions</h3>
<dl class="def">
-<dt id="index-treesit_002dfilter_002dchild"><span class="category">Function: </span><span><strong>treesit-filter-child</strong> <em>node pred &optional named</em><a href='#index-treesit_002dfilter_002dchild' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function finds immediate children of <var>node</var> that satisfies
-<var>pred</var>.
+<dt id="index-treesit_002dfilter_002dchild"><span class="category">Function: </span><span><strong>treesit-filter-child</strong> <em>node predicate &optional named</em><a href='#index-treesit_002dfilter_002dchild' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function finds immediate children of <var>node</var> that satisfy
+<var>predicate</var>.
</p>
-<p>Function <var>pred</var> takes the child node as the argument and should
-return non-nil to indicated keeping the child. If <var>named</var>
-non-nil, this function only searches for named nodes.
+<p>The <var>predicate</var> function takes a node as the argument and should
+return non-<code>nil</code> to indicate that the node should be kept. If
+<var>named</var> is non-<code>nil</code>, this function only examines the named
+nodes.
</p></dd></dl>
<dl class="def">
-<dt id="index-treesit_002dparent_002duntil"><span class="category">Function: </span><span><strong>treesit-parent-until</strong> <em>node pred</em><a href='#index-treesit_002dparent_002duntil' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function repeatedly finds the parent of <var>node</var>, and returns
-the parent if it satisfies <var>pred</var> (which takes the parent as the
-argument). If no parent satisfies <var>pred</var>, this function returns
-nil.
+<dt id="index-treesit_002dparent_002duntil"><span class="category">Function: </span><span><strong>treesit-parent-until</strong> <em>node predicate</em><a href='#index-treesit_002dparent_002duntil' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function repeatedly finds the parents of <var>node</var>, and returns
+the parent that satisfies <var>predicate</var>, a function that takes a
+node as the argument. If no parent satisfies <var>predicate</var>, this
+function returns <code>nil</code>.
</p></dd></dl>
<dl class="def">
-<dt id="index-treesit_002dparent_002dwhile"><span class="category">Function: </span><span><strong>treesit-parent-while</strong><a href='#index-treesit_002dparent_002dwhile' class='copiable-anchor'> ¶</a></span></dt>
+<dt id="index-treesit_002dparent_002dwhile"><span class="category">Function: </span><span><strong>treesit-parent-while</strong> <em>node predicate</em><a href='#index-treesit_002dparent_002dwhile' class='copiable-anchor'> ¶</a></span></dt>
<dd><p>This function repeatedly finds the parent of <var>node</var>, and keeps
-doing so as long as the parent satisfies <var>pred</var> (which takes the
-parent as the single argument). I.e., this function returns the
-farthest parent that still satisfies <var>pred</var>.
+doing so as long as the nodes satisfy <var>predicate</var>, a function that
+takes a node as the argument. That is, this function returns the
+farthest parent that still satisfies <var>predicate</var>.
+</p></dd></dl>
+
+<dl class="def">
+<dt id="index-treesit_002dnode_002dtop_002dlevel"><span class="category">Function: </span><span><strong>treesit-node-top-level</strong> <em>node &optional type</em><a href='#index-treesit_002dnode_002dtop_002dlevel' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function returns the highest parent of <var>node</var> that has the
+same type as <var>node</var>. If no such parent exists, it returns
+<code>nil</code>. Therefore this function is also useful for testing
+whether <var>node</var> is top-level.
+</p>
+<p>If <var>type</var> is non-<code>nil</code>, this function matches each parent’s
+type with <var>type</var> as a regexp, rather than using <var>node</var>’s type.
</p></dd></dl>
</div>
<hr>
<div class="header">
<p>
-Next: <a href="Accessing-Node.html">Accessing Node Information</a>, Previous: <a href="Using-Parser.html">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Next: <a href="Accessing-Node-Information.html">Accessing Node Information</a>, Previous: <a href="Using-Parser.html">Using Tree-sitter Parser</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<link href="Index.html" rel="index" title="Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Parsing-Program-Source.html" rel="up" title="Parsing Program Source">
-<link href="Multiple-Languages.html" rel="prev" title="Multiple Languages">
+<link href="Tree_002dsitter-major-modes.html" rel="prev" title="Tree-sitter major modes">
<style type="text/css">
<!--
a.copiable-anchor {visibility: hidden; text-decoration: none; line-height: 0em}
<div class="section" id="Tree_002dsitter-C-API">
<div class="header">
<p>
-Previous: <a href="Multiple-Languages.html" accesskey="p" rel="prev">Parsing Text in Multiple Languages</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Previous: <a href="Tree_002dsitter-major-modes.html" accesskey="p" rel="prev">Developing major modes with tree-sitter</a>, Up: <a href="Parsing-Program-Source.html" accesskey="u" rel="up">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
-<span id="Tree_002dsitter-C-API-Correspondence"></span><h3 class="section">37.7 Tree-sitter C API Correspondence</h3>
+<span id="Tree_002dsitter-C-API-Correspondence"></span><h3 class="section">37.8 Tree-sitter C API Correspondence</h3>
<p>Emacs’ tree-sitter integration doesn’t expose every feature
-tree-sitter’s C API provides. Missing features include:
+provided by tree-sitter’s C API. Missing features include:
</p>
<ul>
<li> Creating a tree cursor and navigating the syntax tree with it.
</li><li> Setting timeout and cancellation flag for a parser.
</li><li> Setting the logger for a parser.
-</li><li> Printing a DOT graph of the syntax tree to a file.
-</li><li> Coping and modifying a syntax tree. (Emacs doesn’t expose a tree
+</li><li> Printing a <acronym>DOT</acronym> graph of the syntax tree to a file.
+</li><li> Copying and modifying a syntax tree. (Emacs doesn’t expose a tree
object.)
</li><li> Using (row, column) coordinates as position.
-</li><li> Updating a node with changes. (In Emacs, retrieve a new node instead
+</li><li> Updating a node with changes. (In Emacs, retrieve a new node instead
of updating the existing one.)
</li><li> Querying statics of a language definition.
</li></ul>
convenient and idiomatic:
</p>
<ul>
-<li> Instead of using byte positions, the ELisp API uses character
+<li> Instead of using byte positions, the Emacs Lisp API uses character
positions.
</li><li> Null nodes are converted to nil.
</li></ul>
<hr>
<div class="header">
<p>
-Previous: <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
+Previous: <a href="Tree_002dsitter-major-modes.html">Developing major modes with tree-sitter</a>, Up: <a href="Parsing-Program-Source.html">Parsing Program Source</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index.html" title="Index" rel="index">Index</a>]</p>
</div>
</div>
<hr>
<span id="Using-Tree_002dsitter-Parser"></span><h3 class="section">37.2 Using Tree-sitter Parser</h3>
-<span id="index-Tree_002dsitter-parser"></span>
+<span id="index-tree_002dsitter-parser_002c-using"></span>
-<p>This section described how to create and configure a tree-sitter
+<p>This section describes how to create and configure a tree-sitter
parser. In Emacs, each tree-sitter parser is associated with a
-buffer. As we edit the buffer, the associated parser and the syntax
-tree is automatically kept up-to-date.
+buffer. As the user edits the buffer, the associated parser and
+syntax tree are automatically kept up-to-date.
</p>
<dl class="def">
<dt id="index-treesit_002dmax_002dbuffer_002dsize"><span class="category">Variable: </span><span><strong>treesit-max-buffer-size</strong><a href='#index-treesit_002dmax_002dbuffer_002dsize' class='copiable-anchor'> ¶</a></span></dt>
<code>treesit-available-p</code> and <code>treesit-max-buffer-size</code>.
</p></dd></dl>
-<span id="index-Creating-tree_002dsitter-parsers"></span>
+<span id="index-creating-tree_002dsitter-parsers"></span>
+<span id="index-tree_002dsitter-parser_002c-creating"></span>
<dl class="def">
<dt id="index-treesit_002dparser_002dcreate"><span class="category">Function: </span><span><strong>treesit-parser-create</strong> <em>language &optional buffer no-reuse</em><a href='#index-treesit_002dparser_002dcreate' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>To create a parser, we provide a <var>buffer</var> and the <var>language</var>
-to use (see <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>). If <var>buffer</var> is nil, the
-current buffer is used.
+<dd><p>Create a parser for the specified <var>buffer</var> and <var>language</var>
+(see <a href="Language-Definitions.html">Tree-sitter Language Definitions</a>). If <var>buffer</var> is omitted or
+<code>nil</code>, it stands for the current buffer.
</p>
<p>By default, this function reuses a parser if one already exists for
-<var>language</var> in <var>buffer</var>, if <var>no-reuse</var> is non-nil, this
-function always creates a new parser.
+<var>language</var> in <var>buffer</var>, but if <var>no-reuse</var> is
+non-<code>nil</code>, this function always creates a new parser.
</p></dd></dl>
-<p>Given a parser, we can query information about it:
+<p>Given a parser, we can query information about it.
</p>
<dl class="def">
<dt id="index-treesit_002dparser_002dbuffer"><span class="category">Function: </span><span><strong>treesit-parser-buffer</strong> <em>parser</em><a href='#index-treesit_002dparser_002dbuffer' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Returns the buffer associated with <var>parser</var>.
+<dd><p>This function returns the buffer associated with <var>parser</var>.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dparser_002dlanguage"><span class="category">Function: </span><span><strong>treesit-parser-language</strong> <em>parser</em><a href='#index-treesit_002dparser_002dlanguage' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Returns the language that <var>parser</var> uses.
+<dd><p>This function returns the language used by <var>parser</var>.
</p></dd></dl>
<dl class="def">
<dt id="index-treesit_002dparser_002dp"><span class="category">Function: </span><span><strong>treesit-parser-p</strong> <em>object</em><a href='#index-treesit_002dparser_002dp' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Checks if <var>object</var> is a tree-sitter parser. Return non-nil if it
-is, return nil otherwise.
+<dd><p>This function checks if <var>object</var> is a tree-sitter parser, and
+returns non-<code>nil</code> if it is, and <code>nil</code> otherwise.
</p></dd></dl>
<p>There is no need to explicitly parse a buffer, because parsing is done
-automatically and lazily. A parser only parses when we query for a
-node in its syntax tree. Therefore, when a parser is first created,
-it doesn’t parse the buffer; it waits until we query for a node for
-the first time. Similarly, when some change is made in the buffer, a
-parser doesn’t re-parse immediately.
+automatically and lazily. A parser only parses when a Lisp program
+queries for a node in its syntax tree. Therefore, when a parser is
+first created, it doesn’t parse the buffer; it waits until the Lisp
+program queries for a node for the first time. Similarly, when some
+change is made in the buffer, a parser doesn’t re-parse immediately.
</p>
<span id="index-treesit_002dbuffer_002dtoo_002dlarge"></span>
-<p>When a parser do parse, it checks for the size of the buffer.
+<p>When a parser does parse, it checks for the size of the buffer.
Tree-sitter can only handle buffer no larger than about 4GB. If the
-size exceeds that, Emacs signals <code>treesit-buffer-too-large</code>
-with signal data being the buffer size.
+size exceeds that, Emacs signals the <code>treesit-buffer-too-large</code>
+error with signal data being the buffer size.
</p>
<p>Once a parser is created, Emacs automatically adds it to the
internal parser list. Every time a change is made to the buffer,
</p>
<dl class="def">
<dt id="index-treesit_002dparser_002dlist"><span class="category">Function: </span><span><strong>treesit-parser-list</strong> <em>&optional buffer</em><a href='#index-treesit_002dparser_002dlist' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>This function returns the parser list of <var>buffer</var>. And
-<var>buffer</var> defaults to the current buffer.
+<dd><p>This function returns the parser list of <var>buffer</var>. If
+<var>buffer</var> is <code>nil</code> or omitted, it defaults to the current
+buffer.
</p></dd></dl>
<dl class="def">
</p></dd></dl>
<span id="index-tree_002dsitter-narrowing"></span>
-<span id="tree_002dsitter-narrowing"></span><p>Normally, a parser “sees” the whole
-buffer, but when the buffer is narrowed (see <a href="Narrowing.html">Narrowing</a>), the
-parser will only see the visible region. As far as the parser can
-tell, the hidden region is deleted. And when the buffer is later
-widened, the parser thinks text is inserted in the beginning and in
-the end. Although parsers respect narrowing, narrowing shouldn’t be
-the mean to handle a multi-language buffer; instead, set the ranges in
-which a parser should operate in. See <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>.
+<span id="tree_002dsitter-narrowing"></span><p>Normally, a parser “sees” the whole buffer, but when the buffer is
+narrowed (see <a href="Narrowing.html">Narrowing</a>), the parser will only see the accessible
+portion of the buffer. As far as the parser can tell, the hidden
+region was deleted. When the buffer is later widened, the parser
+thinks text is inserted at the beginning and at the end. Although
+parsers respect narrowing, modes should not use narrowing as a means
+to handle a multi-language buffer; instead, set the ranges in which the
+parser should operate. See <a href="Multiple-Languages.html">Parsing Text in Multiple Languages</a>.
</p>
-<p>Because a parser parses lazily, when we narrow the buffer, the parser
-is not affected immediately; as long as we don’t query for a node
-while the buffer is narrowed, the parser is oblivious of the
-narrowing.
+<p>Because a parser parses lazily, when the user or a Lisp program
+narrows the buffer, the parser is not affected immediately; as long as
+the mode doesn’t query for a node while the buffer is narrowed, the
+parser is oblivious of the narrowing.
</p>
<span id="index-tree_002dsitter-parse-string"></span>
-<dl class="def">
-<dt id="index-treesit_002dparse_002dstring"><span class="category">Function: </span><span><strong>treesit-parse-string</strong> <em>string language</em><a href='#index-treesit_002dparse_002dstring' class='copiable-anchor'> ¶</a></span></dt>
-<dd><p>Besides creating a parser for a buffer, we can also just parse a
-string. Unlike a buffer, parsing a string is a one-time deal, and
+<span id="index-parse-string_002c-tree_002dsitter"></span>
+<p>Besides creating a parser for a buffer, a Lisp program can also parse a
+string. Unlike a buffer, parsing a string is a one-off operation, and
there is no way to update the result.
</p>
-<p>This function parses <var>string</var> with <var>language</var>, and returns the
-root node of the generated syntax tree.
+<dl class="def">
+<dt id="index-treesit_002dparse_002dstring"><span class="category">Function: </span><span><strong>treesit-parse-string</strong> <em>string language</em><a href='#index-treesit_002dparse_002dstring' class='copiable-anchor'> ¶</a></span></dt>
+<dd><p>This function parses <var>string</var> using <var>language</var>, and returns
+the root node of the generated syntax tree.
</p></dd></dl>
</div>
(treesit-available-p)
-For your major mode, first create a tree-sitter switch:
-
-#+begin_src elisp
-(defcustom python-use-tree-sitter nil
- "If non-nil, `python-mode' tries to use tree-sitter.
-Currently `python-mode' can utilize tree-sitter for font-locking,
-imenu, and movement functions."
- :type 'boolean)
-#+end_src
-
-Then in other places, we decide on whether to enable tree-sitter by
-
-#+begin_src elisp
-(and python-use-tree-sitter
- (treesit-can-enable-p))
+Users toggle tree-sitter for each major mode with a central variable,
+‘treesit-settings’. You can check whether to enable tree-sitter with
+‘treesit-ready-p’, which takes a major-mode symbol and one or more
+language symbol. The major mode body should use a branch like this:
+
+#+begin_src emacs-lisp
+(cond
+ ;; Tree-sitter setup.
+ ((treesit-ready-p 'python-mode 'python)
+ ...)
+ (t
+ ;; Non-tree-sitter setup.
+ ...))
#+end_src
* Naming convention
-When referring to tree-sitter as a noun, use “tree-sitter”, like
-python-use-tree-sitter. For prefix use “treesit”, like
-python-treesit-indent.
+Use tree-sitter for text (documentation, comment), use treesit for
+symbol (variable, function).
* Font-lock
tag the corresponding capture names onto the nodes and return them to
you. The query function returns a list of (capture-name . node). For
font-lock, we use face names as capture names. And the captured node
-will be fontified in their capture name. The capture name could also
-be a function, in which case (START END NODE) is passed to the
-function for font-lock. START and END is the start and end the
-captured NODE.
+will be fontified in their capture name.
+
+The capture name could also be a function, in which case (NODE
+OVERRIDE START END) is passed to the function for fontification. START
+and END is the start and end of the region to be fontified. The
+function should only fontify within that region. The function should
+also allow more optional arguments with (&rest _), for future
+extensibility. For OVERRIDE check out the docstring of
+treesit-font-lock-rules.
+
+Contextual syntax like multi-line comments and multi-line strings,
+needs special care. Because change in this type of things can affect
+a large portion of the buffer. Think of inserting a closing comment
+delimeter, it causes all the text before it (to the opening comment
+delimeter) to change to comment face. These things needs to be
+captured in a special name “contextual”, so that Emacs can give them
+special treatment. Se the example below for how it looks like.
** Query syntax
** Debugging queires
-If your query has problems, it usually cannot compile. In that case
-use ‘treesit-query-validate’ to debug the query. It will pop a buffer
-containing the query (in text format) and mark the offending part in
-red.
+If your query has problems, use ‘treesit-query-validate’ to debug the
+query. It will pop a buffer containing the query (in text format) and
+mark the offending part in red.
** Code
-To enable tree-sitter font-lock, set ‘treesit-font-lock-settings’
-buffer-locally and call ‘treesit-font-lock-enable’. For example, see
+To enable tree-sitter font-lock, set ‘treesit-font-lock-settings’ and
+‘treesit-font-lock-feature-list’ buffer-locally and call
+‘treesit-major-mode-setup’. For example, see
‘python--treesit-settings’ in python.el. Below I paste a snippet of
it.
Note that like the current font-lock, if the to-be-fontified region
already has a face (ie, an earlier match fontified part/all of the
-region), the new face is discarded rather than applied. If you want
+region), the new face is discarded rather than applied. If you want
later matches always override earlier matches, use the :override
keyword.
+Each rule should have a :feature, like function-name,
+string-interpolation, builtin, etc. Users can then enable/disable each
+feature individually.
+
#+begin_src elisp
(defvar python--treesit-settings
(treesit-font-lock-rules
+ :feature 'comment
+ :language 'python
+ '((comment) @font-lock-comment-face)
+
+ :feature 'string
:language 'python
- :override t
- `(;; Queries for def and class.
- (function_definition
- name: (identifier) @font-lock-function-name-face)
+ '((string) @font-lock-string-face
+ (string) @contextual) ; Contextual special treatment.
- (class_definition
- name: (identifier) @font-lock-type-face)
+ :feature 'function-name
+ :language 'python
+ '((function_definition
+ name: (identifier) @font-lock-function-name-face))
- ;; Comment and string.
- (comment) @font-lock-comment-face
+ :feature 'class-name
+ :language 'python
+ '((class_definition
+ name: (identifier) @font-lock-type-face))
- ...)))
+ ...))
#+end_src
Then in ‘python-mode’, enable tree-sitter font-lock:
#+begin_src elisp
(treesit-parser-create 'python)
-;; This turns off the syntax-based font-lock for comments and
-;; strings. So it doesn’t override tree-sitter’s fontification.
-(setq-local font-lock-keywords-only t)
-(setq-local treesit-font-lock-settings
- python--treesit-settings)
-(treesit-font-lock-enable)
+(setq-local treesit-font-lock-settings python--treesit-settings)
+(setq-local treesit-font-lock-feature-list
+ '((comment string function-name)
+ (class-name keyword builtin)
+ (string-interpolation decorator)))
+...
+(treesit-major-mode-setup)
#+end_src
Concretely, something like this:
#+begin_src elisp
(define-derived-mode python-mode prog-mode "Python"
...
-
- (treesit-parser-create 'python)
-
- (if (and python-use-tree-sitter
- (treesit-can-enable-p))
- ;; Tree-sitter.
- (progn
- (setq-local font-lock-keywords-only t)
- (setq-local treesit-font-lock-settings
- python--treesit-settings)
- (treesit-font-lock-enable))
+ (cond
+ ;; Tree-sitter.
+ ((treesit-ready-p 'python-mode 'python)
+ (treesit-parser-create 'python)
+ (setq-local treesit-font-lock-settings python--treesit-settings)
+ (setq-local treesit-font-lock-feature-list
+ '((comment string function-name)
+ (class-name keyword builtin)
+ (string-interpolation decorator)))
+ (treesit-major-mode-setup))
+ (t
;; No tree-sitter
- (setq-local font-lock-defaults ...))
-
- ...)
+ (setq-local font-lock-defaults ...)
+ ...)))
#+end_src
-You’ll notice that tree-sitter’s font-lock doesn’t respect
-‘font-lock-maximum-decoration’, major modes are free to set
-‘treesit-font-lock-settings’ based on the value of
-‘font-lock-maximum-decoration’, or provide more fine-grained control
-through other mode-specific means. (Towards that end, the :toggle option in treesit-font-lock-rules is very useful.)
-
* Indent
Indent works like this: We have a bunch of rules that look like
OFFSET to it (eg, 0), and that is the column we want to indent the
current line to (4 + 0 = 4).
+Matchers and anchors are functions that takes (NODE PARENT BOL &rest
+_). Matches return nil/non-nil for no match/match, and anchors return
+the anchor point. Below are some convenient builtin matchers and anchors.
+
For MATHCER we have
- (parent-is TYPE)
- (node-is TYPE)
+ (parent-is TYPE) => matches if PARENT’s type matches TYPE as regexp
+ (node-is TYPE) => mathces NODE’s type
(query QUERY) => matches if querying PARENT with QUERY
captures NODE.
first-sibling => start of the first sibling
parent => start of parent
parent-bol => BOL of the line parent is on.
- prev-sibling
- no-indent => don’t indent
- prev-line => same indent as previous line
+ prev-sibling => start of previous sibling
+ no-indent => current position (don’t indent)
+ prev-line => start of previous line
There is also a manual section for indent: "Parser-based Indentation".
((node-is ")") parent-bol 0)
((node-is "]") parent-bol 0)
((node-is ">") parent-bol 0)
- ((node-is ".") parent-bol ,offset)
+ ((node-is "\\.") parent-bol ,offset)
((parent-is "ternary_expression") parent-bol ,offset)
((parent-is "named_imports") parent-bol ,offset)
((parent-is "statement_block") parent-bol ,offset)
...))))
#+end_src
-Then you set ‘treesit-simple-indent-rules’ to your rules, and set
-‘indent-line-function’:
+Then you set ‘treesit-simple-indent-rules’ to your rules, and call
+‘treesit-major-mode-setup’:
#+begin_src elisp
(setq-local treesit-simple-indent-rules typescript-mode-indent-rules)
-(setq-local indent-line-function #'treesit-indent)
+(treesit-major-mode-setup)
#+end_src
* Imenu
Not much to say except for utilizing ‘treesit-induce-sparse-tree’.
-See ‘python--imenu-treesit-create-index-1’ in python.el for an
-example.
+See ‘js--treesit-imenu-1’ in js.el for an example.
-Once you have the index builder, set ‘imenu-create-index-function’.
+Once you have the index builder, set ‘imenu-create-index-function’ to
+it.
* Navigation
(treesit-search-forward-goto "function_definition" 'end)
where "function_definition" matches the node type of a function
-definition node, and ’end means we want to go to the end of that
-node.
-
-Something like this should suffice:
-
-#+begin_src elisp
-(defun js--treesit-beginning-of-defun (&optional arg)
- (let ((arg (or arg 1)))
- (if (> arg 0)
- ;; Go backward.
- (while (and (> arg 0)
- (treesit-search-forward-goto
- "function_definition" 'start nil t))
- (setq arg (1- arg)))
- ;; Go forward.
- (while (and (< arg 0)
- (treesit-search-forward-goto
- "function_definition" 'start))
- (setq arg (1+ arg))))))
-
-(defun xxx-end-of-defun (&optional arg)
- (let ((arg (or arg 1)))
- (if (< arg 0)
- ;; Go backward.
- (while (and (< arg 0)
- (treesit-search-forward-goto
- "function_definition" 'end nil t))
- (setq arg (1+ arg)))
- ;; Go forward.
- (while (and (> arg 0)
- (treesit-search-forward-goto
- "function_definition" 'end))
- (setq arg (1- arg))))))
-
-(setq-local beginning-of-defun-function #'xxx-beginning-of-defun)
-(setq-local end-of-defun-function #'xxx-end-of-defun)
-#+end_src
+definition node, and ’end means we want to go to the end of that node.
+
+Tree-sitter has default implementations for
+‘beginning-of-defun-function’ and ‘end-of-defun-function’. So for
+ordinary languages, it is suffice to set ‘treesit-defun-type-regexp’
+to something that matches all the defun struct types in the language,
+and call ‘treesit-major-mode-setup’. For example,
+
+#+begin_src emacs-lisp
+(setq-local treesit-defun-type-regexp (rx bol
+ (or "function" "class")
+ "_definition"
+ eol))
+(treesit-major-mode-setup)
+#+end_src>
* Which-func
-You can find the current function by going up the tree and looking for
-the function_definition node. See ‘python-info-treesit-current-defun’
-in python.el for an example. Since Python allows nested function
-definitions, that function keeps going until it reaches the root node,
-and records all the function names along the way.
+If you have an imenu implementation, set ‘which-func-functions’ to
+nil, and which-func will automatically use imenu’s data.
+
+If you want independent implementation for which-func, you can find
+the current function by going up the tree and looking for the
+function_definition node. See the function below for an example.
+Since Python allows nested function definitions, that function keeps
+going until it reaches the root node, and records all the function
+names along the way.
#+begin_src elisp
(defun python-info-treesit-current-defun (&optional include-type)