Skip to content

Conversation

the-mikedavis
Copy link
Member

This adds two new symbol picker commands that use tree-sitter rather than LSP. We run a new symbols.scm query across the file and extract tagged things like function definitions, types, classes, etc. For languages with unambiguous syntax this behaves roughly the same as the LSP symbol picker (<space>s). It's less precise though since we don't have semantic info about the language. For example it can easily produce false positives for C/C++ because of preprocessor magic. Prior art for this feature is GitHub's imprecise code navigation which I believe works the same way and leverages tags.scm queries. (I have no internal GitHub knowledge so this is an educated guess.) It should be possible to find definitions and references as well like gd and gr - this is left as a follow-up.

The hope is to start introducing LSP-like features for navigation that can work without installing or running a language server. I made these two pickers in particular because I don't like LSP equivalents in ErlangLS or ELP - the document symbol picker can take a long time to show up during boot and the workspace symbol picker only searches for module names. The other motivation is to have some navigation features in cases when running a language server is too cumbersome - either to install or because of resource constraints. For example clangd needs a fair amount of setup (compile_commands.json) that you might not want to do when quickly reading through a codebase.

This PR also adds commands that either open the LSP symbol picker or the syntax one if a language server is not available. This way you can customize a language to not use the LSP symbol pickers, for example:

[[language]]
name = "erlang"
language-servers = [{ name = "erlang-ls", except-features = ["document-symbols", "workspace-symbols"] }]

and <space>s will use the syntax symbol picker, while <space>s on a Rust file will still prefer the language server.

Some prior discussion of a feature like this is in #3518 talking about Ctags support. The idea here is similar but extracts tags/symbols with tree-sitter instead.

Outstanding question: how closely should we try to match LSP symbol kind? Not at all? Should we have markup specific symbol kinds? (For example see markdown's symbols.scm).

@the-mikedavis the-mikedavis added A-tree-sitter Area: Tree-sitter E-medium Call for participation: Experience needed to fix: Medium / intermediate A-command Area: Commands labels Dec 16, 2024
nikvoid added a commit to nikvoid/helix that referenced this pull request Dec 28, 2024
@EricHenry
Copy link
Contributor

I'm having trouble getting this to work. I pulled down the branch, but when I try to load the symbol picker, without having lsp enabled, I get an error that No language server supporting document symbols or syntax info available. I am testing this on a rust project.

Any ideas?

@the-mikedavis
Copy link
Member Author

There are only a few languages with symbols.scm queries so far: C, C++, Erlang, Elixir, Markdown and Python. Rust queries would need to be added. (Feel free to send a PR to this branch if you'd like. I always have rust-analyzer going so I haven't felt the need to add Rust yet.)

@cgahr
Copy link
Contributor

cgahr commented Feb 6, 2025

I added symbols for typst: #12793

};

use arc_swap::ArcSwapAny;
use dashmap::DashMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dashmap can be very slow to free memory (though it's fast on all other operations).

In my personal experience, scc is better at memory reclamation, which is probably good for a picker (avoid lingering effects from mapping the whole workspace for example).

Of course, we probably want benchmarking results, it may not be an issue at all in practice

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I like dashmap just because it's already a transitive dependency. It'd be good to check if it's being wasteful in terms of memory and switch to something else. It's already a little bit overkill for what it's used for here

@the-mikedavis the-mikedavis marked this pull request as draft February 17, 2025 19:14
@tda-ableton
Copy link

As another use case or data point, I just tried this branch in a large cross-platform C++ project on Windows, where the project doesn't compile with clang itself and thus clangd is fairly cumbersome to set up for it. This tree-sitter based symbol lookup works pretty great and is surprisingly fast, given the size of the project.

I found one inconsistency compared to the LSP-based symbol picker: The tree-sitter based one seems to be always case sensitive, where the LSP picker uses smart case.

@the-mikedavis the-mikedavis force-pushed the syntax-symbol-pickers branch from 378bb8c to 057bc91 Compare July 1, 2025 14:01
@the-mikedavis
Copy link
Member Author

Yeah I find it mostly useful for C codebases so far - clangd can be tricky to set up. Also with the queries for Rust now (#12859) I occassionally find it useful to use the workspace symbol searcher before opening a Rust file (so rust-analyzer isn't running yet) - when I'm looking for a specific symbol but don't know exactly where it is.

I found one inconsistency compared to the LSP-based symbol picker: The tree-sitter based one seems to be always case sensitive, where the LSP picker uses smart case.

I think I had seen this as well, should be fixed in the most recent push.

@the-mikedavis the-mikedavis force-pushed the syntax-symbol-pickers branch from 057bc91 to 5ea92d1 Compare July 15, 2025 00:00
@the-mikedavis the-mikedavis marked this pull request as ready for review July 15, 2025 00:02
the-mikedavis and others added 4 commits July 18, 2025 11:12
Co-authored-by: cgahr <[email protected]>
Co-authored-by: eh <[email protected]>
Neither language server robustly supports workspace symbol search.
`erlang-ls`'s symbol picker takes a long time to open successfully on
boot. `elp`'s is faster but not faster than the tags query.
@the-mikedavis the-mikedavis force-pushed the syntax-symbol-pickers branch from 5ea92d1 to 4418e33 Compare July 18, 2025 15:17
@the-mikedavis the-mikedavis merged commit 4418e33 into master Jul 18, 2025
7 checks passed
@the-mikedavis the-mikedavis deleted the syntax-symbol-pickers branch July 18, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-command Area: Commands A-tree-sitter Area: Tree-sitter E-medium Call for participation: Experience needed to fix: Medium / intermediate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants