Skip to content

Conversation

mubeen-zulfiqar
Copy link

@mubeen-zulfiqar mubeen-zulfiqar commented Sep 9, 2025

Related GitHub Issue

Closes: #5238 #6048

Description

Problem Summary

Roo-Code was failing to process and embed C# files despite having complete infrastructure support for C# language parsing. C# files were being detected but not properly indexed, resulting in missing code embeddings.

Root Cause Analysis

Issue Location

File: src/services/tree-sitter/queries/c-sharp.ts

Problem: Incorrect tree-sitter query patterns using malformed syntax that captured only node names instead of full AST definitions.

Technical Details

How Roo-Code Processes Files

  1. Language Detection: src/services/code-index/processors/file-watcher.ts detects file types

  2. Tree-sitter Parsing: src/services/tree-sitter/languageParser.ts loads language parsers

  3. AST Query Execution: src/services/code-index/processors/parser.ts executes queries

  4. Embedding Generation: Captured code blocks are sent for embedding

Failure Point

In src/services/code-index/processors/parser.ts, the processing logic:

 
const captures = query.captures(tree.rootNode);
 
console.log(`[DEBUG] Query captures: ${captures.length}`);
 
if (captures.length === 0) {
 
    console.log('[DEBUG] No captures found - language may be unsupported');
 
    // Falls back to basic text processing
 
}
 

Result: C# queries returned captures.length === 0, triggering "unsupported language" fallback behavior.

Broken Query Patterns

Original (Incorrect):

 
// BROKEN: Only captures node names as text, not full AST nodes
 
queries.push(`(class_declaration name: (identifier) @name.definition.class)`);
 
queries.push(`(method_declaration name: (identifier) @name.definition.method)`);
 
queries.push(`(property_declaration name: (identifier) @name.definition.property)`);
 

Problem: The syntax @name.definition.class attempts to capture nested attributes, but tree-sitter expects separate capture names.

Query Pattern Syntax Explanation

Why We Use Separate Captures:

  • CORRECT: (node_type name: (identifier) @name) @definition.type
  • INCORRECT: (node_type name: (identifier) @name.definition.type)

Technical Reason:
The correct syntax uses separate captures (@name) @definition.type) to capture both:

  1. @name - captures the identifier text for the name
  2. @definition.type - captures the full AST node for the definition

The incorrect nested syntax (@name.definition.type) attempts to create nested capture attributes which tree-sitter does not support, resulting in zero captures and causing the "unsupported language" fallback behavior.

Future Contributors: Always use separate captures to prevent regressions. This pattern is consistent across all working language queries (Python, JavaScript, etc.).

Fix Applied

Updated C# Query Patterns

Fixed Syntax:

 
// Classes
 
queries.push(`(class_declaration name: (identifier) @name) @definition.class`);
 
// Methods  
 
queries.push(`(method_declaration name: (identifier) @name) @definition.method`);
 
// Properties
 
queries.push(`(property_declaration name: (identifier) @name) @definition.property`);
 
// Events
 
queries.push(`(event_field_declaration (variable_declarator name: (identifier) @name)) @definition.event`);
 
// Delegates
 
queries.push(`(delegate_declaration name: (identifier) @name) @definition.delegate`);
 
// Namespaces (including file-scoped)
 
queries.push(`(namespace_declaration name: (qualified_name) @name) @definition.namespace`);
 
queries.push(`(file_scoped_namespace_declaration name: (qualified_name) @name) @definition.namespace`);
 
// Interfaces
 
queries.push(`(interface_declaration name: (identifier) @name) @definition.interface`);
 
// Structs
 
queries.push(`(struct_declaration name: (identifier) @name) @definition.struct`);
 
// Enums
 
queries.push(`(enum_declaration name: (identifier) @name) @definition.enum`);
 
 

Key Changes

  1. Separated Captures: Changed from @name.definition.type to @name) @definition.type

  2. Proper Node Targeting: Each query now captures both the identifier name and the full AST node

  3. Consistent Syntax: Matches the working pattern used by Python, JavaScript, and other supported languages

  4. Qualified Namespace Support: Uses (qualified_name) instead of (identifier) to capture nested namespaces like MyCompany.MyProduct.MyModule

  5. File-Scoped Namespace Support: Added support for C# 10+ file-scoped namespace declarations

Verification Process

Debug Test Results

Before Fix:

  • Query captures: 0

  • Status: "language may be unsupported"

  • Embedding: Fallback to full file content

After Fix:

  • Query captures: >0 (depends on C# file content)

  • Status: Proper semantic parsing

  • Embedding: Individual code blocks (classes, methods, properties, etc.)

Test Procedure

This update has been tested on a project containing several hundred C# files, delivering excellent results. An example test is shown below.

Test File Used

 
// test-math-functions.cs
 
using System;
 
namespace MathFunctions
 
{
 
    public static class MathFunctions1
 
    {
 
        public static long Fibonacci(int n)
 
        {
 
            if (n <= 0) return 0;
 
            if (n == 1) return 1;
 
            long a = 0, b = 1;
 
            for (int i = 2; i <= n; i++)
 
            {
 
                long temp = a + b;
 
                a = b;
 
                b = temp;
 
            }
 
            return b;
 
        }
 
        public static double AreaOfTriangle(double baseLength, double height)
 
        {
 
            if (baseLength <= 0 || height <= 0)
 
                throw new ArgumentException("Base and height must be positive values");
 
            return 0.5 * baseLength * height;
 
        }
 
        public static long Factorial(int n)
 
        {
 
            if (n < 0)
 
                throw new ArgumentException("Factorial is not defined for negative numbers");
 
            if (n == 0 || n == 1) return 1;
 
            long result = 1;
 
            for (int i = 2; i <= n; i++)
 
            {
 
                result *= i;
 
            }
 
            return result;
 
        }
 
}
 
Roo-code indexing c# files

Build Result

Impact

  • C# Support: Now fully functional with proper semantic parsing

  • Other Languages: Unaffected (already working correctly)

  • Performance: No degradation - same processing pipeline

  • Compatibility: Maintains all existing Roo-Code functionality

Get in Touch

Discord: mubeen_zulfiqar

Email: [email protected]


Important

Fixes C# tree-sitter query patterns in c-sharp.ts to enable proper AST node capturing and embedding generation.

  • Behavior:
    • Fixes C# tree-sitter query patterns in c-sharp.ts to correctly capture AST nodes.
    • Resolves issue where C# files were detected but not indexed, leading to missing embeddings.
  • Technical Details:
    • Changes capture syntax from @name.definition.type to @name) @definition.type.
    • Supports qualified and file-scoped namespaces, classes, methods, properties, events, delegates, interfaces, structs, enums, records, attributes, type parameters, and LINQ expressions.
  • Verification:
    • Debugging shows query captures now return >0, enabling proper semantic parsing and embedding of C# code blocks.

This description was created by Ellipsis for aa9213b. You can customize this summary. It will automatically update as commits are pushed.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Sep 9, 2025
Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and the fix correctly addresses the C# indexing issues reported in #5238 and #6048. The query syntax change aligns with tree-sitter's expected format and all tests pass. I have some suggestions for improvement below.

@mubeen-zulfiqar mubeen-zulfiqar marked this pull request as ready for review September 9, 2025 10:50
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Sep 9, 2025
@mubeen-zulfiqar mubeen-zulfiqar marked this pull request as draft September 9, 2025 11:04
@mubeen-zulfiqar mubeen-zulfiqar marked this pull request as ready for review September 10, 2025 07:32
@mubeen-zulfiqar
Copy link
Author

@mrubens @jr @cte @hannesrudolph I wanted to start by saying that Roo Code is a huge contribution to open source—its versatility and widespread usage are truly inspiring, and I’m proud to contribute.

I’ve just submitted PR #7813, “fix: corrected C# tree-sitter query,” which resolves the issue where C# files were detected but not properly indexed, resulting in fallback to basic text processing.

Key improvements:
• Root cause: malformed capture syntax (@name.definition.type) resulted in zero AST captures.
• Fix: separated captures properly (for example, (identifier) @name) @definition.class), bringing C# indexing in line with other supported languages .
• Now supports a broad range of C# constructs—classes, methods, properties, events, delegates, namespaces (including file-scoped), interfaces, structs, enums, and more .
• Tests on several hundred C# files now show positive capture counts, enabling semantic parsing and embedding without regressions or performance issues .

Automated Review Feedback: roomote[bot] has already reviewed the change and confirmed that it successfully addresses the indexing issues described in issue #5238 and #6048 .

Could you please take a look when you have a moment? I’d appreciate any feedback or approval so we can merge this and restore full C# support.

Thanks, and keep up the amazing work with Roo Code!

Best regards,
@mubeen-zulfiqar

@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Sep 10, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Sep 10, 2025
@adamhill
Copy link

adamhill commented Sep 13, 2025

@mubeen-zulfiqar This is so awesome! I had a strange feeling about CBI being worse than originally. And turns out it was because 95% of my codebases are C#.

Thank you! ❤️

@KJ7LNW Do you think we need to vet all the queries for at least basic parsing. I dont understand how this slipped by with all the corpus tests there are in the most t-s project directories.

@mubeen-zulfiqar
Copy link
Author

@adamhill Thanks a lot! 🙌 Seeing someone hit that issue in real life makes this work worthwhile — this definitely helps folks whose projects are heavily C#. I appreciate you sharing that feedback.

Really glad to be contributing alongside you on Roo Code!

@KJ7LNW
Copy link

KJ7LNW commented Sep 13, 2025

@KJ7LNW Do you think we need to vet all the queries for at least basic parsing. I dont understand how this slipped by with all the corpus tests there are in the most t-s project directories.

I am not familiar enough with C# to check this directly, but make sure that src/services/tree-sitter/__tests__/parseSourceCodeDefinitions.c-sharp.spec.ts has a test for every item the you plan to validate to prevent future regressions

Copy link
Collaborator

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mubeen-zulfiqar!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 15, 2025
@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Needs Review] in Roo Code Roadmap Sep 15, 2025
@mrubens mrubens merged commit 1b4819c into RooCodeInc:main Sep 15, 2025
22 checks passed
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Sep 15, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 15, 2025
@mubeen-zulfiqar
Copy link
Author

mubeen-zulfiqar commented Sep 15, 2025

@mrubens @daniel-lxs
I truly appreciate Roo-Code’s open source work and feel honored to contribute. Thank you! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lgtm This PR has been approved by a maintainer PR - Needs Preliminary Review size:M This PR changes 30-99 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Codebase indexing doesnt work with dotnet
6 participants