Skip to content

Conversation

a7medev
Copy link
Member

@a7medev a7medev commented Sep 13, 2025

Description

Separate signature and parameter documentation by parsing the documentation returned by sourcekitd using swift-markdown and extracting the parameter documentation.

Alternatives considered

The parameter extraction implementation is almost ported from the implementation in the Swift compiler codebase. The problem with doing that in the Swift compiler codebase is that once you parse a the comment as markdown into a Document you cannot easily convert it back into markdown (we'd need to write our own markdown formatter). Besides, cmark doesn't handle Doxygen commands.

We considered using swift-docc but we faced some problems with it:

  1. It's not easy to use. You need to retrieve a symbol graph (typically through a cursor info request) and use that along with other info to interact with DocC.
  2. The result returned by DocC can't be directly converted to markdown, we'd need to provide our own DocC markdown renderer.

Implementing this using swift-markdown allows us to easily parse the comment, process it, convert it back to markdown and it also provides minimal parsing for Doxygen commands (we're only interested in \param) allowing us to use the same implementation for Clang-based declarations too.

Although this approach involves code duplication, it's works for our simple use case and we can probably improve it a bit in the future.

Dependencies

  1. Add preceding newlines before Doxygen commands swift-markdown#237: Fixes a bug in swift-markdown where Doxygen commands are formatted without preceding newlines.
  2. [IDE] Add internal parameter names in signature help swift#84272: Adds the parameter name to the signature's parameters allowing matching parameters with their documentation using their name.

let parametersPrefix = "parameters:"
let headingContent = headingText.string.trimmingCharacters(in: .whitespaces)

guard headingContent.lowercased().hasPrefix(parametersPrefix) else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check for a prefix instead of headingContent == parametersPrefix to match the behavior in Xcode.

@a7medev
Copy link
Member Author

a7medev commented Sep 13, 2025

@a7medev a7medev force-pushed the feat/signature-help-separate-param-doc branch from b94a4a3 to da27530 Compare September 13, 2025 16:57
@a7medev
Copy link
Member Author

a7medev commented Sep 13, 2025

@a7medev
Copy link
Member Author

a7medev commented Sep 13, 2025

@a7medev a7medev force-pushed the feat/signature-help-separate-param-doc branch from da27530 to 0ac09ca Compare September 13, 2025 23:28
@a7medev
Copy link
Member Author

a7medev commented Sep 13, 2025

@a7medev
Copy link
Member Author

a7medev commented Sep 13, 2025

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool. I would have preferred to use docc here but maybe this is the simpler approach.

Do you know how hard it would be to get the symbol graph for a completion item, similar to how we get the documentation comment of a completion item. I’m just wondering that maybe it wouldn’t be all to hard to do so. But I haven’t tried.

@a7medev
Copy link
Member Author

a7medev commented Sep 14, 2025

Do you know how hard it would be to get the symbol graph for a completion item, similar to how we get the documentation comment of a completion item. I’m just wondering that maybe it wouldn’t be all to hard to do so. But I haven’t tried.

@ahoppen That should be doable using a cursor info request provided the USR I think. There are other things that DocC needs (and I'm not entirely aware of what each of them is used for/represents). Besides, conversion from DocC response to markdown would require implementing a renderer within SourceKit-LSP (maybe that should be done in the long-term anyway but I'm not sure if it fits in the timeline here).

I can attempt to use DocC again and see how it goes though. Last time I tried it wasn't a very pleasant experience. 😅


QQ, tests are failing on Windows due to issues with CMakeLists.txt, is there any documentation for how the CMake configuration with Swift is structured and how that maps to the information in Package.swift? It'd also be great if I can test the build locally on macOS before running the Windows tests (which take too long), is that doable?

Comment on lines 171 to 174
let name = String(components[0]).trimmingCharacters(in: .whitespaces)
guard !name.isEmpty else {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about raw identifiers e.g - `foo bar`: hello?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, can you please recheck? 🙏🏼

@a7medev
Copy link
Member Author

a7medev commented Sep 17, 2025

@ahoppen @hamishknight I added the discussion for why we didn't use swift-docc, eliminated mutable state in ParametersDocumentationExtractor, added support for raw identifiers, and added more tests. Can you please recheck? 🙏🏼

@a7medev
Copy link
Member Author

a7medev commented Sep 17, 2025

@a7medev
Copy link
Member Author

a7medev commented Sep 18, 2025

@ahoppen
Copy link
Member

ahoppen commented Sep 20, 2025

That should be doable using a cursor info request provided the USR I think. There are other things that DocC needs (and I'm not entirely aware of what each of them is used for/represents). Besides, conversion from DocC response to markdown would require implementing a renderer within SourceKit-LSP (maybe that should be done in the long-term anyway but I'm not sure if it fits in the timeline here).

Definitely out-of-scope for this PR, I would just like to explore future options so we can decide whether it is something we want to investigate in the future and file an actionable issue. Do you know if it would be possible to get the symbol graph of a completion item without having to go through a cursor info (@hamishknight, maybe you know)? Because if it was, I think the main thing to do would be to implement the symbol graph to Markdown renderer, which while being a bit of work might come in handy in the future anyway.

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test cases and refining this. I have a few more comments on the implementation, the behavior looks great to me.

/// - single: Whether the parameter is a single parameter or part of a parameter outline
///
/// - Returns: A tuple containing the parameter name and documentation if a parameter was found, nil otherwise.
private func extractParam(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple minor-ish things that I find confusing with this implementation:

  • firstTextContent kind of repeats some information that is already present in listItem, just in a slightly modified form. It’s not trivial to reason about when the first text in listItem should be used and when firstTextContent. If it’s possible, I would prefer if we could modify listItem in the extractSingle case to strip away the parameter: prefix. Alternatively, I would either re-implement the parameter: stripping here (I guess it shouldn’t be too difficult) or add documentation that explains exactly how firstTextContent and listItem relate to each other.
  • extractRawIdentifier takes a completely different code path than the non-raw identifier path, which makes it easy for bugs to creep in in one of the paths that aren’t present in any of the others. Would it be possible to unify the two implementations so that both use the technique using split(separator:) or the technique of eating elements from the markdown structure until they hit a :?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • firstTextContent kind of repeats some information that is already present in listItem, just in a slightly modified form. It’s not trivial to reason about when the first text in listItem should be used and when firstTextContent. If it’s possible, I would prefer if we could modify listItem in the extractSingle case to strip away the parameter: prefix. Alternatively, I would either re-implement the parameter: stripping here (I guess it shouldn’t be too difficult) or add documentation that explains exactly how firstTextContent and listItem relate to each other.

I removed the firstTextContent parameter and passed a modified ListItem instead in extractSingleParameter as you suggested.

  • extractRawIdentifier takes a completely different code path than the non-raw identifier path, which makes it easy for bugs to creep in in one of the paths that aren’t present in any of the others. Would it be possible to unify the two implementations so that both use the technique using split(separator:) or the technique of eating elements from the markdown structure until they hit a :?

The problem is the markdown structure is different for raw vs. normal parameter name:

  1. Normal parameter names exist within a Text node as the first child of Paragraph and you can find the name by simply splitting Text around :.
  2. Raw parameter names split parameter documentation into multiple inline markup nodes:
    1. In a parameter outline, the structure is: InlineCode (name), Text (": " + part of the parameter documentation).
    2. In a single parameter, the structure is: Text ("parameter "), InlineCode (name), Text (": " + part of parameter documentation).

That makes merging these into a single flow a bit difficult. Parsing until a : is found can also cause issues depending on the implementation:

  1. If we just format the list item and look for the first : anywhere in the formatted string it'd cause problems with raw identifiers containing a colon (e.g. - `interesting: param`: abc).
  2. If we specifically look for the first Text item containing a colon, we'd still need to enforce the valid parameter structures. (e.g. - `foo` **bar**: baz isn't a valid parameter). It's not impossible, but it'd still have some complexity.

I'm curious to know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess with your explanation I agree that having two entirely different code paths for the documentation extraction is the best way to do it. A little unfortunate but that’s the way it is.

@a7medev a7medev requested a review from ahoppen September 20, 2025 20:32
Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve got a few more nitpicky comments, otherwise this looks great to me now.

/// - Parameter param: description
/// ```
///
/// - Returns: True if the list item has single parameter documentation, false otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not noticing this before but I think the doc comment is out-of-date. Same for some other Returns clauses as well.

return extractParameter(listItem: remainingListItem, single: true)
}

/// Extracts a parameter field from a list item provided the relevant first text content allowing reuse in ``extractParametersOutline`` and ``extractSingleParameter``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this comment is out-of-date now.


let remainingContent = String(paragraphContent[prefixEnd...]).trimmingCharacters(in: .whitespaces)

var remainingParagraph = paragraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a short comment saying something like the following would be helpful here.

Remove the Parameter: prefix from the list item from the list item so we can extract the parameter’s documentation using extractParameter

/// - single: Whether the parameter is a single parameter or part of a parameter outline
private func extractParameterWithRawIdentifier(from listItem: ListItem, single: Bool) -> Parameter? {
/// The index of ``InlineCode`` for the raw identifier parameter name in the first paragraph of ``listItem``
let inlineCodeIndex = single ? 1 : 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid needing to passing through the single parameter if you change extractSingleParameter to do something like the following? I think that would be nicer because the markdown structure of single and multiple parameters would be more similar this way.

if remainingContent.isEmpty {
  remainingParagraph.replaceChildrenInRange(0..<1, with: [])
} else {
  remainingParagraph.replaceChildrenInRange(0..<1, with: [Text(remainingContent)])
}

/// - single: Whether the parameter is a single parameter or part of a parameter outline
///
/// - Returns: A tuple containing the parameter name and documentation if a parameter was found, nil otherwise.
private func extractParam(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess with your explanation I agree that having two entirely different code paths for the documentation extraction is the best way to do it. A little unfortunate but that’s the way it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants