Skip to content

Conversation

dcreager
Copy link
Member

This PR updates our iterator protocol machinery to return a tuple spec describing the elements that are returned, instead of a type. That allows us to track heterogeneous iterators more precisely, and consolidates the logic in unpacking and splatting, which are the two places where we can take advantage of that more precise information. (Other iterator consumers, like for loops, have to collapse the iterated elements down to a single type regardless, and we provide a new helper method on TupleSpec to perform that summarization.)

dcreager added 4 commits July 21, 2025 17:05
* main:
  [ty] Use `ThinVec` for sub segments in `PlaceExpr` (#19470)
  [ty] Splat variadic arguments into parameter list (#18996)
  [`flake8-pyi`] Skip fix if all `Union` members are `None` (`PYI016`)  (#19416)
  Skip notebook with errors in ecosystem check (#19491)
  [ty] Consistent use of American english (in rules) (#19488)
  [ty] Support iterating over enums (#19486)
  Fix panic for illegal `Literal[…]` annotations with inner subscript expressions (#19489)
  Move fix suggestion to subdiagnostic (#19464)
  [ty] Implement non-stdlib stub mapping for classes and functions (#19471)
  [ty] Disallow illegal uses of `ClassVar` (#19483)
  [ty] Disallow `Final` in function parameter/return-type annotations (#19480)
  [ty] Extend `Final` test suite (#19476)
  [ty] Minor change to diagnostic message for invalid Literal uses (#19482)
  [ty] Detect illegal non-enum attribute accesses in Literal annotation (#19477)
  [ty] Reduce size of `TypeInference` (#19435)
  Run MD tests for Markdown-only changes (#19479)
  Revert "[ty] Detect illegal non-enum attribute accesses in Literal annotation"
  [ty] Detect illegal non-enum attribute accesses in Literal annotation
  [ty] Added semantic token support for more identifiers (#19473)
  [ty] Make tuple subclass constructors sound (#19469)
@dcreager dcreager added the ty Multi-file analysis & type inference label Jul 22, 2025
Copy link
Contributor

github-actions bot commented Jul 22, 2025

mypy_primer results

No ecosystem changes detected ✅
No memory usage changes detected ✅

@dcreager dcreager marked this pull request as ready for review July 22, 2025 21:14
Comment on lines -982 to -986
// Some awkward special handling is required here because of the fact
// that calling `try_iterate()` on `Never` returns `Never`,
// but `tuple[Never, ...]` eagerly simplifies to `tuple[()]`,
// which will cause us to emit false positives if we index into the tuple.
// Using `tuple[Unknown, ...]` avoids these false positives.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this logic over into Type::try_iterate, but it seems inconsistent with how we interpret iterating over Never in e.g. a for loop. Before, this would reveal Never:

def _(never: Never):
    for x in never:
        reveal_type(x)  # revealed: Never

i.e., the body is never executed, just like when we iterate over the empty tuple:

for x in ():
    reveal_type(x)  # revealed: Never

This is consistent with simplifying tuple[Never, ...] to tuple[()].

The interpretation described here is that iterating over Never might produce an arbitrary number of elements. That requires changing the for loop test to reveal Unknown.

To me, the interpretation without this special case seems more intuitive, especially when you consider it in terms of the for loop example. @AlexWaygood, you added this special case in #19469. Were there examples in the ecosystem check of indexing into this kind of tuple, which necessitated this special case?

Copy link
Member

@AlexWaygood AlexWaygood Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I added the special case in #18987, not #19469 — it was to tackle false positives I saw in the ecosystem report for sympy on early versions of that PR. If you look at the revisions to the primer report in #18987 (comment), the false positives for sympy are present in the penultimate version of the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the thing that seems wrong in this comment is that we simplify tuple[Never, ...] to tuple[()]. Shouldn't it simplify to Never instead, and then we wouldn't get false positives indexing into it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it simplify to Never instead

No because that would imply the type is uninhabited. tuple[Never, ...] is inhabited by the empty tuple (), the same as list[Never] cannot simplify to Never because it is inhabited by the empty list [].

The situation for Never type arguments to homogeneous tuples here is closer to the situation for other generic containers than it is to the situation for heterogeneous tuples

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that makes sense, but I do wonder if there is any practical benefit to that simplification. It seems like there is a practical cost, because tuple[Never, ...] is a type that can arise in unreachable code, and the indexing errors aren't desirable there. It seems like there might be only upside to just leaving tuple[Never, ...] as-is, instead of simplifying it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcreager added that particular simplification so I'll defer to him, but I'm a bit reluctant to give up on it. It feels obviously correct from the perspective of set-theoretic types.

But we probably don't need the somewhat unprincipled special case I added here that Doug was asking me about -- a better fix would probably just be to check here whether the node is reachable before emitting the diagnostic:

(Type::Tuple(tuple_ty), Type::IntLiteral(int), _) if i32::try_from(int).is_ok() => {
let tuple = tuple_ty.tuple(self.db());
tuple
.py_index(
self.db(),
i32::try_from(int).expect("checked in branch arm"),
)
.unwrap_or_else(|_| {
report_index_out_of_bounds(
&self.context,
"tuple",
value_node.into(),
value_ty,
tuple.len().display_minimum(),
int,
);
Type::unknown()
})
}

I think the machinery @sharkdp added should allow us to query that realtively easily?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently in order to query reachability for a node we have to know in advance that that node might need its reachability queried, and proactively store it in semantic indexing. Which has a memory cost if the category of nodes we are adding is large.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility would be to special-case that iterating Never gives Never, not tuple[Never, ...] or tuple[Unknown, ...]. That is, it is "not iterable", as opposed to "iterable returning no elements" or "iterable returning unknown elements". That would let us keep the simplification of tuple[Never, ...] to tuple[()], which I agree seems consistent with first principles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If iterating Never were to emit a "not iterable" diagnostic, then I think we'd have to special-case avoiding that diagnostic in unreachable code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we probably don't need the somewhat unprincipled special case I added here that Doug was asking me about -- a better fix would probably just be to check here whether the node is reachable before emitting the diagnostic:

I've left the Never special case for now, and added a TODO to consider doing this as a replacement. (This also leaves in place the simplification of tuple[Never, ...] to tuple[()].)

string_literal_ty.python_len(self.db()),
)))
}
Type::LiteralString => Cow::Owned(Tuple::homogeneous(Type::LiteralString)),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This special case isn't needed any more, since we can now interpret str's iteration behavior in the typeshed correctly.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

dcreager added 2 commits July 23, 2025 13:12
* main:
  [ty] Fix narrowing and reachability of class patterns with arguments (#19512)
  [ty] Implemented partial support for "find references" language server feature. (#19475)
  [`flake8-use-pathlib`] Add autofix for `PTH101`, `PTH104`, `PTH105`, `PTH121` (#19404)
  [`perflint`] Parenthesize generator expressions (`PERF401`) (#19325)
  [`pep8-naming`] Fix `N802` false positives for `CGIHTTPRequestHandler` and `SimpleHTTPRequestHandler` (#19432)
  [`pylint`] Handle empty comments after line continuation (`PLR2044`) (#19405)
  Move concise diagnostic rendering to `ruff_db` (#19398)
  [ty] highlight the argument in `static_assert` error messages (#19426)
  [ty] Infer single-valuedness for enums based on `int`/`str` (#19510)
  [ty] Restructure submodule query around `File` dependency
  [ty] Make `Module` a Salsa ingredient
  [ty] Reachability analysis for `isinstance(…)` branches (#19503)
  [ty] Normalize single-member enums to their instance type (#19502)
  [ty] Invert `ty_ide` and `ty_project` dependency (#19501)
  [ty] Implement mock language server for testing (#19391)
  [ty] Detect enums if metaclass is a subtype of EnumType/EnumMeta (#19481)
  [ty] perform type narrowing for places marked `global` too (#19381)
@dcreager dcreager merged commit e0149cd into main Jul 23, 2025
37 checks passed
@dcreager dcreager deleted the dcreager/iterate-tuple branch July 23, 2025 21:11
UnboundVariable pushed a commit to UnboundVariable/ruff that referenced this pull request Jul 24, 2025
…hlight

* 'main' of https://github.com/astral-sh/ruff:
  [ty] Minor: fix incomplete docstring (astral-sh#19534)
  [ty] Move server tests as integration tests (astral-sh#19522)
  [`ruff`] Offer fixes for `RUF039` in more cases (astral-sh#19065)
  [ty] Support `dataclasses.InitVar` (astral-sh#19527)
  [`ruff`] Fix `RUF033` breaking with named default expressions (astral-sh#19115)
  Update pre-commit hook name (astral-sh#19530)
  Bump 0.12.5 (astral-sh#19528)
  [ty] Rename type_api => ty_extensions (astral-sh#19523)
  [ty] Added support for "go to references" in ty playground. (astral-sh#19516)
  [ty] Return a tuple spec from the iterator protocol (astral-sh#19496)
  [ty] Exhaustiveness checking & reachability for `match` statements (astral-sh#19508)
  [ty] Fix narrowing and reachability of class patterns with arguments (astral-sh#19512)
sharkdp added a commit to astral-sh/ty that referenced this pull request Jul 25, 2025
Changes which I chose not to include; let me know if one of these should
be added:

```
- Add warning for unknown `TY_MEMORY_REPORT` value ([#19465](astral-sh/ruff#19465))
- Add goto definition to playground ([#19425](astral-sh/ruff#19425))
- Added support for "go to references" in ty playground. ([#19516](astral-sh/ruff#19516))
- Fall back to `Unknown` if no type is stored for an expression ([#19517](astral-sh/ruff#19517))
- Make `Module` a Salsa ingredient ([#19495](astral-sh/ruff#19495))
- Return a tuple spec from the iterator protocol ([#19496](astral-sh/ruff#19496))
```

---------

Co-authored-by: Alex Waygood <[email protected]>
AlexWaygood pushed a commit that referenced this pull request Jul 25, 2025
This PR updates our iterator protocol machinery to return a tuple spec
describing the elements that are returned, instead of a type. That
allows us to track heterogeneous iterators more precisely, and
consolidates the logic in unpacking and splatting, which are the two
places where we can take advantage of that more precise information.
(Other iterator consumers, like `for` loops, have to collapse the
iterated elements down to a single type regardless, and we provide a new
helper method on `TupleSpec` to perform that summarization.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ty Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants