Skip to content

Commit 5bebbf7

Browse files
authored
Python 3.13 support (#13823)
In order to support Python 3.13, we had to migrate to Cython 3.0. This caused some tricky interaction with our Pydantic usage, because Cython 3 uses the from __future__ import annotations semantics, which causes type annotations to be saved as strings. The end result is that we can't have Language.factory decorated functions in Cython modules anymore, as the Language.factory decorator expects to inspect the signature of the functions and build a Pydantic model. If the function is implemented in Cython, an error is raised because the type is not resolved. To address this I've moved the factory functions into a new module, spacy.pipeline.factories. I've added __getattr__ importlib hooks to the previous locations, in case anyone was importing these functions directly. The change should have no backwards compatibility implications. Along the way I've also refactored the registration of functions for the config. Previously these ran as import-time side-effects, using the registry decorator. I've created instead a new module spacy.registrations. When the registry is accessed it calls a function ensure_populated(), which cases the registrations to occur. I've made a similar change to the Language.factory registrations in the new spacy.pipeline.factories module. I want to remove these import-time side-effects so that we can speed up the loading time of the library, which can be especially painful on the CLI. I also find that I'm often working to track down the implementations of functions referenced by strings in the config. Having the registrations all happen in one place will make this easier. With these changes I've fortunately avoided the need to migrate to Pydantic v2 properly --- we're still using the v1 compatibility shim. We might not be able to hold out forever though: Pydantic (reasonably) aren't actively supporting the v1 shims. I put a lot of work into v2 migration when investigating the 3.13 support, and it's definitely challenging. In any case, it's a relief that we don't have to do the v2 migration at the same time as the Cython 3.0/Python 3.13 support.
1 parent 911539e commit 5bebbf7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+2226
-1272
lines changed

.github/workflows/tests.yml

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,12 @@ jobs:
4545
run: |
4646
python -m pip install flake8==5.0.4
4747
python -m flake8 spacy --count --select=E901,E999,F821,F822,F823,W605 --show-source --statistics
48-
- name: cython-lint
49-
run: |
50-
python -m pip install cython-lint -c requirements.txt
51-
# E501: line too log, W291: trailing whitespace, E266: too many leading '#' for block comment
52-
cython-lint spacy --ignore E501,W291,E266
48+
# Unfortunately cython-lint isn't working after the shift to Cython 3.
49+
#- name: cython-lint
50+
# run: |
51+
# python -m pip install cython-lint -c requirements.txt
52+
# # E501: line too log, W291: trailing whitespace, E266: too many leading '#' for block comment
53+
# cython-lint spacy --ignore E501,W291,E266
5354
5455
tests:
5556
name: Test
@@ -58,7 +59,7 @@ jobs:
5859
fail-fast: true
5960
matrix:
6061
os: [ubuntu-latest, windows-latest, macos-latest]
61-
python_version: ["3.9", "3.12"]
62+
python_version: ["3.9", "3.12", "3.13"]
6263

6364
runs-on: ${{ matrix.os }}
6465

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,6 @@ include README.md
44
include pyproject.toml
55
include spacy/py.typed
66
recursive-include spacy/cli *.yml
7+
recursive-include spacy/tests *.json
78
recursive-include licenses *
89
recursive-exclude spacy *.cpp

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[build-system]
22
requires = [
33
"setuptools",
4-
"cython>=0.25,<3.0",
4+
"cython>=3.0,<4.0",
55
"cymem>=2.0.2,<2.1.0",
66
"preshed>=3.0.2,<3.1.0",
77
"murmurhash>=0.28.0,<1.1.0",

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ setuptools
2323
packaging>=20.0
2424
# Development dependencies
2525
pre-commit>=2.13.0
26-
cython>=0.25,<3.0
26+
cython>=3.0,<4.0
2727
pytest>=5.2.0,!=7.1.0
2828
pytest-timeout>=1.3.0,<2.0.0
2929
mock>=2.0.0,<3.0.0

setup.cfg

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,11 @@ project_urls =
3030
[options]
3131
zip_safe = false
3232
include_package_data = true
33-
python_requires = >=3.9,<3.13
33+
python_requires = >=3.9,<3.14
3434
# NOTE: This section is superseded by pyproject.toml and will be removed in
3535
# spaCy v4
3636
setup_requires =
37-
cython>=0.25,<3.0
37+
cython>=3.0,<4.0
3838
numpy>=2.0.0,<3.0.0; python_version < "3.9"
3939
numpy>=2.0.0,<3.0.0; python_version >= "3.9"
4040
# We also need our Cython packages here to compile against

spacy/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
from .errors import Errors
1818
from .glossary import explain # noqa: F401
1919
from .language import Language
20+
from .registrations import REGISTRY_POPULATED, populate_registry
2021
from .util import logger, registry # noqa: F401
2122
from .vocab import Vocab
2223

spacy/lang/ja/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@
3232
"""
3333

3434

35-
@registry.tokenizers("spacy.ja.JapaneseTokenizer")
3635
def create_tokenizer(split_mode: Optional[str] = None):
3736
def japanese_tokenizer_factory(nlp):
3837
return JapaneseTokenizer(nlp.vocab, split_mode=split_mode)

spacy/lang/ko/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
"""
2121

2222

23-
@registry.tokenizers("spacy.ko.KoreanTokenizer")
2423
def create_tokenizer():
2524
def korean_tokenizer_factory(nlp):
2625
return KoreanTokenizer(nlp.vocab)

spacy/lang/th/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
"""
1414

1515

16-
@registry.tokenizers("spacy.th.ThaiTokenizer")
1716
def create_thai_tokenizer():
1817
def thai_tokenizer_factory(nlp):
1918
return ThaiTokenizer(nlp.vocab)

spacy/lang/vi/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@
2222
"""
2323

2424

25-
@registry.tokenizers("spacy.vi.VietnameseTokenizer")
2625
def create_vietnamese_tokenizer(use_pyvi: bool = True):
2726
def vietnamese_tokenizer_factory(nlp):
2827
return VietnameseTokenizer(nlp.vocab, use_pyvi=use_pyvi)

0 commit comments

Comments
 (0)