Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy v2.0.0 breaks the ability to download models using spaCy #13528

Open
afogel opened this issue Jun 16, 2024 · 7 comments
Open

Numpy v2.0.0 breaks the ability to download models using spaCy #13528

afogel opened this issue Jun 16, 2024 · 7 comments
Labels
bug Bugs and behaviour differing from documentation

Comments

@afogel
Copy link

afogel commented Jun 16, 2024

How to reproduce the behaviour

In my dockerfile, I run these commands:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install spacy

RUN python -m spacy download en_core_web_lg

It returns the following error (and stacktrace):

2.519 Traceback (most recent call last):
2.519   File "<frozen runpy>", line 189, in _run_module_as_main
2.519   File "<frozen runpy>", line 148, in _get_module_details
2.519   File "<frozen runpy>", line 112, in _get_module_details
2.519   File "/usr/local/lib/python3.12/site-packages/spacy/__init__.py", line 6, in <module>
2.521     from .errors import setup_default_warnings
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/errors.py", line 3, in <module>
2.522     from .compat import Literal
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/compat.py", line 39, in <module>
2.522     from thinc.api import Optimizer  # noqa: F401
2.522     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/api.py", line 1, in <module>
2.522     from .backends import (
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/__init__.py", line 17, in <module>
2.522     from .cupy_ops import CupyOps
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/cupy_ops.py", line 16, in <module>
2.522     from .numpy_ops import NumpyOps
2.522   File "thinc/backends/numpy_ops.pyx", line 1, in init thinc.backends.numpy_ops
2.524 ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Locking to the previous version of numpy will resolve this issue:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install numpy==1.26.4 spacy

RUN python -m spacy download en_core_web_lg
@gborodin
Copy link

+1

@svlandeg svlandeg added the bug Bugs and behaviour differing from documentation label Jun 17, 2024
@rustammdev
Copy link

How to reproduce the behaviour

In my dockerfile, I run these commands:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install spacy

RUN python -m spacy download en_core_web_lg

It returns the following error (and stacktrace):

2.519 Traceback (most recent call last):
2.519   File "<frozen runpy>", line 189, in _run_module_as_main
2.519   File "<frozen runpy>", line 148, in _get_module_details
2.519   File "<frozen runpy>", line 112, in _get_module_details
2.519   File "/usr/local/lib/python3.12/site-packages/spacy/__init__.py", line 6, in <module>
2.521     from .errors import setup_default_warnings
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/errors.py", line 3, in <module>
2.522     from .compat import Literal
2.522   File "/usr/local/lib/python3.12/site-packages/spacy/compat.py", line 39, in <module>
2.522     from thinc.api import Optimizer  # noqa: F401
2.522     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/api.py", line 1, in <module>
2.522     from .backends import (
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/__init__.py", line 17, in <module>
2.522     from .cupy_ops import CupyOps
2.522   File "/usr/local/lib/python3.12/site-packages/thinc/backends/cupy_ops.py", line 16, in <module>
2.522     from .numpy_ops import NumpyOps
2.522   File "thinc/backends/numpy_ops.pyx", line 1, in init thinc.backends.numpy_ops
2.524 ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Locking to the previous version of numpy will resolve this issue:

FROM --platform=linux/amd64 python:3.12.4

RUN pip install --upgrade pip

RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
RUN pip install numpy==1.26.4 spacy

RUN python -m spacy download en_core_web_lg

this solution helped, thank you

@supert56
Copy link

+1 I also had this problem. Thanks for posting the solution 👍

@DoctorManhattan123
Copy link

DoctorManhattan123 commented Jun 18, 2024

Those solutions indeed works, but I would still like to see a fix in the codebase itself. This issue is that inside the requirements.txt of the project (just an assumption after a short look at the codebase), the version is specified as such:

numpy>=1.15.0; python_version < "3.9"
numpy>=1.19.0; python_version >= "3.9"

I am a huge fan, in all of my projects, of always pinning dependencies even up to the patch version.

I would suggest a PR that looks like this:

numpy>=1.15.0,<2.0.0; python_version < "3.9"
numpy>=1.19.0,<2.0.0; python_version >= "3.9"

This at least pins the version down to major releases, which should anyway always be the case, as major version can (and most likely will always) contain breaking changes.

@afogel
Copy link
Author

afogel commented Jun 18, 2024

@DoctorManhattan123 To clarify, the solution I posted is only meant to be a stopgap.

Ideally, all downstream consumers of numpy (including library maintainers) should complete the migration to leverage numpy 2.0.0. I imagine, given the size of the release, that this will take time.

The pinned version is to tide over people seeking to quickly fix their CI/CD or whatever impacted process is broken until a more robust solution is implemented in the affected codebases.

@bendennescma
Copy link

This issue with thinc has been noted explosion/thinc#939

mortii added a commit to mortii/anki-morphs that referenced this issue Jun 19, 2024
there is a spaCy bug that hopefully will be fixed soon: explosion/spaCy#13528
SoulHarsh007 added a commit to SoulHarsh007/gitAPy that referenced this issue Jun 19, 2024
spacy is not compatible with numpy 2.x, see:
explosion/spaCy#13528 and thus the CI fails
locking numpy to latest 1.x release fixes this problem

Signed-off-by: SoulHarsh007 <[email protected]>
@lucas-mdsena
Copy link

It helped. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation
Projects
None yet
Development

No branches or pull requests

8 participants