Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy pinning going forward #4816

Open
h-vetinari opened this issue Aug 21, 2023 · 16 comments
Open

Numpy pinning going forward #4816

h-vetinari opened this issue Aug 21, 2023 · 16 comments

Comments

@h-vetinari
Copy link
Member

Not sure if people saw already, but numpy 1.25 introduced a pretty big change

Compiling against the NumPy C API is now backwards compatible by default

NumPy now defaults to exposing a backwards compatible subset of the
C-API. This makes the use of oldest-supported-numpy unnecessary.
Libraries can override the default minimal version to be compatible with
using:

#define NPY_TARGET_VERSION NPY_1_22_API_VERSION

before including NumPy or by passing the equivalent -D option to the
compiler. The NumPy 1.25 default is NPY_1_19_API_VERSION. Because the
NumPy 1.19 C API was identical to the NumPy 1.16 one resulting programs
will be compatible with NumPy 1.16 (from a C-API perspective). This
default will be increased in future non-bugfix releases. You can still
compile against an older NumPy version and run on a newer one.

For more details please see for-downstream-package-authors.

(numpy/numpy#23528)

Also from those release notes, numpy is now planning the long-only-mythical 2.0 release as following 1.26 (which is roughly 1.25 + meson + CPython 3.12 support), so we will have to touch this setup in the not too distant future anyway.

We're currently on 1.22 as per NEP29, so AFAICT we could consider using numpy 1.25 with NPY_1_22_API_VERSION as an equivalent setup (this probably needs to go into an activation script for numpy...?).

CC @conda-forge/numpy @conda-forge/core

@xhochy
Copy link
Member

xhochy commented Aug 21, 2023

Reading this, I see the drawback, that we will have an activation script with numpy and thus some (unexpected) hurdles for maintainers with numpy as a build dependency (if they want a newer numpy version). What would be the benefit of providing numpy=1.25 as the default? I don't see it.

@h-vetinari
Copy link
Member Author

It's perhaps possible to do this without an activation script, that was just the first thing that came to mind...

What would be the benefit of providing numpy=1.25 as the default?

I don't have a strong argument (or preference) here. But whenever we get to numpy>=1.25 as a default, we'd IMO have to adapt the run-export. It would also be a bit weird to jump from (a future) >=1.24 back to >=1.19 (based on the API default of 1.25), but I guess that could be a one-time transition. It also wouldn't match NEP 29 anymore...

@rgommers
Copy link

Wouldn't it be better to deal with it for 2.0? That's less than 6 months away, and at that point there is a hard necessity to deal with C API/ABI stuff.

@h-vetinari
Copy link
Member Author

Yeah, that's part of what I wanted to discuss here, not just the backwards compat by default, but also 2.0.

It also doesn't need an immediate decision, there's no urgency AFAICT.

@hmaarrfk
Copy link
Contributor

I think this will be useful with the 2.0 release, we could pin to 2 and set the environment variables like we do for the C compilers at build time.

@isuruf
Copy link
Member

isuruf commented Aug 30, 2023

I would argue to not move away from the current setup. Even if we set NPY_TARGET_VERSION using an environment variable, there are 2 issues.

  1. It might not get picked up by the build system.
  2. If a project itself sets NPY_TARGET_VERSION, the metadata will not be correct.

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

(This is exactly what we do with macos SDK and deployment target by setting them to the same version by default. For eg: if SDK = 11 and target = 10.9, the symbols introdued in 10.15 are visible, but they need to treated as weak symbols in 10.9 which require the developer to handle it correctly in their C/C++ code)

@ocefpaf
Copy link
Member

ocefpaf commented Aug 30, 2023

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

Also, a looser pin in that case is not necessary better. Most users will want an updated numpy anyway and having that in place will make it easier (faster) for the solver to provide a solution with it.

Sure, there may be a small portion of users who may need older numpy and won't be able to install it but I believe the advantages outweigh the disadvantages.

@h-vetinari
Copy link
Member Author

If a project itself sets NPY_TARGET_VERSION, the metadata will not be correct.

Isn't that a general problem that we'll have to look out for in any case?

I'm not sure if that is something we could easily determine from a compiled artefact (numpy does embed the C-API level AFAIK), but it seems it would be good to check after building what numpy target version got used

That way we could verify that things didn't get lost or overridden by the project or the build system.

@isuruf
Copy link
Member

isuruf commented Sep 10, 2023

Isn't that a general problem that we'll have to look out for in any case?

No. See my comment highlighted below

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

@h-vetinari
Copy link
Member Author

I spoke with @rgommers recently, and he mentioned one thing about this that wasn't clear to me before:

Packages compiled with numpy 2.0 will continue to be compatible with the 1.x ABI.

In other words, if this works out as planned, we could support numpy 2.0 right away without having to do a full CI-bifurcation of all numpy-dependent packages. It would mean using 2.0 as a baseline earlier than we'd do it through NEP29, but given the now built-in backwards compatibility, we could set the pinning to 2.0, and manually set the numpy run-export to do something like numpy >=1.19 (which apparently won't be changed until numpy drops python 3.9 support).

No. See my comment highlighted below

I wasn't talking about the tightness/looseness of the constraints, but about projects setting NPY_TARGET_VERSION in their build scripts somewhere, which has the potential to conflict (in terms of expectations, not constraints) with whatever we do.

@h-vetinari
Copy link
Member Author

Now that we've started migrating for CPython 3.13 (which requires numpy 2.1), the migrator has a major-only pin:

While the numpy 2 migrator pins to 2.0

numpy:
- 2.0
- 2.0
- 2.0
- 2.0

Do we want a major-only pin, or still decide when we update the baseline numpy version (in a post-numpy-2.0 world)? For example, using a major-only pin means we'll start pulling in 2.1 as soon as it's available, and this creates a tighter run-export (>=1.21) than 2.0 (>=1.19), which is something we may want to do consciously rather than accidentally. OTOH, both of those are substantially looser than what NEP 29 suggests (c.f. conda-forge/numpy-feedstock#324).

I think both approaches are workable, we should just decide on one or the other.

@jakirkham
Copy link
Member

Would suggest the Python 3.13 migrator be updated to use 2.1 instead of 2. Reasons being

  • NumPy 2.1.0 is the first version to support Python 3.13
  • Already we are pinning to NumPy 2.0 for older Pythons
  • As you point out, we want to be intentional about setting lower bounds

Everything else can stay the same

Think if we want to change this more dramatically, we should probably wait for these migrators to complete and reassess. It is always easier to relax things later (as opposed to tightening). Also trying to work in more changes with multiple in-flight migrators is hairy

Though open to discussion if others have different opinions

@h-vetinari
Copy link
Member Author

Would suggest the Python 3.13 migrator be updated to use 2.1 instead of 2.

I support this for the reasons you stated, though at least there's no immediate urgency on this. As there are no numpy 2.0 builds for 3.13, the two are equivalent in this particular case (they wouldn't be for py<313 though, hence my question).

@rgommers
Copy link

I'd probably choose the major-only flavor, because that's the actual requirement. But it doesn't really matter either way, since builds are going to be using 2.1 anyway now that that is available.

@h-vetinari
Copy link
Member Author

But it doesn't really matter either way, since builds are going to be using 2.1 anyway now that that is available.

That's not the case; if we pin 2.0, then that's what gets installed in host while building (but 2.1 at runtime of course)

@rgommers
Copy link

That's not the case; if we pin 2.0, then that's what gets installed in host while building (but 2.1 at runtime of course)

Major-only meant 2, not 2.0. What I meant was that both 2 and 2.1 will yield 2.1 at build time (until 2.2 comes out of course).

I think 2 is the more correct choice, as it will avoid having to manually bump the minor version all the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants