-
Notifications
You must be signed in to change notification settings - Fork 937
Updated ABI generation code and new libraries #13280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
hppritcha
wants to merge
15
commits into
open-mpi:main
Choose a base branch
from
hppritcha:abi-generate-ver2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+81,737
−1,668
Draft
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
c14c64c
ABI: Add initial ABI generation code and new libraries
jtronge 9466f42
ABI: Move ABI support into big count binding code
hppritcha d07b17b
ABI: Switch to using MPI Standard ABI values
Joe-Downs fd3847c
ABI: multiple fixes
hppritcha 24e85d1
remove unused function MPI_Abi_supported
hppritcha 71da9a4
REMOVE ME: patch the coll libnbc to free data arrays
hppritcha cb2063b
minor docs updates
hppritcha c976a5c
set mpi version compliance to 5.0
hppritcha 2c76d18
minor comment fixes
hppritcha c6dd262
grammar fix
hppritcha 434dcd2
minor updates to ABI README
hppritcha 7e4a2a6
white space changes
hppritcha 4e9d689
undo changes that are not needed
hppritcha 505b211
remove confusing comment
hppritcha 4110e99
more minor cleanup
hppritcha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule pympistandard
updated
4 files
| +7 −0 | src/pympistandard/_kinds.py | |
| +7,265 −3,104 | src/pympistandard/data/apis.json | |
| +22 −7 | src/pympistandard/isoc.py | |
| +6 −0 | tests/test_iso_c.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built MPICH 5.0b1 and see that it installs libmpi_abi.so.0[.0.0].
Per our conventions, will we be shifting this CRA value to something other than 0:0:0 on release branches (e.g., 0:1:0, so that C-A is still 0, and we still produce libmpi_abi.so.0 to match MPICH)? I ask this only because Libtool recommends not shipping 0:0:0.
Thinking a little further down this rabbit hole: does having both ABI-enabled Open MPI and MPICH allow installing Open MPI and MPICH into the same
$prefix(with all default sub directories, like$includedirbeing$prefix/include)? I'm thinking "no" for at least the following reasons:mpi.hwill have all the ABI things being identical, but we'll have other differences from MPICH'smpi.h(right?).$prefix/include/mpi.h../configure --includedir=...could workaround that.libmpi_abi.so.*-- you couldn't tell if it was from Open MPI or MPICH. From a user perspective, that might be ok, but from a package manager and/or system administrator point of view -- that might get a little weird. For example, what if we both shiplibmpi_abi.so.0.a.bwith the sameaandbvalues?libmpi_abi.so.0.0.0.aandbvalues are different than MPICH's somehow? (I really haven't thought this through to know if this is even possible in a sustainable way over time -- nor what the consequences are outside of Linux)I guess I'm wondering if it's useful to build Open MPI and MPICH with something like:
This would keep a single libdir so that we don't introduce (more) LD_LIBRARY_PATH complexity, but still allow unique
mpi.h.But then again, there's still problems with
mpirunandmpiexecfilename clashes in$bindir(not to mention CLI flag differences). Maybe something like Linux-style alternatives could be useful here...? Shrug.I understand how ABI between MPI libraries solves some perceived problems for users, but trying to go the next steps to actually hide the differences between MPI implementations gets pretty tricky pretty quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of having a unique ABI is to be able to switch between different backends, which effectively requires same sonames (and filenames), so that's pretty much by design. You'd either have a single MPI library at a time in a single-prefix scenario, or as many as you want in a multiple-prefix scenario (think of Spack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another use case I could see is the usual "modules" setup on an HPC cluster. the user could just switch between the mpich module and the openmpi module without needing to recompile/relink. that's basically how one would use spack modules system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot just blindly switch from one module to another UNLESS the ABI's are an exact match - in other words, you have to ensure that the version of the ABI you are switching to is the same as the one you currently are using. An ABI is a rigid specification - there is no such thing as a minor revision to it. Any change - be it an addition, subtraction, or (heaven forbid) a modification - results in a new ABI, and the version number (in libtool parlance, the .so number) must change.
That's the entire point of the libtool .so number - to guide the linker to picking the library that matches the signature required by the executable. In this case, that's the ABI.
People who have been building ABIs learned this the hard way. As @jsquyres pointed out, there are a ton of other problems - but setting the .so number to the ABI version is a basic necessity. Having an unchanging .so number even when the ABI changes is a disaster. The linker will basically be playing russian roulette, and users will rapidly find it...let's politely say, less than useful.
In this case, you want the ABI library to have a .so that matches the ABI it supports. You benefit from having a second library - the actual backend implementation - that can change as it is modified. Key is to tie the ABI library to the matching backend, and then change that connection as you update backends. In other words, you update the ABI-backend combination when the backend gets updated.
So the "module" is an ABI-backend combination, and the user picks the ABI they want supported along with the underlying implementation that supports that ABI. In other words, "give me MPI v2 ABI and the OMPI v6.2.1 backend". If you don't care about ABI, then just pick the implementation library. If you don't care about backend, then select the ABI module and let it pick the default backend.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsquyres IMHO, the way this should be handled is the following: The CURRENT and AGE number are implicitly defined from the MPI_ABI_VERSION/MPI_ABI_SUBVERSION numbers, that is, by the set of backward-compatible additions or the backward-incompatible changes. I am assuming that
MPI_ABI_VERSIONwill stay at1as long as there are no backward-incompatible changes, andMPI_ABI_SUBVERSIONwill bump on backward-compatible additions/updates.The REVISION number could be left for use by the MPI implementation, this way multiple revisions can be installed in the same prefix location, with ldconfig going through its usual cache update rules.
I tried to layout the rules here mpi-forum/mpi-abi-stubs#28. The "formulas" there would produce a soname
libmpi_abi.so.1, but if we want a.0suffix, that's trivial to fix by subtracting1from the formula for current.Can you point me to such recommendation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: I know what the use cases are for ABI (e.g., env modules-style or spack-style environments). My question about whether two installs could share a common
$prefixwas probably more of a hypothetical musing more than anything else. I even answered my own question -- the answer is probably "no", for the reasons already discussed.I admit that I don't quite understand the purpose of
MPI_ABI_VERSIONandMPI_ABI_SUBVERSION. As you pointed out earlier in this thread, there's essentially 2 common styles of maintaining binary compatibility these days:How exactly do a pair of compile-time constants fit into either of those schemes? It's not described in MPI-5.0, nor is an alternate (i.e., 3rd) scheme described. MPI-5.0:20.2 loosely implies that
MPI_ABI_SUBVERSIONcan be used as a proxy forMPI_VERSIONandMPI_SUBVERSION(i.e., be used for conditional compilation of various MPI symbols / types / functions / etc.). But that seems odd -- why have new constants for a mechanism that has worked for decades?Sure,
MPI_ABI_VERSIONcould be a proxy for a Linux SONAME. But then what's the point ofMPI_ABI_SUBVERSIONat compile time (or even run time, viaMPI_ABI_GET_VERSION())?If we intend
MPI_ABI_VERSIONto be a proxy for Linux SONAME, that seems fine. Is there a scheme for howMPI_ABI_SUBVERSIONshould factor in here? The way thatMPI_ABI_SUBVERSIONis (loosely) defined in MPI-5.0 does not seem like a hypothetical scheme such as -- for example --(MPI_ABI_VERSION * 100 + MPI_ABI_SUBVERSION)would be a good candidate as a proxy for the Linux SONAME. So what do implementations and/or users useMPI_ABI_SUBVERSIONfor?I'm digressing from the main question here, and I don't mean to open a whole debate about these 2 compile-time constants here in OMPI -- such issues can be discussed at the Forum level.
For an ABI to satisfy the use cases described above (e.g., swapping out the back end), the questions of how to create linker-compatible shared library versions should really be resolved into some kind of scheme that both Open MPI and MPICH -- and our various derivative implementations -- follow. This doesn't necessarily have to be in the MPI spec itself; it's probably better as an agreement between the Open MPI and MPICH communities. My point: we need to have the discussion and then publicly document the scheme so that anyone can follow it (e.g., even outside of Open MPI and MPICH).
Doh! I just re-read the LT docs and I cannot find such a recommendation. So I guess I'm wrong here. Perhaps I'm remembering some super-old conversation about how we (OMPI?) didn't want to release with 0:0:0 because that's what's on
mainand we don't release offmain(similar to the project version number) -- i.e., more of a release philosophy kind of thing than a strict technical requirement.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption at that time is now part of the standard, MPI 5.0 says (sec 20.2 pp 844):
Backwards-compatible changes, such as the addition of new handle types, will incre-
ment the minor version. Backwards-incompatible changes will increment the major version.
The addition of new functions to the MPI API does not change the ABI version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw that. But can you provide an example of how it would be useful / used?
I.e., how exactly is it different than
MPI_VERSIONanMPI_SUBVERSION?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long ago, somewhere, I claimed that MPI_ABI_VERSION/SUBVERSION was not that useful. As usual, I was ignored ;'-), and now we have these version numbers in the standard. Anyway, now I believe it is still good to have the version defined (though not necessary the C macros). The MPI ABI version/subversion values are to be updated following similar rules as libtool, therefore we can use them to define the C/R/A tuple the following way:
and then under these rules we get a SONAME
libmpi_abi.so.0and then all the planets are aligned.Could you please give a bit of though to this claim of mine? Think again about the rules for updating the MPI ABI version/subversion, the libtool c/r/a update rules, my formulas above, my claim about the SONAME, and confirm whether am I right?
The other obvious uses if conditional compilation with the macros. I use the presence of MPI_ABI_VERSION in mpi4py to conditionally-compile if building against the MPI standard ABI. Regarding the use of the values of MPI_ABI_VERSION/SUBVERSION, there is definitely some overlap with MPI_VERSION/SUBVERSION.
MPI_VERSION/SUBVERSION follow the version of the MPI standard, this is unrelated to ABI or even API. The MPI standard version is not only about API but also about runtime behavior changes. MPI_VERSION/SUBVERSION evolve in ways that are totally unrelated to whether the API/ABI changes are backward compatible or not, while MPI_ABI_VERSION will stay at 1 for as long as the MPI Forum does not introduce backward incompatible changes.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting CRA proposal. I don't think it's quite right, though -- I think you have to give the MPI_ABI_[SUB]VERSION their own distinct digits (somewhat akin to bit mapping). E.g.:
This gives you unique values. Otherwise, you could end up with
Put differently: the original scheme only works if MPI_ABI_VERSION never increases (which would be morally equivalent to hard-coding C=MPI_ABI_SUBVERSION-1, A=MPI_ABI_SUBVERSION). Otherwise, we can get repeat C and A values for different values of MPI_ABI_[SUB]VERSION.
I guess what I'm asking for: can you give an example of something you'd need to #if on that is based on ABI and not API. This might be a failure of imagination on my part to come up with a useful example here...
And FWIW, I tend to prefer always defining preprocessor macros (as opposed to undefining them vs. defining them). If you always define them, you protect against typos:
whereas this will result in a compilation error: