-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add C++ modules support #2291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add C++ modules support #2291
Conversation
|
Great work. I also thought about this. The approach is very similiar to what was done to https://github.com/nlohmann/json with nlohmann/json#4799. Then, I noticed ... Is this temporary approach with a file with This approach is not using modules natively but rather as an interface to the original way. Does this method work without disadvantages? |
|
This is the best way to my knowledge to support modules on top of a header-only or header/source library, allowing continued support for older versions while providing newer features as an option. I'm not aware of any disadvantages to it besides a being additional translation unit to compile, but if I am wrong please correct me. The only glaring difference in API is that detail symbols are hidden as they are not exported, but in my opinion that's probably better not to expose detail symbols and flood IDE suggestions with implementation details. |
|
What about compiled libraries? Is it possible to have the traditional method and modules installed in parallel? I am thinking of repositories that ship compiled This is relevant here, https://aur.archlinux.org/packages/cpp-httplib-compiled . |
|
Yes I believe it's possible to use shared/static libraries with modules, all of my modular projects compiled to shared libraries that an executable consumes |
|
@mikomikotaishi, thanks for the fine pull request! It's fantastic, but my concern is that someone needs to update @sum01 @jimmy-park @Tachi107 do you have any thought about this pull request? |
|
I could create a Python script, or some other kinds of automated means of updating, which you could run every time it is updated. Until then I would be OK with maintaining this file, as it is a simple process. Such a script would probably comb through the file and add any symbols not part of a detail or internal namespace, or prefixed with an underscore, etc. However, I am curious why it is not feasible to update the file manually. In case it isn't clear how, one can update the file by adding a |
|
I have also seen some repositories use bots to push some commits too. Potentially one such bot could be set up to automatically populate the module with new changes each time there is a mismatch. I don't know anything about how to set this up, but I have seen this before and it could potentially be a solution (but I think the simplest one is just to run a Python script each time any update to the library happens). |
|
Anyway, I think this could be one such way of automatically updating the module. |
|
@mikomikotaishi thanks for the additional explanation. I am ok with the following your suggestion.
We could automatically generate |
|
OK, that makes sense to me. (I don't know anything about how to run GitHub Actions or write scripts for it however, so I'm afraid the most I can do is create a script for this.) |
|
I'm not sure why there were failing workflows as I didn't change anything in the core library |
|
Never mind, it seems the failing CI is happening upstream too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this GitHub Action. But after further consideration, I now think this process should be done in CMakeLists.txt. You could generate modules/httplib.cppm under this line
Line 216 in 87c2b4e
| if(HTTPLIB_COMPILE) |
We actually do the similar to generate httplib.h and httplib.cc with split.py and put them in a distribution package. By doing this, we no longer need to keep modules/httplib.cppm in this repository. It will be created on the fly only when necessary.
@sum01 @jimmy-park Is my comment above correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be difficult to parse an especially large file without syntax analysis or parsing library. I don't have a lot of experience with this sort of thing, and because the header contains things like method definitions outside of the class it seems to be a very complex task. Then including a C++ parser would require a dependency even for a source generation script which would be additional bloat.
I guess this could be done by attaching to the header file that split.py generates, but I haven't actually seen what that looks like.
As for dynamically generating the module, this was raised before on the Dear ImGui library which currently does something like this. I think this is a bad choice for module API (having a static file is obviously best for consumers to be able to just read it from a distance without additional interaction, i.e. a "what-you-see-is-what-you-get" sort of API), but ultimately as you are in charge I'll look into this if you prefer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking more into this and I think if we want to allow CMake to compile the module if it's generated into an "out" directory, we have to force the out directory to be named "out" so that it can be written into the CMakeLists.txt script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway @yhirose do you have any opinion on the output directory having a hard-coded name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sum01 @jimmy-park @Tachi107 could you please answer this question?
|
If the file can be run during the build process (and if the output consists of machine-generated files, it should *only* run during build time), then the destination directory should be configurable (maybe defaulting to the the current working directory). Even better, if the output is a single file, the script should allow the user to specify the output file itself (full path).
This is because downstreams (like Debian, which is what I maintain the meson build scripts for) may have some requirements on where build products should be stored.
|
|
@Tachi107 CMake needs to know what the output directory is ahead of time to compile the module. How do you propose to solve this? |
|
@yhirose I think there is one possible solution to allow both the directory to be user-specified while still supporting CMake module building, which is probably just to have the Python script generate the CMake file too. I don't know if this is too convoluted or awkward of a design though, so please do tell me your thoughts. |
|
Hi Miko,
On Tue Dec 9, 2025 at 10:46 PM CET, Miko wrote:
@Tachi107 CMake needs to know what the output directory is ahead of time to compile the module. How do you propose to solve this?
You should use CMake's add_custom_command() function to invoke the
script and pass it the output file path
https://cmake.org/cmake/help/latest/command/add_custom_command.html
Meson has a similar function, but I can do that myself after this gets
merged.
|
Tachi107
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback now that I have been able to look at the whole code :)
| target_sources(httplib_module | ||
| PUBLIC | ||
| FILE_SET CXX_MODULES FILES | ||
| httplib.cppm | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires CMake 3.28, but this project only requires CMake 3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ modules require a later version of CMake, so activating modules should require the user has the correct version.
| option(HTTPLIB_USE_NON_BLOCKING_GETADDRINFO "Enables the non-blocking alternatives for getaddrinfo." ON) | ||
| option(HTTPLIB_REQUIRE_ZSTD "Requires ZSTD to be found & linked, or fails build." OFF) | ||
| option(HTTPLIB_USE_ZSTD_IF_AVAILABLE "Uses ZSTD (if available) to enable zstd support." ON) | ||
| option(HTTPLIB_BUILD_MODULES "Build httplib modules (requires HTTPLIB_COMPILE to be ON)." OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this need to be an option? Does it add any other dependency? Cannot it be always be built, without an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to leave it as opt-in because C++ module support is compiler dependent. Most newer compiler versions offer it, but older ones do not. The build could break if we force it on older toolchains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which Python version does this script require? Is there a reason why, for example, Optional[str] is used instead of the built-in str | None introduced by PEP 604?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just used to writing Optional[T], if T | None is better I can replace it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway I believe this requires a minimum of Python 3.5 now
| def get_git_diff(file_path: str, base_ref: str = "HEAD") -> Optional[str]: | ||
| """ | ||
| Get the git diff for a specific file. | ||
| @param file_path Path to the file to diff | ||
| @param base_ref Git reference to compare against (default: HEAD) | ||
| @return The git diff output, or None if error | ||
| """ | ||
| try: | ||
| result: CompletedProcess = subprocess.run( | ||
| ["git", "diff", base_ref, "--", file_path], | ||
| capture_output=True, | ||
| text=True, | ||
| check=True | ||
| ) | ||
| return result.stdout | ||
| except CalledProcessError as e: | ||
| print(f"Error getting git diff: {e}", file=sys.stderr) | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requiring a Git repository to built the modules is something many users would prefer to avoid. For example, users building from the release tarballs produced by GitHub will not have any Git history, and this would not work.
A way to resolve this would be to always generate the modules file from scratch, so that no diffing operations are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the suggestion @yhirose gave, and while I personally would prefer to have a static module for the purpose of having an obvious or clear module API (rather than behind a script) this is the approach I'm taking because the author does not maintain the module file or build system scripts.
That being said, this Python script is only supposed to be run by a GitHub Action script, so it does rely on the assumption of Git setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikomikotaishi you probably misunderstood what I mentioned in the following comment. I said exactly the same thing as @Tachi107 mentioned above: "Could you please simply regenerate httplib.cppm from scratch with the latest httplib.h? It will make this script much simpler".
#2291 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I was saying, we would be generating it on demand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some problems I am having with implementing this, for example having to insert the #ifdefs to add to the generated module all of the SSL classes. Do you have any suggestions for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion... What I meant by 'from scratch' was to create httplib.cppm entirely, rather than just updating a portion of it. So we would like you not to use get_git_diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I meant that I would be regenerating it entirely. My only problem is how to generate the symbols with the aforementioned.
This pull request adds support for C++20 modules through CMake. It is enabled by the
HTTPLIB_BUILD_MODULESoption (which requiresHTTPLIB_COMPILEto be enabled, though it probably doesn't have to - I only forced this requirement because it seems to make the most sense to force the library to compile if modules are to be compiled).