Skip to content

Conversation

@mikomikotaishi
Copy link
Contributor

This pull request adds support for C++20 modules through CMake. It is enabled by the HTTPLIB_BUILD_MODULES option (which requires HTTPLIB_COMPILE to be enabled, though it probably doesn't have to - I only forced this requirement because it seems to make the most sense to force the library to compile if modules are to be compiled).

@Spixmaster
Copy link

Spixmaster commented Dec 8, 2025

Great work. I also thought about this.

The approach is very similiar to what was done to https://github.com/nlohmann/json with nlohmann/json#4799. Then, I noticed ...

Is this temporary approach with a file with using ... for the meantime while modules and no modules are supported?

This approach is not using modules natively but rather as an interface to the original way.

Does this method work without disadvantages?

@mikomikotaishi
Copy link
Contributor Author

mikomikotaishi commented Dec 8, 2025

This is the best way to my knowledge to support modules on top of a header-only or header/source library, allowing continued support for older versions while providing newer features as an option.

I'm not aware of any disadvantages to it besides a being additional translation unit to compile, but if I am wrong please correct me. The only glaring difference in API is that detail symbols are hidden as they are not exported, but in my opinion that's probably better not to expose detail symbols and flood IDE suggestions with implementation details.

@Spixmaster
Copy link

Spixmaster commented Dec 8, 2025

What about compiled libraries? Is it possible to have the traditional method and modules installed in parallel?

I am thinking of repositories that ship compiled .so or .a.

This is relevant here, https://aur.archlinux.org/packages/cpp-httplib-compiled .

@mikomikotaishi
Copy link
Contributor Author

mikomikotaishi commented Dec 8, 2025

Yes I believe it's possible to use shared/static libraries with modules, all of my modular projects compiled to shared libraries that an executable consumes

@yhirose
Copy link
Owner

yhirose commented Dec 8, 2025

@mikomikotaishi, thanks for the fine pull request! It's fantastic, but my concern is that someone needs to update modules/httplib.cppm whenever new external symbols are added or existing symbols are removed/renamed, and it could happen quite often. I am not planning to maintain this file. So if you cannot keep maintaining the file, I am probably not merge it since the file can be easily out-of-date. Is there a way to generate modules/httplib.cppm from httplib.h automatically?

@sum01 @jimmy-park @Tachi107 do you have any thought about this pull request?

@mikomikotaishi
Copy link
Contributor Author

mikomikotaishi commented Dec 8, 2025

I could create a Python script, or some other kinds of automated means of updating, which you could run every time it is updated. Until then I would be OK with maintaining this file, as it is a simple process. Such a script would probably comb through the file and add any symbols not part of a detail or internal namespace, or prefixed with an underscore, etc.

However, I am curious why it is not feasible to update the file manually. In case it isn't clear how, one can update the file by adding a using httplib::NEW_SYMBOL_HERE; into the export namespace httlib { } block, for each new symbol that is added. Do let me know if any more information is needed.

@mikomikotaishi
Copy link
Contributor Author

I have also seen some repositories use bots to push some commits too. Potentially one such bot could be set up to automatically populate the module with new changes each time there is a mismatch. I don't know anything about how to set this up, but I have seen this before and it could potentially be a solution (but I think the simplest one is just to run a Python script each time any update to the library happens).

@mikomikotaishi
Copy link
Contributor Author

Anyway, I think this could be one such way of automatically updating the module.

@yhirose
Copy link
Owner

yhirose commented Dec 9, 2025

@mikomikotaishi thanks for the additional explanation. I am ok with the following your suggestion.

I have also seen some repositories use bots to push some commits too. Potentially one such bot could be set up to automatically populate the module with new changes each time there is a mismatch. I don't know anything about how to set this up, but I have seen this before and it could potentially be a solution.

We could automatically generate modules/httplib.cppm in a GitHub Actions script when httplib.h is pushed. (I can't accept the way to run a script to update the file manually, since I don't maintain any build systems in this repository. Please see #955 (comment).)

@mikomikotaishi
Copy link
Contributor Author

OK, that makes sense to me. (I don't know anything about how to run GitHub Actions or write scripts for it however, so I'm afraid the most I can do is create a script for this.)

@mikomikotaishi
Copy link
Contributor Author

I'm not sure why there were failing workflows as I didn't change anything in the core library

@mikomikotaishi
Copy link
Contributor Author

Never mind, it seems the failing CI is happening upstream too.

Copy link
Owner

@yhirose yhirose Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this GitHub Action. But after further consideration, I now think this process should be done in CMakeLists.txt. You could generate modules/httplib.cppm under this line

if(HTTPLIB_COMPILE)

We actually do the similar to generate httplib.h and httplib.cc with split.py and put them in a distribution package. By doing this, we no longer need to keep modules/httplib.cppm in this repository. It will be created on the fly only when necessary.

@sum01 @jimmy-park Is my comment above correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be difficult to parse an especially large file without syntax analysis or parsing library. I don't have a lot of experience with this sort of thing, and because the header contains things like method definitions outside of the class it seems to be a very complex task. Then including a C++ parser would require a dependency even for a source generation script which would be additional bloat.

I guess this could be done by attaching to the header file that split.py generates, but I haven't actually seen what that looks like.

As for dynamically generating the module, this was raised before on the Dear ImGui library which currently does something like this. I think this is a bad choice for module API (having a static file is obviously best for consumers to be able to just read it from a distance without additional interaction, i.e. a "what-you-see-is-what-you-get" sort of API), but ultimately as you are in charge I'll look into this if you prefer it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking more into this and I think if we want to allow CMake to compile the module if it's generated into an "out" directory, we have to force the out directory to be named "out" so that it can be written into the CMakeLists.txt script

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway @yhirose do you have any opinion on the output directory having a hard-coded name?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sum01 @jimmy-park @Tachi107 could you please answer this question?

@Tachi107
Copy link
Contributor

Tachi107 commented Dec 9, 2025 via email

@mikomikotaishi
Copy link
Contributor Author

mikomikotaishi commented Dec 9, 2025

@Tachi107 CMake needs to know what the output directory is ahead of time to compile the module. How do you propose to solve this?

@mikomikotaishi
Copy link
Contributor Author

@yhirose I think there is one possible solution to allow both the directory to be user-specified while still supporting CMake module building, which is probably just to have the Python script generate the CMake file too. I don't know if this is too convoluted or awkward of a design though, so please do tell me your thoughts.

@Tachi107
Copy link
Contributor

Tachi107 commented Dec 11, 2025 via email

Copy link
Contributor

@Tachi107 Tachi107 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback now that I have been able to look at the whole code :)

Comment on lines +3 to +7
target_sources(httplib_module
PUBLIC
FILE_SET CXX_MODULES FILES
httplib.cppm
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires CMake 3.28, but this project only requires CMake 3.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ modules require a later version of CMake, so activating modules should require the user has the correct version.

option(HTTPLIB_USE_NON_BLOCKING_GETADDRINFO "Enables the non-blocking alternatives for getaddrinfo." ON)
option(HTTPLIB_REQUIRE_ZSTD "Requires ZSTD to be found & linked, or fails build." OFF)
option(HTTPLIB_USE_ZSTD_IF_AVAILABLE "Uses ZSTD (if available) to enable zstd support." ON)
option(HTTPLIB_BUILD_MODULES "Build httplib modules (requires HTTPLIB_COMPILE to be ON)." OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this need to be an option? Does it add any other dependency? Cannot it be always be built, without an option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to leave it as opt-in because C++ module support is compiler dependent. Most newer compiler versions offer it, but older ones do not. The build could break if we force it on older toolchains.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which Python version does this script require? Is there a reason why, for example, Optional[str] is used instead of the built-in str | None introduced by PEP 604?

Copy link
Contributor Author

@mikomikotaishi mikomikotaishi Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just used to writing Optional[T], if T | None is better I can replace it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I believe this requires a minimum of Python 3.5 now

Comment on lines +62 to +80
def get_git_diff(file_path: str, base_ref: str = "HEAD") -> Optional[str]:
"""
Get the git diff for a specific file.
@param file_path Path to the file to diff
@param base_ref Git reference to compare against (default: HEAD)
@return The git diff output, or None if error
"""
try:
result: CompletedProcess = subprocess.run(
["git", "diff", base_ref, "--", file_path],
capture_output=True,
text=True,
check=True
)
return result.stdout
except CalledProcessError as e:
print(f"Error getting git diff: {e}", file=sys.stderr)
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requiring a Git repository to built the modules is something many users would prefer to avoid. For example, users building from the release tarballs produced by GitHub will not have any Git history, and this would not work.

A way to resolve this would be to always generate the modules file from scratch, so that no diffing operations are required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the suggestion @yhirose gave, and while I personally would prefer to have a static module for the purpose of having an obvious or clear module API (rather than behind a script) this is the approach I'm taking because the author does not maintain the module file or build system scripts.

That being said, this Python script is only supposed to be run by a GitHub Action script, so it does rely on the assumption of Git setup.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikomikotaishi you probably misunderstood what I mentioned in the following comment. I said exactly the same thing as @Tachi107 mentioned above: "Could you please simply regenerate httplib.cppm from scratch with the latest httplib.h? It will make this script much simpler".
#2291 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I was saying, we would be generating it on demand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some problems I am having with implementing this, for example having to insert the #ifdefs to add to the generated module all of the SSL classes. Do you have any suggestions for this?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion... What I meant by 'from scratch' was to create httplib.cppm entirely, rather than just updating a portion of it. So we would like you not to use get_git_diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant that I would be regenerating it entirely. My only problem is how to generate the symbols with the aforementioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants