-
Notifications
You must be signed in to change notification settings - Fork 601
Davem/xspod #23795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blead
Are you sure you want to change the base?
Davem/xspod #23795
Conversation
johannessen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, mostly about minor things.
Overall, I found the document to be well structured and the prose to be easily readable, in spite of the fairly complex topic.
|
On Wed, Oct 08, 2025 at 04:53:18PM -0700, Leon Timmermans wrote:
Should we still use PREINIT? I see it's use-case in C89, but in C99 it
can be confusing (mainly because it runs before argument handling).
I didn't look too closely, but my quick experimentation seemed to
indicate that the flag(s) telling perl to use C99 weren't necessarily
being propagated to XS compilation. So I was playing safe. Also if people
want to write code that runs on older perls, or have any reason to
declare and possibly initialise a var before argument processing, then
PREINIT should be the official way.
|
|
Yes, the -std flag is set by the cflags script and not in Some systems (I think non-x86 BSDs) can use old gccs that don't default to c99 or later. |
|
On Sat, Oct 11, 2025 at 03:00:31PM -0700, Leon Timmermans wrote:
@Leont commented on this pull request.
> + require XSLoader;
+ XSLoader::load(__PACKAGE__, $VERSION);
That automatically adds the module name but not the expected version. Personally I wouldn't use it..
I was just following the form of the Foo.pm which h2xs generates. I have
no knowledge or strong opinions about the usage of XSLoader.
|
|
On Thu, Oct 16, 2025 at 05:44:45PM -0700, Matthew Horsfall (alh) wrote:
@wolfsage commented on this pull request.
Is this intended to be a link to other documentation? Because this is in
a code block, it won't be treated that way by pod parsers
Yes, it was originally intended to link. Then I realised that (as you point
out) it doesn't link, but I forgot to update it.
|
|
I'm done for now. |
|
On Mon, Oct 20, 2025 at 07:58:13PM -0700, Tony Cook wrote:
> +=head3 Perl OPs
...
99%* of XS code doesn't touch OPs, I don't think it needs to be the
first heading, or even mentioned at all here.
I mentioned OPs mainly as a way of introducing OP_ENTERSUB, which I needed
to do to explain how XSUBs get called.
Also, I wanted to give readers a bit of background knowledge abut how the
interpreter works, to make sense of things when debugging.
I'll think about this some more.
|
|
On Mon, Oct 20, 2025 at 08:35:05PM -0700, Tony Cook wrote:
I'm not sure it's good to go into so much detail on the internal
structure of SVs here.
I expect nearly all XS code is fine with newSViv()/sv_setiv()/SvIV() etc
rather than looking at any flags beyond SvOK() and SvUTF8().
I wanted to educate people into the difference between SvIVX() and SvIV(),
for example. I've emphasised that they should generally use the latter,
but at least now they're aware of the difference when they then try and
cargo-cult something from an existing distro that incorrectly uses
SvIVX().
|
wolfsage
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(more to come, don't want this to get lost)
wolfsage
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright that's enough for one night, more later
|
On Sun, Oct 26, 2025 at 05:34:58PM -0700, Tony Cook wrote:
+ This SV has a lifetime which is the same as the sub which the OP is
+ a part of
No, it lives as long as the sub
I don't understand. Aren't we both saying the same thing?
|
You and I are, wolfsage seemed to think otherwise. |
wolfsage
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final review done! Few more small things found.
|
Thanks for all the feedback so far. The two commits I've just appended / pushed to this branch contain fixups for every comment from your reviews. As a general rule, if I didn't reply to a comment, it means I implicitly accepted the point. I made the standardisation of xsubpp version numbers into a separate commit because all the changes were related. The second commit is just a big pile of random minor fixes and rewordings. |
Please mark comments as resolved when you're done with them, that cleans up this thread immensely for anyone reading it. Now it isn't even rendering everything because there are so many of them. |
|
Does anyone still intend to do any further review? Or is this PR approaching being mergeable? |
The XS parser supported an extremely obscure bit of functionality which
made use of the %v package variable to maintain state between different
bits of typemap processing. This was accidentally broken in 5.10.0:
refactoring removed the 'use vars "%v"' line, and no one seemed to
notice or care.
Also, the sole example of its use in the docs seemed to be obscure,
confusing and probably wrong.
There was a consensus in the discussion at
http://nntp.perl.org/group/perl.perl5.porters/267667
that we should stop documenting this feature rather than trying to fix
it.
Rewrite (and retitle) these three subsections:
=head3 Default Parameter Values
=head3 The C<length(NAME)> Keyword
=head3 Variable-length Parameter Lists
Add text for the new '=head2 The XSUB Input Part' section, and rewrite the existing entry for the PREINIT keyword.
This commit completely rewrites this section and subsections:
=head3 The INPUT: Keyword
=head4 The NO_INIT Keyword
=head4 Initializing Function Parameters
=head4 The & Unary Operator
It de-emphasises the INPUT keyword and suggests using ANSI XS signatures
etc instead.
Add text for the new '=head2 The XSUB Init Part' section, and rewrite the existing entry for the INIT keyword.
Add text to the new
=head2 The XSUB Code Part
=head3 Auto-calling a C function
sections, and rewrite the existing
=head4 The C_ARGS: Keyword
section
Rewrite these sections:
=head3 The CODE: Keyword
=head3 The PPCODE: Keyword
This keyword formerly wasn't documented. The docs now say "this is what it is, but don't use it".
Add text to the new
=head2 The XSUB Output Part
section, and rewrite the text in these existing sections:
=head3 The POSTCALL: Keyword
=head3 The OUTPUT: Keyword
Add text to the new
=head2 The XSUB Cleanup Part
section, and rewrite the text in this existing section:
=head3 The CLEANUP: Keyword
Add text to the new
=head2 XSUB Generic Keywords
section, and rewrite the text in this existing section:
=head3 The PROTOTYPE: Keyword
Explain that a 'C' parameter type in an XSUB declaration can actually
be a Perl package name or similar, e.g.
Foo::Bar
f(Foo::Bar obj, char *s)
First, add a new subsection
=head3 T_PTROBJ and opaque handles
to the TYPEMAPs section explaining how this typemap can be used to
map between Perl objects and C library handles. It provides a
fully-worked example of wrapping a simple arithmetic library.
Then completely rewrite the
=head3 The OVERLOAD: Keyword
section. In particular, it now refers to the new T_PTROBJ example and
shows how it can be extended to use overloading.
This keyword was undocumented, even though it had been added 25 years ago.
Populate the introduction to this new section.
Rewrite this section:
=head3 The ALIAS: Keyword
Rewrite these sections: =head3 The INTERFACE: Keyword =head3 The INTERFACE_MACRO: Keyword also demote the second to be a head4 child of the first. Then expand the T_PTROBJ example to use INTERFACE as an alternative to ALIAS.
Rewrite this section:
=head3 The CASE: Keyword
Populate this new section (except for the T_PTROBJ subsection, which had already been added by an earlier commit within this branch). Note that the "Common typemaps" subsection could probably benefit from some further expansion by someone familiar with which built-in T_FOO entries are useful.
Rewrite this section: =head2 Using XS With C++ Disclaimer: I've never written a proper C++ program. I had to (literally) dust off my 34-year old copy of Stroustrup(*) and also do some Googling. Hopefully what I've written is sane. (*) This was bought back in the days when people used to to learn things by buying books, and when I thought that I ought to know something about this newfangled C++ thing. I never got round to reading all of it: I discovered Perl around the same time, which looked to be a lot more fun.
Revise the text in this section:
=head2 Safely Storing Static Data in XS
Rewrite this section:
=head1 EXAMPLES
Basically, delete the one big example in this section and instead
provide links to various other examples already present in this document
instead.
Tweak the final few sections of perlxs.pod.
After the big rewrite, various bits of text which describe when a particular feature was introduced or changed have ended up using a random mixture of Perl, ParseXS and xsubpp version numbers. This commit standardises on xsubpp version numbers. These are mostly the same as ParseXS, but this handles the cases before ParseXS was split off from xsubpp. Perl versions suffer from not exactly matching when an xsubpp version number was incremented, and not matching what is used by the "REQUIRE:" keyword. This commit also adds a short new section which tries to explain how the three sets of version numbers are related.
The many commits in this branch have completely rewritten perlxs.pod. This commit applies all the minor tweaks suggested by reviewers. (It was far too much like hard work to try and update each individual commit with the various changes.)
|
I will merge this PR within a few days unless anyone has any last-minute objections. |
leonerd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, a much bigger (and better) document than previous. So much so I've only had time to review about the top 20% so far but I wanted to commit these comments before they get lost, and to say I will review the rest in the next day or so.
Overall so far looking very much better, here's just a few tiny nits.
| which often requires extensive knowledge of the Perl interpreter's | ||
| internals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you calling me out here? I feel called out ;)
|
|
||
| (This next paragraph is definitely only for background on debugging.) | ||
|
|
||
| An C<OP> is is a data structure within the perl interpreter. It is used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate "is is"
| An C<OP> is is a data structure within the perl interpreter. It is used to | |
| An C<OP> is a data structure within the perl interpreter. It is used to |
| The various OPs executed by the Perl interpreter up until the function is | ||
| called will: push a mark indicating the start of a new argument stack | ||
| frame; push an SV containing the integer value 1; push the SV currently | ||
| bound to the variable C<$x>; and push the C<*foo> typeglob. Then the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "bound to" is likely to confuse folks not from a Lisp/Scheme background. Maybe "the SV currently representing the variable C<$x>" ?
| For a normal Perl subroutine call, C<pp_entersub()> will then: pop the | ||
| topmost mark off the mark stack; pop the SV pointers between that mark and | ||
| the top of the stack and store them in C<@_>; then set C<PL_op> to the | ||
| first OP pointed to by the C<&foo> CV. Those OPs will then be run by the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double space "first OP".
Also, I see some documentation I'm going to have to alter if I get the CVf_NOSNAIL flag working :)
|
|
||
| As mentioned above, almost all runtime data within the perl interpreter is | ||
| stored in an SV (scalar value) structure. The head of an SV structure | ||
| consists of three or four words: a reference count; a type and flags; a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "words" is possibly not a word [pardon the phrasing] that would make sense to folks outside of a machine-level programming background. I'd say "fields", just relating to the members of the C struct.
| STRLEN len; | ||
| char *pv = SvPV(sv, len); | ||
|
|
||
| which both retrieves a string pointer and sets C<len> to its length. Note |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might help casual readers to explicitly point out that since SvPV is a macro, it can do the evil trickery of modifying len directly, so you don't have to pass in &len.
leonerd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got around to reading and reviewing the entire rest.
I admit I somewhat skimmed over the C++ related parts as I don't know the technical details here, though I have read over all the wording and style at least.
Overall a few small comments but generally this is looking much better indeed.
| OUTPUT: | ||
| RETVAL | ||
|
|
||
| Finally, note that some very old (pre-1996) XS documentation suggested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another double-space on this line
| PPCODE: | ||
| { | ||
| int i; | ||
| int max_ix = AvFILL(av); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we intending these code examples still run on (much) older perls? For a while now we've had av_count() which returns a count, so you don't have to do the more awkward + 1 logic elsewhere. But that wouldn't work before it was added - 87306e0 in time for 5.34.
| limited use and are typically only used to mimic the behaviour of Perl | ||
| builtins. For example there is no way to implement a C<push @a, ...;> | ||
| style function without a way of telling the Perl interpreter not to | ||
| flatten C<@a>. Outside of these narrow uses, it is generally a mistake to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double-space
|
|
||
| Note however that, due to a quirk in parsing, it is possible for a | ||
| C<TYPEMAP:> block which comes I<immediately after> an XSUB to affect any | ||
| entries used by that XSUB, as if the block had appeared just before the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double-space
Also: Oh gosh I can't imagine the crazy amount of parsing logic that results in this particular quirk. ;)
| An XSUB definition consists of a declaration (typically two lines), | ||
| followed by an optional body. The declaration specifies the XSUB's name, | ||
| parameters and return type. The body consists of sections started by | ||
| keywords, which may specify how its parameters and any any return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated "any any"
| Note that C<OVERLOAD> shouldn't be mixed with the L<ALIAS|/The ALIAS: | ||
| Keyword> keyword; the value of C<ix> will be undefined for any overload | ||
| method call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: Is this an implementation bug that we should consider fixing?
| information stored in the CV indicates which C library function should be | ||
| autocalled. | ||
|
|
||
| Finally, there is the C<CASE> keyword, which allows the whole body of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double-space
| and the function pointer. | ||
| The alias name can be either a simple function name or can include a | ||
| package name. The alias value to the right of the C<=> may be either a | ||
| literal positive integer or a word (which is expected to be a CPP define). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or enum constant. I've done that several times.
| my $color = Foo::Bar->new(0x10, 0x20, 0xff); | ||
| printf "blue=%d\n", $color->blue(); # prints 255 | ||
| $color->set_blue(0x80); | ||
| printf "blue=%d\n", $color->blue(); # prints 128 | ||
|
|
||
| =head2 Safely Storing Static Data in XS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this entire section is talking about mutable static data, yes? I've seen (and written) many cases of static const data structures being used all over .xs files. As they're const and set up purely at compiletime, there's no issues with mutations across threads.
Should there be perhaps some wording on that topic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, though it would also include a static SV* const and similar.
|
|
||
| =head1 AUTHOR | ||
|
|
||
| Originally written by Dean Roehrich <F<roehrich@cray.com>>. | ||
| Completely rewritten in 2025. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add your name here of course :)
Rewrite perlxs.pod
This branch completely rewrites and modernises the XS reference manual,
perlxs.pod.
The new file is about twice the size of the old one.
This branch:
deletes some obsolete sections;
reorders the existing sections into a more logical order;
adds a large new introductory/overview part, which explains
all the background needed to understand what XSUBs do, including
SVs, the stack, reference counts, magic etc.
includes a BNF syntax section
modernises: e.g. it uses "ANSI" parameter syntax throughout
has a fully-worked example using T_PTROBJ
Note that although each commit in this branch may have a complex-looking
diff for the updating of a particular section, in reality most sections
haver been rewritten from scratch, and the diff output is showing
paragraph breaks as fixed unchanging points, so that it appears as lots of
individual paragraph changes rather than "delete all this text, add new
text". If reviewing, it may be easier to just read the final perlxs.pod
file instead of looking at the diffs.