Update api.rst

JorjMcKie · JorjMcKie · commit f779e1e2ceda · 2024-07-11T18:53:56.000-04:00
Clarify use of `hdr_info` parameter.
diff --git a/docs/src/api.rst b/docs/src/api.rst
@@ -18,7 +18,7 @@ API
 
     :arg list pages: optional, the pages to consider for output (caution: specify 0-based page numbers). If omitted all pages are processed.
 
-    :arg hdr_info: optional. Use this if you want to provide your own header detection logic. This may be a callable or an object having a method named `get_header_id`. It must accept a text span (a span dictionary as contained in `extractDict <https://pymupdf.readthedocs.io/en/latest/textpage.html#span-dictionary>`_) and has optional access to the owning `Page <https://pymupdf.readthedocs.io/en/latest/page.html>`_ object. It must return a string "" or up to 6 "#" characters followed by 1 space. If omitted, a full document scan will be performed to find the most popular font sizes and derive header levels based on this. For instance, avoid any headers by specifying `hdr_info=lambda s: ""`.
+    :arg hdr_info: optional. Use this if you want to provide your own header detection logic. This may be a callable or an object having a method named `get_header_id`. It must accept a text span (a span dictionary as contained in `extractDict <https://pymupdf.readthedocs.io/en/latest/textpage.html#span-dictionary>`_) and a keyword parameter "page" (which is the owning `Page <https://pymupdf.readthedocs.io/en/latest/page.html>`_ object). It must return a string "" or up to 6 "#" characters followed by 1 space. If omitted, a full document scan will be performed to find the most popular font sizes and derive header levels based on them. To completely avoid this behavior specify `hdr_info=lambda s, page=None: ""` or `hdr_info=False`.
 
     :arg bool write_images: when encountering images or vector graphics, PNG images will be created from the respective page area and stored in the folder of the document. Markdown references will be generated pointing to these images. Any text contained in these areas will not be included in the text output (but appear as part of the images). Therefore, if your document has text written on full page images, make sure to set this parameter to `False`.