bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#63125: 30.0.50; [BUG] last argument of libxml-parse-html-region has


From: Ruijie Yu
Subject: bug#63125: 30.0.50; [BUG] last argument of libxml-parse-html-region has no effect?
Date: Sat, 29 Apr 2023 08:58:03 +0800
User-agent: mu4e 1.9.22; emacs 30.0.50

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Ruijie Yu <ruijie@netyu.xyz>
>> Cc: Eli Zaretskii <eliz@gnu.org>, 63125@debbugs.gnu.org
>> Date: Fri, 28 Apr 2023 18:40:35 +0800
>> 
>> > I have filed an issue [1] in libxml2.  We'll see what they say about it.
>> >
>> > FTR, [2] is the documentation of the libxml2's htmlReadMemory()
>> > function -- though it does not say much.
>> >
>> > [1]: https://gitlab.gnome.org/GNOME/libxml2/-/issues/525
>> > [2]:
>> > https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-HTMLparser.html#htmlReadMemory.
>> 
>> I just got a response from one of libxml2's maintainers.
>> 
>> It seems that the docstring for `libxml-parse-html-region' is wrong:
>> this argument has never served the purpose of resolving relative URLs.
>> It was only used for error messages.  So I suggest that we modify the
>> docstring of this function and `libxml-parse-xml-region' to reflect this
>> fact.
>
> The response doesn't say much.  What is this "base URL" argument used
> for, and why is it named "bas URL"?  What does it mean "used for error
> messages"?  And where is the up-to-date and accurate documentation of
> this function, which explains what is this argument for?
>
> Without knowing all that, we cannot fix our documentation, let alone
> code.

The "base-url" is an argument to the Elisp function
`libxml-parse-html-region'.  I added Lars to the CC, who originally
introduced this function according to git-blame, and who may have a
better idea.

The following portion are my impressions, but I'm happy to pass any
questions you still have to the libxml2 devs if you want (or you can
comment there directly in the linked issue on gnome's gitlab instance).

-----

As you pointed out, these arguments of the Elisp function are passed
with minimal transformations and sent to the libxml2 function
`htmlReadMemory()' function.  This C function takes an argument `url',
which is the string `base-url' or empty string if `base-url' is nil.

According to Nick (the libxml2 maintainer) and my interpretation, the
`url' parameter of the libxml2 function is simply stored inside the
`url' field of a `xmlDoc' struct, to be used when an error message needs
to be displayed.  So, the `url' parameter practically does nothing for
us, since we disable all libxml2-level warnings and errors in calling
`htmlReadMemory()'.

I put this url [1] to the issue assuming that it is the documentation,
and Nick doesn't have any comment regarding the url.  So this is
probably the up-to-date, albeit not very elaborate, documentation for
the function.

[1]: 
https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-HTMLparser.html#htmlReadMemory

-- 
Best,


RY

[Please note that this mail might go to spam due to some
misconfiguration in my mail server -- still investigating.]





reply via email to

[Prev in Thread] Current Thread [Next in Thread]