emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#27978: closed (Detection of section name in man.el


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#27978: closed (Detection of section name in man.el)
Date: Fri, 18 Aug 2017 08:51:02 +0000

Your message dated Fri, 18 Aug 2017 11:49:57 +0300
with message-id <address@hidden>
and subject line Re: bug#27978: Detection of section name in man.el
has caused the debbugs.gnu.org bug report #27978,
regarding Detection of section name in man.el
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
27978: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=27978
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Detection of section name in man.el Date: Sun, 6 Aug 2017 01:44:19 +0200 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

When parsing manual in languages with non-ascii letters, the section names using non-ascii letters are not added to the table of content.

I noticed the bug reading the French bash manual: the quite useful "COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL BUILTIN COMMAND). (because of the É letter)

I propose to use Character class instead of ascii interval in the appropriate regexp defvar. It should not change anything for english manual and it should work for many other languages.

 It works great for the bash manual in French.
 Grégory Mounié

Attachment: 0001-Unicode-support-for-man-section-name-detection.patch
Description: Text Data


--- End Message ---
--- Begin Message --- Subject: Re: bug#27978: Detection of section name in man.el Date: Fri, 18 Aug 2017 11:49:57 +0300
> From: Grégory Mounié
>       <address@hidden>
> Date: Sun, 6 Aug 2017 01:44:19 +0200
> 
>   When parsing manual in languages with non-ascii letters, the section 
> names using non-ascii letters are not added to the table of content.
> 
>   I noticed the bug reading the French bash manual: the quite useful 
> "COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL 
> BUILTIN COMMAND). (because of the É letter)
> 
>   I propose to use Character class instead of ascii interval in the 
> appropriate regexp defvar. It should not change anything for english 
> manual and it should work for many other languages.

Thanks, I pushed these changes with some minor adjustments.
Specifically:

> -(defvar Man-section-regexp "[0-9][a-zA-Z0-9+]*\\|[LNln]"
> +(defvar Man-section-regexp "[[:digit:]][[:alnum:]+]*\\|[LNln]"
>    "Regular expression describing a manpage section within parentheses.")

I didn't change this one, because I think a section always uses only
ASCII letters and numbers, as in ".1n".  If you disagree, can you show
an example where this is not so?

> -(defvar Man-heading-regexp "^\\([A-Z][A-Z0-9 /-]+\\)$"
> +(defvar Man-heading-regexp "^\\([[:upper:]][[:upper:][:digit:] /-]+\\)$"
>    "Regular expression describing a manpage heading entry.")

I see no reason to replace 0-9 with [:digit:] here, since I think
non-ASCII digits will never be used in this context.  Do you agree?

Incidentally, I see quite a few similar regexps elsewhere in man.el,
did you audit all of them and established that they don't need similar
changes?  If not, would you like to propose similar changes there?


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]