[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 4/6] xattr.7: wfix

From: Alejandro Colomar
Subject: Re: [PATCH 4/6] xattr.7: wfix
Date: Mon, 1 Aug 2022 15:28:03 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.0.2

CC -= Štěpán:
I don't think he's interested in a deep discussion about use of \~ and '\ ' in man pages
CC -= mtk:
He's already subscribed to the list, and quite silent these days]
CC += groff@:
probably people there are interested in this discussion

Hi Branden,

On 7/30/22 19:53, Alejandro Colomar (man-pages) wrote:
Hi Štěpán and Branden!

On 7/30/22 16:15, Štěpán Němec wrote:

Hello Branden,

On Fri, 29 Jul 2022 15:58:23 -0500
G. Branden Robinson wrote:

-The VFS imposes limitations that an attribute names is limited to 255 bytes
-and an attribute value is limited to 64\ kB.
+The VFS-imposed limits on attribute names and values are 255 bytes
+and 64\ kB, respectively.

While you're tidying this up, I would convert the `\ ` escape sequence
to `\~`.  Both are non-breaking spaces, but the latter is adjustable.

groff_man(7) from groff 1.22.4 says:

  \~     Adjustable, non-breaking space character.  Use  this escape  to          prevent  a  break  inside  a short phrase or between a numerical
         quantity and its corresponding unit(s).

                Before starting the motor, set the output speed to\~1.
                There are 1,024\~bytes in 1\~kiB.
                CSTR\~#8 documents the B language.

Thank you for the review!

I think I disagree: IMO a number+unit should be treated as a single
entity both semantically/logically and typographically (at least as far
as space stretching goes), i.e., say (if I understand the effect of '\ '
and '\~' right),

   255 bytes               and                64 kB, respectively.

would make a bit more sense to me than

   255        bytes        and         64         kB, respectively.

Current Linux man-pages usage doesn't appear quite consistent, but '\ '
prevails over '\~' (about 6:1), and my cursory grep found only one
instance of '\~' used between a number and its unit

Would you mind sensing a patch for that one between the number and its unit?

(vs. many instances
of '\ ' in that context).

That is just a matter of writers not knowing the existence of \~ ('\ ' was documented in man-pages(7), but \~ wasn't).  I wouldn't give much more importance to existing practice in this regard.

When I read this email I had no strong opinion; both variants made sense to me.  So I did some investigation, to see if the SI already specifies something about it; and it does:


5.2 Unit symbols

Unit symbols are printed in upright type regardless of the type used in the surrounding text.  They are printed in lower-case letters unless they are derived from a proper name, in which case the first letter is a capital letter.

An exception, adopted by the 16th CGPM (1979, Resolution 6), is that either capital L or lower-case l is allowed for the litre, in order to avoid possible confusion between the numeral 1 (one) and the lower-case letter l (el).

A multiple or sub-multiple prefix, if used, is part of the unit and precedes the unit symbol without a separator.  A prefix is never used in isolation and compound prefixes are never used.

Unit symbols are mathematical entities and not abbreviations. Therefore, they are not followed by a period except at the end of a sentence, and one must neither use the plural nor mix unit symbols and unit names within one expression, since names are not mathematical entities.

In forming products and quotients of unit symbols the normal rules of algebraic multiplication or division apply.  Multiplication must be indicated by a space or a half-high (centred) dot (⋅), since otherwise some prefixes could be misinterpreted as a unit symbol.  Division is indicated by a horizontal line, by a solidus (oblique stroke, /) or by negative exponents.  When several unit symbols are combined, care should be taken to avoid ambiguities, for example by using brackets or negative exponents.  A solidus must not be used more than once in a given expression without brackets to remove ambiguities.

It is not permissible to use abbreviations for unit symbols or unit names, such as sec (for either s or second), sq. mm (for either mm2 or square millimetre), cc (for either cm3 or cubic centimetre), or mps (for either m/s or metre per second).  The use of the correct symbols for SI units, and for units in general, as listed in earlier chapters of this broch ure, is mandatory.  In this way ambiguities and misunderstandings in the values of quantities are avoided.

5.4.3 Formatting the value of a quantity

The numerical value always precedes the unit and a space is always used to separate the unit from the number.  Thus the value of the quantity is the product of the number and the unit.  The space between the number and the unit is regarded as a multiplication sign (just as a space between units implies multiplication).  The only exceptions to this rule  are for the unit symbols for degree, minute and second for plane angle, °, ′ and ′′, respectively, for which no space is left between the  numerical value and the unit symbol.

This rule means that the symbol °C for the degree Celsius is preceded by a space when one expresses values of Celsius temperature t.

Even when the value of a quantity is used as an adjective, a space is left between the numerical value and the unit symbol.  Only when the name of the unit is spelled out would the  ordinary rules of grammar apply, so that in English a hyphen would be used to separate the number from the unit.

In any expression, only one unit is used. An exception to this rule is in expressing the values of time and of plane angles using non-SI units.  However, for plane angles it is generally preferable to divide the degree decimally.  It is  therefore preferable to write 22.20° rather than 22° 12′, except in  fields such as navigation, cartography, astronomy, and in the measurement of very small angles.

Sorry for copying the full text, but I preferred to give enough context.

So, from the SI text quoted above, the space is not a word separator in that context (it is for example not allowed to hyphenate between the value and the unit even if it acts as an adjective; the SI disables normal language rules).  It is instead a mathematical symbol denoting multiplication, and the whole value+unit is a single mathematical expression; to me, that is better denoted with a single space, rather than an adjustable one.

Therefore, I'd say that it makes more sense in this case to use '\ '.

In view of the above, failing any instruction from a man-pages
maintainer to the contrary, I'd prefer leaving this as is.

In the general case, I prefer \~, but for value+unit I prefer '\ '.
Thank you both!

   With best wishes,




On 7/30/22 19:59, Alejandro Colomar (man-pages) wrote:
> On 7/30/22 19:53, Alejandro Colomar (man-pages) wrote:
>> Even when the value of a quantity is used as an adjective, a space is
>> left between the numerical value and the unit symbol.  Only when the
>> name of the unit is spelled out would the  ordinary rules of grammar
>> apply, so that in English a hyphen would be used to separate the
>> number from the unit.
> Although, I missed this small paragraph.  According to that, it would be
> 255\~bytes but 64\ kB.

I left the whole original conversation for groff@ users to read it without needing to go to linux-man@ archives.

I'd like to arrive to some consensus on usage of \~ and '\ '.

For things related to the SI, we should follow SI conventions (they developed them for a reason, and I don't see a strong reason to deviate).

For things unrelated to the SI, we need to come up with some convention. I think mirroring what the SI does could be good.

For example, for commands, I'd use non-adjustable spaces. For pointer types, I'd also use the non-adjustable space. For compound names such as 'RFC 1234', I'd say normal language rules apply, and the space should be adjustable.

To be clear, I'll add some examples taken from the Linux man-pages (and some of them modified by me):

.I "struct termios2\ *"
.I (1\ <<\ oparg)
.I unice\ =\ 20\ \-\ knice
is filesystem dependent and is typically 16\ MiB.
.I (uid_t)\ \-1
Enables RFC\~7413 Fast Open support.
.I Power ISA, Book\~II - Section\~3.1 (Program Priority Registers)
Before starting the motor, set the output speed to\~1.
There are 1,024\~bytes in 1\ kiB.
CSTR\~#8 documents the B language.

What do you think?



Alejandro Colomar

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]