emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bidi-string-strip-control-characters


From: Eli Zaretskii
Subject: Re: bidi-string-strip-control-characters
Date: Thu, 20 Jan 2022 12:14:33 +0200

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Thu, 20 Jan 2022 10:29:26 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Lars, I'm not sure I understand the purpose of this function.  Can you
> > explain?
> 
> Like the NEWS item says, it's for cases where you want to ensure that
> there's no bidiness going on.

But when is that useful, the current specific use case aside?

> >   (textsec-email-address-header-suspicious-p
> >    "Lars Ingebrigtsen <larsi@\N{RIGHT-TO-LEFT OVERRIDE}gnus.org>")
> >   "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"
> >
> > The empty string between quotes is the riddle.
> 
> Well...  perhaps not optimal, but not really a riddle.  But the function
> will probably be used elsewhere in textsec, too, but I haven't gotten
> round to auditing all the strings yet.

That's why I think we should discuss the issue now.  I don't think
removing the bidi controls is TRT, as it will make some text hard to
read and interpret.  We can do better.

> >   (insert (format "Disallowed character: `%s' (#x202e, RIGHT-TO-LEFT 
> > OVERRIDE)"
> >             (concat (string ?\x202e)
> >                     (propertize (string ?\x202c ?\x200e) 'invisible t))))
> >
> > This displays the RLO character, but doesn't mess up the description
> > after it.
> 
> The display is identical to the one we have now, though:
> 
>    "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"

No, it isn't identical, because in the latter case the U+202E glyph is
retained on display.  (It disappeared from your email for some reason,
but if I eval the form, I see it between the quotes.)

> But removing the bidi chars is "obviously correct" (and impervious to
> future attacks) for somebody that's not that familiar with the bidi
> machinery, so I prefer to remove the chars instead here.

You make this stuff hard to read for a reason that doesn't sound right
to me: we do have better solutions that still avoid messing up the
display.  We use those other solutions elsewhere in Emacs, so why not
here?

> Isn't that bidi-string-mark-left-to-right?

Yes, but bidi-string-mark-left-to-right will not help with overrides,
it only helps with "normal" RTL characters.  We do need a new API,
just not one that removes the bidi controls entirely, that is too
drastic.  What we do in descr-text.el provides a full solution, we
just need to factor it out into a separate function.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]