[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bidi-string-strip-control-characters
From: |
Eli Zaretskii |
Subject: |
Re: bidi-string-strip-control-characters |
Date: |
Thu, 20 Jan 2022 12:14:33 +0200 |
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: emacs-devel@gnu.org
> Date: Thu, 20 Jan 2022 10:29:26 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Lars, I'm not sure I understand the purpose of this function. Can you
> > explain?
>
> Like the NEWS item says, it's for cases where you want to ensure that
> there's no bidiness going on.
But when is that useful, the current specific use case aside?
> > (textsec-email-address-header-suspicious-p
> > "Lars Ingebrigtsen <larsi@\N{RIGHT-TO-LEFT OVERRIDE}gnus.org>")
> > "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"
> >
> > The empty string between quotes is the riddle.
>
> Well... perhaps not optimal, but not really a riddle. But the function
> will probably be used elsewhere in textsec, too, but I haven't gotten
> round to auditing all the strings yet.
That's why I think we should discuss the issue now. I don't think
removing the bidi controls is TRT, as it will make some text hard to
read and interpret. We can do better.
> > (insert (format "Disallowed character: `%s' (#x202e, RIGHT-TO-LEFT
> > OVERRIDE)"
> > (concat (string ?\x202e)
> > (propertize (string ?\x202c ?\x200e) 'invisible t))))
> >
> > This displays the RLO character, but doesn't mess up the description
> > after it.
>
> The display is identical to the one we have now, though:
>
> "Disallowed character: `' (#x202e, RIGHT-TO-LEFT OVERRIDE)"
No, it isn't identical, because in the latter case the U+202E glyph is
retained on display. (It disappeared from your email for some reason,
but if I eval the form, I see it between the quotes.)
> But removing the bidi chars is "obviously correct" (and impervious to
> future attacks) for somebody that's not that familiar with the bidi
> machinery, so I prefer to remove the chars instead here.
You make this stuff hard to read for a reason that doesn't sound right
to me: we do have better solutions that still avoid messing up the
display. We use those other solutions elsewhere in Emacs, so why not
here?
> Isn't that bidi-string-mark-left-to-right?
Yes, but bidi-string-mark-left-to-right will not help with overrides,
it only helps with "normal" RTL characters. We do need a new API,
just not one that removes the bidi controls entirely, that is too
drastic. What we do in descr-text.el provides a full solution, we
just need to factor it out into a separate function.
- bidi-string-strip-control-characters, Eli Zaretskii, 2022/01/20
- Re: bidi-string-strip-control-characters, Lars Ingebrigtsen, 2022/01/20
- Re: bidi-string-strip-control-characters,
Eli Zaretskii <=
- Re: bidi-string-strip-control-characters, Po Lu, 2022/01/20
- Re: bidi-string-strip-control-characters, Eli Zaretskii, 2022/01/20
- Re: bidi-string-strip-control-characters, Po Lu, 2022/01/20
- Re: bidi-string-strip-control-characters, Lars Ingebrigtsen, 2022/01/20
- Re: bidi-string-strip-control-characters, Eli Zaretskii, 2022/01/20
- Re: bidi-string-strip-control-characters, Lars Ingebrigtsen, 2022/01/20
- Re: bidi-string-strip-control-characters, Eli Zaretskii, 2022/01/20
- Re: bidi-string-strip-control-characters, Lars Ingebrigtsen, 2022/01/20
- Re: bidi-string-strip-control-characters, Eli Zaretskii, 2022/01/20