emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: master 02bca34: Utilize new string decoding feature in GTK native in


From: Eli Zaretskii
Subject: Re: master 02bca34: Utilize new string decoding feature in GTK native input
Date: Sat, 19 Feb 2022 14:36:43 +0200

> From: Po Lu <luangruo@yahoo.com>
> Cc: emacs-devel@gnu.org
> Date: Sat, 19 Feb 2022 18:09:38 +0800
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Is this a good idea?  Consing a string when we process input increases
> > GC pressure, and what issues does this change solve as a
> > counter-weight for that disadvantage?  Is g_utf8_to_ucs4 a problematic
> > API or something?
> 
> No, but some input method modules don't always return valid UTF-8 like
> they're supposed to, thereby causing crashes in g_utf8_to_ucs4_fast.
> 
> I should have explained that in the commit message.

You can still explain that in a comment to the code.

> > But in general, decoding UTF-8 encoded C string is better done without
> > consing a string and then using the coding.c stuff.  After all, if the
> > original string is 100% guaranteed to be in UTF-8, the decoding is
> > almost trivial.
> 
> It's supposedly guaranteed, but some input method modules break that
> guarantee.

And what do we want to do with those invalid UTF-8 sequences?  The way
you did it will produce raw bytes for them -- is that really TRT in
this case?

In any case, at the very least consider using decode_string_utf_8
instead of consing a Lisp string and then using the "usual" decoding
stuff -- decode_string_utf_8 will cons only one string.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]