[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Automatic (e)tags generation and incremental updates

From: Dmitry Gutov
Subject: Re: Automatic (e)tags generation and incremental updates
Date: Sat, 16 Jan 2021 05:57:21 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 13.01.2021 17:58, Eli Zaretskii wrote:

Almost all the identifiers are ASCII, right?  So maybe optimize 99.9%
of use cases by storing such tags tables in a unibyte buffer, read
with insert-file-contents-literally?

All right, and that option is probably handled well enough already by
the user choosing (l) in the prompt when the tags file is very big.

Yes, but my idea was to do that automatically.  After all, the size
threshold beyond which we prompt the user is customizable, so it could
be very large.

Even so, this mode of operation removes a feature. How frequently it's used, I have no idea, but it's better to have full functionality by default. There must be a reason why all those languages added support for unicode chars in identifiers.

For the time being, I just disabled synchronization to disk, given that we don't yet know how to refresh an existing file anyway.

My (apparently faulty) intuition was that if utf-8-emacs is the memory
representation of buffer text, converting it into that encoding can be
faster because it could be done by copying from memory rather that
having to do the work of recoding every character.

We don't recode characters when they are valid UTF-8 sequences, but
you forget the raw bytes: they are converted from internal multibyte
representation to single bytes, and that requires walking the buffer
one character at a time.

IOW, utf-8-emacs is the same as utf-8 for this purpose.

So utf-8-emacs is not the same as "internal multibyte representation"?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]