Re: docs for insert-file-contents use 'bytes'

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: docs for insert-file-contents use 'bytes'

From:	Stefan Monnier
Subject:	Re: docs for insert-file-contents use 'bytes'
Date:	Tue, 30 Sep 2008 11:58:12 -0400
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux)

>>> This is not a safe operation mode with multibyte sequences; is there a
>>> way to DTRT?  I'm specifically thinking about a paged buffer mode where
>>> you only see a small portion of the file (for editing large files, as we
>>> discussed in another newsgroup a while ago).

EZ> How about this idea: read a bit more than you want, then find safe
EZ> place to end this page-full?

> How do I find the next safe position in the byte flow?

It's a dificult problem for everyone.  Which is why Emacs doesn't do it
for you, basically: I don't think anyone has made serious use of that
feature yet, so nobody has gone to the trouble of coming up with
a good solution.

Maybe you can simply look at the end of the previous insertion, count
the number of eight-bit-* chars that were inserted (these correspond to
bytes that belong to the char that straddles the boundary) so as to find
the end of the last complete char you encountred.

> I want to use it to implement a paged view of large files.  We discussed
> this in emacs-help and you suggested using insert-file-contents IIRC.

This is a very good application indeed.

> Because the text will be corrupted if you seek in the middle of a
> multibyte sequence, and there's no way to know in advance if a position
> is safe without at least some scanning.

It's not exactly "corrupted" in the sense that, while it is not
displayed correctly, it should be correctly saved back so no information
is lost.  Basically, some of the bytes are decoded with the wrong
coding-system, but this coding system is supposed to be safe.

No doubt that it's not "good enough" in general.

> There could be a insert-file-decoded-contents that seeks to a byte
> position and gets the next character at or after that position.  That's
> not too hard to implement and it's fast.

It wouldn't be good enough for your application because you might then
lose the chars that straddle a boundary.


        Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

docs for insert-file-contents use 'bytes', Ted Zlatanov, 2008/09/29
- Re: docs for insert-file-contents use 'bytes', Eli Zaretskii, 2008/09/29
  - Re: docs for insert-file-contents use 'bytes', Ted Zlatanov, 2008/09/29
    - Re: docs for insert-file-contents use 'bytes', Miles Bader, 2008/09/30
    - Re: docs for insert-file-contents use 'bytes', Eli Zaretskii, 2008/09/30
    - Re: docs for insert-file-contents use 'bytes', Ted Zlatanov, 2008/09/30
    - Re: docs for insert-file-contents use 'bytes', Stefan Monnier <=
    - Re: docs for insert-file-contents use 'bytes', Eli Zaretskii, 2008/09/30
    - Re: docs for insert-file-contents use 'bytes', Kenichi Handa, 2008/09/30

Prev by Date: Re: locate-dominating-file calls `stat' too eagerly
Next by Date: Re: docs for insert-file-contents use 'bytes'
Previous by thread: Re: docs for insert-file-contents use 'bytes'
Next by thread: Re: docs for insert-file-contents use 'bytes'
Index(es):
- Date
- Thread