Re: decode-coding-string gone awry?

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: decode-coding-string gone awry?

From:	Kenichi Handa
Subject:	Re: decode-coding-string gone awry?
Date:	Thu, 17 Feb 2005 21:08:02 +0900 (JST)
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3.50 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Stefan Monnier <address@hidden> writes:

>>  Is it reasonable to operate with decode-coding-string on a multibyte
>>  string?  If that is nonsense, maybe we should make it get an error,
>>  to help people debug such problems.

> I think it would indeed make sense to signal errors when decoding
> a multibyte string or when encoding a unibyte string.

>>  If there are some few cases where decode-coding-string makes sense on
>>  a multibyte string, maybe we can make it get an error except in those
>>  few cases.

> The problem I suspect is that it's pretty common for ASCII-only strings to
> be arbitrarily marked unibyte or multibyte depending on the circumstance.
> So we would have to check for the case where the string is ASCII-only before
> signalling an error.

> I'm actually running right now with an Emacs that does signal such errors.
> I've changed the notion of "multibyte/unibyte" string by saying:
> - [same as now] if size_byte < 0, it's UNIBYTE.
> - [same as now] if size_byte > size, it's MULTIBYTE.
> - [changed]     if size_byte == size, it's neither/both (ASCII-only).

> Then I've changed several parts of the C code to try and set size_byte==size
> whenever possible (instead of marking the string as unibyte).

Even if size_byte == size, it may contain eight-bit-graphic
characters, and decoding such a string is a valid operation.
And even if size_byte > size, it may contain only ASCII,
eight-bit-graphic, and eight-bit-control charactes.  It's
also a valid operation to decode it.

It's not a trivial work to change the current code (in
coding.c) to signal an error safely while doing a code
conversion.  So, to check if decoding is valid or not, we
have to check all characters in a string in advance, which,
I think, slows down the operation considerably.

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

decode-coding-string gone awry?, David Kastrup, 2005/02/12
- Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/13
  - Re: decode-coding-string gone awry?, David Kastrup, 2005/02/13
  - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/15
    - Re: decode-coding-string gone awry?, David Kastrup, 2005/02/15
    - Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/15
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/17
    - Re: decode-coding-string gone awry?, Kenichi Handa <=
    - Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/17
    - Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/18
    - Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/18
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/19
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/18
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/19
    - Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/20
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/22
    - Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/18
- Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/14

Prev by Date: Re: dired: it is actually a marked file off the screen you will act on
Next by Date: Re: Emacs 21.4
Previous by thread: Re: decode-coding-string gone awry?
Next by thread: Re: decode-coding-string gone awry?
Index(es):
- Date
- Thread