[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: decode-coding-string gone awry?
From: |
Kenichi Handa |
Subject: |
Re: decode-coding-string gone awry? |
Date: |
Thu, 17 Feb 2005 21:08:02 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3.50 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Stefan Monnier <address@hidden> writes:
>> Is it reasonable to operate with decode-coding-string on a multibyte
>> string? If that is nonsense, maybe we should make it get an error,
>> to help people debug such problems.
> I think it would indeed make sense to signal errors when decoding
> a multibyte string or when encoding a unibyte string.
>> If there are some few cases where decode-coding-string makes sense on
>> a multibyte string, maybe we can make it get an error except in those
>> few cases.
> The problem I suspect is that it's pretty common for ASCII-only strings to
> be arbitrarily marked unibyte or multibyte depending on the circumstance.
> So we would have to check for the case where the string is ASCII-only before
> signalling an error.
> I'm actually running right now with an Emacs that does signal such errors.
> I've changed the notion of "multibyte/unibyte" string by saying:
> - [same as now] if size_byte < 0, it's UNIBYTE.
> - [same as now] if size_byte > size, it's MULTIBYTE.
> - [changed] if size_byte == size, it's neither/both (ASCII-only).
> Then I've changed several parts of the C code to try and set size_byte==size
> whenever possible (instead of marking the string as unibyte).
Even if size_byte == size, it may contain eight-bit-graphic
characters, and decoding such a string is a valid operation.
And even if size_byte > size, it may contain only ASCII,
eight-bit-graphic, and eight-bit-control charactes. It's
also a valid operation to decode it.
It's not a trivial work to change the current code (in
coding.c) to signal an error safely while doing a code
conversion. So, to check if decoding is valid or not, we
have to check all characters in a string in advance, which,
I think, slows down the operation considerably.
---
Ken'ichi HANDA
address@hidden
- decode-coding-string gone awry?, David Kastrup, 2005/02/12
- Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/13
- Re: decode-coding-string gone awry?, David Kastrup, 2005/02/13
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/15
- Re: decode-coding-string gone awry?, David Kastrup, 2005/02/15
- Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/15
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/17
- Re: decode-coding-string gone awry?,
Kenichi Handa <=
- Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/17
- Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/18
- Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/18
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/19
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/18
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/19
- Re: decode-coding-string gone awry?, Kenichi Handa, 2005/02/20
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/22
- Re: decode-coding-string gone awry?, Richard Stallman, 2005/02/18
Re: decode-coding-string gone awry?, Stefan Monnier, 2005/02/14