emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: eight-bit char handling in emacs-unicode


From: Stefan Monnier
Subject: Re: eight-bit char handling in emacs-unicode
Date: 26 Nov 2003 09:14:03 -0500
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

>> but since the output is a unibyte string,
>> that restrict it to cases where the code-points can be encoded in 8 bits,
>> thus it doesn't sound very generic
> Yes.  But I thought generic or not is not a point here.

Except that if it's not generic (in the sense that it does not behave
meaningfully in all language environments), then it can't be used in generic
elisp code, right?

>> and I don't see any application for it
>> (nor do I see any practical difference with using encode-coding-string
>> since the output AFAIK would be the same).

> My examples shows that we can't use encode-coding-string.
> How can we use encode-coding-string without knowing what
> coding system to use?  I haven't heard your answer yet.

I can't answer this question without knowing the answer to my question:
what is string-make-unibyte used for.  I'm not saying that we can do
something like:

  (defun string-make-unibyte (s) (encode-coding-string s <blabla>))

but I'm saying that everywhere where the current string-make-unibyte is
used, we should be able to easily replace it by a call to
encode-coding-string or a code to my make-string-unibyte (which does
not pay attention to the language environment and only accepts multibyte
chars that correspond to bytes, i.e. eight-bit-control or
eight-bit-graphic, or ASCII, and multibyte chars whose internal code point
is 128-255).

> But, my understanding is that
> string-make-unibyte/multibyte are designed not to change the
> number of characters to make the difference of
> unibyte/multibyte transparent in Lisp.

That is indeed an absolute requirement.

>> Of course: that's pretty much what I suggested: make-string-unibyte only
>> accepts multibyte chars that correspond to "bytes".

> I agree with that.  But, it just changes the behaviour of
> the function on error case.  It doesn't change the concept
> of what it does.

Except that I said "byte" not "code point", which makes a difference
in non-latin-1 locales.

>> I don't see any use of string-make-unibyte in your two examples.
> Again, I'd like to ask how to use encode-coding-string
> without knowing the proper coding-system in each case.

How could I know the coding-system to use when replacing
`string-make-unibyte' if I don't have any actual call to
string-make-unibyte to work with ?


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]