Unibyte characters

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unibyte characters

From:	Eli Zaretskii
Subject:	Unibyte characters
Date:	Fri, 31 Oct 2008 13:05:54 +0200

The ELisp manual has (in node "Text Representation") this explanation
of what is a "unibyte character":

       In unibyte representation, each character occupies one byte and
    therefore the possible character codes range from 0 to 255.  Codes 0
    through 127 are ASCII characters; the codes from 128 through 255 are
    used for one non-ASCII character set [...]

But I think this is inaccurate and even misleading.  For starters,
unibyte buffers and strings can contain DBCS characters and UTF-8
encoded text, where a character certainly does not ``occupy one
byte''.

More generally, I think it is better to say that unibyte buffers and
strings hold raw 8-bit bytes, and that for 8859-x and single-byte
Windows codepages, each such byte represents a single character.

Am I missing something?

[Prev in Thread]

Current Thread

[Next in Thread]

Unibyte characters, Eli Zaretskii <=
- Re: Unibyte characters, Miles Bader, 2008/10/31
  - Re: Unibyte characters, Eli Zaretskii, 2008/10/31
    - Re: Unibyte characters, Stefan Monnier, 2008/10/31
    - Re: Unibyte characters, Juanma Barranquero, 2008/10/31
    - Re: Unibyte characters, Eli Zaretskii, 2008/10/31
    - Re: Unibyte characters, Stefan Monnier, 2008/10/31
- Re: Unibyte characters, Richard M. Stallman, 2008/10/31

Prev by Date: Re: emacsclient on WXP does not work (as it used to)
Next by Date: Re: emacsclient on WXP does not work (as it used to)
Previous by thread: emacsclient on WXP does not work (as it used to)
Next by thread: Re: Unibyte characters
Index(es):
- Date
- Thread