[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Emacs-diffs] Changes to emacs/lispref/nonascii.texi
From: |
Eli Zaretskii |
Subject: |
[Emacs-diffs] Changes to emacs/lispref/nonascii.texi |
Date: |
Sun, 02 Nov 2003 01:46:27 -0500 |
Index: emacs/lispref/nonascii.texi
diff -c emacs/lispref/nonascii.texi:1.39 emacs/lispref/nonascii.texi:1.40
*** emacs/lispref/nonascii.texi:1.39 Mon Oct 6 12:59:45 2003
--- emacs/lispref/nonascii.texi Sun Nov 2 01:29:58 2003
***************
*** 4,14 ****
@c See the file elisp.texi for copying conditions.
@setfilename ../info/characters
@node Non-ASCII Characters, Searching and Matching, Text, Top
! @chapter address@hidden Characters
@cindex multibyte characters
! @cindex address@hidden characters
! This chapter covers the special issues relating to address@hidden
characters and how they are stored in strings and buffers.
@menu
--- 4,14 ----
@c See the file elisp.texi for copying conditions.
@setfilename ../info/characters
@node Non-ASCII Characters, Searching and Matching, Text, Top
! @chapter address@hidden Characters
@cindex multibyte characters
! @cindex address@hidden characters
! This chapter covers the special issues relating to address@hidden
characters and how they are stored in strings and buffers.
@menu
***************
*** 44,51 ****
@cindex unibyte text
In unibyte representation, each character occupies one byte and
therefore the possible character codes range from 0 to 255. Codes 0
! through 127 are @sc{ascii} characters; the codes from 128 through 255
! are used for one address@hidden character set (you can choose which
character set by setting the variable @code{nonascii-insert-offset}).
@cindex leading code
--- 44,51 ----
@cindex unibyte text
In unibyte representation, each character occupies one byte and
therefore the possible character codes range from 0 to 255. Codes 0
! through 127 are @acronym{ASCII} characters; the codes from 128 through 255
! are used for one address@hidden character set (you can choose which
character set by setting the variable @code{nonascii-insert-offset}).
@cindex leading code
***************
*** 134,147 ****
acceptable because the buffer's representation is a choice made by the
user that cannot be overridden automatically.
! Converting unibyte text to multibyte text leaves @sc{ascii} characters
unchanged, and likewise character codes 128 through 159. It converts
! the address@hidden codes 160 through 255 by adding the value
@code{nonascii-insert-offset} to each character code. By setting this
variable, you specify which character set the unibyte characters
correspond to (@pxref{Character Sets}). For example, if
@code{nonascii-insert-offset} is 2048, which is @code{(- (make-char
! 'latin-iso8859-1) 128)}, then the unibyte address@hidden characters
correspond to Latin 1. If it is 2688, which is @code{(- (make-char
'greek-iso8859-7) 128)}, then they correspond to Greek letters.
--- 134,147 ----
acceptable because the buffer's representation is a choice made by the
user that cannot be overridden automatically.
! Converting unibyte text to multibyte text leaves @acronym{ASCII} characters
unchanged, and likewise character codes 128 through 159. It converts
! the address@hidden codes 160 through 255 by adding the value
@code{nonascii-insert-offset} to each character code. By setting this
variable, you specify which character set the unibyte characters
correspond to (@pxref{Character Sets}). For example, if
@code{nonascii-insert-offset} is 2048, which is @code{(- (make-char
! 'latin-iso8859-1) 128)}, then the unibyte address@hidden characters
correspond to Latin 1. If it is 2688, which is @code{(- (make-char
'greek-iso8859-7) 128)}, then they correspond to Greek letters.
***************
*** 153,162 ****
text.
@defvar nonascii-insert-offset
! This variable specifies the amount to add to a address@hidden character
when converting unibyte text to multibyte. It also applies when
@code{self-insert-command} inserts a character in the unibyte
! address@hidden range, 128 through 255. However, the functions
@code{insert} and @code{insert-char} do not perform this conversion.
The right value to use to select character set @var{cs} is @code{(-
--- 153,162 ----
text.
@defvar nonascii-insert-offset
! This variable specifies the amount to add to a address@hidden character
when converting unibyte text to multibyte. It also applies when
@code{self-insert-command} inserts a character in the unibyte
! address@hidden range, 128 through 255. However, the functions
@code{insert} and @code{insert-char} do not perform this conversion.
The right value to use to select character set @var{cs} is @code{(-
***************
*** 263,269 ****
values in that range are valid. The values 128 through 255 are not
entirely proper in multibyte text, but they can occur if you do explicit
encoding and decoding (@pxref{Explicit Encoding}). Some other character
! codes cannot occur at all in multibyte text. Only the @sc{ascii} codes
0 through 127 are completely legitimate in both representations.
@defun char-valid-p charcode &optional genericp
--- 263,269 ----
values in that range are valid. The values 128 through 255 are not
entirely proper in multibyte text, but they can occur if you do explicit
encoding and decoding (@pxref{Explicit Encoding}). Some other character
! codes cannot occur at all in multibyte text. Only the @acronym{ASCII} codes
0 through 127 are completely legitimate in both representations.
@defun char-valid-p charcode &optional genericp
***************
*** 301,308 ****
characters, generally known as Big 5, is divided into two Emacs
character sets, @code{chinese-big5-1} and @code{chinese-big5-2}.
! @sc{ascii} characters are in character set @code{ascii}. The
! address@hidden characters 128 through 159 are in character set
@code{eight-bit-control}, and codes 160 through 255 are in character set
@code{eight-bit-graphic}.
--- 301,308 ----
characters, generally known as Big 5, is divided into two Emacs
character sets, @code{chinese-big5-1} and @code{chinese-big5-2}.
! @acronym{ASCII} characters are in character set @code{ascii}. The
! address@hidden characters 128 through 159 are in character set
@code{eight-bit-control}, and codes 160 through 255 are in character set
@code{eight-bit-graphic}.
***************
*** 336,343 ****
@cindex dimension (of character set)
In multibyte representation, each character occupies one or more
bytes. Each character set has an @dfn{introduction sequence}, which is
! normally one or two bytes long. (Exception: the @sc{ascii} character
! set and the @sc{eight-bit-graphic} character set have a zero-length
introduction sequence.) The introduction sequence is the beginning of
the byte sequence for any character in the character set. The rest of
the character's bytes distinguish it from the other characters in the
--- 336,343 ----
@cindex dimension (of character set)
In multibyte representation, each character occupies one or more
bytes. Each character set has an @dfn{introduction sequence}, which is
! normally one or two bytes long. (Exception: the @code{ascii} character
! set and the @code{eight-bit-graphic} character set have a zero-length
introduction sequence.) The introduction sequence is the beginning of
the byte sequence for any character in the character set. The rest of
the character's bytes distinguish it from the other characters in the
***************
*** 426,433 ****
@result{} (latin-iso8859-1 0)
@end example
! The character sets @sc{ascii}, @sc{eight-bit-control}, and
! @sc{eight-bit-graphic} don't have corresponding generic characters. If
@var{charset} is one of them and you don't supply @var{code1},
@code{make-char} returns the character code corresponding to the
smallest code in @var{charset}.
--- 426,433 ----
@result{} (latin-iso8859-1 0)
@end example
! The character sets @code{ascii}, @code{eight-bit-control}, and
! @code{eight-bit-graphic} don't have corresponding generic characters. If
@var{charset} is one of them and you don't supply @var{code1},
@code{make-char} returns the character code corresponding to the
smallest code in @var{charset}.
***************
*** 744,750 ****
return value is just one coding system, the one that is highest in
priority.
! If the region contains only @sc{ascii} characters, the value
is @code{undecided} or @code{(undecided)}.
@end defun
--- 744,750 ----
return value is just one coding system, the one that is highest in
priority.
! If the region contains only @acronym{ASCII} characters, the value
is @code{undecided} or @code{(undecided)}.
@end defun
***************
*** 846,857 ****
expression that matches certain file names. The element applies to file
names that match @var{pattern}.
! The @sc{cdr} of the element, @var{coding}, should be either a coding
system, a cons cell containing two coding systems, or a function name (a
symbol with a function definition). If @var{coding} is a coding system,
that coding system is used for both reading the file and writing it. If
! @var{coding} is a cons cell containing two coding systems, its @sc{car}
! specifies the coding system for decoding, and its @sc{cdr} specifies the
coding system for encoding.
If @var{coding} is a function name, the function must return a coding
--- 846,857 ----
expression that matches certain file names. The element applies to file
names that match @var{pattern}.
! The @acronym{CDR} of the element, @var{coding}, should be either a coding
system, a cons cell containing two coding systems, or a function name (a
symbol with a function definition). If @var{coding} is a coding system,
that coding system is used for both reading the file and writing it. If
! @var{coding} is a cons cell containing two coding systems, its @acronym{CAR}
! specifies the coding system for decoding, and its @acronym{cdr} specifies the
coding system for encoding.
If @var{coding} is a function name, the function must return a coding
***************
*** 975,981 ****
@example
;; @r{Read the file with no character code conversion.}
! ;; @r{Assume @sc{crlf} represents end-of-line.}
(let ((coding-system-for-write 'emacs-mule-dos))
(insert-file-contents filename))
@end example
--- 975,981 ----
@example
;; @r{Read the file with no character code conversion.}
! ;; @r{Assume @acronym{crlf} represents end-of-line.}
(let ((coding-system-for-write 'emacs-mule-dos))
(insert-file-contents filename))
@end example
***************
*** 1175,1183 ****
@section Input Methods
@cindex input methods
! @dfn{Input methods} provide convenient ways of entering address@hidden
characters from the keyboard. Unlike coding systems, which translate
! address@hidden characters to and from encodings meant to be read by
programs, input methods provide human-friendly commands. (@xref{Input
Methods,,, emacs, The GNU Emacs Manual}, for information on how users
use input methods to enter text.) How to define input methods is not
--- 1175,1183 ----
@section Input Methods
@cindex input methods
! @dfn{Input methods} provide convenient ways of entering address@hidden
characters from the keyboard. Unlike coding systems, which translate
! address@hidden characters to and from encodings meant to be read by
programs, input methods provide human-friendly commands. (@xref{Input
Methods,,, emacs, The GNU Emacs Manual}, for information on how users
use input methods to enter text.) How to define input methods is not
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Emacs-diffs] Changes to emacs/lispref/nonascii.texi,
Eli Zaretskii <=