From 2e0e944840de65936a979b075aa2ea4177f49854 Mon Sep 17 00:00:00 2001 From: Richard Hansen Date: Fri, 3 Jun 2022 01:04:41 -0400 Subject: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte' * doc/lispref/nonascii.texi (Converting Representations): Fix erroneous description of `string-to-unibyte' (it does not signal an error on eight-bit characters) and clarify its behavior. Update documentation of `string-to-multibyte' to match. --- doc/lispref/nonascii.texi | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/doc/lispref/nonascii.texi b/doc/lispref/nonascii.texi index d7d25dc36a..8746b79de8 100644 --- a/doc/lispref/nonascii.texi +++ b/doc/lispref/nonascii.texi @@ -271,20 +271,24 @@ Converting Representations @defun string-to-multibyte string This function returns a multibyte string containing the same sequence of characters as @var{string}. If @var{string} is a multibyte string, -it is returned unchanged. The function assumes that @var{string} -includes only @acronym{ASCII} characters and raw 8-bit bytes; the -latter are converted to their multibyte representation corresponding -to the codepoints @code{#x3FFF80} through @code{#x3FFFFF}, inclusive -(@pxref{Text Representations, codepoints}). +it is returned unchanged. Otherwise, byte values @code{#x00} through +@code{#x7F} (@acronym{ASCII} characters) are mapped to their +corresponding codepoints, and byte values @code{#x80} through +@code{#xFF} (eight-bit characters) are mapped to codepoints +@code{#x3FFF80} through @code{#x3FFFFF} (@pxref{Text Representations, +codepoints}). @end defun @defun string-to-unibyte string This function returns a unibyte string containing the same sequence of -characters as @var{string}. It signals an error if @var{string} -contains a non-@acronym{ASCII} character. If @var{string} is a -unibyte string, it is returned unchanged. Use this function for -@var{string} arguments that contain only @acronym{ASCII} and eight-bit -characters. +characters as @var{string}. If @var{string} is a unibyte string, it +is returned unchanged. Otherwise, codepoints @code{#x00} through +@code{#x7F} (@acronym{ASCII} characters) are mapped to their +corresponding byte values, and codepoints @code{#x3FFF80} through +@code{#x3FFFFF} (eight-bit characters) are mapped to byte values +@code{#x80} through @code{#xFF} (@pxref{Text Representations, +codepoints}). It signals an error if any other codepoint is +encountered. @end defun @defun byte-to-string byte -- 2.36.1