[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: string-as-unibyte

From: YAMAMOTO Mitsuharu
Subject: Re: string-as-unibyte
Date: Tue, 19 Jul 2005 07:41:33 +0900
User-agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/22.0.50 (sparc-sun-solaris2.8) MULE/5.0 (SAKAKI)

>>>>> On Mon, 18 Jul 2005 17:33:02 -0400, Stefan Monnier <address@hidden> said:

> Could you explain the need for the change below:

> 2005-07-16 YAMAMOTO Mitsuharu <address@hidden>

>       * mac.c [TARGET_API_MAC_CARBON] (Fmac_code_convert_string):
> Use Fstring_as_unibyte instead of string_make_unibyte.

It is at the preparation stage of code conversion.  So I think the
following comment in decode_coding_string (coding.c) is also
applicable to this case.

      /* Decoding routines expect the source text to be unibyte.  */
      str = Fstring_as_unibyte (str);

> My experience is that string-as-unibyte is extremely rarely the
> right answer to solve a problem.  If you described your motivation,
> I could add a comment in the code making it clear why this is needed
> here (or else come up with a better solution).

I was trying to make a coding system that almost works as utf-8, but
additionally does "HFS+ composition" (canonical composition with some
exclusions) on decoding.

                                     YAMAMOTO Mitsuharu

;; For the Carbon port, Mac OS X 10.2 or later.
 (coding-system-mnemonic 'utf-8)
 "Like utf-8, but additionally does Mac HFS+ composition on decoding."
 (coding-system-flags 'utf-8)
 (list (cons 'safe-charsets (coding-system-get 'utf-8 'safe-charsets))
       '(post-read-conversion . mac-hfs+-post-read-conversion)
       '(pre-write-conversion . mac-hfs+-pre-write-conversion)))

(defun mac-hfs+-post-read-conversion (length)
      (narrow-to-region (point) (+ (point) length))
      (let ((str (mac-code-convert-string (buffer-string)
                                          'utf-8 'utf-8 'HFS+C)))
        (when str
          (insert (if enable-multibyte-characters
                      (string-as-multibyte str) str)))
        (setq length (decode-coding-region (point-min) (point-max) 'utf-8))
        ;; We are inside a post-read-conversion function, so the
        ;; original post-read-conversion for utf-8 is not
        ;; automatically called.
        (goto-char (point-min))
        (funcall (or (coding-system-get 'utf-8 'post-read-conversion)

(defun mac-hfs+-pre-write-conversion (beg end)
  (funcall (or (coding-system-get 'utf-8 'pre-write-conversion) 'ignore)
           beg (+ beg (encode-coding-region beg end 'utf-8))))

(setq default-file-name-coding-system 'mac-hfs+)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]