help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

call-process -> insert -> iso-latin-1-dos problem on Windows


From: Eduardo Ochs
Subject: call-process -> insert -> iso-latin-1-dos problem on Windows
Date: Sun, 14 Jan 2024 20:09:41 -0300

Hi list,

I have a function called `find-wget' that works well in *NIX-like
systems - it calls wget, puts the output in the temporary buffer, and
on unices Emacs always chooses the right encoding... but when I run it
on Windows, and I call wget like this,

  wget -q -O - http://anggtwu.net/LUA/Dang1.lua

where Dang1.lua is a file in UTF-8, then Emacs switches the encoding
of output buffer to iso-latin-1-dos...

I probably wrote my code relying in undefined behaviors... any
suggestions on how to fix it? I'm attaching the file with the test and
the comments below, and it's also here:

  http://anggtwu.net/elisp/find-wget-jan-2024.el.html
  http://anggtwu.net/elisp/find-wget-jan-2024.el

Thanks in advance =/,
  Eduardo Ochs
  http://anggtwu.net/eepitch.html
  http://anggtwu.net/#eev


--snip--snip--

;; This is a simplified version of the `find-wget' from eev:
;;
;;   http://anggtwu.net/eev-current/eev-plinks.el.html#find-wget
;;                       (find-eev "eev-plinks.el" "find-wget")
;;
;; Most functions were copied from the source code of eev without
;; changes; only the ones that are marked as "dummified" were replaced
;; by trivial versions.

(defvar ee-wget-program "wget")
(defvar ee-find-callprocess00-exit-status nil)

;; Dummified versions
(defun ee-expand (fname) fname)
(defun ee-goto-rest (list) ())
(defun ee-goto-position (&optional pos-spec &rest rest) ())

(defun find-ebuffer (buffer &rest pos-spec-list)
  "Hyperlink to an Emacs buffer (existing or not)."
  (interactive "bBuffer: ")
  (switch-to-buffer buffer)
  (apply 'ee-goto-position pos-spec-list))

(defun ee-split (str)
  "If STR is a string, split it on whitespace and return the resulting list.
If STR if a list, return it unchanged."
  (if (stringp str)
      (split-string str "[ \t\n]+")
    str))

(defun find-callprocess00-ne (program-and-args)
  (let ((argv (ee-split program-and-args)))
    (with-output-to-string
      (with-current-buffer standard-output
(setq ee-find-callprocess00-exit-status
      (apply 'call-process (car argv) nil t nil (cdr argv)))))))

(defun find-wget (url &rest pos-spec-list)
  "Download URL with \"wget -q -O - URL\" and display the output.
If a buffer named \"*wget: URL*\" already exists then this
function visits it instead of running wget again.
If wget can't download URL then this function runs `error'."
  (let* ((eurl (ee-expand url))
(wgetprogandargs (list ee-wget-program "-q" "-O" "-" eurl))
(wgetbufname (format "*wget: %s*" eurl)))
    (if (get-buffer wgetbufname)
(apply 'find-ebuffer wgetbufname pos-spec-list)
      ;;
      ;; If the buffer wgetbufname doesn't exist, then:
      (let* ((wgetoutput (find-callprocess00-ne wgetprogandargs))
     (wgetstatus ee-find-callprocess00-exit-status))
;;
(if (not (equal wgetstatus 0))
    ;; See: (find-node "(wget)Exit Status")
    (error "wget can't download: %s" eurl))
;;
(find-ebuffer wgetbufname) ; create buffer
(insert wgetoutput)
(goto-char (point-min))
(apply 'ee-goto-position pos-spec-list)))))


;; Test: (eval-buffer)
;;       (find-wget "http://anggtwu.net/LUA/Dang1.lua";)
;;
;; When we run the test above on Debian the double angle brackets in
;; the line 12 of Dang1.lua are displayed correctly as single
;; characters - and when we run `M-x hexlify-buffer' we see that they
;; take are encoded in two bytes each - c2ab and c2bb. From
;; /usr/share/unicode/UnicodeData.txt:
;;
;; 00AB;LEFT-POINTING DOUBLE ANGLE QUOTATION MARK;Pi;0;ON;;;;;Y;LEFT
POINTING GUILLEMET;;;;
;; 00BB;RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK;Pf;0;ON;;;;;Y;RIGHT
POINTING GUILLEMET;;;;
;;
;; When we run the `find-wget' above in Emacs 29 for Windows the
;; resulting buffer is put in the encoding "iso-latin-1-dos". `M-x
;; hexlify-buffer' shows that they are still two bytes each - c2ab and
;; c2bb - but they are displayed as two characters each, preceded by
;; "c2"s::
;;
;; 00C2;LATIN CAPITAL LETTER A WITH CIRCUMFLEX;Lu;0;L;0041
0302;;;;N;LATIN CAPITAL LETTER A CIRCUMFLEX;;;00E2;
;;
;; The wget that I am using on Windows was extracted from this zip:
;;
;;   https://eternallybored.org/misc/wget/releases/wget-1.21.2-win64.zip



reply via email to

[Prev in Thread] Current Thread [Next in Thread]