vile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vile] problem with 'wide characters' (utf-8) under macosx


From: Thomas Dickey
Subject: Re: [vile] problem with 'wide characters' (utf-8) under macosx
Date: Sat, 6 Dec 2014 12:46:06 -0500
User-agent: Mutt/1.5.20 (2009-06-14)

On Sat, Dec 06, 2014 at 11:15:36AM +0100, j. van den hoff wrote:
> forgot to Cc the list. sorry for the noise, brendan ....
> 
> On Sat, 06 Dec 2014 07:49:29 +0100, Brendan O'Dea <address@hidden> wrote:
> 
> >On 6 December 2014 at 09:39, j. van den hoff
> ><address@hidden> wrote:
> >>[...] I want to use it in the native `Terminal.app' coming
> >>with macos. here's the problem: despite `Terminal.app' being set up for
> >>utf-8 character encoding, vile displays
> >>non-ascii characters by their hexcode such as \u00E4. [...]
> >
> >Hi Joerg,
> 
> hi brendan,
> 
> >
> >Could you paste the contents of the buffer produced by :show-variables
> >when you are in a file which has such a problem?
> 
> sure. I've saved this list for _both_ cases, editing from within urxvt
> (where everything is fine and
>   from within `Terminal.app' (where it is displaying the utf-8 hexcodes). I
> here only list the differences:
> 
> urxvt:                           Terminal.app:
> ======                           =============
> $curcol = 1                   |  $curcol = 6
> $encoding =                   |  $encoding = UTF-8
> $lcols = 9                    |  $lcols = 14
> $locale = de_DE               |  $locale = UTF-8
> $pagelen = 50                 |  $pagelen = 56
> $pagewid = 141                |  $pagewid = 181
> $pid = 33249                  |  $pid = 33243
> $term-cols = 141              |  $term-cols = 181
> $term-encoding = utf-8        |  $term-encoding = locale
> $term-lines = 50              |  $term-lines = 56
> $wlines = 48                  |  $wlines = 54

Testing the port (which seems to be old - "9.8" - "9.8o" is current),
I don't see any encoding differences.
 
> most of these differences are obviously irrelevant but the encoding
> related values differ, too...
> I think I will start to read up, what exactly they mean in `vile --help'
> ...
> 
> >
> >The output of the locale command from the shell and the value of $TERM
> >may also be useful.
> 
> the problem might lie in this area. in urxvt I get
> 
> LANG=
> LC_COLLATE="C"
> LC_CTYPE="de_DE.UTF-8"
> LC_MESSAGES="C"
> LC_MONETARY="C"
> LC_NUMERIC="C"
> LC_TIME="C"
> LC_ALL=

I have something comparable in uxterm (started in OSX):

        LANG=
        LC_COLLATE="C"
        LC_CTYPE="en_US.UTF-8"
        LC_MESSAGES="C"
        LC_MONETARY="C"
        LC_NUMERIC="C"
        LC_TIME="C"
        LC_ALL=

However - see below.

> where I explicitely set LC_CTYPE to that value in (the equivalent of)
> .xinitrc so that it is defined when the x11 window manager starts up (but
> is ignored, of course by Terminal.app...) in Terminal.app I get instead:
> 
> LANG=
> LC_COLLATE="C"
> LC_CTYPE="UTF-8"
> LC_MESSAGES="C"
> LC_MONETARY="C"
> LC_NUMERIC="C"
> LC_TIME="C"
> LC_ALL=

I see - I have this in Terminal.app:

        LANG="en_US.UTF-8"
        LC_COLLATE="en_US.UTF-8"
        LC_CTYPE="en_US.UTF-8"
        LC_MESSAGES="en_US.UTF-8"
        LC_MONETARY="en_US.UTF-8"
        LC_NUMERIC="en_US.UTF-8"
        LC_TIME="en_US.UTF-8"
        LC_ALL=

In its preferences ("Advanced" tab), I have
        Character encoding: Unicode (UTF-8)
        Set locale environment variables on startup

Generally I don't set locale variables in my shell startup scripts
(for special cases, I set those in scripts around certain programs).

> which conforms to what I can select under "character encoding" in the
> `preferences' settings of that program. so it's not exactly the same
> locale but my (limited) understanding of these things is that "UTF-8"
> alone should suffice and the country specfic qualifier (de_DE for me) has
> not much of an influence? (and both terminals identify as xterm-color).

Not exactly.  One might suppose that the names are well-standardized, but
they are not.  By itself, for instance, "UTF-8" as a locale setting likely
refers to an alias.  The names that I'm accustomed to using are those found
using "locale -a".

vile's different from the other editors because it will (if available)
use the "de_DE" to infer a useful value for the "8bit" encoding.
(vile has built-in locale tables in case "de_DE" itself is not supplied
on your machine, so that can do this - about 70kb).

I experimented a little, and see that your locale settings are confusing
vile.  You can see this best by ":show-printable" and looking at the bottom
of the page (codes are showing as hexadecimal).

Using "de_DE.UTF-8" throughout (actually LC_CTYPE is the important one),
I don't see the hexadecimal characters in "9.8" or the current version.

This might be related to the "??" problem - I'm not sure.

("xterm-color" is problematic as well - a different topic).

> the strange thing is that several other editors
> recognize these settings in a way that utf-8 is displayed correctly in
> bother terminal emulators.

-- 
Thomas E. Dickey <address@hidden>
http://invisible-island.net
ftp://invisible-island.net

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]