[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vile] problem with 'wide characters' (utf-8) under macosx

From: j. van den hoff
Subject: Re: [vile] problem with 'wide characters' (utf-8) under macosx
Date: Sat, 06 Dec 2014 23:43:01 +0100
User-agent: Opera Mail/12.12 (MacIntel)

On Sat, 06 Dec 2014 22:53:46 +0100, Thomas Dickey <address@hidden> wrote:

On Sat, Dec 06, 2014 at 09:45:04PM +0100, j. van den hoff wrote:
>In its preferences ("Advanced" tab), I have
>    Character encoding: Unicode (UTF-8)
>    Set locale environment variables on startup

I have exactly the same there but end up with the strange `locale' settings
including LC_CTYPE=UTF-8.  this definitely is no longer a vile related
question but do you have any idea from where derives it's
information _what_ locale environement vars to set (even in your case they are not the same -- with the lucky exception of LC_CTYPE -- as in uxterm).

hmm - no, I don't... When I setup my Mac's (both macmini servers), I didn't
delve into its locale settings.  In the system preferences, I see the
language/region part, which is English/United States - which is probably
where gets its information from.  I do recall that initially

that sounds probable. and maybe I managed to confuse it in that area using
a custom setup declaring region as 'Germany' and currency 'Euro' but
using English as format language and
using `.' as decimal separator -- might be the reason
the used algorithm gave up to decide what sort of LC_CTYPE localisation
is right and fell back to just setting it to `UTF-8'.

OSX wasn't making useful locale settings that I could pass via ssh -- I used
to just override it on the remote end.  I'm running last year's release
Mavericks on both machines (am still seeing X as buggy in yosemite).

good to know. I, too, stick with 10.9.5, waiting until yosemite has converged
to something really usable (hopefully).

>Generally I don't set locale variables in my shell startup scripts
>(for special cases, I set those in scripts around certain programs).
>>which conforms to what I can select under "character encoding" in the
>>`preferences' settings of that program. so it's not exactly the same
>>locale but my (limited) understanding of these things is that "UTF-8"
>>alone should suffice and the country specfic qualifier (de_DE
>>for me) has
>>not much of an influence? (and both terminals identify as xterm-color).
>Not exactly. One might suppose that the names are well-standardized, but
>they are not.  By itself, for instance, "UTF-8" as a locale
>setting likely
>refers to an alias.  The names that I'm accustomed to using are
>those found
>using "locale -a".

understood -- but I have no idea whatsoever _how_ that `locale' setting
in comes about ...

>vile's different from the other editors because it will (if available)
>use the "de_DE" to infer a useful value for the "8bit" encoding.
>(vile has built-in locale tables in case "de_DE" itself is not supplied
>on your machine, so that can do this - about 70kb).

I see.

right.  If it cannot find the useful value, then it falls back to POSIX
or Latin-1, depending. The ":show-printable" I saw today looked like POSIX.

(I probably should revisit this and attempt to improve it - but the port
is old, too ...)

maybe you could initiate that they use the current version?

>I experimented a little, and see that your locale settings are confusing
>vile.  You can see this best by ":show-printable" and looking at
>the bottom
>of the page (codes are showing as hexadecimal).
>Using "de_DE.UTF-8" throughout (actually LC_CTYPE is the important one),
>I don't see the hexadecimal characters in "9.8" or the current version.

I see something similar but not quite: in and
with the `UTF-8' value for LC_CTYPE I can hexcodes for positions
128-159 (\x80 - \x9F)
and a verbatim `?' for positions 160-255. If I then manually set
LC_CTYPE=de_DE.UTF-8 in that window and restart vile I

1) still get the hexcodes for pos. 128-159 (but the same happens in urxvt)
2) get regular chars for 160-255
3) most important: the `??' problem when entering diacritical
characters such as ü vanishes

only problem: I don't see any way to convince to use a
valid (fully qualified) value for LC_CTYPE
in the first place...

>This might be related to the "??" problem - I'm not sure.

bingo ;-), see above (thanks!). the whole remains confusing for me, though.
for one, I don't understand in which way the LC_CTYPE=UTF-8 setting is
confusing vile (since as explained at least after a redraw the entered `ü' (and similar) are rendered correctly in the buffer (while not being displayed in the show-printable output). but obviously there'll be some hidden reason for this. the other thing which remains unclear for me is how I manage to end up with LC_CTYPE=UTF-8 in in the first place. but that's
probably not a problem for this list...

>("xterm-color" is problematic as well - a different topic).

is it? what do you recommend here?

The closest for would be from ncurses:
which I added in 2012.

But for Mac's that's hard:
        a) the terminal database in /usr/share/terminfo is very old.
        b) only has certain settings (actually in Mavericks,
           none are "xterm-color").  It does have "nsterm", which was

$TERM is reported as `xterm-color' in the window irrespective of whether it is declared as `xterm-256color' or `nsterm'
in the prefereneces. not sure what actually is going on here. :-(.

           added to ncurses in 2001.  A quick check with that entry shows
           some problem with line-drawing.
        c) Unlike the Linux and *BSD's, the port for ncurses is still 5.9
           release (2011).  That includes the terminal database in

As xterm-color, the function keys are half-right.

OK, thx for clarifying.

Using Opera's revolutionary email client:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]