vile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vile] problem with 'wide characters' (utf-8) under macosx


From: Thomas Dickey
Subject: Re: [vile] problem with 'wide characters' (utf-8) under macosx
Date: Sat, 6 Dec 2014 16:53:46 -0500
User-agent: Mutt/1.5.20 (2009-06-14)

On Sat, Dec 06, 2014 at 09:45:04PM +0100, j. van den hoff wrote:
> >In its preferences ("Advanced" tab), I have
> >     Character encoding: Unicode (UTF-8)
> >     Set locale environment variables on startup
> 
> I have exactly the same there but end up with the strange `locale' settings
> including LC_CTYPE=UTF-8.  this definitely is no longer a vile related
> question but do you have any idea from where Terminal.app derives it's
> information _what_ locale environement vars to set (even in your case they
> are not the same -- with the lucky exception of LC_CTYPE -- as in uxterm).

hmm - no, I don't...  When I setup my Mac's (both macmini servers), I didn't
delve into its locale settings.  In the system preferences, I see the
language/region part, which is English/United States - which is probably
where Terminal.app gets its information from.  I do recall that initially
OSX wasn't making useful locale settings that I could pass via ssh -- I used
to just override it on the remote end.  I'm running last year's release
Mavericks on both machines (am still seeing X as buggy in yosemite).
 
> >Generally I don't set locale variables in my shell startup scripts
> >(for special cases, I set those in scripts around certain programs).
> >
> >>which conforms to what I can select under "character encoding" in the
> >>`preferences' settings of that program. so it's not exactly the same
> >>locale but my (limited) understanding of these things is that "UTF-8"
> >>alone should suffice and the country specfic qualifier (de_DE
> >>for me) has
> >>not much of an influence? (and both terminals identify as xterm-color).
> >
> >Not exactly.  One might suppose that the names are well-standardized, but
> >they are not.  By itself, for instance, "UTF-8" as a locale
> >setting likely
> >refers to an alias.  The names that I'm accustomed to using are
> >those found
> >using "locale -a".
> 
> understood -- but I have no idea whatsoever _how_ that `locale' setting
> in Terminal.app comes about ...
> 
> >
> >vile's different from the other editors because it will (if available)
> >use the "de_DE" to infer a useful value for the "8bit" encoding.
> >(vile has built-in locale tables in case "de_DE" itself is not supplied
> >on your machine, so that can do this - about 70kb).
> 
> I see.

right.  If it cannot find the useful value, then it falls back to POSIX
or Latin-1, depending.  The ":show-printable" I saw today looked like POSIX.

(I probably should revisit this and attempt to improve it - but the port
is old, too ...)
 
> >I experimented a little, and see that your locale settings are confusing
> >vile.  You can see this best by ":show-printable" and looking at
> >the bottom
> >of the page (codes are showing as hexadecimal).
> >
> >Using "de_DE.UTF-8" throughout (actually LC_CTYPE is the important one),
> >I don't see the hexadecimal characters in "9.8" or the current version.
> 
> I see something similar but not quite: in Terminal.app and
> with the `UTF-8' value for LC_CTYPE I can hexcodes for positions
> 128-159 (\x80 - \x9F)
> and a verbatim `?' for positions 160-255. If I then manually set
> LC_CTYPE=de_DE.UTF-8 in that Terminal.app window and restart vile I
> 
> 1) still get the hexcodes for pos. 128-159 (but the same happens in urxvt)
> 2) get regular chars for 160-255
> 3) most important: the `??' problem when entering diacritical
> characters such as ü vanishes
> 
> only problem: I don't see any way to convince Terminal.app to use a
> valid (fully qualified) value for LC_CTYPE
> in the first place...
> 
> >
> >This might be related to the "??" problem - I'm not sure.
> 
> bingo ;-), see above (thanks!).  the whole remains confusing for me, though. 
> for one, I don't understand in which way the LC_CTYPE=UTF-8 setting is
> confusing vile (since as explained at least after a redraw the entered `ü'
> (and similar) are rendered correctly in the buffer (while not being displayed
> in the show-printable output).  but obviously there'll be some hidden reason
> for this.  the other thing which remains unclear for me is how I manage to
> end up with LC_CTYPE=UTF-8 in Terminal.app in the first place.  but that's
> probably not a problem for this list...
> 
> >
> >("xterm-color" is problematic as well - a different topic).
> 
> is it? what do you recommend here?

The closest for Terminal.app would be from ncurses:
        nsterm-256color
which I added in 2012.

But for Mac's that's hard:
        a) the terminal database in /usr/share/terminfo is very old.
        b) Terminal.app only has certain settings (actually in Mavericks,
           none are "xterm-color").  It does have "nsterm", which was
           added to ncurses in 2001.  A quick check with that entry shows
           some problem with line-drawing.
        c) Unlike the Linux and *BSD's, the port for ncurses is still 5.9
           release (2011).  That includes the terminal database in
           /opt/local/share/terminfo

As xterm-color, the function keys are half-right.

-- 
Thomas E. Dickey <address@hidden>
http://invisible-island.net
ftp://invisible-island.net

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]