[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ACS_VLINE characters shifted in UTF-8 xterm

From: Sebastian Kayser
Subject: Re: ACS_VLINE characters shifted in UTF-8 xterm
Date: Mon, 06 Apr 2009 22:21:18 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv: Gecko/20090302 Lightning/0.9 Thunderbird/ Mnenhy/

Thomas Dickey wrote:
> On Sun, Apr 05, 2009 at 01:35:49PM +0200, Sebastian Kayser wrote:
>> i have an ncurses-based application that draws a vertical line via
>> ACS_VLINEs to the screen. When i run this application in an UTF-8 xterm
>> (xterm-243) the ACS_VLINEs come out as a kind of a right-shifted
>> staircase. When i set NCURSES_NO_UTF8_ACS=1 or when i run the
>> application in dtterm the vertical line looks fine.
>> The application in question is mcabber [1], but i can recreate this
>> issue with a few lines of code [2]. It doesn't matter whether i link
>> against a regular ncurses or a --enable-widec one.
>> I have tried to wrap my head around this and read the relevant ncurses
>> FAQ items. From what i understand xterm is capable of interpreting the
>> ACS form characters just fine even in UTF-8 mode (so no need to set
>> NCURSES_NO_UTF8_ACS). I have truss'ed the application to see how the
>> vertical line gets written to the terminal.
>>   1B [ H     [ s t a t u s ]                          0E x0F1B [ m
>>   1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [
>>    m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B
>>    [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F
>>   1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x0F1B [ m1B [ 1 B\b0E x
>> So it is basically a sequence of "smacs, x, rmacs, sgr0, cud, \b"
>> sequences to draw the line. When i manually echo ACS_VLINE characters to
>> the terminal i can see that they consume two columns instead of one and
>> i suppose that's why the \b isn't sufficient to move completely backwards.
> Here's a guess:

First of all, thanks for your response. Indeed, with the -mk_width
option to xterm the vertical line characters only consume one column and
the vertical line shows up properly aligned.

Now i am just trying to understand what is going on here. :) Please
correct me if anything of the following is wrong.

> Both ncurses and xterm check the widths of characters using wcwidth.
> However, ncurses doesn't check the ACS_xxx widths (lots of things
> would break if they're allowed to be something other than 1).
> xterm may internally translate the VT100-style line-drawing to Unicode
> values (see charsets.c around line 310), and if wcwidth returns an
> unexpected value for that, it could produce the sort of effect
> you're describing.

So ncurses assumes ACS_VLINE to be 1 column wide and hence emits only a
single \b to move the cursor backwards.

On the other hand, xterm converts ACS_VLINE to 0x2502 ("BOX DRAWINGS
LIGHT VERTICAL") and calls wcwidth() to determine the width of this
character. According to Solaris locale handling it is 2 columns wide,
which is i guess one thing why you say Solaris locales have some quirks,
as this character would perfectly fit into one column?

So now that xterm has put the vertical line character in two columns,
the single \b from ncurses only goes back one column (\b is always
column-oriented), which is not enough to line up with the vertical line
character in the line above.

>> I am using Solaris 10 in case that matters, LC_CTYPE is set to
>> en_US.UTF-8. It seems as if a missing only a small part of the puzzle.
>> What's wrong with those vertical line characters and my UTF-8 xterm?
> Solaris locale support seems to have some quirks (I could digress).
> I'd try setting the mkWidth resource, which _should_ tell xterm to
> use its built-in locale table.
> There's a command-line option (-mk_width).

Finally, with the -mk_width option the xterm uses its internal
mk_wcwidth() to determine the column width and this returns 1 for
0x2502, so now xterm and ncurses go hand in hand WRT to the width of the
vertical line and everything is fine (... dtterm had been fine from the
start because it doesn't translate the ACS_xxx chars into Unicode

Would you say that -mk_width / mkWidth is a reasonable default for xterm
and UTF-8 locales in Solaris? There is a bug entry at Sun Solve [1]
(only accessible to Sun customers :/, bug description at [2]) that
explains the rationale behind the mkwidth() behaviour and it seems to be
because the _major_ target for localization is Asia and there some font
codepoints in the range of box characters do actually consume two
columns. Not quite sure, whether i got that right there. Please feel
free to digress about Solaris locale support ;)

On a related note, i had a look at vttest as well as the UTF-8 demo file
from Markus Kuhn after setting -mk_width. Both look much better now in
xterm, the box alignment tests in particular. Again, thanks for pointing
me in the right direction.


[1] http://sunsolve.sun.com/search/document.do?assetkey=1-1-4474512-1
[2] http://bugs.opensolaris.org/view_bug.do?bug_id=4474512

reply via email to

[Prev in Thread] Current Thread [Next in Thread]