qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] python/utils: add enboxify() text decoration utility


From: John Snow
Subject: Re: [PATCH 1/4] python/utils: add enboxify() text decoration utility
Date: Wed, 16 Feb 2022 11:16:00 -0500


On Tue, Feb 15, 2022, 6:57 PM Philippe Mathieu-Daudé <f4bug@amsat.org> wrote:
On 16/2/22 00:53, John Snow wrote:
> On Tue, Feb 15, 2022 at 5:55 PM Eric Blake <eblake@redhat.com> wrote:
>>
>> On Tue, Feb 15, 2022 at 05:08:50PM -0500, John Snow wrote:
>>>>>> print(enboxify(msg, width=72, name="commit message"))
>>> ┏━ commit message ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
>>> ┃ enboxify() takes a chunk of text and wraps it in a text art box that ┃
>>> ┃  adheres to a specified width. An optional title label may be given, ┃
>>> ┃  and any of the individual glyphs used to draw the box may be        ┃
>>
>> Why do these two lines have a leading space,
>>
>>> ┃ replaced or specified as well.                                       ┃
>>
>> but this one doesn't?  It must be an off-by-one corner case when your
>> choice of space to wrap on is exactly at the wrap column.
>>
>
> Right, you're probably witnessing the right-pad *and* the actual space.
>
>>> ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   python/qemu/utils/__init__.py | 58 +++++++++++++++++++++++++++++++++++
>>>   1 file changed, 58 insertions(+)

>>> +    def _wrap(line: str) -> str:
>>> +        return os.linesep.join([
>>> +            wrapped_line.ljust(lwidth) + suffix
>>> +            for wrapped_line in textwrap.wrap(
>>> +                    line, width=lwidth, initial_indent=prefix,
>>> +                    subsequent_indent=prefix, replace_whitespace=False,
>>> +                    drop_whitespace=False, break_on_hyphens=False)
>>
>> Always nice when someone else has written the cool library function to
>> do all the hard work for you ;)  But this is probably where you have the off-by-one I called out above.
>>
>
> Yeah, I just didn't want it to eat multiple spaces if they were
> present -- I wanted it to reproduce them faithfully. The tradeoff is
> some silliness near the margins.
>
> Realistically, if I want something any better than what I've done
> here, I should find a library to do it for me instead -- but for the
> sake of highlighting some important information, this may be
> just-enough-juice.

's/^┃  /┃ /' on top ;D

I have to admit that this function is actually very fragile. Last night, I did some reading on unicode and emoji encodings and discovered that it's *basically impossible* to predict the "visual width" of a sequence of unicode codepoints.

So, this function as written will only really work if we stick to single-codepoint glyphs that can be rendered 1:1 in a monospace font.

I could probably improve it to work with "some" (but certainly not all) wide glyphs and emoji, but it's a very complex topic and far outside my specialty. Support for multi-codepoint narrow/halfwidth glyphs is also an issue. (This affects some Latin characters outside of ascii if they are encoded using combining codepoints.)

(See https://hsivonen.fi/string-length/ ... It's nasty.)

So I must admit that this function has some very serious limitations to it. I want to explain why I wrote it, though.

First: Tracebacks make people's eyes cross over. It's a very long sequence of mumbo jumbo that most people don't read, because it's program debug information. I don't blame them. Setting apart the error summary visually is a helpful tool for drawing one's eyes to the most critical pieces of information.

Second: In my AQMP library, I use the ascii vertical bar | as a left-hand border decoration to provide a kind of visual quoting mechanism to illustrate in the logfile which subsequent confusing lines of jargon belong to the same log entry. I really like this formatting mechanism, but...

Third: If a line of text becomes so long that it wraps in your terminal, the visual quote mechanism breaks, making the output messy and hard to read. Forcibly re-wrapping the text in a virtual box is a necessary mechanism to preserve readability in this circumstance - the lines from qemu-img et al may be much wider than your terminal column width.

And so, I drew a box instead of just a left border, because I needed to re-wrap the text anyway. Visually, I believed it to help explain that the output was being re-formatted to fit in a certain dimensionality. Unfortunately, it's inadequate.

So ... what to do.

(1) I can just remove the right margin decoration and call the function visual_quote or something. If any of the lines get too "long" because of emoji/日本語, it MAY break the indent line, but occasional uses of one or two wide characters probably won't cause wrapping that breaks the "visual quote line" on a terminal with at least 85 columns. Essentially it'd still be broken, but without a solid right border it'd be harder to notice *small* breakages.

(2) If there is a genuine interest in using visual highlighting techniques to make iotest failures easier to diagnose (and making sure it is properly multilingual), I could use the urwid helper library to estimate visual text width to make drawing terminal boxes more resilient than what I could do on my own power. Downside is a new third party dependency. I already use urwid for the aqmp tui that we're working on, but it's remained an optional dependency so far.

(3) I can take a swing at improving this text decoration utility and having it account for the most basic cases. East Asian language support is a low hanging fruit, though I have only rudimentary familiarity with Hangul. (And virtually no exposure to Thai or other south-eastern Asian scripts.)

(4) Just leave it alone for now, don't you have IDE/FDC patches to work on?

Sigh. The punishment for trying to do something nice is swift.

--js

reply via email to

[Prev in Thread] Current Thread [Next in Thread]