bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Back to underline


From: Blake McBride
Subject: Re: [Bug-apl] Back to underline
Date: Sun, 16 Aug 2015 06:03:48 -0500

I understand what you are saying.  However, rather then bend outcome to fit technical difficulty or complexity, I prefer to take whatever technical effort it takes to produce the desired outcome.

⍴,'A̲'  and  ⍴,'ä'  should each produce exactly 1 regardless of the underlying technicalities.



On Sun, Aug 16, 2015 at 5:49 AM, Elias Mårtenson <address@hidden> wrote:
On 16 August 2015 at 18:35, Blake McBride <address@hidden> wrote:
My own opinion:

1.  Very strongly -  ⍴,'A̲'  has got to equal 1 no matter what !!

You may think so, but if you want to be consistent on that, you would have to implement a completely new character set and abandon Unicode.

I'll give you an example. What would you want ⍴,'ä' to be?

Right now, that could return either 1 or 2 depending on whether the ä was using the precomposed character (U+00E4) or the combining mark (U+0061, U+0308). Visually, these are identical, and generally you'd expect them to compare equal.

In Unicode, the comparison of equivalent (but with different characters) strings are done by performing a normalisation step prior to comparison. There are 4 different types of normalisation, with different behaviour.

Now, the ä character has a precomposed form in Unicode, and if you couple that with the NFC normalisation form, you'd get the above _expression_ to return 1.

However, the reason for ä working is only because there is a precomposed form available. The combining underline does not have that. So if you want to suggest that the _expression_ applied on an underlined character should return 1, you also have to provide a suggestion as to what ⎕UCS X should return. Remember that ⎕UCS has to satisfy (X=⎕UCS ⎕UCS X).

Regards,
Elias


reply via email to

[Prev in Thread] Current Thread [Next in Thread]