[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] add 'string-distance' to calculate Levenshtein distance
From: |
Eli Zaretskii |
Subject: |
Re: [PATCH] add 'string-distance' to calculate Levenshtein distance |
Date: |
Sat, 14 Apr 2018 20:08:51 +0300 |
> From: Chen Bin <address@hidden>
> Cc: address@hidden
> Date: Sun, 15 Apr 2018 02:40:18 +1000
>
> Correct me if I'm wrong.
>
> I read cod eand found definion of Lisp_String:
> struct GCALIGNED Lisp_String
> {
> ptrdiff_t size;
> ptrdiff_t size_byte;
> INTERVAL intervals; /* Text properties in this string. */
> unsigned char *data;
> };
>
> I understand string text is encoded in UTF8 format and is stored in
> 'Lisp_String::data'. There is actually no difference between unibyte
> and multibyte text since UTF8 is compatible with ASCII and we only deal
> with 'data' field.
No, that's incorrect. The difference does exist, it just all but
disappear for unibyte strings encoded in UTF-8. But if you encode a
string in some other encoding, like Latin-1, you will see a very
different stream of bytes.
> I attached the latest patch.
Thanks.
> + ;; string containing unicode character (Hanzi)
> + (should (equal 6 (string-distance "ab" "ab我她")))
> + (should (equal 3 (string-distance "我" "她"))))
Should the distance be measured in bytes or in characters? I think
it's the latter, in which case the implementation should work in
characters, not bytes.
- [PATCH] add 'string-distance' to calculate Levenshtein distance, Chen Bin, 2018/04/13
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Eli Zaretskii, 2018/04/14
- Message not available
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Eli Zaretskii, 2018/04/14
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Chen Bin, 2018/04/14
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance,
Eli Zaretskii <=
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Chen Bin, 2018/04/15
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Eli Zaretskii, 2018/04/15
- Message not available
- Message not available
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/16
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Eli Zaretskii, 2018/04/17
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/18
- Message not available
- Message not available
- Message not available
- Message not available
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/17
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, Eli Zaretskii, 2018/04/19
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/19
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/20
- Re: [PATCH] add 'string-distance' to calculate Levenshtein distance, chen bin, 2018/04/20