bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10494: 24.0.92; Syntax table and non-ASCII character interaction


From: Eli Zaretskii
Subject: bug#10494: 24.0.92; Syntax table and non-ASCII character interaction
Date: Thu, 11 Aug 2016 18:24:22 +0300

> From: address@hidden
> Date: Wed, 10 Aug 2016 20:29:05 -0400
> Cc: address@hidden
> 
> I confirm this is still the case in 25.1-rc1.
> 
> Aaron Ecay <address@hidden> writes:
> >
> > This bug relates to setting a non-ASCII character punctuation character
> > (U+2019, which is ’) to have word syntax, and using word-motion
> > commands.  Here’s a recipe from emacs -Q:
> >
> > M-x text-mode
> > don't
> > C-a M-f
> >   -> (as expected, the cursor moves to the end of the line)
> > RET RET
> > don M-x ucs-insert 2019 t
> 
> This should now use insert-char (C-x 8 RET) instead of ucs-insert.
> 
> >   -> (text in buffer: "don’t")
> > C-a M-f
> >   -> (cursor is on the quotation mark, as expected)
> > M-: (modify-syntax-entry ?’ "w" text-mode-syntax-table)
> > C-a M-f
> >   -> (BUG: cursor is on quotation mark, which should count as part of the 
> > word)
> >
> > If you re-run the experiment substituting - for ’ everywhere, there is a
> > difference in behavior – the cursor moves to the end of the line after
> > the call to modify-syntax-entry, as expected.  This leads me to think
> > that the problem has to do with ’ being outside the ASCII charset.

Indeed.  This is a feature: we don't let word-movement commands to
cross into a different script.  IOW, if

  (aref char-script-table C1)

and

  (aref char-script-table C2)

return different values, then we decide that there's a word boundary
between C1 and C2.  See the function word_boundary_p, which is called
from scan_words.

Maybe we should document this somewhere, like the ELisp manual.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]