[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#29871: 25.3; ZWJ word-boundaries in regexps
From: |
Mark Shoulson |
Subject: |
bug#29871: 25.3; ZWJ word-boundaries in regexps |
Date: |
Wed, 27 Dec 2017 14:07:40 -0500 |
According to http://unicode.org/reports/tr29/#Word_Boundaries rule WB4,
it would seem that a ZWJ character (U+200D ZERO WIDTH JOINER) between
two "word" characters should not constitute a word boundary. And yet:
(string-match "\\<" "foo\u200Dfbar" 1)
evaluates to 4 (the 1 is to skip the word-beginning at the start of the
string). Or you can search for "\\b" or "\\>" and get 3. Either way,
indicative of a word-break at the ZWJ character. Is this correct?
~mark
In GNU Emacs 25.3.1 (x86_64-redhat-linux-gnu, GTK+ Version 3.22.19)
of 2017-09-14 built on buildvm-29.phx2.fedoraproject.org
Configured using:
'configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
--bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
--libexecdir=/usr/libexec --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-dbus --with-gif --with-jpeg --with-png
--with-rsvg --with-tiff --with-xft --with-xpm --with-x-toolkit=gtk3
--with-gpm=no --with-xwidgets --with-modules
build_alias=x86_64-redhat-linux-gnu host_alias=x86_64-redhat-linux-gnu
'CFLAGS=-DMAIL_USE_LOCKF -O2 -g -pipe -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
--param=ssp-buffer-size=4 -grecord-gcc-switches
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic'
LDFLAGS=-Wl,-z,relro
PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'
Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GCONF GSETTINGS NOTIFY
ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES XWIDGETS
Important settings:
value of $LANG: en_US.utf8
value of $XMODIFIERS: @im=none
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent messages:
../../usr/share/emacs/site-lisp/uim-el/uim-key.el: (lambda (x) ...) quoted with
' rather than with #' [3 times]
../../usr/share/emacs/site-lisp/uim-el/uim-preedit.el: (lambda (x) ...) quoted
with ' rather than with #'
../../usr/share/emacs/site-lisp/uim-el/uim-candidate.el: (lambda (x) ...)
quoted with ' rather than with #' [5 times]
../../usr/share/emacs/site-lisp/uim-el/uim-helper.el: (lambda (x) ...) quoted
with ' rather than with #' [2 times]
../../usr/share/emacs/site-lisp/uim-el/uim.el: (lambda (x) ...) quoted with '
rather than with #' [9 times]
../../usr/share/emacs/site-lisp/uim-el/uim-leim.el: (lambda (x) ...) quoted
with ' rather than with #'
uim.el: starting uim-el-helper-agent... done
uim.el: starting uim-el-agent... done
Loading /usr/share/emacs/site-lisp/site-start.d/uim-init.el (source)...done
For information about GNU Emacs and the GNU system, type C-_ C-a.
Load-path shadows:
~/lib/yaml-mode hides /home/mark/.emacs.d/elpa/yaml-mode-20170727.1531/yaml-mode
/usr/share/emacs/site-lisp/site-start.d/maxima-modes hides
/usr/share/emacs/site-lisp/maxima/site_start.d/maxima-modes
~/lib/mwheel hides /usr/share/emacs/25.3/lisp/mwheel
~/lib/css-mode hides /usr/share/emacs/25.3/lisp/textmodes/css-mode
~/lib/cperl-mode hides /usr/share/emacs/25.3/lisp/progmodes/cperl-mode
/home/mark/.emacs.d/elpa/org-20170606/ob-table hides
/usr/share/emacs/25.3/lisp/org/ob-table
/home/mark/.emacs.d/elpa/org-20170606/ob-sass hides
/usr/share/emacs/25.3/lisp/org/ob-sass
/home/mark/.emacs.d/elpa/org-20170606/ob-lilypond hides
/usr/share/emacs/25.3/lisp/org/ob-lilypond
/home/mark/.emacs.d/elpa/org-20170606/org-pcomplete hides
/usr/share/emacs/25.3/lisp/org/org-pcomplete
/home/mark/.emacs.d/elpa/org-20170606/ox-man hides
/usr/share/emacs/25.3/lisp/org/ox-man
/home/mark/.emacs.d/elpa/org-20170606/org-list hides
/usr/share/emacs/25.3/lisp/org/org-list
/home/mark/.emacs.d/elpa/org-20170606/ob-core hides
/usr/share/emacs/25.3/lisp/org/ob-core
/home/mark/.emacs.d/elpa/org-20170606/org-compat hides
/usr/share/emacs/25.3/lisp/org/org-compat
/home/mark/.emacs.d/elpa/org-20170606/ob-dot hides
/usr/share/emacs/25.3/lisp/org/ob-dot
/home/mark/.emacs.d/elpa/org-20170606/org-faces hides
/usr/share/emacs/25.3/lisp/org/org-faces
/home/mark/.emacs.d/elpa/org-20170606/org-mouse hides
/usr/share/emacs/25.3/lisp/org/org-mouse
/home/mark/.emacs.d/elpa/org-20170606/ob-makefile hides
/usr/share/emacs/25.3/lisp/org/ob-makefile
/home/mark/.emacs.d/elpa/org-20170606/ob-perl hides
/usr/share/emacs/25.3/lisp/org/ob-perl
/home/mark/.emacs.d/elpa/org-20170606/org-irc hides
/usr/share/emacs/25.3/lisp/org/org-irc
/home/mark/.emacs.d/elpa/org-20170606/org-mobile hides
/usr/share/emacs/25.3/lisp/org/org-mobile
/home/mark/.emacs.d/elpa/org-20170606/org-rmail hides
/usr/share/emacs/25.3/lisp/org/org-rmail
/home/mark/.emacs.d/elpa/org-20170606/ob-asymptote hides
/usr/share/emacs/25.3/lisp/org/ob-asymptote
/home/mark/.emacs.d/elpa/org-20170606/ob-matlab hides
/usr/share/emacs/25.3/lisp/org/ob-matlab
/home/mark/.emacs.d/elpa/org-20170606/org-indent hides
/usr/share/emacs/25.3/lisp/org/org-indent
/home/mark/.emacs.d/elpa/org-20170606/org hides
/usr/share/emacs/25.3/lisp/org/org
/home/mark/.emacs.d/elpa/org-20170606/ob-haskell hides
/usr/share/emacs/25.3/lisp/org/ob-haskell
/home/mark/.emacs.d/elpa/org-20170606/org-plot hides
/usr/share/emacs/25.3/lisp/org/org-plot
/home/mark/.emacs.d/elpa/org-20170606/org-feed hides
/usr/share/emacs/25.3/lisp/org/org-feed
/home/mark/.emacs.d/elpa/org-20170606/org-bibtex hides
/usr/share/emacs/25.3/lisp/org/org-bibtex
/home/mark/.emacs.d/elpa/org-20170606/org-src hides
/usr/share/emacs/25.3/lisp/org/org-src
/home/mark/.emacs.d/elpa/org-20170606/ob-awk hides
/usr/share/emacs/25.3/lisp/org/ob-awk
/home/mark/.emacs.d/elpa/org-20170606/org-gnus hides
/usr/share/emacs/25.3/lisp/org/org-gnus
/home/mark/.emacs.d/elpa/org-20170606/org-macs hides
/usr/share/emacs/25.3/lisp/org/org-macs
/home/mark/.emacs.d/elpa/org-20170606/ob-octave hides
/usr/share/emacs/25.3/lisp/org/ob-octave
/home/mark/.emacs.d/elpa/org-20170606/org-table hides
/usr/share/emacs/25.3/lisp/org/org-table
/home/mark/.emacs.d/elpa/org-20170606/ob-scala hides
/usr/share/emacs/25.3/lisp/org/ob-scala
/home/mark/.emacs.d/elpa/org-20170606/ox-org hides
/usr/share/emacs/25.3/lisp/org/ox-org
/home/mark/.emacs.d/elpa/org-20170606/org-version hides
/usr/share/emacs/25.3/lisp/org/org-version
/home/mark/.emacs.d/elpa/org-20170606/ox-beamer hides
/usr/share/emacs/25.3/lisp/org/ox-beamer
/home/mark/.emacs.d/elpa/org-20170606/ob-C hides
/usr/share/emacs/25.3/lisp/org/ob-C
/home/mark/.emacs.d/elpa/org-20170606/ob-ref hides
/usr/share/emacs/25.3/lisp/org/ob-ref
/home/mark/.emacs.d/elpa/org-20170606/ox hides /usr/share/emacs/25.3/lisp/org/ox
/home/mark/.emacs.d/elpa/org-20170606/ox-ascii hides
/usr/share/emacs/25.3/lisp/org/ox-ascii
/home/mark/.emacs.d/elpa/org-20170606/org-bbdb hides
/usr/share/emacs/25.3/lisp/org/org-bbdb
/home/mark/.emacs.d/elpa/org-20170606/ob-java hides
/usr/share/emacs/25.3/lisp/org/ob-java
/home/mark/.emacs.d/elpa/org-20170606/org-agenda hides
/usr/share/emacs/25.3/lisp/org/org-agenda
/home/mark/.emacs.d/elpa/org-20170606/ob-mscgen hides
/usr/share/emacs/25.3/lisp/org/ob-mscgen
/home/mark/.emacs.d/elpa/org-20170606/ob-org hides
/usr/share/emacs/25.3/lisp/org/ob-org
/home/mark/.emacs.d/elpa/org-20170606/ob-js hides
/usr/share/emacs/25.3/lisp/org/ob-js
/home/mark/.emacs.d/elpa/org-20170606/org-w3m hides
/usr/share/emacs/25.3/lisp/org/org-w3m
/home/mark/.emacs.d/elpa/org-20170606/ob-comint hides
/usr/share/emacs/25.3/lisp/org/ob-comint
/home/mark/.emacs.d/elpa/org-20170606/ob-sqlite hides
/usr/share/emacs/25.3/lisp/org/ob-sqlite
/home/mark/.emacs.d/elpa/org-20170606/org-protocol hides
/usr/share/emacs/25.3/lisp/org/org-protocol
/home/mark/.emacs.d/elpa/org-20170606/org-clock hides
/usr/share/emacs/25.3/lisp/org/org-clock
/home/mark/.emacs.d/elpa/org-20170606/ob-picolisp hides
/usr/share/emacs/25.3/lisp/org/ob-picolisp
/home/mark/.emacs.d/elpa/org-20170606/ob hides /usr/share/emacs/25.3/lisp/org/ob
/home/mark/.emacs.d/elpa/org-20170606/org-loaddefs hides
/usr/share/emacs/25.3/lisp/org/org-loaddefs
/home/mark/.emacs.d/elpa/org-20170606/ob-calc hides
/usr/share/emacs/25.3/lisp/org/ob-calc
/home/mark/.emacs.d/elpa/org-20170606/ob-lob hides
/usr/share/emacs/25.3/lisp/org/ob-lob
/home/mark/.emacs.d/elpa/org-20170606/org-eshell hides
/usr/share/emacs/25.3/lisp/org/org-eshell
/home/mark/.emacs.d/elpa/org-20170606/org-habit hides
/usr/share/emacs/25.3/lisp/org/org-habit
/home/mark/.emacs.d/elpa/org-20170606/ob-python hides
/usr/share/emacs/25.3/lisp/org/ob-python
/home/mark/.emacs.d/elpa/org-20170606/ob-fortran hides
/usr/share/emacs/25.3/lisp/org/ob-fortran
/home/mark/.emacs.d/elpa/org-20170606/org-archive hides
/usr/share/emacs/25.3/lisp/org/org-archive
/home/mark/.emacs.d/elpa/org-20170606/ob-clojure hides
/usr/share/emacs/25.3/lisp/org/ob-clojure
/home/mark/.emacs.d/elpa/org-20170606/org-timer hides
/usr/share/emacs/25.3/lisp/org/org-timer
/home/mark/.emacs.d/elpa/org-20170606/ob-exp hides
/usr/share/emacs/25.3/lisp/org/ob-exp
/home/mark/.emacs.d/elpa/org-20170606/ob-shen hides
/usr/share/emacs/25.3/lisp/org/ob-shen
/home/mark/.emacs.d/elpa/org-20170606/org-element hides
/usr/share/emacs/25.3/lisp/org/org-element
/home/mark/.emacs.d/elpa/org-20170606/org-docview hides
/usr/share/emacs/25.3/lisp/org/org-docview
/home/mark/.emacs.d/elpa/org-20170606/ox-md hides
/usr/share/emacs/25.3/lisp/org/ox-md
/home/mark/.emacs.d/elpa/org-20170606/org-ctags hides
/usr/share/emacs/25.3/lisp/org/org-ctags
/home/mark/.emacs.d/elpa/org-20170606/org-inlinetask hides
/usr/share/emacs/25.3/lisp/org/org-inlinetask
/home/mark/.emacs.d/elpa/org-20170606/ob-keys hides
/usr/share/emacs/25.3/lisp/org/ob-keys
/home/mark/.emacs.d/elpa/org-20170606/ob-ledger hides
/usr/share/emacs/25.3/lisp/org/ob-ledger
/home/mark/.emacs.d/elpa/org-20170606/org-entities hides
/usr/share/emacs/25.3/lisp/org/org-entities
/home/mark/.emacs.d/elpa/org-20170606/org-attach hides
/usr/share/emacs/25.3/lisp/org/org-attach
/home/mark/.emacs.d/elpa/org-20170606/ox-odt hides
/usr/share/emacs/25.3/lisp/org/ox-odt
/home/mark/.emacs.d/elpa/org-20170606/ob-ocaml hides
/usr/share/emacs/25.3/lisp/org/ob-ocaml
/home/mark/.emacs.d/elpa/org-20170606/ob-gnuplot hides
/usr/share/emacs/25.3/lisp/org/ob-gnuplot
/home/mark/.emacs.d/elpa/org-20170606/ob-maxima hides
/usr/share/emacs/25.3/lisp/org/ob-maxima
/home/mark/.emacs.d/elpa/org-20170606/ob-latex hides
/usr/share/emacs/25.3/lisp/org/ob-latex
/home/mark/.emacs.d/elpa/org-20170606/ox-latex hides
/usr/share/emacs/25.3/lisp/org/ox-latex
/home/mark/.emacs.d/elpa/org-20170606/ox-texinfo hides
/usr/share/emacs/25.3/lisp/org/ox-texinfo
/home/mark/.emacs.d/elpa/org-20170606/ob-scheme hides
/usr/share/emacs/25.3/lisp/org/ob-scheme
/home/mark/.emacs.d/elpa/org-20170606/org-crypt hides
/usr/share/emacs/25.3/lisp/org/org-crypt
/home/mark/.emacs.d/elpa/org-20170606/ob-eval hides
/usr/share/emacs/25.3/lisp/org/ob-eval
/home/mark/.emacs.d/elpa/org-20170606/ox-publish hides
/usr/share/emacs/25.3/lisp/org/ox-publish
/home/mark/.emacs.d/elpa/org-20170606/ob-lisp hides
/usr/share/emacs/25.3/lisp/org/ob-lisp
/home/mark/.emacs.d/elpa/org-20170606/org-info hides
/usr/share/emacs/25.3/lisp/org/org-info
/home/mark/.emacs.d/elpa/org-20170606/ob-ditaa hides
/usr/share/emacs/25.3/lisp/org/ob-ditaa
/home/mark/.emacs.d/elpa/org-20170606/ob-R hides
/usr/share/emacs/25.3/lisp/org/ob-R
/home/mark/.emacs.d/elpa/org-20170606/org-datetree hides
/usr/share/emacs/25.3/lisp/org/org-datetree
/home/mark/.emacs.d/elpa/org-20170606/ox-icalendar hides
/usr/share/emacs/25.3/lisp/org/ox-icalendar
/home/mark/.emacs.d/elpa/org-20170606/ob-io hides
/usr/share/emacs/25.3/lisp/org/ob-io
/home/mark/.emacs.d/elpa/org-20170606/org-footnote hides
/usr/share/emacs/25.3/lisp/org/org-footnote
/home/mark/.emacs.d/elpa/org-20170606/org-mhe hides
/usr/share/emacs/25.3/lisp/org/org-mhe
/home/mark/.emacs.d/elpa/org-20170606/org-colview hides
/usr/share/emacs/25.3/lisp/org/org-colview
/home/mark/.emacs.d/elpa/org-20170606/ob-css hides
/usr/share/emacs/25.3/lisp/org/ob-css
/home/mark/.emacs.d/elpa/org-20170606/ob-plantuml hides
/usr/share/emacs/25.3/lisp/org/ob-plantuml
/home/mark/.emacs.d/elpa/org-20170606/ob-emacs-lisp hides
/usr/share/emacs/25.3/lisp/org/ob-emacs-lisp
/home/mark/.emacs.d/elpa/org-20170606/ox-html hides
/usr/share/emacs/25.3/lisp/org/ox-html
/home/mark/.emacs.d/elpa/org-20170606/org-macro hides
/usr/share/emacs/25.3/lisp/org/org-macro
/home/mark/.emacs.d/elpa/org-20170606/ob-ruby hides
/usr/share/emacs/25.3/lisp/org/ob-ruby
/home/mark/.emacs.d/elpa/org-20170606/org-id hides
/usr/share/emacs/25.3/lisp/org/org-id
/home/mark/.emacs.d/elpa/org-20170606/ob-tangle hides
/usr/share/emacs/25.3/lisp/org/ob-tangle
/home/mark/.emacs.d/elpa/org-20170606/ob-screen hides
/usr/share/emacs/25.3/lisp/org/ob-screen
/home/mark/.emacs.d/elpa/org-20170606/ob-sql hides
/usr/share/emacs/25.3/lisp/org/ob-sql
/home/mark/.emacs.d/elpa/org-20170606/org-install hides
/usr/share/emacs/25.3/lisp/org/org-install
/home/mark/.emacs.d/elpa/org-20170606/org-capture hides
/usr/share/emacs/25.3/lisp/org/org-capture
Features:
(shadow sort mail-extr emacsbug message idna dired format-spec rfc822
mml mml-sec password-cache epg gnus-util mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils term/xterm
xterm time-date disp-table org-install finder-inf cl-seq cl-macs info rx
package epg-config seq byte-opt gv bytecomp byte-compile cl-extra
help-mode easymenu cconv cl-loaddefs pcase cl-lib uim-leim uim advice
uim-helper uim-candidate uim-preedit uim-key uim-util uim-debug
uim-keymap uim-var uim-version mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment
elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify
dynamic-setting system-font-setting font-render-setting xwidget-internal
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)
Memory information:
((conses 16 131598 4734)
(symbols 48 23041 0)
(miscs 40 45 154)
(strings 32 23862 4596)
(string-bytes 1 696636)
(vectors 16 13413)
(vector-slots 8 423793 2418)
(floats 8 198 588)
(intervals 56 274 8)
(buffers 976 20))
- bug#29871: 25.3; ZWJ word-boundaries in regexps,
Mark Shoulson <=