emacs-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Emacs-diffs] master 95cee7f: Improve the commit-msg Git hook for unibyt


From: Eli Zaretskii
Subject: [Emacs-diffs] master 95cee7f: Improve the commit-msg Git hook for unibyte environments
Date: Tue, 14 Apr 2015 18:58:42 +0000

branch: master
commit 95cee7f6a6c9332296e386ca6e6fcce3141e5d13
Author: Eli Zaretskii <address@hidden>
Commit: Eli Zaretskii <address@hidden>

    Improve the commit-msg Git hook for unibyte environments
    
    * build-aux/git-hooks/commit-msg: Set LC_ALL=C, before running Awk
    in unibyte environments.  (Suggested by Paul Eggert
    <address@hidden>.)  Use a more accurate approximation to
    [:print:], based on UTF-8 sequences of the unprintable characters.
---
 build-aux/git-hooks/commit-msg |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/build-aux/git-hooks/commit-msg b/build-aux/git-hooks/commit-msg
index 6e31dbc..9661376 100755
--- a/build-aux/git-hooks/commit-msg
+++ b/build-aux/git-hooks/commit-msg
@@ -36,8 +36,11 @@ at_sign=`$awk "$print_at_sign" </dev/null 2>/dev/null`
 if test "$at_sign" != @; then
   at_sign=`LC_ALL=en_US.UTF-8 $awk "$print_at_sign" </dev/null 2>/dev/null`
   if test "$at_sign" = @; then
-    LC_ALL=en_US.UTF-8; export LC_ALL
+    LC_ALL=en_US.UTF-8
+  else
+    LC_ALL=C
   fi
+  export LC_ALL
 fi
 
 # Check the log entry.
@@ -45,10 +48,13 @@ exec $awk -v at_sign="$at_sign" -v cent_sign="$cent_sign" '
   BEGIN {
     # These regular expressions assume traditional Unix unibyte behavior.
     # They are needed for old or broken versions of awk, e.g.,
-    # mawk 1.3.3 (1996), or gawk on MSYS (2015).
+    # mawk 1.3.3 (1996), or gawk on MSYS (2015), and/or for systems that
+    # cannot use UTF-8 as the codeset for the locale.
     space = "[ \f\n\r\t\v]"
     non_space = "[^ \f\n\r\t\v]"
-    non_print = "[\1-\37\177]"
+    # The non_print below rejects control characters and surrogates
+    # UTF-8 for: 0x01-0x1f 0x7f   0x80-0x9f  0xd800-0xdbff  0xdc00-0xdfff
+    non_print = "[\1-\37\177]|\302[\200-\237]|\355[\240-\277][\200-\277]"
 
     # Prefer POSIX regular expressions if available, as they do a
     # better job of checking.  Similarly, prefer POSIX negated



reply via email to

[Prev in Thread] Current Thread [Next in Thread]