[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RFA/RFC] extract strings from m4 skeletons
From: |
Paolo Bonzini |
Subject: |
[RFA/RFC] extract strings from m4 skeletons |
Date: |
Thu, 18 Jan 2007 12:04:13 +0100 |
User-agent: |
Thunderbird 1.5.0.9 (Macintosh/20061207) |
This patch implements the idea I suggested a few hours ago.
Conventional wisdom is that only m4 can parse m4 correctly, so that's
what we do.
It also fixes a couple of messages that could not be translated correctly.
One is what Joel had anticipated:
contained M4 macro invocations will not be expanded.
It's actually worse -- they're wrong, because the string must be fully
known at extraction time. My solution does not cater for this, and
indeed there are a couple of places. But we can put big flashing
warnings in bison.m4.
In the other case, instead of saying "%define variable `foo'" we have to
say `%define foo'" to make the translator's life easier. The
problematic case occurred in bison.m4, but I did the same change to the
wording in parse-gram.y too, for consistency.
Anyway, back to the meat of the patch, which is data/m4-xgettext.sh.
1) What does it do? The scripts takes the list of skeletons and the
list of macros to trace, and produces a `fake' C output with the
strings. This C output is then fed to xgettext. Producing C has the
advantage that xgettext understands #line directives: this way the m4
source file names end up in bison.pot, instead of the fake C file.
2) How does it do that? The script will look inside macro invocations
to find the traced macros. We should use a technique similar to what is
done in autoupdate to achieve perfect results. For now, however, we
rely on something much simpler. We redefine a few key macros in order
to expand all arms of the conditional statements and in order to expand
the loops only once -- which is enough if we only need to look at
autom4te traces.
3) Is it good enough? There is a case in which this mechanism breaks,
that is, if a macro defines another macro. Of course, there is an
example of such a thing in bison.m4. However, the script is coded so
that work arounds can be easily established. In fact, m4 makes "macros
that define macros" complicated enough that I doubt the work around will
need to be replicated in the future.
Paul, Akim, what do you think? You are the real m4-fu-ers.
Paolo
2007-07-18 Paolo Bonzini <address@hidden>
* data/.cvsignore: Add m4strings.c.
* data/Makefile.am (m4_xgettext_files,
(dist_noinst_DATA, dist_noinst_SCRIPTS): New.
(m4strings.c): New target.
* data/m4-xgettext.sh: New file.
* data/bison.m4: Fix strings that cannot be translated. Warn about
correct practices for errors issued in the skeleton.
* data/glr.cc, data/lalr1.cc: Don't invoke macros in translated strings.
* src/parse-gram.y: Adapt string to be consistent with data/bison.m4
change.
* po/POTFILES.in: Add m4strings.c.
* tests/input.at: Adjust expected output.
Index: data/.cvsignore
===================================================================
RCS file: /sources/bison/bison/data/.cvsignore,v
retrieving revision 1.2
diff -u -r1.2 .cvsignore
--- data/.cvsignore 20 Jun 2006 11:32:19 -0000 1.2
+++ data/.cvsignore 18 Jan 2007 10:21:29 -0000
@@ -1,2 +1,3 @@
Makefile.in
Makefile
+m4strings.c
Index: data/Makefile.am
===================================================================
RCS file: /sources/bison/bison/data/Makefile.am,v
retrieving revision 1.16
diff -u -r1.16 Makefile.am
--- data/Makefile.am 19 Dec 2006 00:34:36 -0000 1.16
+++ data/Makefile.am 18 Jan 2007 10:21:29 -0000
@@ -15,9 +15,22 @@
## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
## 02110-1301 USA
-dist_pkgdata_DATA = README bison.m4 \
+m4_xgettext_files = \
c-skel.m4 c.m4 yacc.c glr.c push.c \
c++-skel.m4 c++.m4 location.cc lalr1.cc glr.cc
+dist_pkgdata_DATA = README bison.m4 $(m4_xgettext_files)
+
m4sugardir = $(pkgdatadir)/m4sugar
dist_m4sugar_DATA = m4sugar/m4sugar.m4
+
+dist_noinst_DATA = m4strings.c
+dist_noinst_SCRIPTS = m4-xgettext.sh
+
+m4strings.c: $(m4_xgettext_files) m4-xgettext.sh
+ srcdir="$(srcdir)" AUTOM4TE="$(AUTOM4TE)" \
+ $(SHELL) $(srcdir)/m4-xgettext.sh -M -Bdata \
+ $(m4_xgettext_files) -- \
+ b4_warn:1 b4_warn_at:3 \
+ b4_complain:1 b4_complain_at:3 \
+ b4_fatal:1 b4_fatal_at:3 > $(srcdir)/m4strings.c
Index: data/bison.m4
===================================================================
RCS file: /sources/bison/bison/data/bison.m4,v
retrieving revision 1.14
diff -u -r1.14 bison.m4
--- data/bison.m4 18 Jan 2007 08:32:33 -0000 1.14
+++ data/bison.m4 18 Jan 2007 10:58:52 -0000
@@ -85,7 +85,9 @@
# b4_warn(FORMAT, [ARG1], [ARG2], ...)
# ------------------------------------
-# Write @warn(FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Write @warn(FORMAT@,ARG1@,ARG2@,...@) to diversion 0. Note that FORMAT
+# should not contain any macros, or the extraction of strings for translators
+# will not work correctly.
#
# As a simple test suite, this:
#
@@ -118,12 +120,17 @@
# b4_warn_at(START, END, FORMAT, [ARG1], [ARG2], ...)
# ---------------------------------------------------
# Write @warn(START@,END@,FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Note that FORMAT should not contain any macros, or the extraction of
+# strings for translators will not work correctly.
+
m4_define([b4_warn_at],
[b4_error_at([[warn]], $@)])
# b4_complain(FORMAT, [ARG1], [ARG2], ...)
# ----------------------------------------
# Write @complain(FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Note that FORMAT should not contain any macros, or the extraction of
+# strings for translators will not work correctly.
#
# See the test suite for b4_warn above.
m4_define([b4_complain],
@@ -132,12 +139,16 @@
# b4_complain_at(START, END, FORMAT, [ARG1], [ARG2], ...)
# -------------------------------------------------------
# Write @complain(START@,END@,FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Note that FORMAT should not contain any macros, or the extraction of
+# strings for translators will not work correctly.
m4_define([b4_complain_at],
[b4_error_at([[complain]], $@)])
# b4_fatal(FORMAT, [ARG1], [ARG2], ...)
# -------------------------------------
# Write @fatal(FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Note that FORMAT should not contain any macros, or the extraction of
+# strings for translators will not work correctly.
#
# See the test suite for b4_warn above.
m4_define([b4_fatal],
@@ -146,6 +157,8 @@
# b4_fatal_at(START, END, FORMAT, [ARG1], [ARG2], ...)
# ----------------------------------------------------
# Write @fatal(START@,END@,FORMAT@,ARG1@,ARG2@,...@) to diversion 0.
+# Note that FORMAT should not contain any macros, or the extraction of
+# strings for translators will not work correctly.
m4_define([b4_fatal_at],
[b4_error_at([[fatal]], $@)])
@@ -328,7 +341,7 @@
m4_pushdef([b4_end], m4_shift(m4_shift(b4_occurrence)))dnl
m4_ifndef($3[(]m4_quote(b4_user_name)[)],
[b4_warn_at([b4_start], [b4_end],
- [[%s `%s' is not used]],
+ [[`%s %s' is not used]],
[$1], [b4_user_name])])[]dnl
m4_popdef([b4_occurrence])dnl
m4_popdef([b4_user_name])dnl
@@ -391,12 +404,12 @@
## --------------------------------------------------------- ##
m4_define([b4_check_user_names_wrap],
-[m4_ifdef([b4_percent_]$1[_user_]$2[s],
- [b4_check_user_names([[%]$1 $2],
- [b4_percent_]$1[_user_]$2[s],
- [[b4_percent_]$1[_skeleton_]$2[s]])])])
+[m4_ifdef([b4_percent_]$1[_user_]$2,
+ [b4_check_user_names([[%]$1],
+ [b4_percent_]$1[_user_]$2,
+ [[b4_percent_]$1[_skeleton_]$2])])])
m4_wrap([
-b4_check_user_names_wrap([[define]], [[variable]])
-b4_check_user_names_wrap([[code]], [[qualifier]])
+b4_check_user_names_wrap([[define]], [[variables]])
+b4_check_user_names_wrap([[code]], [[qualifiers]])
])
Index: data/glr.cc
===================================================================
RCS file: /sources/bison/bison/data/glr.cc,v
retrieving revision 1.38
diff -u -r1.38 glr.cc
--- data/glr.cc 18 Jan 2007 08:32:33 -0000 1.38
+++ data/glr.cc 18 Jan 2007 10:21:31 -0000
@@ -53,7 +53,7 @@
# The header is mandatory.
b4_defines_if([],
- [b4_fatal([b4_skeleton[: using %%defines is mandatory]])])
+ [b4_fatal([[%s: using %%defines is mandatory]], [b4_skeleton])])
m4_include(b4_pkgdatadir/[c++.m4])
m4_include(b4_pkgdatadir/[location.cc])
Index: data/lalr1.cc
===================================================================
RCS file: /sources/bison/bison/data/lalr1.cc,v
retrieving revision 1.156
diff -u -r1.156 lalr1.cc
--- data/lalr1.cc 18 Jan 2007 08:32:33 -0000 1.156
+++ data/lalr1.cc 18 Jan 2007 10:21:33 -0000
@@ -27,7 +27,7 @@
# The header is mandatory.
b4_defines_if([],
- [b4_fatal([b4_skeleton[: using %%defines is mandatory]])])
+ [b4_fatal([[%s: using %%defines is mandatory]], [b4_skeleton])])
# Backward compatibility.
m4_define([b4_location_constructors])
Index: data/m4-xgettext.sh
===================================================================
RCS file: data/m4-xgettext.sh
diff -N data/m4-xgettext.sh
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ data/m4-xgettext.sh 18 Jan 2007 10:21:33 -0000
@@ -0,0 +1,186 @@
+#! /bin/sh
+
+# Extract translated strings from the skeletons.
+
+# This file is part of GNU Bison
+# Copyright 2007 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+# 02111-1307 USA
+
+# Usage: m4-xgettext.sh input-file.m4 ... -- MACRO:ARG MACRO:ARG ...
+#
+# The script will extract the ARG-th argument of each specified
+# macro.
+#
+# We should use a technique similar to what is done in autoupdate
+# to look inside every macro invocation. For now, however, we rely
+# on something much simpler. We produced completely garbled output
+# but there is actually no need to look at it because we only
+# need the autom4te traces. It only breaks in the case of a macro
+# that defines another macro: there is an example of such thing in
+# bison.m4 and we work around it below.
+
+: ${srcdir=.}
+: ${AUTOM4TE=autom4te}
+
+gettext_m4=`pwd`/gettext.m4
+trap 'exit_status=$?
+ rm -f $gettext_m4
+ exit $exit_status' 0
+
+lc_all_sed ()
+{
+ LC_ALL=C sed "$@"
+}
+
+# ------------------------------
+# Gather command line arguments.
+# ------------------------------
+
+AUTOM4TE_ARGS="-l M4sugar --no-cache"
+while true; do
+ case "$1" in
+ -B)
+ AUTOM4TE_ARGS="$AUTOM4TE_ARGS -B $2"
+ file_prefix=$2/
+ shift
+ shift
+ ;;
+ -B*)
+ AUTOM4TE_ARGS="$AUTOM4TE_ARGS -B $2"
+ file_prefix=`echo $1 | sed 's,-B,,' `/
+ shift
+ ;;
+ -M|--melt|-v|--verbose|-d|--debug)
+ AUTOM4TE_ARGS="$AUTOM4TE_ARGS $1"
+ shift
+ ;;
+ *)
+ break
+ ;;
+ esac
+done
+
+# ---------------------------------------
+# Create the m4 file that will be traced.
+# ---------------------------------------
+
+cat > gettext.m4 <<\_EOF
+# Use a different prefix for macros that we need later.
+m4_copy([m4_builtin], [_m4_builtin])
+m4_copy([m4_ifdef], [_m4_ifdef])
+m4_copy([m4_define], [_m4_define])
+
+# Throw away all the output. We only use the autom4te traces.
+m4_define([m4_divert_push], [])
+m4_define([m4_divert_pop], [])
+m4_define([m4_divert], [])
+
+# Throw away include files, we include all we need explicitly.
+m4_define([m4_include], address@hidden)
+m4_define([m4_sinclude], address@hidden)
+
+# Process both sides of conditional macros.
+m4_define([m4_eval], [$1])
+m4_define([m4_foreach], [$3])
+m4_define([m4_map], [$2])
+m4_define([m4_map_sep], [$2])
+m4_define([m4_if], address@hidden)
+m4_define([m4_ifdef], [$2[]$3])
+m4_define([m4_ifndef], [$2[]$3])
+m4_define([m4_ifval], [$2[]$3])
+m4_define([m4_case], address@hidden)
+
+# Throw away the special meanings of macros defined by/for the skeletons:
+# just expand the definition in order to get the occurrences of traced
+# macros in the expansion, then define the macro to "$@".
+#
+# Don't redefine already defined macros, though, so that we can override
+# problematic cases below.
+
+_m4_define([m4_define], [$2[]_m4_ifdef([$1], [], [_m4_define([$1],
address@hidden)])])
+_m4_define([m4_pushdef], [$2[]_m4_ifdef([$1], [], [_m4_define([$1],
address@hidden)])])
+_m4_define([m4_define_default], [$2[]_m4_ifdef([$1], [], [_m4_define([$1],
address@hidden)])])
+
+# Almost the same as above, but no need to expand the macro name.
+_m4_define([m4_copy], [_m4_ifdef([$1], [], [_m4_define([$1],
address@hidden)])])
+
+# These add nothing, since we expand the definition of the macro below.
+_m4_define([m4_defn], [])
+_m4_define([m4_indir], [])
+_m4_define([m4_shift], [])
+
+# This is unnecessary because our m4_pushdef is equivalent to m4_define.
+_m4_define([m4_popdef], [])
+
+# A hack: we need to override the definition of b4_define_flag_if in bison.m4,
+# because it defines other macros.
+_m4_define([b4_define_flag_if],
+[_m4_define([b4_$1_if], [$][1][$][2])])])
+
+# Include bison.m4 first to get the b4_*_if macros.
+_m4_builtin([include], [bison.m4])
+
+_EOF
+
+# Add includes for the requested files.
+while [ $# -gt 0 ] && [ "$1" != -- ]; do
+ case "$1" in
+ bison.m4 | gettext.m4 ) ;;
+ *) echo "_m4_builtin([include], [$1])" ;;
+ esac
+ shift
+done >> gettext.m4
+
+# -------------------------------------
+# Compose the -t arguments to autom4te.
+# -------------------------------------
+
+shift
+args="$*"
+set fnord
+for i in $args; do
+ name=`echo $i | sed 's,:.*,,' `
+ arg=`echo $i | sed 's,.*:,,' `
+ fmt=
+
+ set -- "$@" -t "$name:\$l,,${file_prefix}\$f,,\$n,,\$$arg"
+done
+
+# Remove the fnord.
+shift
+
+# ------------------
+# Gather the traces.
+# ------------------
+
+# The sed script formats the traces as C source code so that xgettext
+# interprets the #line directives.
+#
+# We need to remove [...] to cater for double-quoted arguments, and
+# to format the argument as a C string.
+
+cd $srcdir
+$AUTOM4TE $AUTOM4TE_ARGS "$@" $gettext_m4 | lc_all_sed \
+ -e 's:\(.*\),,\(.*\),,\(.*\),,\(.*\):#line \1 "\2" \
+_(\4) /* \3 */: ' \
+ -e 'P' \
+ -e 's:.*\n::'
\
+ -e 's:\\:\\\\:g' \
+ -e 's:":\\":g' \
+ -e 's:^_(\[\(.*\)]) /\*:_(\1) /*:g' \
+ -e 's:^_(\(.*\)) /\*:_("\1") /*:g' || exit 1
+
Index: po/POTFILES.in
===================================================================
RCS file: /sources/bison/bison/po/POTFILES.in,v
retrieving revision 1.23
diff -u -r1.23 POTFILES.in
--- po/POTFILES.in 1 Oct 2006 23:35:37 -0000 1.23
+++ po/POTFILES.in 18 Jan 2007 10:21:34 -0000
@@ -1,3 +1,5 @@
+data/m4strings.c
+
src/complain.c
src/conflicts.c
src/files.c
Index: src/parse-gram.y
===================================================================
RCS file: /sources/bison/bison/src/parse-gram.y,v
retrieving revision 1.117
diff -u -r1.117 parse-gram.y
--- src/parse-gram.y 18 Jan 2007 02:18:17 -0000 1.117
+++ src/parse-gram.y 18 Jan 2007 10:33:30 -0000
@@ -242,7 +242,7 @@
strcpy (name + sizeof name_prefix - 1, $2);
strcpy (name + sizeof name_prefix - 1 + length, ")");
if (muscle_find_const (name))
- warn_at (@2, _("%s `%s' redefined"), "%define variable", $2);
+ warn_at (@2, _("`%%define %s' redefined"), $2);
MUSCLE_INSERT_STRING (uniqstr_new (name), $3);
free (name);
muscle_grow_user_name_list ("percent_define_user_variables", $2, @2);
Index: tests/input.at
===================================================================
RCS file: /sources/bison/bison/tests/input.at,v
retrieving revision 1.75
diff -u -r1.75 input.at
--- tests/input.at 17 Jan 2007 08:36:07 -0000 1.75
+++ tests/input.at 18 Jan 2007 10:21:35 -0000
@@ -718,9 +718,9 @@
start: ;
]])
AT_CHECK([[bison input-c.y]], [0], [],
-[[input-c.y:1.7: warning: %code qualifier `q' is not used
-input-c.y:2.7-9: warning: %code qualifier `bad' is not used
-input-c.y:3.7-9: warning: %code qualifier `bad' is not used
+[[input-c.y:1.7: warning: `%code q' is not used
+input-c.y:2.7-9: warning: `%code bad' is not used
+input-c.y:3.7-9: warning: `%code bad' is not used
]])
AT_DATA([input-c-glr.y],
@@ -731,9 +731,9 @@
start: ;
]])
AT_CHECK([[bison input-c-glr.y]], [0], [],
-[[input-c-glr.y:1.7: warning: %code qualifier `q' is not used
-input-c-glr.y:2.7-9: warning: %code qualifier `bad' is not used
-input-c-glr.y:3.8-10: warning: %code qualifier `bad' is not used
+[[input-c-glr.y:1.7: warning: `%code q' is not used
+input-c-glr.y:2.7-9: warning: `%code bad' is not used
+input-c-glr.y:3.8-10: warning: `%code bad' is not used
]])
AT_DATA([input-c++.y],
@@ -744,9 +744,9 @@
start: ;
]])
AT_CHECK([[bison input-c++.y]], [0], [],
-[[input-c++.y:1.7: warning: %code qualifier `q' is not used
-input-c++.y:2.7-9: warning: %code qualifier `bad' is not used
-input-c++.y:3.8: warning: %code qualifier `q' is not used
+[[input-c++.y:1.7: warning: `%code q' is not used
+input-c++.y:2.7-9: warning: `%code bad' is not used
+input-c++.y:3.8: warning: `%code q' is not used
]])
AT_DATA([input-c++-glr.y],
@@ -757,9 +757,9 @@
start: ;
]])
AT_CHECK([[bison input-c++-glr.y]], [0], [],
-[[input-c++-glr.y:1.7-9: warning: %code qualifier `bad' is not used
-input-c++-glr.y:2.7: warning: %code qualifier `q' is not used
-input-c++-glr.y:3.7: warning: %code qualifier `q' is not used
+[[input-c++-glr.y:1.7-9: warning: `%code bad' is not used
+input-c++-glr.y:2.7: warning: `%code q' is not used
+input-c++-glr.y:3.7: warning: `%code q' is not used
]])
AT_DATA([special-char-@@.y],
@@ -770,9 +770,9 @@
start: ;
]])
AT_CHECK([[bison special-char-@@.y]], [0], [],
-[[special-char-@@.y:1.7-9: warning: %code qualifier `bad' is not used
-special-char-@@.y:2.7: warning: %code qualifier `q' is not used
-special-char-@@.y:3.7: warning: %code qualifier `q' is not used
+[[special-char-@@.y:1.7-9: warning: `%code bad' is not used
+special-char-@@.y:2.7: warning: `%code q' is not used
+special-char-@@.y:3.7: warning: `%code q' is not used
]])
AT_DATA([special-char-@:>@.y],
@@ -783,9 +783,9 @@
start: ;
]])
AT_CHECK([[bison special-char-@:>@.y]], [0], [],
-[[special-char-@:>@.y:1.7-9: warning: %code qualifier `bad' is not used
-special-char-@:>@.y:2.7: warning: %code qualifier `q' is not used
-special-char-@:>@.y:3.7: warning: %code qualifier `q' is not used
+[[special-char-@:>@.y:1.7-9: warning: `%code bad' is not used
+special-char-@:>@.y:2.7: warning: `%code q' is not used
+special-char-@:>@.y:3.7: warning: `%code q' is not used
]])
AT_CLEANUP
@@ -808,13 +808,13 @@
]])
AT_CHECK([[bison input.y]], [0], [],
-[[input.y:2.9-11: warning: %define variable `var' redefined
-input.y:3.10-12: warning: %define variable `var' redefined
-input.y:1.9-11: warning: %define variable `var' is not used
-input.y:2.9-11: warning: %define variable `var' is not used
-input.y:3.10-12: warning: %define variable `var' is not used
-input.y:4.9-16: warning: %define variable `special1' is not used
-input.y:5.9-16: warning: %define variable `special2' is not used
+[[input.y:2.9-11: warning: `%define var' redefined
+input.y:3.10-12: warning: `%define var' redefined
+input.y:1.9-11: warning: `%define var' is not used
+input.y:2.9-11: warning: `%define var' is not used
+input.y:3.10-12: warning: `%define var' is not used
+input.y:4.9-16: warning: `%define special1' is not used
+input.y:5.9-16: warning: `%define special2' is not used
]])
AT_CLEANUP
- [RFA/RFC] extract strings from m4 skeletons,
Paolo Bonzini <=