bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xgettext: problems with PHP heredocs


From: Bruno Haible
Subject: Re: xgettext: problems with PHP heredocs
Date: Mon, 15 May 2006 14:19:36 +0200
User-agent: KMail/1.5

Gaëtan Frenoy wrote:
> Note that this problem was already reported a while ago
> in gnu.utils.bug
> (see 
> http://groups.google.be/group/gnu.utils.bug/tree/browse_frm/thread/94bed260e 
> b3dde71/e8c679cb94fb3b8a)

Thanks for re-reporting it; the original reporter had not provided a complete
testcase.

> But so far, I did not find any fix.
>
> First let's describe the problem.  Say you have the
> following PHP source file.
>
> -------- bug_heredoc.php -------------------------- start -
> <?php
>
> $foo = _("Bar");
> echo <<<_END_
> [...]
> <script language="javascript">
> [...]
> </script>
> _END_;
> $foo2 = _("Bar2");
>
> ?>
> -------- bug_heredoc.php ---------------------------- end -
>
> Now, run "xgettext" to extract marked strings :
>
> -- xgettext call ---------------------------------- start -
> $ xgettext -L PHP --omit-header heredoc_bug.php -o -
> #: heredoc_bug.php:3
> msgid "Bar"
> msgstr ""
> -- xgettext call ------------------------------------ end -
>
> Surprisingly, "Bar2" is not reported nor any of marked
> strings located after the end of PHP heredoc section.
>
> If you are not familiar with PHP, here are some words
> about Heredoc syntax:
> http://www.php.net/manual/en/language.types.string.php#language.types.string 
> .syntax.heredoc

You're right; this is a bug in xgettext. Thanks also for the syntax
reference.

> By digging into the code, I found the following fix
> for gettext-tools/src/x-php.c :
>
> -- x-php.c patch --------------------------------- start -
> $ diff -abu x-php.orig.c x-php.c
> --- x-php.orig.c        2003-12-30 12:30:01.000000000 +0100
> +++ x-php.c     2006-05-04 19:14:43.434424200 +0200
> @@ -1087,12 +1087,18 @@
>                   {
>                     int bufidx = 0;
>
> +                   /* Skip blank lines before processing
> +                    * possible label */
> +                   do
> +                     c = phase1_getc ();
> +                    while (c != EOF && (c == '\n' || c == '\r'));
> +
>                     while (bufidx < bufpos)
>                       {
>                         c = phase1_getc ();
>                         if (c == EOF)
>                           break;
> -                       if (c != buffer[bufidx])
> +                       if (c != buffer[bufidx++])
>                           {
>                             phase1_ungetc (c);
>                             break;
> -- x-php.c patch ----------------------------------- end -


Thanks for the patch; the missing bufidx increment is indeed half of the
bug. However, your fix of the blank lines bug does not work well. Try for
example the input file

======================= foo.php =======================
<?
echo _("Egyptians");
echo <<<EOTMARKER
Ramses
EOTMARKER;
echo _("Babylonians");
echo <<<EOTMARKER
Nebukadnezar
EOTMARKER
echo _("Assyrians");
echo <<<EOTMARKER
Assurbanipal
EOT
echo _("Persians");
echo <<<EOTMARKER
Darius

echo _("Greeks");
echo <<<EOTMARKER
Alexander

EOTMARKER
echo _("Romans");
echo <<<EOTMARKER
Augustus
  EOTMARKER
echo _("Goths");
echo <<<EOTMARKER
Odoakar
EOTMARKER
echo _("Franks");
?>
===============================================

The expected xgettext output here is:

===============================================
msgid "Egyptians"
msgstr ""

msgid "Babylonians"
msgstr ""

msgid "Assyrians"
msgstr ""

msgid "Romans"
msgstr ""

msgid "Franks"
msgstr ""
===============================================

I'm using the appended patch.

*** gettext-0.14.5/gettext-tools/src/x-php.c.bak        2005-05-20 
22:46:40.000000000 +0200
--- gettext-0.14.5/gettext-tools/src/x-php.c    2006-05-12 03:42:58.000000000 
+0200
***************
*** 1097,1109 ****
                                    phase1_ungetc (c);
                                    break;
                                  }
                              }
!                           c = phase1_getc ();
!                           if (c != ';')
!                             phase1_ungetc (c);
!                           c = phase1_getc ();
!                           if (c == '\n' || c == '\r')
!                             break;
                          }
                      }
  
--- 1097,1113 ----
                                    phase1_ungetc (c);
                                    break;
                                  }
+                                 bufidx++;
                              }
!                             if (bufidx == bufpos)
!                               {
!                               c = phase1_getc ();
!                               if (c != ';')
!                                 phase1_ungetc (c);
!                               c = phase1_getc ();
!                               if (c == '\n' || c == '\r')
!                                 break;
!                               }
                          }
                      }
  

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]