[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch] i18n, l10n, gettext and something more

From: Daniel Skarda
Subject: Re: [patch] i18n, l10n, gettext and something more
Date: 02 Aug 2001 13:28:11 +0000
User-agent: Gnus/5.0806 (Gnus v5.8.6) Emacs/20.6

>     Daniel>      1) it does not record exact position of strings
> Out of interest, what does source-properties record?  Is it just the
> beginning of each parenthesized expression?

  It seems so (unfortunately it can happen that string does not start on the
same line as "parent" expression)

>     Daniel>      2) it always interprets escape sequences (even
>     Daniel> thought when you do not want to)
> When do you not want it to?

  I my opinion extracted strings should look same in .po files and .scm files. 
When there is \t inside string in .scm, there should be \t in .po file and not 
"        " (and vice versa) unless user turns on --escape (escape every
`suspicious' character :)
>     Daniel>      3) it discards comments (surprising, is not it? :-)
> Why do you need to translate strings in comments?

  I do not need to translate strings from comments - I want to extract comments
(or xgettext options) to help translators in their work.
>     Daniel>    In some Japanese coding, Japanese part of string is
>     Daniel> encoded between two escape sequences - and between them
>     Daniel> there can be arbitrary characters - even #\" can be there!
>     Daniel> - so scheme parser gets confused on such sequences and
>     Daniel> string parsing is terminated too early.
> This sounds like a serious problem!

  As far as I know no Japanese has complained (so maybe it is not that serious 

>     Daniel>    It was very entertaining to extend guile in this way. I
>     Daniel> have to admit that in one moment I had to say "stop" (I
>     Daniel> caught myself playing with idea about writing parser for C
>     Daniel> or Perl like language... it is a shame I have not enough
>     Daniel> spare time (or sponsor :))
> Perhaps not; we already have ctax to embarass us.  And Thomas Bushnell
> is already working on Python.

  I like it just because of fun of coding (and maybe I would like to work on
some other language when infrastructure for adding new parsers to guile settles

  But infix syntax is also/already helpful. I _like_ scheme syntax, but for
typing more complex math expression is infix syntax more convenient. So it comes
handy to "turn off" scheme prefix syntax - just for few characters :)

  BTW - what's the ctax status (I have not seen stable release for months) -
who maintains the package?
  I have not checked ctax (or Thomas Bushnell's work) but I suppose that it is
one file - one syntax. (ice-9 infix) has another intention.

>     Daniel>    - Parser extension (lookup array: char -> procedure) is
>     Daniel> global. That's bad thing - IMHO each port should have one
>     Daniel> such array (arrays could be shared so do not worry about
>     Daniel> wasted memory)
> Agreed.

  And same applies to read-hash-extension...

>     Daniel>      Modularize it. scm_lreadr now looks something like
>     Daniel> this:
>     Daniel>        switch (getc ()) { case '"':
>     Daniel>      I think the speed would not decrease too much if we
>     Daniel> write
>     Daniel>        call (array [c = getc ()], c, port)
> I think it would.  The biggest current Guile performance problem is
> startup time on older machines, and this would be right there in the
> inner loop!
> In my non-maintainerly opinion... Scheme code maybe, but C changes
> definitely not; the performance impact of your proposed reader changes
> is just too great (or at least too uncertain at this point).

  As soon as guile gets VM and the ability to load/compile bytecode, 'read'
performance is not going to affect startup time anymore. That's from longterm
point of view.

  And what about performance of 1.6 reader? I have not implemented all
improvements I suggested and it would be surprising for me if enhanced read
function would be noticeably slower. (array [int] != SCM_BOOL_F) lookup is done
only once per "starting character" of reader's "atom" so this should not degrade
reader performance ("starting character" means #\" #\; for strings (or comments)

  Though one should run some serious tests to be sure...

>     Daniel>         [+ (+ 1 (+ 1 [+ 1 (+ 1 (+ 1 (+ 1 1)))]))]
>     Daniel>      But scheme hackers say - no way! So what should such
>     Daniel> poor little fellow do? (right answer: change the catcode
>     Daniel> of braces and brackets so they would open/close lists...)
> Unless the little fellow plans never to share any of his/her code, the
> right answer is to learn the prevailing conventions and the tools that
> are available to help with them, IMO.

  You are right - that was poor example. But one should think about how to
create powerful tools (and not how to prevent others from shooting themselves in
foot :-)

  Another example I forgot to mention - case insensitive parser can be done in
catcode way - instead of category (negative integers), translation table holds
uppercase symbols for lowercase letters (and special categories for other
characters - #\(, #\), #\; etc...)


ps: Language parsers are also related to new module system. Does it still have
"waiting for Goddot" status or somebody is working on it? I remember Jost
Boekemeier was working on it, I noticed "libguile/environment.[ch]" but I could
not find any code that uses it (maybe I am wrong).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]