Re: 2 character comment starter bug

From: Stefan Monnier
Subject: Re: 2 character comment starter bug
Date: Wed, 23 Mar 2005 18:10:45 -0500
[ Please keep the discussions on the mailing-list. ]

>> >     (modify-syntax-entry ?\= "_ b12" st) ; comment start == 
>> Yes, it seems the problem is that your 2-char comment  sequence is made
>> of symbol-chars, so there are cases where the code does things  like "oh,
>> here's a symbol, let's skip it" without checking whether some of the
>> chars that compose the symbol happen to also be a comment-marker.
>> Does your = char really need to have "symbol" syntax (i.e. "_") or
>> could it have punctuation syntax instead (i.e. ".") ?

> Punctuation syntax seems to cause all kinds of problems.  The =
> character is commonly used as the first character in in
> filenames.

But does it matter in such a case whether it has punctuation syntax or
symbol syntax?  Do you also give symbol syntax to the / directory separator
as well?

> It's also part of several operators such as := and ':=' and '=:' which
> would behave quite oddly without proper syntax.

Traditionally, punctuation syntax has been used specifically for things like
the above.  So, I'd say that punctuation *is* the proper syntax.  If you use
symbol syntax for those chars, things like M-C-f risk skipping over
"foo:=bar" in "foo:=bar + 1", rather than just skipping over "foo".

> It sounds like you are saying it might be a problem to fix the code.

Yes: it might take a bit of work; it risks slowing down syntax-based
operationd in all buffers; and it could introduce bugs in other languages
where the current behavior is closer to what we want (after all, if you
define your language using lex and you say that a symbol can be [a-z=_]+
and a comment starter is ==, your lexer will take `foo==' to be a symbol
and won't see the comment starter in it).

The current behavior is buggy (it doesn't behave consistently between
things like forward-sexp, backward-sexp, and parse-partial-sexp).

But before someone can convince me to try and fix these bugs, they should
first make a good case that the way they setup their syntax-tables is well
thought out.


