Re: AW: font-locking and open parens in column 0

From: Stefan Monnier
Subject: Re: AW: font-locking and open parens in column 0
Date: Fri, 10 Nov 2006 23:52:50 -0500
>> The purpose of open-paren-in-column-0-is-defun-start is to enable to
>> disable this heuristic.  Setting it to nil disables the heuristic.
>> So I think his change is correct.

>     Huh?  What do you mean by "heuristic".

> Emacs is looking for an open-paren at top level in list structure.

When?  We're talking about a patch to beginning-of-defun, right?
Beginning-of-defun has until now been defined by regexps.

> The heuristic is to take an open-paren at column zero and assume
> it is at top level.

The docstring of beginning-of-defun doesn't say anything about "toplevel".

AFAIK beginning-of-defun is a function in the same category as
forward-paragraph and things like that, which do not care much about the
global syntax state, but instead only pay attention to the immediately
surrounding text.

> I am having trouble relating the rest of your message to this issue.
> This is just a matter of implementing
> open-paren-in-column-0-is-defun-start = nil to do what it says it will do.

The discussion about beginning-of-defun is intricately tied to syntax-ppss

1 - it started with a bug-report about wrong font-locking because
    beginning-of-defun is apparently used by cc-mode to get a safe starting
    point for syntactic analysis.
2 - beginning-of-defun has been used as a syntax-begin-function in several
    occasions, as a heuristic to find a safe starting point for
    syntactic analysis.
3 - changing beginning-of-defun to use parse-partial-sexp (or worse
    syntax-ppss) defeats the purpose of using it as a syntax-begin-function.
    It may even break such uses (e.g. in emacs-lisp-mode).
4 - the original motivation for the patch (i.e. point 1 above) is better
    addressed by not using beginning-of-defun and rely on syntax-ppss's
    cache instead.

I think the problem is that beginning-of-defun has many different possible
uses, not all of which are compatible:

1 - it can be used as a "move to toplevel" (i.e. outside of any syntactic
    element).  Currently it's a not reliable way to do that, but it's been
    used as a good heuristic.  Note that in some languages such a concept
    may not even be very meaningful: in languages whose files are commonly
    composed of only one toplevel element (typically a module or a class
    which then contains other elements inside themselves maybe classes or
    modules, ...).

2 - it can be used as a form of "backward-paragraph-for-prog-langs", to move
    to the beginning of a "block of text".  In case where defuns can be
    nested, this first only move to the beginning of the nested defun.

3 - a mix of the two: define some level of nesting (if any) as the main one
   (typically either the toplevel one, or if the toplevel is a single
   element, use the next level down) and move to the beginning of the defun
   at that level.

Interactive use mostly wants behavior 2 or 3.
A reliable way to get behavior 1 is to use syntax-ppss rather than

The proposed patch basically tries to make beginning-of-defun follow the
behavior number 1 and to make it do so reliably.  Given the availability of
syntax-ppss to get the same result, I don't think this patch is such
a good idea.

OTOH it might be a good idea indeed to change beginning-of-defun so that it
ignores regexp-matches if they're inside comments or strings.  But that'd be
a different patch, which would apply regardless of


