possible bug in anchored font-lock functions

From: Sam Halliday
Subject: possible bug in anchored font-lock functions
Date: Sun, 11 Nov 2018 14:26:03 +0000

Dear all,

I think I might have found a bug in GNU Emacs but I would like to check
my understanding first, before filing a report or (ideally) fixing it.

In Search Based Fontification[1] it is possible to specify a function as a
matcher. It must obey the following contract:

> it receives one argument, the limit of the search; it should begin
> searching at point, and not search beyond the limit. It should return
> non-nil if it succeeds, and set the match data to describe the match
> that was found. Returning nil indicates failure of the search.

In addition, it is possible to extend the region to be fontified by
adding a routine to `font-lock-extend-region-functions` that in-place
updates the `font-lock-beg` or `font-lock-end` variables (these
variables are not visible in a function matcher). Let's ignore
`font-lock-multiline` property approaches, I'm not using them.

Indeed, I have confirmed that if I extend the region in a
`font-lock-extend-region-functions` then the `limit` does increase for
my function matcher!

However, if I use an an `anchored` matcher, having the form `(matcher .
anchored-highlighter)`, where the `anchored-highlighter` is a
`function`, my custom `font-lock-{beg,end}` regions are ignored and
`limit` is much reduced!

Is there something I need to do so that anchored matchers receive the
calculated regions or are they only designed to extend to the end of the
current line by default?

If I had to guess I'd say the anchored matcher is forgetting to use
`font-lock-{beg,end}` and is instead calculating a new limit or using a
cached version of the limit from before
`font-lock-extend-region-functions` ran.

I would greatly appreciate it if somebody could please point me to the
source code in GNU Emacs where the `font-lock-keywords` are called for
anchored matchers. I suspect the limit is also broken for `regexp`
matchers, not just `function`, but I have no way of printing out `limit`
in that case.

A final note, this is the first time I've written a syntax-table and
font-lock for a programming language, and I have found the experience to
be much more pleasant than I expected! The font-lock-keyword API is
lovely to work with and I've been using the rx macro to avoid writing
regexps by hand... my code reads like a simplified BNF description!


Best regards,

