2) sgml-indent-line calls sgml-parse-tag-backward, which does
(re-search-backward "[<>]"), finds "<" and performs simple regexp check.
Thus,<% if a< 3 %> breaks indentation on following lines, until first
closing tag.
I think we can treat this as a bug in sgml-indent-line, which should try
and use syntax-ppss or something like that instead of regexps.
I wonder how that could be fixed exactly. parse-partial-sexp doesn't look
helpful, because it works with single characters, and sgml is concerned with
full tags. It also has to handle unclosed tags like <br>, some closing tags
are optional, and HTML 4 has self-closing tags.
I think just checking after the regexp-match whether the match was found
within a "comment" should do the trick, assuming we can get syntax-ppss
(or some extension thereof) to treat "other modes" as comments.
If parse-partial-sexp just starts from (point-min), and then skips over
"comments", it will never visit submode regions this way, no?
That's why we'd need to hook into syntax-ppss to run parse-partial-sexp
on a chunk-by-chunk basis, maybe. Also parse-sexp-ignore-comments also
affects (for|back)ward-sexp, as well as up-list, which are important
building blocks for indentation algorithms.
Another thing to consider - having "visibility" into previous chunks of the
same submode may be more harmful than useful in some cases.
That's OK: the low-level code can't know those things, but the
higher-level code which handles the various chunks can treat different
chunks differently. E.g. using narrow-to-region for chunks which need
to ignore previous text.