[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A vision for multiple major modes: some design notes

From: Eli Zaretskii
Subject: Re: A vision for multiple major modes: some design notes
Date: Sat, 23 Apr 2016 10:39:55 +0300

> Date: Fri, 22 Apr 2016 22:35:08 +0000
> Cc: address@hidden, address@hidden
> From: Alan Mackenzie <address@hidden>
> > > > Why whitespace? why not some new category?  By overloading whitespace,
> > > > you make things harder on the underlying infrastructure, like regexp
> > > > search and matching.
> > > I think it's clear that the "foreign" island's syntax has no interaction
> > > with the current island.
> > This is not a contradiction to what I suggested.  The new category
> > could be treated the same as whitespace, in its effect on
> > syntax-related issues.  By contrast, having whitespace regexp class be
> > indistinguishable from an island probably means complications on a
> > very low level of matching regular expressions and syntax constructs,
> > something that I fear will get in the way.
> > > If we treat it as whitespace, that should minimise the amount of
> > > adapting we need to do to existing major modes.
> > We need to consider the amount of adaptations in the low-level
> > infrastructure code as well, not only on the application level.
> I think the adaptations to the regexp engine would be far less work than
> adapting many thousands of regexps in major modes we want to use as
> sub-modes.  For example there are 115 occurrences in CC Mode of just the
> exact string "[ \t".

Please let's not forget that regexps are used in many places that have
no relation whatsoever to major modes, and searching for whitespace is
a very common operation using regular expressions.  Infecting all
those with this new meaning of whitespace that is totally alien to any
code that doesn't deal with major mode is IMO plain wrong.

More generally, I think we should first and foremost make our goal to
have a clean and reasonably simple design, and only care about the
amount of changes in major mode code as a secondary goal.  Thinking
about the changes in major modes first could easily lead us astray.

> Bear in mind that this matching of an island by a whitespace regexp
> element would happen ONLY whilst `in-islands' was bound to non-nil, i.e.
> when a major mode is working in its own island chain.

I understand, but I don't think this goes far enough to address my
concerns.  And my suggestion to have a separate class/category will
serve your needs just as well, so I'm unsure why we need to piggyback

> Are there any circumstances in which we would not want the major
> mode to see the gap between its islands as WS?

Who says that every major mode necessarily treats whitespace as you
assume?  Most (or even all) of those you know about might, but this is
not written anywhere as a limitation of a major mode.  By hard-wiring
this special meaning of [:space:] into your design, you are limiting
future (and possibly some rare extant) major modes.

> When `in-islands' is nil (i.e. when the super mode's code is
> running, or the user is typing commands) the islands would NOT match
> a WS regexp.

Are you sure that none of the background processing will ever need to
treat islands as such?  I'm talking about stuff like timers, process
filters and sentinels, hook functions run by redisplay and the command
loop, etc.  If any of these might need to observe the island rules and
restrictions, the design which builds on in-islands being bound to
non-nil _only_ when the major mode is running its own code is
unreliable, and will cause unrelated code to find itself dealing with
island peculiarities.  E.g., JIT font-lock runs off an idle timer, but
clearly needs to observe islands, so it sounds like the problem I'm
worried about is pretty much into our faces.

> > By contrast, if we decide that whitespace matches an island, we are
> > opening a giant can of worms.  Here's one worm out of that can: some
> > low-level operations need to search the buffer using regexps
> > disregarding any narrowing -- what you suggest means these operations
> > cannot safely use whitespace in their regexps.  This is something to
> > stay away of, IMO.
> It depends on whether these low level operations are working within an
> island chain (`in-islands' non-nil) or on the buffer as a whole
> (`in-islands' nil).  I think such operations would typically be run with
> `in-islands' nil, hence would not run up against these problems.

"Typically" is not good enough, IMO.  We must convince ourselves that
this happens _always_, and there will _never_ be a reasonably
justifiable need to search the entire buffer for whitespace when
in-islands is non-nil, i.e. in any of the code that is running as a
side-effect of performing some major-mode related operation.

> > > CVAR would get the current chain from the `island' (or `chain') text
> > > property at the position.
> > If it is stored in the text property, then you will have to decide
> > what happens when text is copied and yanked elsewhere.
> It would be the job of the `island-after-change-function' to strip the
> unwanted text properties (both the `island' and `syntax-table' ones) and
> to apply any needed new ones to the yanked region.

The problem is the decision whether they are unwanted or not.  It's
usually not simple to make that decision for text properties that
change the way text is displayed, when surrounding text also affects

> > > Otherwise it would access the appropriate named element in the struct
> > > chain.  I think CVAR would take three parameters: the variable name, the
> > > buffer, and the buffer position.
> > Can you show a pseudo-code of CVAR?  I'm afraid I'm missing something
> > here, because I don't see clearly what you have in mind.
> I'll try.  Something like this:
> #define CVAR(var, buf, position) \
>     chain = read_text_property (Qisland, buf, position), \
>     chain ? chain.var \
>           : BVAR (var, buf)
> , but I don't think that would be a valid Lvalue in C.  :-(

Didn't you talk about some alist to look up?  I see no alist look up
in this pseudo-code.  And 'chain.var' sounds wrong, since 'chain' is
definitely a Lisp object, not a C struct.  Or maybe I don't understand
what hides behind read_text_property.

> > > Other chain local variables would be accessed through an alist in the
> > > struct chain holding miscellaneous variables, exactly as is done for
> > > the other buffer local variables in struct buffer.
> > There's no such alist in how we access buffer-local variables, not
> > AFAIK.  Again, I must be missing something here.
> Or, maybe I am.  I thought that the slot `local_var_alist_' in the struct
> buffer held the bindings of all the non-BVAR local variables, as an
> alist.

Ah, you were talking about local_var_alist_...  OK, but then I don't
see anything like that in CVAR above.

> I'm not at all clear on when and how buffer local variable
> bindings get swapped in and out of, say, C variables like Vfoo.

This happens when we switch buffers, see set_buffer_internal_1.  But
that function is driven by an explicit event of switching buffers,
while in your design you need to do something similar when point
crosses some buffer position, which is a much more subtle event.
E.g., think about all the save-excursion and save-restriction code out

> > > > This actually sounds like a simple extension of narrowing, so I wonder
> > > > why do we need so many new object types and notions.
> > > I think it's more like a complicated extension of narrowing.  :-)
> > It's simple because instead of one region you have more than one, and
> > the user-level commands don't affect them.  All the other changes are
> > exact reproduction of what narrowing does.
> > > I think that chain local variables are essential to multiple major
> > > modes - you can't have m.m.m. without some sort of chain locality.
> > What is "chain locality"?
> Having things (variables) which are local to a chain, as opposed to
> global variables or buffer local variables or frame local variables.

OK, but no one said that applying a restriction and making
island-specific bindings of variables must be parts of the same
feature.  They could be 2 separate features instead.

> >       base_face_id = it->string_from_prefix_prop_p
> >         ? (!NILP (Vface_remapping_alist)
> >            ? lookup_basic_face (it->f, DEFAULT_FACE_ID)
> >            : DEFAULT_FACE_ID)
> >         : underlying_face_id (it);
> > Another example (which I also mentioned) is standard-display-table:
> >   /* Use the standard display table for displaying strings.  */
> >   if (DISP_TABLE_P (Vstandard_display_table))
> >     it->dp = XCHAR_TABLE (Vstandard_display_table);
> > See? no BVAR anywhere in sight.
> OK.  But `face-remapping-alist' can definitely be made buffer local, and
> `standard-display-table' most probably can.

They both are.

> There will be some mechanism (which I don't currently understand) by
> which buffer local values are swapped into and out of
> Vface_remapping_alist when the current buffer changes.

See above: that mechanism is part of the function that switches to
another buffer.

> Surely a similar mechanism could be created for when the current
> island changes.

The issue is to make it as cheap as possible, because redisplay code
is at liberty to move around the buffer at will, and the location
where it examines buffer text is not directly related to point.

> > Something bothers me there.  What will "M-<" and "M->" do, if
> > point-min and point-max are limited to the current island?  Likewise
> > the search commands -- they cannot be limited to the current island,
> > unless the user explicitly says so (and personally, I don't envision
> > users to ask to be so limited).
> Those restrictions will only apply when `in-islands' is bound to non-nil,
> i.e. when major mode code is running.  It will be nil when the user types
> in M-<, hence point will move to the beginning of the (visible region of
> the) buffer.

See above: there might be some situations, like JIT font-lock, where
you will want to have in-islands non-nil while running async code, and
that might make the islands visible to code that is not strictly part
of any major mode, like the infrastructure which invokes these async
parts of Emacs code.  So I think you need to consider the effects of
those on more than just major modes.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]