Before I begin, as has been pointed out, let us be clear that the discussion has changed. Originally I was interested in better call stack and traceback information, a run-time thing, which I was proposing as a Summer of Code project. The discussion now is compiler locations at compile time.
So be it.
The problems however do have one thing in common: how to represent a location.
Let me also correct one earler correction that the "better" location was construed to be a single line and column number. A better way, I believe, to think of locations is as
- an container, where container is defined to be something,
- an offset off of that container, of some kind where "units" are defined to be something, and
- an optional length of those units. When the length isn't given it is assumed to be the value one.
For example, if you are intersted in only representing a line and column number, one value, an offset would do it.
Note that this abstraction works equally well for other kinds of things like bytecode and the offset would be the bytecode offset. Many times contiguous sequence of bytecode many times maps to a contiguous sequence in the source code. Of course that's not necessarily always the case, but already this is wandering astray of the proposal to follow for me describe how to deal with this. But let me say again if you just care about a single bytecode instruction, set the length to be 1 or leave out the length field.
I know this might not be satisfying to some, but here is a extremely simple but accurate proposal that and doesn't incur a lot of overhead and can deal with a lot of generality.
A unit of compilation I think is a function. That is the container part. Attach to the function its location information in some other way (e.g. it's container might be a file name if that is appropriate, or defined inside a macro...)
A function before bytecompile compiles it is a kind of lambda which is a kind of S-_expression_. A location inside that could simply be a tree node's preorder number. Or the pre-order number and a number of successor nodes in preorder traversal. As with the simple-minded run-time error location proposal: when we have a bytecode offset, mark that position in a disassembly, the same thing can be done here: show the position or range of nodes in the S-expresion that you've got.
What if the bytecode compiler has done some wild and weird optimization changes? Just show what S-exp you were working on and mark where you were.
I know for some or many it may not be satisfying, but it is the honest truth and I'd rather have that than nothing or the wrong guess.
Having done this first step, the problem is divided a little bit so carry on: discuss and conquer. A separate tool outside of the compiler proper can be written to take this and given pointers to where the source might be located figure out where in the source code that might be. Maybe pattern matching would work, dunno, but let me not try to speculate too much.
Finally, in this proposal though I am not suggesting changing the current behavior: by default the additional precise geeky information might be shown only in some sort of "super hacker" verbose compilation mode.
On Fri, Mar 20, 2020 at 4:10 PM Alan Mackenzie <address@hidden
On Thu, Mar 19, 2020 at 17:41:30 -0400, Stefan Monnier wrote:
> > things like cconv.el here). More to the point, users' macros chew up and
> > spit out cons cells, and we have no control over them. So whilst we
> > could, with a lot of tedious effort, clean up our own software to
> > preserve cons cells (believe me, I've tried), this would fail in users'
> > macros.
> I think fat-cons cells are cheap to implement (with (hopefully) no
> performance impact when not used .....
They may be cheap to implement in themselves, but adapting the entire
byte compiler and all our macros to the heavily restricted semantics
they would impose would be an enormous job. I've tried something
similar, and gave up in exhaustion.
> or weird semantic artifacts like the fat-symbol approach you tried),
Er, not "tried" but "implemented", please. The implementation was
complete, and was capable of bootstrapping Emacs with correct positions
for all the (then plentiful) warning messages.
> and can work 99.9% right in the long term with an incremental way to
> get there.
Where does this 99.9% come from? How is this cons tracking you're
proposing supposed to work, when there are an infinite number of
occurrences of the likes of
(cons (car form) (cdr form))
in our code?
> Furthermore it matches the "usual" way to deal with this problem, so
> there's very little doubt about whether it can work or not.
Are you saying that this is how other Lisp compilers deal with source
code positions? How do they deal with the difficult problem of user
macros? Could you give me an example of a free Lisp system which works
this way? I'd be interested in having a look at it.
I think there's quite a bit of doubt as to whether this could work
effectively in Emacs. The way to dispel this doubt is for Somebody (tm)
to implement it.
> > Since then I've worked a fair bit on creating a "double" Emacs core,
> > one core being for normal use, the other for byte compiling.
> > There's a fair amount of work still to do on this, but I know how to
> > do it. The problem is that I have been discouraged by the prospect
> > of having this solution vetoed too, since it will make Emacs quite a
> > bit bigger.
> I'd probably try to veto it, indeed. It might be a good solution in
> the short-term but it'd just slow down our progress in the long term.
Fixing bugs slows down our progress?
To which the answer is to install the working solution pending the
implementation of something better, after which it can be superseded.
Somehow, even that strategy tends to get vetoed.
Alan Mackenzie (Nuremberg, Germany).