ON Wed, 15 Jul 2020 23:55:19 -0400 Stefan Monnier wrote:
Sounds like a lot of information, which in turn implies a potentially
high overhead (e.g. the "exact string" sounds like it might cost O(N²)
in corner cases, yet provides redundant info that can be recovered from
begin+end points). Note also that while `read` returns a sexp made
exclusively of data coming from a particular buffer, the code after
macro-expansion can include chunks coming from other buffers, so if we
want to keep the same representation of "sexp with extra info" in both
cases, we can't just assume "the buffer".
Yes, when I last looked, yes, there is bloat in the way source mappings are done. But let me explain:
As a Google Summer of Code project, the project has always been been a bit behind. So the approach I had been taking was that if something is usable for now, go with it and move onto other uncharted territory. In other words, get something out, complete what remains and only then go back and iterate on the parts that need improving. The C changes were little bit different because of the (necessarily) long lead time to get things into master and because one can't put something inefficient into the core.
The source-code string is needed in the source map only at the top-level. (Oddly the member name for this is "code"). I had suggested that offsets should be relative to the beginning of the function, and the function node would have the position from the beginning of the container (e.g. file) that it is in. However this isn't a big deal, since conversions are easily done.
As for handling bits of S-expressions that represent the conglomeration of a number of containers/files, that's pretty easily handled inside the structure. I am not totally clear about how the container information is determined. I imagine some of it would be noticed in the parameters when the macro is defined, and some of each time the macro is expanded. But once it is determined that certain S-expressions go with certain containers, it is trivial to add it to a source-map object
One cool thing about having the source string stored in the sourcemap object (whether just at the top-level of in more places) is that in tracebacks is that exact information can be given without searching around. In fact, the source code may have never existed inside a file and this still works.
Another great thing about this is that it can tolerate mismatches between the Elisp compiled and the Elisp that is have available. If there were changes outside the toplevel object but not inside the object, then it is pretty easy to detect and correct for this. Even if the discrepency is inside the object, the differences are also easiliy detected. Adjusting is a little more difficult, but still doable.