[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regexp bytecode disassembler

From: Eli Zaretskii
Subject: Re: Regexp bytecode disassembler
Date: Sat, 21 Mar 2020 21:19:16 +0200

> From: Mattias Engdegård <address@hidden>
> Date: Sat, 21 Mar 2020 17:52:51 +0100
> Cc: address@hidden
> > First, please document this in NEWS and in the ELisp manual.  IMNSHO,
> > this feature will be much less useful without documentation.
> Sorry, I should have been clear on the point that this is primarily a debug 
> and maintenance aid for the regexp-engine developer and not intended as a 
> user-facing feature. Nobody is barred from using it, but they are expected to 
> read the circuit schematics that comes with Emacs (ie, the source code).
> In particular, there is no user interface to the regexp bytecode at all; 
> users can't write program in it and have Emacs run them. It is also not 
> stable in the slightest. Documenting the inner workings of the regexp engine 
> would only put a burden on its maintainers.

I didn't mean the user manual, I meant the ELisp manual.  I don't
agree that this command should remain undocumented, and I don't
understand your opposition to making this more visible and more easily
used.  Having users read the C code is quite an obstacle to some.

> >> +;;;###autoload
> >> +(defun regexp-disasm (regexp)
> > 
> > Why do we need to auto-load this?
> Actually, a function that returns the bytecode in symbolic form turned out to 
> be useful in its own right, and I found it handy for some programmatic uses 
> like comparing the bytecodes of two regexps.

I don't think this answers the question.  Not every useful function is
auto-loaded, is it?  Why is it a problem to have to require this

> >> +         (read-u16 (lambda (ofs) (+ (aref bc ofs)
> >> +                                    (ash (aref bc (1+ ofs)) 8))))
> > 
> > Why lambda-forms and not functions (or desfsubst)?
> Because they need to close over variables in scope.

So you are "saving" one more argument?

> With lexical binding, elisp almost feels like a real programming language!

Maybe so, but this style makes the code harder to read and modify,

> >> +               (pcase opcode
> >> +                 (0 (cons 'no-op 1))
> >> +                 (1 (cons 'succeed 1))
> > 
> > Is pcase really needed here?  It looks like a simple cond will do.
> Well, pcase is a lot more readable here, don't you think?

No, I don't, not in this case.  You are just selecting from a list of
fixed values.

> >> +  (interactive "XRegexp (evaluated): ")
> > 
> > This prompt should do a better job describing what kind of input is
> > expected here.
> I'm not sure what else to say in the prompt. I found it more useful to input 
> the regexp as a lisp expression than a string (for cut-and-paste from source 
> code, or for rx) but maybe that's just me.

I envision many people will think a string is expected, thus my

> +   Any changes here should be reflected in regexp-disasm.el as well.  */

I think the same comment should be near the definition of re_opcode_t.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]