[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] [parser] subscripts and underlines interacting badly

From: Nicolas Goaziou
Subject: Re: [O] [parser] subscripts and underlines interacting badly
Date: Wed, 18 Dec 2013 16:01:35 +0100


Aaron Ecay <address@hidden> writes:

> The attached patch implements this.  It also updates the fontification
> to match (by calling out to the parser, so there are potential
> performance issues although with the cache it will hopefully not be an
> issue in practice), and notes the new heuristic in the manual.  The test
> suite passes.

Thank you. Here are some comments and the usual nitpicks.

> From e2044312b95f8b427ddc662cd1abf10bf4d87b2d Mon Sep 17 00:00:00 2001
> From: Aaron Ecay <address@hidden>
> Date: Sun, 15 Dec 2013 21:30:27 -0500
> Subject: [PATCH] org-element: use brackets to disambiguate subscript/underline

You need a capital after colon.

> * lisp/org-element.el (org-element-sub/superscript-successor): use
> brackets to disambiguate subscript/underline

Ditto, and a period at the end of the sentence.

> * lisp/org.el (org-do-emphasis-faces): incorporate the above
> disambiguation

I'd rather not use `org-element-context' in fontification ATM. My plan
is, indeed, to use the parser for fontification, but in a planned-out
way. Doing it too early may be counter-productive.

For now, we can accept some discrepancies between fontification and
syntax (there are many other such occurrences anyway).

> * doc/org.texi: reflect these changes in the manual

See above.

> +When it follows an alphanumeric character, the underscore is always
> +interpreted as a subscript (@pxref{Subscripts and superscripts}), and when it
> +follows whitespace it is always the start of an underline (assuming a
> +matching underscore is found in a proper position further along).  However,
> +after a punctuation character (for example the apostrophe), the underscore
> +character can be ambiguous between these two interpretations.  Org uses a
> +simple heuristic for these cases: if the character following the underscore
> +is an opening brace @address@hidden or if no matching underscore is seen in 
> the
> +following text, the underscore is considered to be the start of a subscript.
> +Otherwise, it is the start of underlining.

There is no harm in documenting it, but remember that it's not a feature
of the syntax.  Maybe it could be shortened and put into a footnote

> +    (let (res)
> +      (while (and (not res)
> +               (re-search-forward org-match-substring-regexp nil t))
> +     (goto-char (match-beginning 0))
> +     (when (or
> +            ;; this subscript uses brackets -> handle as subscript
> +            ;; unconditionally

Comments need to start with a capital and end with a period.

> +            (eq (aref (match-string 3) 0) ?{)
> +            ;; it is not ambiguous with an underline -> handle as
> +            ;; subscript
> +            (not (looking-at-p org-emph-re)))

It should be `org-looking-at-p' for compatibility with other Emacsen.

> +       (setq res (cons (if (string= (match-string 2) "_")
> +                           'subscript
> +                         'superscript)
> +                       (match-beginning 2))))
> +     ;; otherwise -> keep going, and let the underline
> +     ;; parser have it
> +     (goto-char (match-end 0)))

I think

  (save-excursion (goto-char (match-beginning 0)) ...) 

is better than 

  (goto-char (match-beginning 0)) ... (goto-char (match-end 0)).

> +      res)))

I suggest to use (catch 'found ... (throw 'found (cons ...))) instead
of RES variable: the less `setq', the better.


Nicolas Goaziou

reply via email to

[Prev in Thread] Current Thread [Next in Thread]