[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Lack of tooling slowing down contributions

From: Juri Linkov
Subject: Re: Lack of tooling slowing down contributions
Date: Wed, 19 Jun 2019 00:29:51 +0300
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu)

>> E.g. for example, should not become "E.g.  for example".
>> P. G. Wodehouse should not end up as "P.  G.  Wodehouse".
>> Prof. Moriarty should not end up as "Prof.  Moriarty"
> I believe it's possible to find some simple heuristics (maybe even
> just regexp-based) that would cover more than 90% of cases.

A simple heuristics relying on the sentence length would cover most cases.
Here is experimental code that works surprisingly well for adding double space
to sentences longer than 5 letters:

  (defun canonically-double-space-region (beg end)
    (interactive "*r")
    (canonically-space-region beg end)
    (unless (markerp end) (setq end (copy-marker end t)))
    (let* ((sentence-end-double-space nil) ; to get right regexp below
           (end-spc-re (concat "\\([^.,]\\{5,\\}\\)\\(?:" (sentence-end) 
        (goto-char beg)
        (while (and (< (point) end)
                    (re-search-forward end-spc-re end t))
          (unless (or (>= (point) end)
                      (looking-back "[[:space:]]\\{2\\}\\|\n"))
            (insert " "))))))

  (advice-add 'fill-paragraph :before
              (lambda (&rest _args)
                (when (use-region-p)
              '((name . fill-paragraph-double-space)))

It's one drawback is that it doesn't handle such false negatives as short
sentences.  Alas. But such cases are extremely rare.  False positives
such as long abbreviations are rare too.

BTW, I tried to replace the regexp above with ‘rx’ that is easier to read:

  (rx (group (>= 5 (not (any " ."))))
      (group (regexp (sentence-end))))

but it fails.  Not sure if this will be fixed in bug#36237.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]