[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Orgmode] Improve percent escaping links in Org mode (pull request / OK

From: David Maus
Subject: [Orgmode] Improve percent escaping links in Org mode (pull request / OK to push)
Date: Sun, 02 Jan 2011 20:37:24 +0100
User-agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (Gojō) APEL/10.8 Emacs/23.2 (i486-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)

This is a pull request or push announcement for the first set of
patches to improve Org mode's percent escaping functions.  This set of
changes solves the problems with percent escaping non-ascii

address@hidden:dmj/dmj-org-mode.git feature/org-percent-escaping

I do have commit access but because this set of changes might break
things seriously I'd like to get an "OK to push" or someone who pulls
and reviews the changeset.

The problem:

Current implementation of percent escaping URIs uses a whitelist
approach, e.g. only percent escapes characters that are in
`org-link-escape-chars' or in a user supplied list.  This is a problem
because using this function requires knowledge about all possible
characters that could occur in a URI -- and URIs are limited to plain
ASCII, meaning a call to the function must list literally all possible
characters and their escapings to get a properly percent escaped

The changes:

- `org-link-escape' percent escapes every character that matches one
  of the following conditiions:

  * equal 37 (percent sign)
  * equal 127 (DEL, control character)
  * below 32 (control character)
  * above 127 (non-ASCII character)
  * a character in the escaping table (e.g. `org-link-escape-chars')

  The character in question is first encoded in UTF-8, then all bytes
  of the resulting character are percent escaped.  If converting to
  UTF-8 fails, Org throws an error indicating this problem.

  The function got a optional third argument which can be set to merge
  to user defined table with the default escaping table.

- `org-link-unescape' unescapes every percent-escape sequence.  It is
  no longer possible to supply a list of characters that should be
  unescaped.  No function in core used `org-link-unescape' with a
  unescaping table.

  Internally the `org-protocol-unhex-*' functions were renamend to
  `org-link-unescape-*', moved to org.el and refactored (thanks to
  Vincent Belaïche for suggesting some of the changes).  They are
  declared obsolete and aliased per 2010-11-21.

  The unescaping function is backward compatible and unescapes the old
  percent escape format for non-ASCII characters (thanks to Sebastian

  It is possible that the new implementation will break links in at
  least this (known) case: If the user stored a link to a file or
  directory containing a percent sign.  Currently Org mode does not
  percent escape the percent sign and subsequently the new variant of
  `org-link-unescape' will try to unescpae the alleged percent escape

- `org-link-escape-chars' format changed.  It's just a list of
  characters to escape, the percent escape sequence is implied by the

  Functions in core that used a custom escaping table are changed
  accordingly to use the new table format.

What is next:

  - check if we can fall back to use `url-hexify-string' and
    `url-unhex-string' instead our own functions
  - check if the recent problems with percent escaping are solved

  -- David

[1] Not escaping the percent sign is actually a glitch: Try to store
and open a link to a file literally called "foo%20baz.org".

Attachment: pgpjUbb4d3To_.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]