CVSROOT: /cvsroot/storm
Module name: storm
Changes by: Benja Fallenstein <address@hidden> 03/04/06 16:46:22
Modified files:
doc/pegboard/simple_storm--benja: peg.rst
Log message:
update: specify canonicalization
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/storm/storm/doc/pegboard/simple_storm--benja/peg.rst.diff?tr1=1.3&tr2=1.4&r1=text&r2=text
Patches:
Index: storm/doc/pegboard/simple_storm--benja/peg.rst
diff -u storm/doc/pegboard/simple_storm--benja/peg.rst:1.3
storm/doc/pegboard/simple_storm--benja/peg.rst:1.4
--- storm/doc/pegboard/simple_storm--benja/peg.rst:1.3 Thu Apr 3 03:20:18 2003
+++ storm/doc/pegboard/simple_storm--benja/peg.rst Sun Apr 6 16:46:22 2003
@@ -4,8 +4,8 @@
:Author: Benja Fallenstein
:Date: 2003-02-16
-:Revision: $Revision: 1.3 $
-:Last-Modified: $Date: 2003/04/03 08:20:18 $
+:Revision: $Revision: 1.4 $
+:Last-Modified: $Date: 2003/04/06 20:46:22 $
:Type: Architecture
:Scope: Major
:Status: Current
@@ -106,13 +106,28 @@
(How long are the IDs going to be and whether
this will be a problem.)
- RESOLVED: Here's an example URI, 102 characters long:
+ RESOLVED: Here's an example URI, 106 characters long:
- urn:urn-?:application/rdf+xml,QLFYWY2RI5WZCTEP6MJKR
- 5CAFGP7FQ5X.VEKXTRSJPTZJLY2IKG5FQ2TCXK26SECFPP4DX7I
+ urn:urn-?:1.0:application/rdf+xml,QLFYWY2RI5WZCTEP6MJ
+ KR5CAFGP7FQ5X.VEKXTRSJPTZJLY2IKG5FQ2TCXK26SECFPP4DX7I
This is long, but IMO not 'too long.'
+- Why the ``1.0``?
+
+ RESOLVED: To have some kind of versioning information,
+ e.g. if we have to change the hash functions because
+ something is broken.
+
+- Are the rules for escaping too complex? What's with all this
+ escape this, don't escape that, quote this, don't quote that?
+
+ RESOLVED: The important things to notice are that
+ the common cases are simple (just a type, type plus charset),
+ and that canonicalization is *really* easy. The other
+ rules aren't that difficult, either, and they only
+ apply in uncommon cases. It should be ok.
+
- Why this syntax? Why not another?
RESOLVED: For similarity to ``data`` URLs.
@@ -122,27 +137,52 @@
=======
Storm blocks do not have headers any more; the hash in their URN
-is only of the body. Storm URNs have the following form:
-
- <namespace>:block:[<mediatype>],<data>
-
-``<namespace>`` is an informal URN namespace to be registered,
-like ``urn:urn-5``. ``<bitprint>`` is a Bitzi bitprint as defined
-by <http://bitzi.com/developer/bitprint>. ``<mediatype>`` is
-the token defined in [RFC2397]--
+is only of the body. Block URNs have the following form::
+ blockurn := namespace "1.0:" [ mediatype ] "," bitprint
mediatype := [ type "/" subtype ] *( ";" parameter )
parameter := attribute "=" value
-"where [...] 'type', 'subtype', 'attribute' and 'value' are
-the corresponding tokens from [RFC2045], represented using
-URL escaped encoding of [RFC2396] as necessary" [RFC2397].
-(Escaping is necessary when a character isn't in the set
-of allowed URN characters.)
+``namespace`` is an informal URN namespace to be registered,
+like ``urn:urn-5``. Before it is registered, ``urn:storm:``
+is used. ``bitprint`` is a Bitzi bitprint as defined
+by <http://bitzi.com/developer/bitprint>; this means it's
+32 characters, a dot, plus 39 more characters.
+
+The ``type``, ``subtype``, ``attribute`` and ``value``
+tokens are specified by [RFC2045]. All characters not
+in ``<URN chars>`` as defined by [RFC2141] MUST be
+percent escaped [RFC1630], with one special exception:
+The slash separating type from subtype MUST NOT be escaped.
+This is for easier readability, and is consistent with
+the use in ``data`` URLs [RFC2397] (it's also the thing
+most likely to be struck down in the namespace
+application process... but we can see whether it
+gets through or not).
+
+Block URNs are completely case-insensitive; they are
+canonicalized by lower-casing them, character by character.
+Two block URNs are thus considered equal when compared
+ignoring case.
+
+To make this work, in case-sensitive ``values``, upper-case
+characters MUST be percent escaped, since they are not allowed
+in the canonical form. This is admittedly ugly, but
+case-sensitive ``values`` are rare. For parameters whose ``value``
+is always a ``token`` as defined by [RFC2045] (for example
+``charset``), ``value`` SHOULD NOT be enclosed in quotation marks
+(prior to percent escaping). For parameters whose value may
+contain characters not allowed in ``token``, ``value`` SHOULD
+be enclosed in quotation marks. Quoting [RFC2045], ::
+
+ token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
+ or tspecials>
"X-" types aren't allowed, as they work against the persistence
of Storm blocks; ``application/octet-stream`` or similar
-must be used instead.
+must be used instead. There is an internet-draft
+[draft-eastlake-cturi-04] on the use of URIs as MIME types;
+if this becomes standard, it should be used for extension.
Unlike in [RFC2397], if no ``<mediatype>`` is given,
``application/octet-stream`` is assumed (not ``text/plain``).
_______________________________________________
Gzz-commits mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/gzz-commits