|
From: | Benja Fallenstein |
Subject: | Re: [Gzz] ``canon3_file_format``: A canonical, N3-based file format |
Date: | Wed, 02 Apr 2003 19:38:46 +0200 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030327 Debian/1.3-4 |
Tuomas Lukka wrote:
On Wed, Apr 02, 2003 at 04:11:47PM +0200, Benja Fallenstein wrote:If we want to use it with CVS, requiring LF would mean that the files would have to be added as binary; AFAIK CVS wouldn't do diffing then.Ahh, right. Hmm - maybe should have different subformats: strict and non-strict: strict which is absolutely specified, and non-strict which is used for CVS?
Ok.
Capitalize "must" ;)Which definition do you want to use?RFC one?
Ok.
Anyway, might make sense to unconvolve the sentence: First Literals, then URIrefs, and finally anonymous nodes.
Ok.
- URIrefs are compared character-by-character, in the form as defined in [RFC 2396] (i.e., *after* Unicode characters outside the ASCII range have been escaped). Characters are compared by Unicode code point value.Is this the same as a lexicographic string comparison of the UTF-8 encoded one?I don't know.Need to explain how to compare. I couldn't write a program yet.
In Java: string1.compareTo(string2), on the in-memory representation as used by Jena.
The full writer algorithm looks something like this: - Get all Statements from the Model.- Put them into a SortedSet. Normalization into NFC is done at this stage (note to self: find out how this works in Java.) The comparator uses the algorithm specified in the PEG, using Java compareTo to compare any strings.
- Create an UTF-8 writer and write the header to it. - Write each statement in order. Escaping of literals is done at this stage. - Benja
[Prev in Thread] | Current Thread | [Next in Thread] |