monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Text under revision control


From: hendrik
Subject: [Monotone-devel] Text under revision control
Date: Wed, 25 Feb 2009 19:01:34 -0500
User-agent: Mutt/1.5.13 (2006-08-11)

On Thu, Feb 26, 2009 at 11:16:35AM +1100, Daniel Carosone wrote:
> On Thu, Feb 26, 2009 at 12:09:45AM +0100, Philipp Gr?schler wrote:
> > Philipp Gr?schler schrieb:
> > > In the course of the current Mini Summit I spent the afternoon hacking
> > > on a (yet still) small XSLT file whose purpose will be the conversion of
> > > Monotone's Texinfo Documentation to a set of multiple files which can be
> > > used for the Wiki.
> > > ....
> > 
> > I just committed the first release of this thing, in a very *pre-alpha*
> > state. 
> 
> I saw the commits before this thread, and was curious what you were up to.
> Alas, I missed the mini-summit this time.
> 
> But - excellent!
> 
> As far as output format goes, mdwn or others can be deal with by
> ikiwiki.  The limitations there are around some of the more specific
> semantic markup: noting that this represents a command, or an option,
> or a literal vs a variable, and getting this information through to
> the point where CSS can render it with visual distinctions.  
> 
> Markdown offers some basic notations, and the opportunity to revert to
> html elements for more detailed cases, but this can be a little
> disruptive as a document author writing a wiki page (it's a sudden
> shift from minimal to more extensive internal markup).  That is much
> less an issue if, at least in the first phases, we're talking about
> keeping the source in texinfo and rendering to something that ikiwiki
> can consume to produce a better-integrated output on the website
> (indexing, etc). 
> 
> These are good examples of the discontinuity, by the way, because many
> of these element types native to texinfo are focused on software
> documentation, where markdown is more focused on general writing.
> 
> Longer term, we need to develop a strategy for more unified
> documentation.  That may involve changing the markup source for some
> components, and potentially integrating your work into ikiwiki
> (allowing it to read essentially another markup input language).  It
> almost certainly involves unifying the stylesheet, both in terms of
> the output rendering and the selection of styles available.
> 
> It also would involve allowing the creation of narrative navigaton
> paths through the page collection, both as a reading guide online and
> to structure the generation of offline formats (e.g. PDF output of
> documents similar to the current manual, in an organised sequence of
> chapters and sections).
> 
> This means we'll have pages on the site intended for differnet
> purposes, generated from a number of mechanisms (including automatic
> aggregation via some of ikiwiki's tricks), and potentially from
> sources in different markup styles.
> 
> The great thing about this work is it (begins to) breaks the coupling
> between purpose and style, which means content can be used for
> multiple purposes regardless of style, in turn meaning that
> "unification" doesn't get confused with "markup conversion".
> 
> So, really, yay, and yay again.
> 
> --
> Dan.

There's a real need for a document file format that

  (1) behaves well with versin-control software (VCS) (i.e., independent 
changes are likely to be treated as such duting merging and other 
operations),

  (2) follows international standard notations, or is easily converted 
to them, and

  (3) is easily converted to popular file formats, and to the file 
formats publishers demand (such as LaTEX, pdf, and Word).

I'd like to open the discussion on whether this involves innovation in 
the VCS or in the file formats, and what kind of innovation is needed.
 
Part of the problem is that an VCS usually treats change as being 
insertion, deletion, and possibly movement of lines.  Word-processors 
usually treat the division of text into lines either as completely fluid 
(resulting in lots of spurious changes in the eyes of the VCS) or as 
absent from the file format (resulting in entire paragraphs being single 
lines).

In either case, changes like single-character typo-fixes are promoted 
into paragraph replacements, and independent typo-fixes become 
conflicting changes.

Further,  word-processors often compress their files, resulting in 
complete loss of structure for the VCS.  I had had hopes for .odt files 
(at last there was a real standard), but zipping them (which is sht 
standard) turns them into binary gibberish.  .fodt files (the same XML 
stuff, but packaged into one text file instead of zipped into 
gibberish) could be better, but here the entire text of the document 
seems to end up being one single line.

Now the VCS could use a different difference algorithm when processing 
them.  Or it could unpack them into something easier to process (like a 
sequence of words instead of lines).  Or the word-processor could use a 
better file-format, or be careful to preserve the locations of the 
meaningless line numbers in the file, or insert many of them in standard 
places (such as sentence breaks, or punctuation, or between every two 
words).

But until some such VCS-compatible file format becomes well-established, 
and easy to convert to other standard forms (and nonstandard forms like 
Word) it's the VCSs that will have to deal, if they are to be widely 
used for managing frequently-edited text.

We're trying to cobble together disparate systems to get something that 
works for now.  A kind of "Documents in our time", perhaps.  It's a 
start.  I've cobbled together some stuff for my own use, too, but I 
wouldn't want to foist it on others.  Its main virtue is that when I 
need further features I can hack together further code.  The C++ code 
that translates it is almost part of the document, and no, I don't think 
that's an appropriate style to propagate.

Now there are file formats that meet some of the technical 
requirements,  Almost anything with explicit markup that's edited by 
emacs will do, as long as emacs isn't made to flow words from one line 
to the next, and the human wielding the editor knows not to do this.  
But few have the necessary social acceptance and easy convertibility to 
and from the other formats.

-- hendrik




reply via email to

[Prev in Thread] Current Thread [Next in Thread]