[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Feature suggestion: "tla inventory -0"

From: Tom Lord
Subject: Re: [Gnu-arch-users] Feature suggestion: "tla inventory -0"
Date: Thu, 25 Dec 2003 12:27:17 -0800 (PST)

    > From: Charles Duffy <address@hidden>

    > ...resulting in null-delimited output, suitable for piping into
    > xargs -0 or the like, and thus causing The Right Thing to happen
    > in cases involving filenames with spaces.

    > Thoughts?


What I would most like to avoid longer-term is a half-hearted
accumulation of features, each intended to make filenames-with-spaces
support closer, but in actuality not adding up to anything coherent.

The null-character convention used by GNU xargs (and GNU tar as I
recall) is one strategy for dealing with such filenames -- but I think
it is a problematic one.   For example, other textutils don't
understand that convention, it looks horrible in a text editor,
although fine for filenames it can't handle fields that contain the
null character, etc.

We have other needs within arch for lists (in some cases multi-field
lists) which can include odd filenames.  I'd find it easier to say yes
to incrementally adding features to arch if we first had an overall
strategy for fields that can contain non-graphical characters.

So far as I know, the choices basically come down to:

~ use 0 specially 

  losses: not terminal or editor friendly,
          can't handle 0 in fields

  wins: GNU xargs and GNU tar support it

~ use a quotation syntax (which also then has to include escapes)
  to delimite fields with some kind of quote mark

  losses: whitespace-based field separation fails,
          tools need to translate fields for many operations

  wins: pick the string syntax of your favorite scripting language

~ use an escape syntax without delimiters to map all strings into
  strings of graphical characters

  losses: tools need to translate fields for many operations

  wins: whitespace-based field separation works,

Of these, I think I'm mostly inclined towards the last one (but see

If you look at my full devo tree (as opposed to devo.tla) you can see
that there's a lonely directory there containing just `unfold.c'.

One direction I think is worth exploring:

~ making a full plan for arch (changeset format, log file format,
  cached inventory file format ....)

~ make a coding standards spec for tools in general to handle 
  the new conventions

~ incrementally add stuff to arch according to the plan.
  also incrementally add utils to src/text-utils according
  to the plan

One difficulty is that it's probably worth thinking about Unicode
issues in the same plan.

Another difficulty is that it's probably worth thinking about
alternative record syntaxes at the same time -- e.g., a generic syntax
for multi-line records.


All that said, the other fork in the road goes this way: 

The "software tools" paradigm is really cool, but this is a specific
instance of a general problem: that it was developed around a very 
simplistic data model.  (That's also a strength of it, of course --
e.g., the way `grep' and `join' can see the same data in two quite
different ways.)

Without wishing to start YAXFW, there is an unsolved need for a more
general, more flexible exchange format between tools.  A small core
set of data types which is extensible, a read and write syntax for
those, etc.  Of course the most obvious choice here is s-expressions.

So that's another way to go: to form a plan mostly around generalized
and language-independent s-exps instead of simple

The two forks aren't necessarily incompatible: fully-general
records-of-string-fields could perhaps be defined just as a
specialized syntax for a subset of s-exps.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]