[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: {arch} directory

From: Jason McCarty
Subject: Re: [Gnu-arch-users] Re: {arch} directory
Date: Thu, 25 Sep 2003 11:56:03 -0400
User-agent: Mutt/1.3.28i

Stephen J. Turnbull wrote:
> >>>>> "Jason" == Jason McCarty <address@hidden> writes:
>     Jason> Why would I encode ? as anything other than 0x3F? And why
>     Jason> would I complain if arch gave me back exactly that?
> ROTFLSHIPIMP!  I couldn't have asked for a better straight line!
> You see (well, actually I suppose you don't :-( ), your system
> substituted Unicode 003F for Unicode 0109, providing an apt example.

I don't get it. I expected mutt to encode the question mark as 0x3F, and
AFAICT it did. Unicode doesn't even enter the picture, just ASCII.
Admittedly I know very little about unicode and encoding, but that at
least is what hexdump tells me. On a side note, why does unicode have
its own question mark? I thought it incorporated ascii as a subset.

> Fortunately, the difference between ? and ? is immediately evident to
> my eye, so the bug was localized without trouble.
> Now, what is fairly likely to happen on sophisticated systems is that
> that character will be translated into a non-Unicode "native" encoding
> of the _same_ character but different bytes, and any references to
> that filename embedded in file content (eg, Makefiles) will fail to
> match.  Arch will very precisely replicate that mistake, and transmit
> it to other systems, where it may randomly start working again!  Or
> randomly break in a different fashion.

That sounds very broken. If a system understands unicode, why not
preserve every character with its original coding in the filename?
Translating filenames is just asking for trouble, and will break more
than arch.

> And it will be just as impossible to see on screen as the difference
> between a SPACE and a NO-BREAK SPACE, unless you know what to look for
> and deliberately prevent your system from being "helpful".

I can do without the kind of "help" that breaks things behind my back :-)

>     >> The only interesting restrictions are on ASCII characters,
>     >> namely, the space.  (Which is a bloody stupid thing to have in
>     >> a file name; all the systems that support that also know about
>     >> NO-BREAK SPACE.)
>     Jason> Bah, space is a perfectly acceptable character in
>     Jason> filenames.
> That's simply false.  Even on systems that provide for it, there are
> many contexts where things break; for example, constructs like
> "less `which baseball.bat`" will necessarily fail if 'baseball.bat'
> lives in 'C:\Program Files'.  Now suppose 'baseball.bat' lives in
> 'C:\Program&nbsp;Files'.  You and the shell can both Just DTRT.  No?

Quoting is my panacea:
  less "`which baseball.bat`"
or in zsh,
        less =baseball.bat

That's rarely an issue for me anyway, as completion deals with spaces
fine, and zsh's globbing appears to be smart enough to quote/escape when
it needs to.

I suppose foo&nbsp;bar would eventually be acceptable if the shell
supported it, but it seems unnecessary, since spaces work for me _now_.

> So it is perfectly _unnecessary_.
> N.B. The user interface conventions that would provide convenient ways
> to enter NBSP in filename contexts are left as an exercise.

The ampersand and semicolon are just as problematic as the space --
if the shell supports that transparently, it can just as easily support


reply via email to

[Prev in Thread] Current Thread [Next in Thread]