[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?
From: |
Marcus Sundman |
Subject: |
Re: [Gnu-arch-users] Re: How does arch/tla handle encodings? |
Date: |
Sun, 29 Aug 2004 14:57:37 +0300 |
User-agent: |
KMail/1.7 |
On Sunday 29 August 2004 03:02, Michael Poole wrote:
> Marcus Sundman writes:
> > On Saturday 28 August 2004 20:02, Michael Poole wrote:
> > > 1) You have not defined any specific problems that you want to solve,
> > > then you assume that we are too stupid to solve your problem. So
> > > far you have made complaints analogous to "arch should solve the
> > > code attribution problem."
> >
> > First of all I originally only tried to get answers to a few questions,
> > and almost immediately people started bashing wildly.
> >
> > That said, the problem here was very specific. No link in the chain
> > should lose the essential piece of metadata referred to as "encoding
> > info". If it is lost then there is no way to get it back. How is this
> > not specific?
>
> Where is this metadata established? I know of no editor on my Linux
> or Windows machines that records "encoding info," except within the
> byte stream of the files they work on.
The metadata is established when the string is encoded. Duh!
Sigh.. I've already said this, but sure, I can say it once more...
If the encoding isn't specified explicitly then it's implicitly the system's
default encoding, as defined by your environment settings. At least this is
how it's done in most systems today. E.g. when you write "echo foo >bar"
then the file "bar" will be created in the local system's default encoding.
This usually works reasonably well until the file leaves the local system.
Then you have to also send the encoding metadata along, lest the file
becomes unusable.
> The kind of specifics I would like is a description like "I commit a file
> using ISO-8859-15 into arch, and someone who gets that file and opens it
> using an ISO-8859-1 editor gets the wrong non-ASCII characters."
OK.
"I commit a file using windows-1252 into arch, and someone who gets that
file and opens it using an UTF-8 editor gets the wrong characters."
So, why didn't he simply open the as windows-1252 instead of UTF-8? Because
he didn't know what encoding the file was in, damn it! Why not? Because
arch threw away that piece of info!
> The obvious question about that case is:
> Suppose arch records and can report the encoding. How does that help
> a user who needs arch's assistance to discover the encoding?
Huh? You have answered your own question. If you need to know the encoding
then obviously it helps if arch can tell it.
> > > 2) You insist that the best way to solve an uncommon problem (most
> > > users have no confusion about encoding systems) is by arch
> > > providing a special-purpose hook.
> >
> > I have insisted no such thing. Also, in my experience the problem is
> > way too common.
>
> If it is not a special-purpose hook, what generic mechanism exists
> that permits arch to record this metadata?
There are several alternatives. E.g., you could provide the info as command
line args, and you could have per-user, per-project, per-module and/or
per-filetype defaults, so that you don't have to use the command line
switch. The arch client could also detect the local system's default
encoding and default to that if nothing else is specified. There are
probably a lot more ways, too, but something tells me you're not in the
slightest interested in even thinking about it. No, since you haven't
experienced the problem (or at least think you haven't) then the problem
obviously doesn't exist, so you bitch and moan to your heart's extent when
the issue is brought up. What a nice attitude.
> I do not discard the value of your experience, but "way too common" is
> both subjective and vague. My experience is to the contrary -- mostly
> because people tend to know what coding system is used by files they
> open or edit -- and I do not know of any reason to accept your
> experience as more accurate than mine.
Huh? First of all, my experience is very "accurate". There's nothing
inaccurate about having trouble with different encodings in mixed systems
environments.
E.g., in my company we currently have two teams, one that uses UTF-8 and one
that uses a mix of ISO-8859-15 and windows-1252. We also have a library
"module" that is imported into both teams' source code trees. It's obvious
that this causes trouble, and there is nothing inaccurate about the fact.
Secondly, I disagree that people tend to know what encoding is used. Mostly
people seem to simply ignore the issue and hope for the best. Many have
decided to use only English, just because they've noticed those characters
looks the same for all team members.
Still, even if the majority wouldn't be experiencing problems that doesn't
mean that you should just screw over the minority. Of course you have to
draw the line somewhere, but this particular minority isn't very small, and
it'll only get larger as a result of further internationalization.
> > > If you want us to take you seriously, it would be helpful to be very
> > > specific about how and where you believe your problem occurs and why
> > > arch is a good place to solve this problem.
> >
> > The problem occurs when one link in the chain behaves badly. Arch is
> > one link in the chain. Exactly what is it that you don't understand?
>
> As I explained above, I still do not understand what specific problem
> you want to solve.
Now there we have the comprehension problem again. Sigh...
Sorry, I just don't know how to say it more clearly.
> There is a chicken-and-egg problem with standards to record this:
> until some standard storage mechanism exists, tools will randomly
> destroy the metadata. But until tools exist, many implementors will
> reject a proposed storage mechanism as not truly standard.
How can anyone have such an amazingly narrow field of view?
Read my lips: you don't have to use EAs or similar. I have already mentioned
several alternatives. Get a clue already!
In general, you don't have to make something perfect from day one. That
doesn't mean that there is no way of making it good, or even perfect, in
steps.
> The main problem I see with common filesystems is that, in the general
> case, the metadata has to be stored in a separate file. When multiple
> streams per file are supported by more operating systems, a meaningful
> mechanism can be used. Until then, there can be only fragile kludges
> to address the problem.
One has to start somewhere. Otherwise it's impossible to get around
chicken-and-egg problems.
> If your proposal describes how to use EAs, named streams, or whatever
> other OS/FS-specific mechanism implements per-file metadata, I would
> like to hear it
It doesn't. Those are implementation details, and as such needs to be worked
out by people more experienced with arch.
Oh, it just dawned on me that maybe we are miscommunicating because you
think I'm talking on the implementation level when I'm actually talking on
the conceptual level. I'm sorry if I've misled you.
> I apologize for offending you.
Apology accepted. I also apologize for using somewhat harsh words
occasionally.
- Marcus Sundman
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, (continued)
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Michael Poole, 2004/08/27
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Esben Mose Hansen, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Jan Hudec, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Esben Mose Hansen, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Jan Hudec, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Esben Mose Hansen, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Michael Poole, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Marcus Sundman, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Michael Poole, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?,
Marcus Sundman <=
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Message not available
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Aaron Bentley, 2004/08/29
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Miles Bader, 2004/08/29
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Zenaan Harkness, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Tom Lord, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Jeremy Shaw, 2004/08/28
- Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?, Robin Green, 2004/08/27