monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] line endings as project policy


From: hendrik
Subject: Re: [Monotone-devel] line endings as project policy
Date: Wed, 22 Nov 2006 10:05:06 -0500
User-agent: Mutt/1.5.9i

On Wed, Nov 22, 2006 at 10:40:58AM +0100, Richard Levitte - VMS Whacker wrote:
> In message <address@hidden> on Tue, 21 Nov 2006 23:59:41 -0800, "Justin 
> Patrin" <address@hidden> said:
> 
> papercrane> I haven't read the line endings with 0.31 thread yet
> papercrane> but...ugh. Is it really necessary to mangle line endings
> papercrane> when checking out files? I mean really....shouldn't people
> papercrane> just use a capable text editor if they're contributing to
> papercrane> a project?
> 
> If it was as easy as the editor.  Trouble is, different systems have
> different standards, and a lot of programmers know only one of the
> systems with no understanding of the rest of the world (this goes for
> Windows, Unix and VMS programmers alike, and I think this discussion
> shows it).  So far, I've seen editors make a mess (think notepad.exe),
> at least one shell (/bin/sh on Solaris) barf all ovre the place when
> it sniffs the presence of a CR, and at least one C compiler (don't
> recall which, but it was fairly recent) do the same.
> 
> As soon as you're dealing with software that transfers files between
> different platforms, this becomes the eternal problem to deal with.
> FTP had to.  Editors are typically NOT the kind of software that
> should need to deal with this kind of problem, because editors do NOT
> typically transfer files between different platforms.  Same goes for C
> compilers, shells and so on.  You can't blame them for being fed
> something that completely unexpected for the system they live in.
> Sticking our heads in the sand doesn't change this.
> 
> My point is, it's really up to monotone to do something that's at
> least sensible in most of the cases.  Right now, as soon as you start
> dealing with line endings (which is what you do as soon as you hack
> the lua function get_linesep_conv()), you take a shot at screwing up,
> royally.
> 
> There are a few proposals I actually liked, and most of all, the
> fella' that suggested monotone could check that line endings are
> consistent for anything it suspects being text.
> 
> Basically, it comes down to a few itams, some of them I regard as
> fact, others I regard as questions:
> 
>  - We need to treat files as binary unless told otherwise.  This I
>    regard as a fact.  (see the problem with screwed up files without
>    the user knowing about it)

Agree.  This is an essential safety constraint.

> 
>  - We need to mark text files as such.  This I regard as fact, and it
>    seemt to me like this is almost concensus.

Agree.

> 
>  - We need to convert line endings to the local standard on anything
>    that's assumed to be text on checkout.  This I regard as a fact.
>    (see the problem that some Unixly programs have with embedded \r)

This seems obvious, but I have some discomfort with the idea.  Perhaps 
because I'm thinking of the wider issues involved in character set 
incompatibility.  IN any case, conversion on checkout should be 
overridable in some way. 

>  - We need to make a choice, either we treat all files as binary and
>    only mark them as text and what line ending they seem to go by, or
>    we need to convert to some internal line ending standard.  It seems
>    to me this is still a question, although most seem to lean toward
>    an internal line ending standard, which is what monotone does now.
>
>  - IF we go for an internal line ending standard, we need to CHOOSE
>    one and stick with it, not have the user choose one for us.  I
>    don't currently recall if it is already this way today or if we're
>    relying on the first element returned by get_linesep_conv().  If
>    it's the latter, we need to stop that.  This I regard as fact.

If we use an internal line ending standard, we should consider the 
possibility of using the standard newline character NEL, "Next Line", 
0x85, unicode U+0085.

> The rest, such as merge problems to deal with, will come and will have
> to be treated when they do.  But first, we need to make decisions and
> stick by them.  The discussion on line endings has popped up a little
> now and then, and been left off with a few question marks and nothing
> else happening, just to come up again a few months later.  It's time
> things get decided upon so we can actually get the work done, and I
> don't believe in someone just doing and that be the winning thing,
> because months later, there's gonna be a whiner who says we f*cked up
> royally.

whiner is almost an anagram of winner :-)

> Let's get it right and reach consensus instead, well
> grounded into are minds and our wills.

To get it really well-grounded, we might also consider it in the context 
of character set conversion.  Points that are easy to overlook with 
respect to line endings may be glaringly obvious in this larger context.  
Even if we don't solve the larger context, it may make decisions clear 
with the smaller one.

> 
> So, anything I forgot?

Just how do we mark files as being text in the data base?  Will it 
conceptially be part of the checked-in revision, and editable and 
mergible like anything else?

Just how does the user mark files as being text?  A specific parameter 
on initial checkin, to be changed later on checkin?  A default for new 
files based on the last few letters of the name?  A sanity check whether 
the file is really of the type claimed?

Can we uncompress compressed files so as top better diff/merge the 
contents and recompress on checkout?  This might be very helpful for 
openoffice files.

How do we handle the transition between the current conventions and the 
new ones?

Are we currently storing files as unicode or UTF-8?  (I think only admin 
information such as file names)  Should we store text files as 
UTF-8?

-- hendrik




reply via email to

[Prev in Thread] Current Thread [Next in Thread]