[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Q: On Windows why not ignore CRLF?

From: Edward Welbourne
Subject: Re: Q: On Windows why not ignore CRLF?
Date: Fri, 2 Jun 2017 08:53:12 +0000

Paul Smith (31 May 2017 19:48:57 -0400)
>> >   #if !defined(WINDOWS32) && !defined(__MSDOS__) && !defined(__EMX__)
>> >         /* Check to see if the line was really ended with CRLF; if so 
>> > ignore
>> >            the CR.  */
>> >         if ((p - start) > 1 && p[-2] == '\r')
>> >           {
>> >             --p;
>> >             memmove (p-1, p, strlen (p) + 1);
>> >           }
>> >   #endif
>> >
>> > I'm not sure about this implementation (performance-wise) but leaving
>> > that aside, I don't understand why this code is ifdef'd out on
>> > Windows.

hmm ... yes, memmove() isn't something you want to do to the rest of the
file contents following each line-ending ... surely there's some better
way to ignore '\r's at the ends of lines ... even if it's just one
incremental memmove, increasing the offset at each '\r', instead of the
quadratic behaviour of doing memmove to "the rest of the data" after
each '\r'.

Eli Zaretskii writes:
>> Well, it's your code, or so it seems ;-)  And it was always ifdef'ed
>> away on those platforms, since the day it was committed back in 1999
>> (see commit 7052a57).

So Paul's had eighteen years to forget what lead to this - illustrating
the importance of writing reasons in comments (or, at least, commit
messages) ...

>> > I mean, CRLF is more prevalent on Windows so why wouldn't we have this
>> > there?

There would at least be no harm in it.
It just might take some oddities for this code to trigger there ...

>> > Is the idea that on Windows we want to preserve the CRLF, for some
>> > reason?  I'm not sure I see the point in doing that when we're parsing
>> > the makefile; I mean we'd throw away all the newlines on UNIX, so why
>> > would we preserve the CRLF on Windows?

>> I think Ray is right: the CRLF combination simply cannot happen on
>> those platforms, because Make there uses text-mode reads, so the CR
>> characters are simply gone by the time we get here.

address@hidden (1 June 2017 16:55)
> I concur that the code is disabled on Windows due to text mode
> stripping away the '\r' and leaving only '\n' in the input stream.

As ANSI X3.159-1989 Section 4.9.2, Streams, puts it:

  A text stream is an ordered sequence of characters composed into
  lines, each line consisting of zero or more characters plus a
  terminating new-line character.  [...]  Characters may have to be
  added, altered or deleted on input and output to conform to differing
  conventions for representing text in the host environment.  Thus,
  there need not be a one-to-one correspondence between the characters
  in a stream and those in the external representation. [...]

The MS-libc thus, for files opened in text mode, converts the native
CRLF (that binary mode sees as "\r\n") into C's canonical '\n'.
Likewise, old Mac-libc converted its '\r' to a '\n' in text mode (I trust).

> I surmise it's enabled on non-Windows builds to allow them to work
> wtih Makefiles created on a Windows machine.

Indeed, at least when files aren't opened with a 'b' flag, DOS-based
systems are exactly where CRLF will appear as '\n' without a '\r' to
complicate life.  If a contributor on MS hasn't configured line-ending
settings suitably (where their version control even has them, as in
git), their make-file may come out of version control with DOS line
endings that'll look like "\r\n" to everyone not on MS.

The remaining question is whether it's *possible* for a text-mode file
handle on MS to still have a '\r' just before its '\n' in a line,
presumably due to a CRCRLF mess in the raw file.  If so, the #if-ery
above will lead to make tripping over the stray '\r'; then again, on
non-MS, that "\r\r\n" sequence would trip up this code anyway, as it
only checks for one.  I would be tempted to get rid of the #if-ery, make
the if a while and find something better than memmove() by which to
ignore such '\r' bytes - but leaving it as it is is OK, too (albeit a
better comment might save a reprise of this conversation in the mid


reply via email to

[Prev in Thread] Current Thread [Next in Thread]