bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#26574: v4.4: POSIX violation with respect to output of a trailing ne


From: Assaf Gordon
Subject: bug#26574: v4.4: POSIX violation with respect to output of a trailing newline, even with --posix
Date: Thu, 20 Apr 2017 18:32:22 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

Hello,

On Thu, Apr 20, 2017 at 11:46:15AM -0500, Eric Blake wrote:
On 04/20/2017 11:36 AM, Michael Klement wrote:
Thanks for the detailed feedback, Eric.

The POSIX spec. is, unfortunately, vague on this topic:

The definition of a line (which you quote) is complemented with the definition of an 
incomplete line 
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_195>:

A sequence of one or more non- <newline> characters at the end of the file.


So while the standard is aware of this possibility and gives it a name that 
suggests it is a kind of line, but something's missing, there is precious 
little behavior prescribed with respect to such incomplete lines.


You're welcome to submit a bug report to get POSIX to more clearly word
its intentions that a file with an incomplete line is NOT a text file
(http://austingroupbugs.net/main_page.php), but everyone on the Austin
Group (myself included) has already agreed that the intention is there
(even if the wording could be improved): Omitting a trailing newline
causes sed to enter into the realm of undefined behavior - and this is
BECAUSE there are existing sed implementations that behave differently
when a trailing newline is omitted.  Some do not do anything with an
incomplete line (sed behaves as though the file were truncated at the
last newline).


For completeness, here's the behaviour of several implementaions:

sed implementations that do not add a newline (like gnu sed):
  FreeBSD 10
  OpenBSD 5.9
  BusyBox 1.22
  ToyBox 7.2
  AIX 7

sed implementations that do add a new line:
  NetBSD 7.0
  Heirloom

SunOS 5.11's sed prints nothing if there is no newline:
  $ printf 'a' | sed '' | od -tx1
  0000000
  $ printf 'a\n' | sed '' | od -tx1
  0000000 61 0a
  0000002
  $ uname -a
  SunOS unstable11s 5.11 11.2 sun4u sparc SUNW,SPARC-Enterprise
  $ which sed
  /usr/bin/sed


The behaviour (of processing a file without newline at the last line) also differs in other programs/languages/implementations:

  $ printf a | perl -npe '' | od -tx1
  0000000 61
  0000001

  $ printf a | perl -lnpe '' | od -tx1
  0000000 61 0a
  0000002

  $ printf a | awk '{print}' | od -tx1
  0000000 61 0a
  0000002

  $ printf 'a' | sh -c 'while read A ; do echo $A ; done' | od -tx1
  0000000

  $ printf 'a' \
     | python3 -c 'import sys; [print(x,end="") for x in sys.stdin]' \
     | od -tx1
  0000000 61
  0000001

  $ printf a | uniq-gnu | od -t x1
  0000000 61 0a
  0000002

  $ printf a | uniq-freebsd-11 | od -t x1
  0000000    61
  0000001

  $ printf a | cut-gnu -f1 | od -tx1
  0000000 61 0a
  0000002

  $ printf a | cut-freebsd-11 -f1 | od -tx1
  0000000    61
  0000001

  $ printf a | sort | od -t x1
  0000000 61 0a
  0000002


And this reinforces what Eric wrote: there is simply no
'one correct' (or agreed-upon) way to deal with files without newlines on the last line.


regards,
- assaf





reply via email to

[Prev in Thread] Current Thread [Next in Thread]