bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: read problem


From: Robert Elz
Subject: Re: read problem
Date: Sun, 25 Sep 2022 03:44:35 +0700

    Date:        Sat, 24 Sep 2022 11:21:53 -0500
    From:        Dennis Williamson <dennistwilliamson@gmail.com>
    Message-ID:  
<CANaoh6K=29x_w1NzTmhVn5YJJepjAEg9K7G4AJqPh+gp5T2F4Q@mail.gmail.com>


  |        If IFS is unset, or its value is exactly <space><tab><newline>, the
  | default, then any sequence of IFS characters serves  to  delimit  words.
  |        If IFS has a value other than the default, then sequences of the
  | whitespace characters space and tab are ignored at the beginning and end
  |        of the word, as long as the whitespace character is in the value of
  | IFS (an IFS whitespace character).  Any character in IFS that is  not
  |        IFS  whitespace,  along  with  any adjacent IFS whitespace
  | characters, delimits a field.  A sequence of IFS whitespace characters is
  | also
  |        treated as a delimiter.  If the value of IFS is null, no word
  | splitting occurs.

That's not exactly how it works (or how it should work anyway) though it
is close.   What matters are IFS white space (any of space, tab, or
newline that is in the value of IFS), and other IFS characters (other
characters that are in IFS) and which of those appear in the string being
split.   If the string is abc\s\tdef\t\t\s\tghi and IFS is \s\t\n!
(in this message, \s represents a space, just for clarity - \s meaning space
cannot be used in shells, any of them, anywhere) then the string is split
exactly the same way as if IFS were \s\t\n or \s\t (since there are no !'s
nor \n's in the string, those chars in IFS are irrelevant).

The rule is that leading and training IFS whitespace are deleted, the string
is split on any sequence of IFS whitespace, a single other IFS char, and more
IFS whitespace (where each of those 3 is optional, as long as at least one
appears - and just seeing the 2 sequences of IFS whitespace reduces to a
single sequence of IFS whitespace (obviously) so that's a meaningless case).
Each of those splits terminates a field (IFS really contains field terminators,
not field separators - it is badly named, but we've had it more than 40 years,
so it isn't changing now).  "Delimits a field" as written above is technically
correct, the field is ended, but not everyone would realise that some other
non IFS whitespace character must follow before a new field starts, which is
why I prefer to think of it as "terminates" rather than "delimits".

After all of that (since we know the string wasn't quoted, or it wouldn't be
being split) any empty fields can be deleted.

The first sentence starts off correctly, but it is generally easier to
treat it as "If IFS is unset, it is treated as if its value is \s\t\n"
and then just apply the rest of it as if that were the value.   If IFS
is set, but empty (IFS='' (and any time its value contains no chars that
are in the string being split, which if IFS is empty must be true) then no
splitting happens, and the string is unchanged.

kre




reply via email to

[Prev in Thread] Current Thread [Next in Thread]