|
From: | Stahlman Family |
Subject: | Re: Trailing null fields are discarded while leading ones are preserved |
Date: | Wed, 20 Dec 2006 07:03:51 -0600 |
----- Original Message ----- From: "Chet Ramey" <chet.ramey@case.edu>
To: "Stahlman Family" <brettstahlman@comcast.net> Cc: <bug-bash@gnu.org>; <chet@case.edu> Sent: Tuesday, December 19, 2006 9:00 AM Subject: Re: Trailing null fields are discarded while leading ones are preserved
Stahlman Family wrote:I guess the question is, what is meant by "delimits a field" in the following excerpt from the Bash manual? "Any character in IFS that is not IFS whitespace, along with any adjacent IFS whitespace characters, delimits a field." I suppose I'm interpreting it to mean "separates fields". It occurred to me that perhaps it means "terminates a field". The problem with that definition, however, is that if I add any single character after the final '|' in the example above, string_extract_verbatim will extract a final field, which is not terminated by anything in IFS, but simply by the end of the string. In that case, the final IFS delimiter is separating the final two fields. The bottom line is that since the Bash manual does not appear to distinguish between the cases of leading and trailing null fields, it appears that an arbitrary design choice determines that leading null fields are kept, and trailing ones are not.The Posix committee has debated this issue several times. In fact, there is a standards interpretation (from 1995!) declaring that "delimiter" must be used as "field terminator" (and the standard consistently uses "delimiter").
Ok. I have found an IEEE interpretation for 1003.2-1992 3.6.5 (interpretation #98) on the web, and I see that the behavior is correct. The thing that wasn't quite clarified by the clarification is the question: "If IFS serves only to terminate fields, then how is it that, if I add any non IFS character after the final field delimiter, a final field is created, which is "delimited" not by anything in IFS, but by the end of the original (unsplit) word?" The only satisfactory answer I could come up with for this is that the final field in that case is not being *created* by word splitting, but merely retained; i.e., the final field is all that is left of the original word as it existed prior to word splitting. All previous fields were created as a result of encountering an IFS delimiter. Thus, the additional fields are sliced off the front of the original word, and you are left either with nothing (if the final char in the original word was an IFS delimiter) or some portion of the original word otherwise. Is this the correct way to look at it?
Thanks, Brett S.
The Posix rules are at http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_05 Bash follows them faithfully. The language isn't perfect, but there is practical consensus among shell implementations. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer Live Strong. No day but today. Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/
[Prev in Thread] | Current Thread | [Next in Thread] |