help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Insertion of extra OFS character into output string


From: H
Subject: Re: Insertion of extra OFS character into output string
Date: Tue, 14 Mar 2023 15:06:26 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 03/14/2023 02:58 AM, Neil R. Ormos wrote:
> H wrote:
>> "Neil R. Ormos" wrote:
>>> H wrote:
>>>> I am a newcomer to awk and have run into an
>>>> issue I have not figured out yet... My
>>>> platform is CentOS 7 running awk 4.0.2, the
>>>> default version. [...]
>>> I don't have 4.0.2 available to test, but I
>>> tested with older and newer versions.
>>> When I test, I get the result I think I expect
>>> from the code you posted. [...]
>>> It would be easier to help if you would please provide:
>>> the simplest input line that reproduces the problem;
>>> the output you expect; and
>>> the output you are getting.
>> I am not on my computer but typing this on my
>> phone. With that caveat, a /minimal/ example
>> would be:
>> echo "Alpha,Beta,Charlie,Delta" | awk 'BEGIN{FS=","; 
>> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}'
>> I would expect to see:
>> Alpha<TAB>Beta<TAB>Charlie<TAB>Delta
>> but instead see
>> Alpha<TAB><TAB>Beta<TAB>Charlie<TAB>Delta
>> If you change $1=$1 to $2=$2 you will find that the extra tab character then 
>> moves to the next field.
>> Can anyone try this with the most recent version of awk?
> I tested with four versions of Gawk:
>   GNU Awk 3.1.7
>   GNU Awk 4.1.1
>   GNU Awk 4.1.4
>   GNU Awk 5.2.0
>
> and among those versions was able to reproduce the behavior that is vexing 
> you only in version 4.1.1.  
>
> It appears that issue was fixed no later than version 4.1.4.  
>
> Version 5.2.0 is fairly recent but not the latest, and, in any case, does not 
> exhibit the problem you have experienced.
>
>> I believe I had also tried without the
>> definition of FS with the same result.  Finally,
>> note that the FPAT expression comes from the awk
>> documentation and is thus expected to work.
> I wasn't saying that setting FS was causing the problem.  Just that setting 
> FS would be overridden by the subsequent setting of FPAT.
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 3.1.7
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=","; 
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' | 
> hexdump $hexdumparg:q
>      0      0  | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |  
>  A   l   p   h   a  \t   B   e
>      8      8  | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |  
>  t   a  \t   C   h   a   r   l
>     10     16  | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |  
>  i   e  \t   D   e   l   t   a
>     18     24  | 0a                      | 010                             |  
> \n                            
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=","; 
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' | 
> hexdump $hexdumparg:q
>      0      0  | 41 6c 70 68 61 09 09 42 | 065 108 112 104 097 009 009 066 |  
>  A   l   p   h   a  \t  \t   B
>      8      8  | 65 74 61 09 43 68 61 72 | 101 116 097 009 067 104 097 114 |  
>  e   t   a  \t   C   h   a   r
>     10     16  | 6c 69 65 09 44 65 6c 74 | 108 105 101 009 068 101 108 116 |  
>  l   i   e  \t   D   e   l   t
>     18     24  | 61 0a                   | 097 010                         |  
>  a  \n                        
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=","; 
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' | 
> hexdump $hexdumparg:q
>      0      0  | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |  
>  A   l   p   h   a  \t   B   e
>      8      8  | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |  
>  t   a  \t   C   h   a   r   l
>     10     16  | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |  
>  i   e  \t   D   e   l   t   a
>     18     24  | 0a                      | 010                             |  
> \n                            
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 5.2.0, API 3.2, PMA Avon 7, (GNU MPFR 3.1.5, GNU MP 6.1.2)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=","; 
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' | 
> hexdump $hexdumparg:q
>      0      0  | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |  
>  A   l   p   h   a  \t   B   e
>      8      8  | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |  
>  t   a  \t   C   h   a   r   l
>     10     16  | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |  
>  i   e  \t   D   e   l   t   a
>     18     24  | 0a                      | 010                             |  
> \n                            
>
> ========================================
>
OK, thank you for looking into this.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]