[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parse CVS in awk
From: |
Peter Brooks |
Subject: |
Re: Parse CVS in awk |
Date: |
Fri, 10 Apr 2020 05:52:47 +0100 |
You might find this a useful tool:
https://colin.maudry.com/csvtool-manual-page/
Sent from my iPad
> On 9 Apr 2020, at 18:53, Manuel Collado <address@hidden> wrote:
>
> El 09/04/2020 a las 17:00, Manuel Collado escribió:
>>> El 09/04/2020 a las 4:51, Peng Yu escribió:
>>> I'm wondering if the solution mentioned here is robust against all CVS
>>> format variations.
>>>
>>> https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content
>
> This manual says:
>
> <quote>
> NOTE: Some programs export CSV data that contains embedded newlines between
> the double quotes. gawk provides no way to deal with this. Even though a
> formal specification for CSV data exists, there isn’t much more to be done;
> the FPAT mechanism provides an elegant solution for the majority of cases,
> and the gawk developers are satisfied with that.
> <endquote>
>
> Well, there is a trick that can handle fields with embedded newlines. The
> idea is to join lines until the number of quotes is an even number. And amend
> NR and FNR if necessary:
>
> # Process CSV input records with embedded newlines
> {
> # Collect multi-line data, if it is the case
> CSVRECORD = $0
> while (gsub("\"", "\"", CSVRECORD) % 2 == 1 && (_csv_multi = getline
> _csv_) > 0) {
> CSVRECORD = CSVRECORD "\n" _csv_
> NR--
> FNR--
> }
> if (_csv_multi) {
> $0 = CSVRECORD
> }
> }
>
> HTH.
> --
> Manuel Collado - http://mcollado.z15.es
>
- Parse CVS in awk, Peng Yu, 2020/04/08
- Re: Parse CVS in awk, Wolfgang Laun, 2020/04/09
- Re: Parse CVS in awk, Manuel Collado, 2020/04/09
- Re: Parse CVS in awk, Manuel Collado, 2020/04/09
- Re: Parse CVS in awk,
Peter Brooks <=
- RE: Parse CVS in awk, Carl Friedberg, 2020/04/10
- Re: Parse CVS in awk, Manuel Collado, 2020/04/10
- Re: Parse CVS in awk, Peter Brooks, 2020/04/10
- RE: Parse CVS in awk, pjfarley3, 2020/04/11
- Re: Parse CVS in awk, Peter Brooks, 2020/04/11
Re: Parse CVS in awk, arnold, 2020/04/09