[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CSV parsing and other issues (Re: LC_NUMERIC)
From: |
Maxim Nikulin |
Subject: |
Re: CSV parsing and other issues (Re: LC_NUMERIC) |
Date: |
Sat, 12 Jun 2021 21:41:48 +0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
On 11/06/2021 04:10, Stefan Monnier wrote:
>> There are plenty of CSV dialects. If decimal separator is
>> "," then office software uses ";" instead of comma as cell
>> (field) separator.
>
> But there's no reason to presume that a given CSV file was
> generated in the same locale as the one we're currently
> using.
>
> So the locale could be one ingredient in the machinery used
> to guess which separator was used, but I'm not sure it would
> be of much help.
You are right. My expectation is still that ";" is mostly used for
locales with comma as decimal separator, and in such cases it must be
tried with higher priority due to records that have enough amount of
both characters.
1,2;3,45;56,789
Originally the question raised exactly in the context of attempt to
improve guessing of separator:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47885 The patches have
however other problems. Advanced options for table import are likely
more suitable e.g. for csv-mode and may become unnecessary burden in
org-mode (especially if kill-yank would work well in both directions).
Certainly users should have opportunity to explicitly specify the
dialect of the files they are going to import.
> [ BTW, I'll take the opportunity to advocate for the use of
> TSV instead, which is slightly less ill-defined. ]
In real world one often does have full control of file formats he has to
deal with. In simple cases I can use space separated columns of numbers
having fixed width. On the other hand downloaded bank statements are
namely CSV with ";" as delimiter and in legacy windows 8-bit encoding
(and such files have a kind of header with varying column number
distinct from the following table).
So ability to get decimal separator for current locale may slightly
improve user experience with import of CSV files at least in Org mode.
However it is just an aspect of support of locale-aware number formats
in Emacs.
- Re: CSV parsing and other issues (Re: LC_NUMERIC), (continued)
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/14
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/14
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/16
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/16
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Stefan Monnier, 2021/06/10
- Re: CSV parsing and other issues (Re: LC_NUMERIC),
Maxim Nikulin <=