[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort does not use tab as delimiter

From: DI Oliver Maurhart
Subject: Re: sort does not use tab as delimiter
Date: Mon, 11 Dec 2006 09:03:07 +0100
User-agent: Thunderbird (X11/20061127)

Hi *,

>> Yes, it does, but this "feature" is pain ass to enter on a bash-shell:
> Absolutely.  And this bothers me in many respects, not just
> with sort -t.  Perhaps you should be directing your bug
> reports to the Bash maintainer.  After all, in this
> particular context (and in the other contexts I'm thinking
> of) it should be perfectly obvious to Bash that the user
> just wanted to type a tab.

Well, yes, maybe. To insert a tab on a console based on simultaneously
pressing a combination of several keys on the keyboard ... gee, <tab>:
bash-completion, <alt>-<tab>: whhoppp there it goes I'm in my second app
on the desktop, <ctrl>-<tab>: weeee, I'm in yet another desktop ...

Maybe a <ctrl>-<shift>-<alt>-<esc>-<tab> will do ... (hoping no window
manager reserved that combination already for some fancy thing) but it's
sure a hard task to convince people that this keyboard-tipping-kung-fu
is a *good* solution.

No. Actually the $'\t' works! It's pretty cool. I'm into Linux for
several years now, but everyday you get surprised by such small goodies.
Didn't know of that one.

On the other side, I wonder, if this is also available on other shells
like csh, korn-shell or some XYZ-Shell.

So I still believe that sort should be capable of handling tabs on its
own and not to depend on a user's shell feature to correctly pass on
this literal.

> If this were fixed with Bash, that would improve the
> situation for many programs, not just 'sort'.
> It's mainly because nobody has had the time to implement it
> cleanly yet.  This includes the patch you proposed, which
> supports only \t and which does not update the
> documentation.

Yes, right. Actually what I provided is merely nothing but a "quick fix"
or "hack". Nothing more. And it solely addresses the tabs in the sort

I think the very minimum of quoting whitespace chars can be found in the
ISO C99 standard (Page 19, section 5.2.2 paragraph 2 of the 5th May 2005
revision) which lists good candidates for this problem to be commonly
agreed upon.

And yet then: when you start thinking of what can be done to improve all
progies, I quickly find myself thinking of the wide-char stuff. So sort
actually has a 1 char delimiter. However, dealing with Unicode docs
users do observe certain literals as single characters whereas they are
stored as a multiple of octets in the text stream.

Now by introducing wchar_t in sort you've successfully gained a free
ticket for a lot of unpaid work in the next few years ... :-)

Then yet again: sort *would* be capable of handling Unicode docs, if one
matches to transform the text into UTF-8 and passes this on to a
multi-char-delimiter-enabled sort. For UTF-8 the maximum number of
octets is 4 (as I remember).

But then yet again: why not use such new multi-char feature directly
with non-Unicode texts as well? And then: what's a good upper bound for
the size of such multi-char-delimiters? 10? 16? 1K? 32K?

Puhhh ... feeling' dizzy now ... I see, much ideas, no time.

But it's very interesting, I think I should start to get involved ...



 * This is my signature
 * @return      all may account data (if you need it)
CAddress CEMailSignature::getMyAddress() const {

   QString sTitle      = "Dipl.Ing.";
   QString sName       = "Oliver Maurhart";
   QString sProfession = "Software Development";

   QString sStreet     = "Duernfeld 3";
   QString sCity       = "9064 Pischeldorf";
   QString sState      = "AUSTRIA";

   // Telefon as hex: 0x3EAB83D3B45
   // But as dec seems more convenient, doesn't it?
   QString sTelefon    = "+43 / (0)664 / 825 12 05";

   // should return: "address@hidden"
   QString sEMail      = getCurrentMail().getFrom();

   // Chat?
   try {
   catch (IM::IamOfflineException& cException) {
      fprintf(stdout, "Sorry, but %s\n", cException.getReason());

   // normaly I use a single line here ...
   return CAddress(sName,

Attachment: oliver.maurhart.vcf
Description: Vcard

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]