|
From: | Walter Anema |
Subject: | [Zutils-bug] zgrep performance long line |
Date: | Wed, 15 Aug 2018 15:47:17 +0000 |
Hi Antonio, You made a nice package with z utilities. I am using this in a docker container (Alpine) and try to analyse JSON logging. I have a problem with the performance of a special file. It is a file with logging in json format, without a \n. (zcat /logs/s3/2018/04/11/08/prod-kinesis-firehose-stream-1-2018-04-11-08-05-23-bcdf3841-52b5-47eb-bf85-c36dfa2d0d55;echo ) | wc 1 2145643 37786248 Somehow the zgrep takes a long time: # /usr/bin/zgrep -V zgrep (zutils) 1.7 Copyright (C) 2018 Antonio Diaz Diaz. License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
97 97 776 real 0m19.320s user 0m19.317s sys 0m0.078s When I use GNU zgrep it is 20 times faster: # zgrep -H zgrep (gzip) 1.5 Copyright (C) 2010-2012 Free Software Foundation, Inc. This is free software. You may redistribute copies of it under the terms of the GNU General Public License <http://www.gnu.org/licenses/gpl.html>. There is NO WARRANTY, to the extent permitted by law. Written by Jean-loup Gailly. # time (/usr/bin/zgrep -o connect largefile_with_one_json_line | wc) 97 97 776 real 0m0.830s user 0m0.964s sys 0m0.044s Can you explain the difference? Best regards, Walter Anema Technisch Applicatie Beheer be smart. get connected.
portbase Blaak 16
•
3011 TA Rotterdam
•
The Netherlands +31 (0)88 625 25 37
•
+31 (0)6 54 32 76 70 portbase.com Op dit bericht is de e-mail disclaimer van Portbase van toepassing. Please consider the environment before printing this e-mail. |
[Prev in Thread] | Current Thread | [Next in Thread] |