|
From: | Bykov Alexey |
Subject: | Re: [Bug-wget] How to intercept wget to extract the raw requests and the raw responses? |
Date: | Thu, 15 Feb 2018 21:34:22 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 6.0; rv:49.0) Gecko/20100101 SeaMonkey/2.46 |
wget --warc-file=httpbin -qO- https://httpbin.org/getHow to convert the warc format to the actual header of requests and responses?
Greetings WARC is gzipped plain text.wget --warc-file=httpbin --no-warc-compression -qO response.raw -- https://httpbin.org/get
Extract headers with GNU Sedsed -n -r -e "/WARC-Type: (request|response)/{s/.*: (.)/\n\L\1/;p;:a;N;s/\n$//;Ta;s/.*//;:b;N;s/\n$//;Tb;p;}" httpbin.warc > headers.txt
Extract headers with GNU AWKawk "{if(/WARC-Type: (response|request)/){print n;hp=1;np=0;}if(hp){if(np){if(!$1){np=0;hp=0;}else print}if(!np&&!$1)np=1;}}" httpbin.warc > headers.txt
Best regards.
[Prev in Thread] | Current Thread | [Next in Thread] |