discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] compressing I/Q files


From: Andrej Rode
Subject: Re: [Discuss-gnuradio] compressing I/Q files
Date: Thu, 14 Mar 2019 12:48:44 +0100

This can partly be explained by complex IQ captured in 2x 32bit floats
only uses [-1,1] range of float. So instead of using 32 bits the data
already is only using 32 - 8 bit (since you literally do not use the
exponent to carry information).

If you then captured "real" data
instead of noise you don't have white info which can be compressed.

Thanks for posting your findings!

Cheers
Andrej


On Thu, 14 Mar 2019 10:49:45 +0100
Kristoff <address@hidden> wrote:

> Marcus, all,
> 
> 
> Thx.
> 
> In the mean time, I did a little bit of testing.
> 
> A 256 MB piece of a I/Q file (a pass of NOAA-19), sampled at 240 Ksps.
> Gzip compressed this down to 40 MB. 7Zip managed to get this down to
> 29 MB (but compressing took 10 to 20 times longer).
> 
> Now, after converting this file from float to short, you get a 128 MB
> file. However, if you then compress that, the gain isn't that big
> anymore: gzip 33 MB, 7zip 25 MB.
> 
> 
> My guess is that gzip and 7zip do compression based on looking for 
> repetitive patterns. This means that converting 32bit floats to 16bit 
> shorts does not really help if you plan to compress the files
> afterwards anyway.
> 
> 
> 
> Kristoff
> 
> 
> On 10/03/19 18:33, Marcus Müller wrote:
> > Hi Kristoff, Benny and Alban,
> >
> > TL;DR:
> > Benny is exactly on spot. Other than that, decimate your signal if
> > you know the bandwidth is less than your sampling rate, and don't
> > put too much hope on audio encoders.
> >
> > Long Version:
> >
> > Point is: the signal coming from your SDR device, whatever that
> > might be, has finite resolution – typically, no more than 16 bits
> > per channel. Hence, the conversion from float to short (or directly
> > getting short, if your device driver allows that) is lossless. For
> > example, USRPs' driver (UHD), and the GNU Radio USRP source, can be
> > configured to hand out the signed complex 16 bit conversion of the
> > data from the network or USB interface instead of the float32
> > conversion.
> >
> > Any other compression method can only do so much:
> > Your signal recording is essentially random – meaning that all
> > values should be roughly equally likely. Maybe extreme high
> > amplitudes are a little rarer, since you'd typically avoid those to
> > stay clear of clipping.
> > That means that the average info per sample is relatively high: From
> > seeing other samples, we know very little about it, so the surprise
> > we get from its actual value is pretty high.
> > Information-theoretically, the expected information content per
> > sample is the entropy of a source. Information and entropy are both
> > measured in bit – the completely fair random decision between 0 and
> > 1 ("flipping a coin") is worth 1 bit, and picking one out of 2¹⁶
> > values perfectly randomly is worth 16 bit.
> >
> > (Lossless) compression can, best case, achieve a compression where
> > the amount of bits used per sample is equal to the entropy of the
> > source. Now, if your signal is somewhat noisy, and other than that
> > relatively interesting (i.e. you're not observing a constant
> > value), your source entropy often approaches the limit given by the
> > ADC – in my tests, even on severly backed-off signals, standard
> > Huffmann and Lempel-Ziv-Welch compressors (zip, gzip, 7z, zstd,
> > bz2, xz) achieved negligible compression ratios on radio recordings.
> >
> > I've tried FLAC, too – FLAC doesn't allow to set the actual sampling
> > rate as high as was truly used by typical SDR hardware (i.e. the
> > header field for the sampling rate simply doesn't have enough size
> > to allow for 10⁷, for example). But that's mainly a metadata
> > problem that can be solved by ignorance.
> > However, FLAC's linear prediction coding relies on signals having
> > a) "small" deviation from a linear function for short time periods,
> > and b) the following residual coding relies on geometric
> > distribution –
> >
> > and that's usually not given, because
> > a) if you already know you will be in need of compression, you're
> > probably not significantly oversampling your signal, but are already
> > decimating it to a rate barely more than sufficient. Everything else
> > would be a larger waste of space – and has no benefits for signal
> > analysis later, and
> > b) with the prior assumption broken, only a zero-order linear
> > precoder doesn't make things worse – i.e., simply handing through
> > the input samples to the residual coder. That residual coder, as
> > said, depends on the distribution of amplitudes to follow a
> > specific statistic to work well.  Sadly, that statistic doesn't
> > apply to I&Q signals, typically.
> >
> > My experience is that FLAC doesn't work well for anything that's not
> > massively oversampled AM audio – which is no surprise, because that
> > literally isn't very different from audio, which is what FLAC was
> > designed for.
> >
> > However, my FLAC experiments lie years in the past – maybe the
> > encoder got more versatile; Alban, do you have deviating experience?
> >
> > Best regards,
> > Marcus
> > On Sun, 2019-03-10 at 11:54 +0000, Benny Alexandar wrote:  
> >>   Yes, converting float 32bit to short16 is an option, compressing
> >> using 7zip or gzip won't give good compression .
> >> From: Discuss-gnuradio <
> >> address@hidden> on behalf of
> >> Kristoff<address@hidden>
> >> Sent: Sunday, March 10, 2019 3:57 PM
> >> To:address@hidden
> >> Subject: [Discuss-gnuradio] compressing I/Q files
> >>   
> >> Hi all,
> >>
> >>
> >>
> >> Simple and short question:
> >> What is the best way to compress a raw I/Q file? A generic
> >> compression-tool like gzip, zip? Or are there better and
> >> specialised tools?
> >>
> >>
> >> Is converting the data in the I/Q file from float to short an
> >> option?
> >>
> >>
> >> Kristoff
> >>
> >>
> >> _______________________________________________
> >> Discuss-gnuradio mailing list
> >> address@hidden
> >> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> >> _______________________________________________
> >> Discuss-gnuradio mailing list
> >> address@hidden
> >> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio  
> 
> _______________________________________________
> Discuss-gnuradio mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Attachment: pgpS2bGv88Y1F.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]