coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: cksum --base64/-b support


From: Pádraig Brady
Subject: Re: RFC: cksum --base64/-b support
Date: Mon, 30 Jan 2023 18:17:33 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 29/01/2023 20:40, Jim Meyering wrote:
Hi Pádraig! Happy new year (belatedly ;-). Hope you're well.

I'd like our generated announcements to be able to include
base64-encoded checksums without having to recommend verifying them
using openbsd's cksum, so...

This is so the checksums are shorter right?
I.e. 4x/3 rather than the 2x for hex.
Playing devil's advocate, is the complexity of
generally using base64 for this worth it, since say a 512 bit checksum
would only reduce from 128 chars to 86 chars?

Also related to this is the use of variable length algorithms
(supported with the existing -l option).
A quick scan of https://www.keylength.com/ suggests newer algorithms
like blake2 blake3 sha3 with -l 256 may be sufficient for this use case
in which case the difference would only be 64 chars to 44 chars.
The following demos that, while also using existing tools:

  $ sha256sum < /dev/null | xxd -r -p | basenc --base16
  E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855
  $ sha256sum < /dev/null | xxd -r -p | basenc --base64
  47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=

I've begun writing the code to add --base64/-b support for GNU cksum,
prompted by this thread:
https://lists.gnu.org/r/diffutils-devel/2023-01/msg00015.html and the
fact that OpenBSD already has this option (as -b):
https://man.openbsd.org/cksum

We originally considered this back at:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=7313
which showed ways to achieve this with existing tools.
Though if it was a common operation (like it would be for your use case),
and for the fact that openbsd already implements this,
then it would be worth adding an option.

BTW I noted the following possible option when I recently refactored cksum:

  --digest_format={int, hex, base64, binary}
      /* cksum output formats:
         int (sum, and cksum default),
         hex (rest default),
         b64 (to reduce size),
         bin (would auto suppress names? restrict to single argument? */

Two questions:
- blake2b's tag is inconsistently capitalized. All of the other tags
are all-caps versions of their lower-case strings, but this one is
spelled BLAKE2b, with a lower-case "b" at the end. I presume this is
desired. Likely too late to change to make it consistent. Arose while
considering how to implement support for the "x" and "b"
option suffixes, to denote "use hex" or "use base64" as encoding,
while the usual default is of course hex, and --base64 changes that.

BLAKE2b came from the original blake2 reference implementation.
I.e. that was the preferred naming, and couldn't be changed now
due to backwards compat.  It's not so bad to work around,
but yes would lead to messier code.

Leading to my second question:
- I'm inclined to work like the openbsd cksum and accept invocations
like "cksum -a sha1x" and "cksum -a sha1b". Any objection?

bsd compat so good.
multiple ways to do something, so possibly confusing.

Also, comparing algorithms, openbsd has two that we don't: rmd160, sha512/256
I'm not interested in adding those in this diff, of course, but it may
be something to consider for compatibility.

Yes I was considering it, also along with sha3 and blake3

What's the schedule for the next release?
Assuming this is desirable, want to include it there? >
My own ETA is variable, depending on pressure/desire.
I've written most of the code (but not yet suffix support) and minimal
tests, but no documentation or NEWS.

I hope to get the next release out in about 3 weeks,
so it would be good to include this if possible.

thanks,
Pádraig




reply via email to

[Prev in Thread] Current Thread [Next in Thread]