bug#49741: basenc --base64url decoding bug

From: Emil Lundberg
Subject: bug#49741: basenc --base64url decoding bug
Date: Mon, 26 Jul 2021 13:24:01 +0200
Hi! I seem to have encountered a bug in basenc. While decoding a large
base64url-encoded JSON blob, the decoder drops some characters,
rendering the output invalid JSON. I've verified against Python's
built-in base64url decoder, which correctly produces the expected result
while basenc does not.

I've attached the test case, which I've tried to minimize as much as I
can. All my attempts to remove more of the JSON values have made the bug
not trigger, or at least not as easily detectable.

Reproduction instructions:

$ uname -a
Linux HOST 5.13.4-arch1-1 #1 SMP PREEMPT Tue, 20 Jul 2021 16:58:51 +0000
x86_64 GNU/Linux

$ basenc --version
basenc (GNU coreutils) 8.32

$ cat expected-output.txt | sha256sum
fdb9a77c44e9cd612ad3a3cc210e03ea9782e342bb8293b49530e032b2e4ed0e  -

$ cat actual-output.txt | sha256sum
86bce7aa1d0c2da8432cfbb6da4ad2e559012dadbd1abde711e96b2c518d2b11  -

$ basenc -d --base64 input.txt | sha256sum
86bce7aa1d0c2da8432cfbb6da4ad2e559012dadbd1abde711e96b2c518d2b11  -

$ diff actual-output.txt expected-output.txt
<             "minor: 0
>             "minor": 0

Installed from Arch Linux official repos, package version coreutils 8.32-1.

Thanks for making basenc, and please let me know if I can do anything
more to help!


