[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] bug#29311: Byte comparison from cmp diffutils

From: address@hidden
Subject: [bug-diffutils] bug#29311: Byte comparison from cmp diffutils
Date: Wed, 15 Nov 2017 23:44:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

Hi, i write this message because cmp command from diffutils give an unexpected result if is used on binary files: if we try to compare two different files (not symlink, not hardlink etc..), cmp say that the differ from 42 byte.

~ > LC_ALL=C cmp /usr/bin/g++ /usr/bin/gcc
/usr/bin/g++ /usr/bin/gcc differ: char 42, line 1

During a discussion another member convinced me to try hexdump for do that  comparison than i have see that those file differ for many bytes then 42, so i have try to write a program for do that, just for sure (the program is silly i have also forget to return EXIT_SUCCESS, and as default if it find a different byte the consider all other byte different but is only for this demonstration)
src: https://arin.ga/swsNn1

~ > CFLAGS=-O3 make diffbyte_gcc && ./diffbyte_gcc
they diff by: 766087 byte

so something absolutely does not work:
some clarifications: I have see just now that using C as language it say char, but in my language it say "byte" ("/bin/g++ /bin/gcc differenza: byte 42, riga 1"). However that is indifferent since in manpage it use only "byte" term, so i expect that it compare bytes, not only ascii text; The manpages at description say:  "Compare two files byte by byte".

For do other example, the wc command with -c option really counts the bytes:

~ > printf "\u2592"|wc -c

~ > printf "ciao\0x8" | wc -c

and (except for the sparse files) also du and wc report the same result with binary files:

~ > du -b /bin/gcc
993584  /bin/gcc

~ > wc -c /bin/gcc
993584 /bin/gcc

From info i read:
The 'cmp' command compares two files, and if they differ, tells the first byte and line number where they differ or reports that one file is a prefix of the other. Bytes and lines are numbered starting with 1. The arguments of 'cmp' are as follows:

I do not know if it is meant that the comparison stops at the first line or rather when one of the two buffers differs from the other but manpages and program options say that it compare byte
so it should do

reply via email to

[Prev in Thread] Current Thread [Next in Thread]