coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] ls: add --sort=width (-W) option to sort by filename width


From: Carl Edquist
Subject: [PATCH] ls: add --sort=width (-W) option to sort by filename width
Date: Fri, 9 Apr 2021 07:02:44 -0500 (CDT)

Dear Coreutils Maintainers,

I'd like to introduce my favorite 'ls' option, '-W', which I have been enjoying using regularly over the last few years.

The concept is just to sort filenames by their printed widths.


(If this sounds odd, I invite you hear it out, try and see for yourself!)


I am including a patch with my implementation and accompanying tests - as well as some sample output. And I'll happily field any requests for improvements.


But first, some motivation....


The main use case for me has to do with managing filenames in directories, as they are displayed by 'ls' itself.

There is a usual tidy/untidy cycle for me in my homedir, or various other user-managed directories where files tend to accumulate.

The "tidying" part of the cycle involves organizing files into subdirs until a bare 'ls' invocation fits comfortably within a window (eg 80x24).

(I feel that the 'ls' output is optimally useful when it fits into a single window, to see and reason about the entire directory's contents at once.)

Then over time, random files accumulate of various lengths; and before you know it, the output of 'ls' is several window-heights tall. (And I feel the usefulness of the 'ls' column output drops off significantly when you can't see the entire listing in a single window.)

The _various lengths_ part is significant here, because a longer filename makes the entire column it appears in wider. So if you have long filenames mixed in with shorter ones, you end up with mostly whitespace in the 'ls' output. (Which is also to say, filenames become inefficiently "packed" in the column display.)

When this is sufficiently annoying to motivate tidying up again, the first thing is actually to identify the long filenames, which are making a mess of the otherwise-nice default 'ls' column output. Tucking just the long ones into subdirs (or just renaming them to something shorter) is a quick way to condense the directory listing output significantly.


Originally I would identify the longest filenames in a directory with something in the shell like:


    lsort0 () {
        [ $# -eq 0 ] ||
        printf '%s\0' "$@" |
        awk '{print length, $0}' OFS='\t' {RS,ORS}='\0' |
        sort -zn | cut -zf2-
    }

    zlines () { tr '\0\n' '\n?'; }

    lsort0 * | tail -z | zlines


This does an ok job, but it seems like a lot of tricky work to accomplish, when 'ls' is already designed for listing files sorted by various criteria.

Also notably, the 'length' of the filename is not quite the right thing to measure, as it does not take into account the width of unicode characters (sometimes 0 or 2), nor (more generally) the actual width that gets used when 'ls' displays it, which may include various quoting characters.

Really, only 'ls' itself has access to this information, so it can only be done properly if the feature is built into 'ls'.


An interesting observation is that if you ask 'ls' to display files in the order of their width, you actually get an optimally-packed column display, in the default column format mode (-C).

This helps identify the outliers for long filenames, but it also looks neat and can easily cut in half the number of lines 'ls' takes to display a directory.

You can get a taste for this using the 'lsort0' function defined above, with an unpatched 'ls':

    lsort0 * | xargs -0 ls -dU --color=auto

(Try it in a messy homedir!  Neat, eh?)

This emulates what the new 'ls -W' does by itself.


(I provide the complicated 'printf | awk | sort | cut | xargs' pipeline, not to demonstrate that the new 'ls -W' option is superfluous, but to show how troublesome it is even to approximate the desired result without the option built in to 'ls'.)

Additionally, 'ls -W' can be combined naturally with other 'ls' options like '-a' or '-r', or whatever decoration options you may have defined for your 'ls' alias in LS_OPTIONS.


So, that's what the new ls -W/--sort=width option is all about.

It helps identify the outliers for long filenames, and it also produces a more compact display of columns when listing a directory with many entries of various widths.


An implementation detail: this sorts files based on ls's internal 'quote_name_width' using the current filename quoting options. So it takes into account the actual width that 'ls' displays for each entry.

And ties are still broken by the default sorting of the filename itself - as is the case with other sort options.


If you try it and you're impressed with how neatly 'ls -W' is able to pack the filenames into columns, at first you might almost think it must be a new 'ls --format' option; but really all it does is change the sort order.



That's about it. Thanks for your consideration, and I hope many find this to be as useful & enjoyable as I do.


Carl


-=-=+=-=-


* Demo! *


[coreutils/src]$ ls  # normal output
basename.c        expand.c            make-prime-list.c  shred.c
basenc.c          expr.c              make-prime-list.o  shuf.c
blake2            extent-scan.c       md5sum.c           single-binary.mk
cat.c             extent-scan.h       mkdir.c            sleep.c
chcon.c           extract-magic       mkfifo.c           sort.c
chgrp.c           factor.c            mknod.c            split.c
chmod.c           false.c             mktemp.c           stat.c
chown-core.c      fiemap.h            mv.c               statx.h
chown-core.h      find-mount-point.c  nice.c             stdbuf.c
chown.c           find-mount-point.h  nl.c               stty.c
chroot.c          fmt.c               nohup.c            sum.c
cksum.c           fold.c              nproc.c            sync.c
comm.c            force-link.c        numfmt.c           system.h
copy.c            force-link.h        od.c               tac-pipe.c
copy.h            fs-is-local.h       operand2sig.c      tac.c
coreutils-arch.c  fs.h                operand2sig.h      tail.c
coreutils-dir.c   getlimits.c         paste.c            tee.c
coreutils-vdir.c  group-list.c        pathchk.c          test.c
coreutils.c       group-list.h        pinky.c            timeout.c
coreutils.h       groups.c            pr.c               touch.c
cp-hash.c         head.c              primes.h           tr.c
cp-hash.h         hostid.c            printenv.c         true.c
cp.c              hostname.c          printf.c           truncate.c
csplit.c          id.c                prog-fprintf.c     tsort.c
cu-progs.mk       install.c           prog-fprintf.h     tty.c
cut.c             ioblksize.h         ptx.c              uname-arch.c
date.c            join.c              pwd.c              uname-uname.c
dcgen             kill.c              readlink.c         uname.c
dd.c              lbracket.c          realpath.c         uname.h
df.c              libstdbuf.c         relpath.c          unexpand.c
die.h             link.c              relpath.h          uniq.c
dircolors.c       ln.c                remove.c           unlink.c
dircolors.h       local.mk            remove.h           uptime.c
dircolors.hin     logname.c           rm.c               users.c
dirname.c         longlong.h          rmdir.c            version.c
du-tests          ls-dir.c            runcon.c           version.h
du.c              ls-ls.c             selinux.c          wc.c
echo.c            ls-vdir.c           selinux.h          who.c
env.c             ls.c                seq.c              whoami.c
expand-common.c   ls.h                set-fields.c       yes.c
expand-common.h   make-prime-list     set-fields.h


[coreutils/src]$ ls -W  # sort by width
cp.c   seq.c   sync.c   tsort.c   stdbuf.c    readlink.c     extent-scan.h
dd.c   sum.c   tail.c   uname.c   system.h    realpath.c     extract-magic
df.c   tac.c   test.c   uname.h   unlink.c    tac-pipe.c     fs-is-local.h
du.c   tee.c   true.c   users.c   uptime.c    truncate.c     operand2sig.c
fs.h   tty.c   uniq.c   basenc.c  whoami.c    unexpand.c     operand2sig.h
id.c   who.c   chcon.c  chroot.c  cp-hash.c   coreutils.c    uname-uname.c
ln.c   yes.c   chgrp.c  csplit.c  cp-hash.h   coreutils.h    prog-fprintf.c
ls.c   blake2  chmod.c  du-tests  dirname.c   cu-progs.mk    prog-fprintf.h
ls.h   comm.c  chown.c  expand.c  install.c   dircolors.c    coreutils-dir.c
mv.c   copy.c  cksum.c  factor.c  logname.c   dircolors.h    expand-common.c
nl.c   copy.h  false.c  fiemap.h  ls-vdir.c   getlimits.c    expand-common.h
od.c   date.c  ls-ls.c  groups.c  pathchk.c   ioblksize.h    make-prime-list
pr.c   echo.c  mkdir.c  hostid.c  relpath.c   libstdbuf.c    coreutils-arch.c
rm.c   expr.c  mknod.c  local.mk  relpath.h   chown-core.c   coreutils-vdir.c
tr.c   fold.c  nohup.c  ls-dir.c  selinux.c   chown-core.h   single-binary.mk
wc.c   head.c  nproc.c  md5sum.c  selinux.h   force-link.c   make-prime-list.c
cat.c  join.c  paste.c  mkfifo.c  timeout.c   force-link.h   make-prime-list.o
cut.c  kill.c  pinky.c  mktemp.c  version.c   group-list.c   find-mount-point.c
dcgen  link.c  rmdir.c  numfmt.c  version.h   group-list.h   find-mount-point.h
die.h  nice.c  shred.c  primes.h  basename.c  set-fields.c
env.c  shuf.c  sleep.c  printf.c  hostname.c  set-fields.h
fmt.c  sort.c  split.c  remove.c  lbracket.c  uname-arch.c
ptx.c  stat.c  statx.h  remove.h  longlong.h  dircolors.hin
pwd.c  stty.c  touch.c  runcon.c  printenv.c  extent-scan.c


[coreutils/src]$ # accumulate some long filenames...
[coreutils/src]$ touch {a,z}-some-obnoxiously-longish-filename

[coreutils/src]$ ls  # normal output, now much taller
a-some-obnoxiously-longish-filename  make-prime-list.c
basename.c                           make-prime-list.o
basenc.c                             md5sum.c
blake2                               mkdir.c
cat.c                                mkfifo.c
chcon.c                              mknod.c
chgrp.c                              mktemp.c
chmod.c                              mv.c
chown-core.c                         nice.c
chown-core.h                         nl.c
chown.c                              nohup.c
chroot.c                             nproc.c
cksum.c                              numfmt.c
comm.c                               od.c
copy.c                               operand2sig.c
copy.h                               operand2sig.h
coreutils-arch.c                     paste.c
coreutils-dir.c                      pathchk.c
coreutils-vdir.c                     pinky.c
coreutils.c                          pr.c
coreutils.h                          primes.h
cp-hash.c                            printenv.c
cp-hash.h                            printf.c
cp.c                                 prog-fprintf.c
csplit.c                             prog-fprintf.h
cu-progs.mk                          ptx.c
cut.c                                pwd.c
date.c                               readlink.c
dcgen                                realpath.c
dd.c                                 relpath.c
df.c                                 relpath.h
die.h                                remove.c
dircolors.c                          remove.h
dircolors.h                          rm.c
dircolors.hin                        rmdir.c
dirname.c                            runcon.c
du-tests                             selinux.c
du.c                                 selinux.h
echo.c                               seq.c
env.c                                set-fields.c
expand-common.c                      set-fields.h
expand-common.h                      shred.c
expand.c                             shuf.c
expr.c                               single-binary.mk
extent-scan.c                        sleep.c
extent-scan.h                        sort.c
extract-magic                        split.c
factor.c                             stat.c
false.c                              statx.h
fiemap.h                             stdbuf.c
find-mount-point.c                   stty.c
find-mount-point.h                   sum.c
fmt.c                                sync.c
fold.c                               system.h
force-link.c                         tac-pipe.c
force-link.h                         tac.c
fs-is-local.h                        tail.c
fs.h                                 tee.c
getlimits.c                          test.c
group-list.c                         timeout.c
group-list.h                         touch.c
groups.c                             tr.c
head.c                               true.c
hostid.c                             truncate.c
hostname.c                           tsort.c
id.c                                 tty.c
install.c                            uname-arch.c
ioblksize.h                          uname-uname.c
join.c                               uname.c
kill.c                               uname.h
lbracket.c                           unexpand.c
libstdbuf.c                          uniq.c
link.c                               unlink.c
ln.c                                 uptime.c
local.mk                             users.c
logname.c                            version.c
longlong.h                           version.h
ls-dir.c                             wc.c
ls-ls.c                              who.c
ls-vdir.c                            whoami.c
ls.c                                 yes.c
ls.h                                 z-some-obnoxiously-longish-filename
make-prime-list


[coreutils/src]$ ls -W  # sort by width for much denser output
cp.c    copy.c   rmdir.c   uptime.c     libstdbuf.c
dd.c    copy.h   shred.c   whoami.c     chown-core.c
df.c    date.c   sleep.c   cp-hash.c    chown-core.h
du.c    echo.c   split.c   cp-hash.h    force-link.c
fs.h    expr.c   statx.h   dirname.c    force-link.h
id.c    fold.c   touch.c   install.c    group-list.c
ln.c    head.c   tsort.c   logname.c    group-list.h
ls.c    join.c   uname.c   ls-vdir.c    set-fields.c
ls.h    kill.c   uname.h   pathchk.c    set-fields.h
mv.c    link.c   users.c   relpath.c    uname-arch.c
nl.c    nice.c   basenc.c  relpath.h    dircolors.hin
od.c    shuf.c   chroot.c  selinux.c    extent-scan.c
pr.c    sort.c   csplit.c  selinux.h    extent-scan.h
rm.c    stat.c   du-tests  timeout.c    extract-magic
tr.c    stty.c   expand.c  version.c    fs-is-local.h
wc.c    sync.c   factor.c  version.h    operand2sig.c
cat.c   tail.c   fiemap.h  basename.c   operand2sig.h
cut.c   test.c   groups.c  hostname.c   uname-uname.c
dcgen   true.c   hostid.c  lbracket.c   prog-fprintf.c
die.h   uniq.c   local.mk  longlong.h   prog-fprintf.h
env.c   chcon.c  ls-dir.c  printenv.c   coreutils-dir.c
fmt.c   chgrp.c  md5sum.c  readlink.c   expand-common.c
ptx.c   chmod.c  mkfifo.c  realpath.c   expand-common.h
pwd.c   chown.c  mktemp.c  tac-pipe.c   make-prime-list
seq.c   cksum.c  numfmt.c  truncate.c   coreutils-arch.c
sum.c   false.c  primes.h  unexpand.c   coreutils-vdir.c
tac.c   ls-ls.c  printf.c  coreutils.c  single-binary.mk
tee.c   mkdir.c  remove.c  coreutils.h  make-prime-list.c
tty.c   mknod.c  remove.h  cu-progs.mk  make-prime-list.o
who.c   nohup.c  runcon.c  dircolors.c  find-mount-point.c
yes.c   nproc.c  stdbuf.c  dircolors.h  find-mount-point.h
blake2  paste.c  system.h  getlimits.c  a-some-obnoxiously-longish-filename
comm.c  pinky.c  unlink.c  ioblksize.h  z-some-obnoxiously-longish-filename

Attachment: 0001-ls-add-sort-width-W-option.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]