help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Adding option to select file types


From: Greg Wooledge
Subject: Re: Adding option to select file types
Date: Mon, 5 Jul 2021 08:50:09 -0400

On Mon, Jul 05, 2021 at 06:48:14AM +0200, lisa-asket@perso.be wrote:
> I have the following bash function and I want an option that allows me
> 
> to pass the file type if I do not want the default .org and .texi

I'm assuming you mean "extensions".

> print-region ()
>   {
>     na=$1
>     nb=$2
>     dir=$3
>     
>       find "$dir" \( -name \*.org -o -name \*.texi \)  \
>         -exec awk -v a="$na" -v b="$nb"                \
>               'FNR == 1 {s="\n==> "FILENAME" <==\n"; f = 0}
>                FNR == a {print s; f = 1}
>                f {print; if (FNR == b) nextfile}' {} +
>   }

OK, the fundamental challenge here is to pass a *list* of extensions
and to generate a find command from that list.  You might think that
generating the dynamic find arguments is the harder part of this
challenge, but that's a straightforward operation; it's actually passing
the list that's the harder part.  I'll discuss that at the end.

Let's assume that after careful review, you've decided that the list
of filename extensions will be passed in the command-line arguments,
as optional arguments after the three mandatory arguments.

So your synopssis (which you've forgotten to document in the comments
for your function) is currently:

# usage: print-region startline endline startdir

And with the change, it will be:

# usage: print-region startline endline startdir [extension ...]

This also means the startdir cannot be an optional argument; it's
going to be mandatory, forever.  If the optional extensions are not
provided, then you'll use .org and .texi.  This should also be in
the comments.

So, here's one possible implementation.  I've used bash 3.1 syntax
features, and placed the documentation in a usage message, in lieu
of comments.

print-region ()
{
  if (($# < 3)); then
    echo "usage: print-region startline endline startdir [extension ...]" >&2
    echo "If extensions are not provided, .org and .texi will be used." >&2
    return 1
  fi

  local na=$1
  local nb=$2
  local dir=$3
  shift 3

  if (($# == 0)); then
    set -- .org .texi
  fi

  # Construct find arguments dynamically from the extension list.
  local findargs=( "$dir" '(' )
  local ext
  for ext in "$@"; do
    findargs+=( -name "*$ext" -o )
  done
  # Overwrite the final -o with ')'
  findargs[${#findargs[@]}-1]=')'

  find "${findargs[@]}" \
    -exec awk -v a="$na" -v b="$nb" '
      FNR == 1 {s="\n==> "FILENAME" <==\n"; f = 0}
      FNR == a {print s; f = 1}
      f {print; if (FNR == b) nextfile}
    ' {} +
}

I did not test it.

If you don't agree with the decision to pass the optional list of
extensions as arguments, then you'll need to come up with your own
alternative way of passing the list.  I chose the *simplest* way, and
the way that I would probably use in real life.  But I don't know what
your preferences are.

It's worth noting that I probably would have had startdir as an optional
argument in the original design, with . as the default value.  So, adding
the list of extensions would have been a much tougher decision for me.
For you, it's a lot simpler, as you don't need to support two sets of
optional arguments simultaneously.

So, just for the sake of curiosity, what would we do if the original
synopsis had been "print-region startline endline [startdir]"?

One choice would be to break backward compatibility and force startdir
to be provided, flat out.  This would lead us to the solution shown
above.

Another choice would be to require the startdir *only if* you're also
passing a list of extensions afterward.  That might look something
like this:

usage: print-region startline endline [startdir [extension ...]]

And an implementation:

  local na=$1
  local nb=$2
  shift 2

  local dir=.
  if (($# > 0)); then
    dir=$1
    shift
  fi

  if (($# == 0)); then
    set -- .org .texi
  fi

Of course, there are many other ways to implement it as well.  This design
has the advantage of maintaining backward compatibility; the two-argument
call still works.

Another design choice would be to introduce options, and make the
extension list an option.  This complicates the parsing quite a bit,
but doesn't break backward compatibility.  We still have some more
decisions to make, though: will the list be passed as a single option
argument, with some kind of delimiter?  Or will the list be passed as
a series of repeated option arguments?

The first choice would have a synopsis like:

print-region [--extensions extlist] startline endline [startdir]

(And you'd need to document how the list is formatted.)  The second
choice might look like:

print-region [-e extension ...] startline endline [startdir]

And in both cases, an option parser would need to be introduced, before
the straightforward na=$1 ... that we've been using.  You'd build up an
array of extensions in a local array variable, rather than using the
positional parameters.

Finally, one last possible design decision would be to have the list of
extensions in an array variable in the caller's scope, and pass that
array variable by name to the function.  This is the most complicated
choice, for both the caller *and* the function, so I won't go into much
more detail here.  I just note that it's something that could be chosen.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]