[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 1/2] maint.mk: Split long argument lists
From: |
Roman Bolshakov |
Subject: |
Re: [PATCH v2 1/2] maint.mk: Split long argument lists |
Date: |
Thu, 13 Dec 2018 00:13:42 +0300 |
User-agent: |
NeoMutt/20180716 |
On Tue, Dec 04, 2018 at 04:10:08PM -0600, Eric Blake wrote:
> On 12/3/18 9:00 AM, Roman Bolshakov wrote:
> > $(VC_LIST_EXCEPT) is usually expanded into arguments for a command.
> > When a project contains too many, some operating systems can't pass all
> > the arguments because they hit the limit of arguments. FreeBSD and macOS
> > are known to have the limit of 256k of arguments.
> >
> > More on the issue:
> > http://lists.gnu.org/archive/html/bug-gnulib/2015-08/msg00019.html
> > https://www.redhat.com/archives/libvir-list/2015-August/msg00758.html
> >
> > xargs without flags can be used to limit number of arguments. The
> > default number of arguments (max-args for "-n" flag) is
> > platform-specific. If argument length exceeds default value for "-s"
> > flag (max-chars), xargs will feed less arguments than max-args.
> >
> > Signed-off-by: Roman Bolshakov <address@hidden>
> > ---
> > top/maint.mk | 53 +++++++++++++++++++++++++++++++++-------------------
> > 1 file changed, 34 insertions(+), 19 deletions(-)
> >
> > diff --git a/top/maint.mk b/top/maint.mk
> > index 4889ebacc..36a5df262 100644
> > --- a/top/maint.mk
> > +++ b/top/maint.mk
> > @@ -303,18 +303,22 @@ define _sc_search_regexp
> > \
> > : Filter by content;
> > \
> > test -n "$$files" && test -n "$$containing"
> > \
> > - && { files=$$(grep -l "$$containing" $$files); } || :;
> > \
>
> If $$files is large enough to overflow the command-line limit for execv*()
> here...
>
> > + && { files=$$(echo "$$files" \
>
> ...then why is it not large enough to overflow the limit for 'echo' here?
> Or is it because you are implicitly relying on 'echo' being a shell-builtin
> that does not suffer from the same command-line limit because there is no
> execv*() involved?
>
Yes, the built-in doesn't suffer from execve limitations :)
> > + | xargs grep -l "$$containing" /dev/null); } || :; \
>
> At any rate, as long as the rewrite worked for you, it does look like you
> are benefitting from 'echo' being a shell builtin that does not suffer from
> the same limits as an external program. If command-line arguments start
> affecting echo, we'd have to come up with something hairier like:
>
> xargs <<EOF grep -l "$$containing" /dev/null
> $$files
> EOF
>
AFAIK it could happen only if build host is very very low on memory. I
don't know if heredocs would help with that.
> > test -n "$$files" && test -n "$$non_containing"
> > \
> > - && { files=$$(grep -vl "$$non_containing" $$files); } || :; \
> > + && { files=$$(echo "$$files" \
> > + | xargs grep -vl "$$non_containing" /dev/null); } || :;
> > \
>
> Did you need /dev/null on this conversion?
>
No, I've dropped /dev/null for grep invocations with -l flag in v3.
> > @@ -323,9 +327,10 @@ define _sc_search_regexp
> > endef
> > sc_avoid_if_before_free:
> > - @$(srcdir)/$(_build-aux)/useless-if-before-free \
> > - $(useless_free_options) \
> > - $$($(VC_LIST_EXCEPT) | grep -v useless-if-before-free) && \
> > + @$(VC_LIST_EXCEPT) | grep -v useless-if-before-free \
> > + | xargs
> > \
> > + $(srcdir)/$(_build-aux)/useless-if-before-free \
> > + $(useless_free_options) && \
>
> Spacing before the && looks odd here.
>
Ok, I've left the first tab there and placed | and && like you advised
below. Hope this helps.
> > @@ -845,7 +853,10 @@ sc_prohibit_always-defined_macros:
> > case $$(echo all: | grep -l -f - Makefile) in Makefile);; *) \
> > echo '$(ME): skipping $@: you lack GNU grep' 1>&2; exit 0;; \
> > esac; \
> > - $(def_sym_regex) | grep -E -f - $$($(VC_LIST_EXCEPT)) \
> > + sym_regexes=$$(mktemp); \
>
> Are we guaranteed that 'mktemp' is portably present on all developer's
> machines?
>
That's a good question. It's not present on Solaris 9. What platforms
gnulib should cover?
> > + $(def_sym_regex) > $$sym_regexes; \
> > + $(VC_LIST_EXCEPT) | xargs \
> > + grep -E -f $$sym_regexes /dev/null \
> > && { echo '$(ME): define the above via some gnulib .h file' \
> > 1>&2; exit 1; } || :; \
>
> Even worse, you aren't cleaning up your temporary file. This conversion
> needs to be rewritten in a better manner. I suggest:
>
> $(VC_LIST_EXCEPT) | xargs sh -c \
> '$(def_sym_regex) | grep -E -f - "$$@"' dummy /dev/null
>
> which uses 'sh' as an intermediary to let you still feed $(def_sym_regex) as
> the stdin to the grep process (the dummy argument supplies $0 to sh, then
> the /dev/null is the usual placeholder to ensure grep sees more than one
> file at a time to avoid its output changing styles).
>
Ok, I was struggling to find a portable solution. Thank you for
proposing this one (and explaining it). The macro can't be expanded
without breaking shell parser though. I don't know... the approach with
mktemp is more readable but I'm not sure if that's what we want to use
because of portability concerns.
> > @@ -927,7 +938,8 @@ require_exactly_one_NL_at_EOF_ =
> > \
> > }
> > \
> > END { exit defined $$fail }
> > sc_prohibit_empty_lines_at_EOF:
> > - @perl -le '$(require_exactly_one_NL_at_EOF_)' $$($(VC_LIST_EXCEPT)) \
> > + @$(VC_LIST_EXCEPT) | xargs -I{} \
> > + perl -le '$(require_exactly_one_NL_at_EOF_)' {} \
>
> Why did you need -I{} here? If your only use of {} in the 'initial-args' to
> xargs is in the final position, this should be the same as:
>
> $(VC_LIST_EXCEPT) | xargs perl -le '$(require_exactly_one_NL_at_EOF_)'
>
Right, there's no need, thanks.
> > @@ -1033,7 +1046,8 @@ sc_prohibit_test_double_equal:
> > # definition of LDADD from the appropriate Makefile.am and exits 0
> > # when it contains "ICONV".
> > sc_proper_name_utf8_requires_ICONV:
> > - @progs=$$(grep -l 'proper_name_utf8 ''("' $$($(VC_LIST_EXCEPT)));\
> > + @progs=$$($(VC_LIST_EXCEPT) | xargs \
> > + grep -l 'proper_name_utf8 ''("' /dev/null);
> > \
>
> You don't need /dev/null here because of grep -l.
>
Fixed, thanks.
> > @@ -1155,8 +1169,8 @@ sc_po_check:
> > @if test -f $(po_file); then \
> > grep -E -v '^(#|$$)' $(po_file) \
> > | grep -v '^src/false\.c$$' | sort > address@hidden;
> > \
> > - files=$$(perl $(perl_translatable_files_list_) \
> > - $$($(VC_LIST_EXCEPT)) $(generated_files)); \
> > + files=$$($(VC_LIST_EXCEPT) | xargs -I{} \
> > + perl $(perl_translatable_files_list_) {} $(generated_files)); \
> > grep -E -l '$(_gl_translatable_string_re)' $$files \
>
> This one is wrong; if xargs splits the list, it ends up running perl on the
> tail end of $(generated_files) more than once. Better would be:
>
> { $(VC_LIST_EXCEPT); echo $(generated_files); } | \
> xargs perl $(perl_translatable_files_list_)) | \
> xargs grep -E -l '$(_gl_translatable_string_re)' \
>
> > | $(SED) 's|^$(_dot_escaped_srcdir)/||' | sort -u > address@hidden;
> > \
>
I agree that's much better, thanks.
> Also, this points out that we tend to use the style:
>
> command 1 \
> | xargs command 2 \
> | command 3
>
> rather than
>
> command 1 | xargs \
> command 2 | \
> command 3
>
> so while I didn't point out many spots where your style is different (by
> splitting xargs from its arguments), it may be worth reformatting to include
> xargs on the same line as its arguments rather than separately.
>
Ok, sure I've changed a bit with formatting to this one (it also
illustrates that I split command 2 because otherwise I have to move
backslash for every macro/routine I'm changing, but in some cases I have
to move backslash even beyond 79th character):
command 1 \
| xargs \
command 2 \
| command 3
> > @@ -1230,7 +1244,8 @@ sc_cross_check_PATH_usage_in_tests:
> > exit 1; }; \
> > good=$$(grep -E '$(_hv_regex_strong)' $(_hv_file)); \
> > grep -LFx "$$good" \
> > - $$(grep -lE '$(_hv_regex_weak)' $$($(VC_LIST_EXCEPT))) \
> > + $$($(VC_LIST_EXCEPT) | xargs \
> > + grep -lE '$(_hv_regex_weak)' /dev/null) \
>
> Are you sure that the inner $$() won't produce too many arguments for the
> outer grep -LFx, or is this another case where you'll want to rewrite to:
>
> $(VC_LIST_EXCEPT) | xargs grep -lE '$(_hv_regx_weak)' \
> | xargs grep -LFx "$$good"
>
It doesn't produce too many arguments (otherwise I'd catch this) but I
rewrote it as you advised because it's easier to properly indent.
Thank you,
Roman
[PATCH v2 2/2] maint.mk: Replace grep with $(GREP), Roman Bolshakov, 2018/12/03
Re: [PATCH v2 0/2] Fix syntax-check on macOS/FreeBSD, Bruno Haible, 2018/12/03