[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
expanding TABs in indentation: the aftermath can be easy
From: |
Jim Meyering |
Subject: |
expanding TABs in indentation: the aftermath can be easy |
Date: |
Fri, 11 Dec 2009 13:56:26 +0100 |
Jim Meyering wrote:
> Bruno Haible wrote:
>> What should I write in the NEWS file, about recommendations for people who
>> have
>> patches on top of gnulib?
>
> We also need a way to keep things in order going forward.
> I.e., a syntax-check style rule that enforces this style.
>
> To that end, please prepare a file like the one below,
> to be committed along with your other changes,
> or as part of a subsequent change that enforces policy.
> I started based on your earlier outline.
>
> These are extended regular expressions that match
> any file that must retain TAB-based indentation.
> For now, let's not worry about TABs elsewhere.
> --------------------------
> # These contain Makefile snippets.
> ^modules/
>
> # The regex module is the only major source code for which we still
> # have bidirectional propagation between gnulib and glibc.
> ^lib/regcomp\.c$
> ^lib/regex\.[ch]$
> ^lib/regex_internal\.[ch]$
> ^lib/regexec\.c$
>
> # This is special.
> ^lib/.*\.charset$
>
> # This is a binary file.
> ^lib/.*\.class$
> --------------------------
>
>> What are the tricks?
>
> I'll try to post details tomorrow.
The first part is the "patch-xform" script below.
I'll put it in gnulib's build-aux soon.
For example, I've just used it in coreutils-with-latest-gnulib
to confirm that it can transform the two gl/lib/*.diff files that
no longer apply:
cd coreutils/gl/lib &&
for i in c h; do f=tempname.$i.diff; patch-xform $f > k && mv k $f; done
#!/usr/bin/perl
# Expand leading TABs in the context and modified lines of git unidiff patches.
# If --exclude=FILE is specified, do not modify the patches of any file whose
# name matches any of the perl regular expressions (one per line) in that file.
# The regular expressions are matched against each full, relative file name, as
# found in git unidiff headers, but without the typical "a/", "b/", etc. prefix.
# Here is a useful set of regular expressions:
#
# (?:^|\/)ChangeLog[^/]*$
# (?:^|\/)(?:GNU)?[Mm]akefile[^/]*$
# \.(?:am|mk)$
#
# Only lines to consider:
#
# /^[ +-]/ matched and context lines, when in a diff
#
# /^diff --git/ this is a git diff: ignore a/ and b/ file name prefix
# /^--- (.*)/ use the file name in $1
# /^\+\+\+ / ignore
#
# Currently makes no attempt to detect the end of the final patch,
# so it may convert TABs to spaces on anything there that resembles
# a unidiff-context/modified line.
use strict;
use warnings;
use Text::Tabs;
use Getopt::Long;
(my $ME = $0) =~ s|.*/||;
my $VERSION = '0.1';
my $verbose;
sub usage ($)
{
my ($exit_code) = @_;
my $STREAM = ($exit_code == 0 ? *STDOUT : *STDERR);
if ($exit_code != 0)
{
print $STREAM "Try `$ME --help' for more information.\n";
}
else
{
my $example_regexp = <<\EOF;
(?:^|\/)ChangeLog[^/]*$
(?:^|\/)(?:GNU)?[Mm]akefile[^/]*$
\.(?:am|mk)$
EOF
print $STREAM <<EOF;
Usage: $ME [OPTIONS] [FILE]
Filter FILE (containing git unidiff output), expanding leading TABs
in the context and modified lines.
OPTIONS:
--exclude=RE_FILE if RE_FILE is specified, do not modify the patches of
any file whose name matches any of the perl regular
expressions (one per line) in that file.
--help display this help and exit
--version output version information and exit
With no FILE, or when FILE is -, read standard input.
Sample content for a RE_FILE:
$example_regexp
Be sure to exclude any binary files, e.g., .jpg, .pdf, etc. too.
EOF
}
exit $exit_code;
}
sub build_regexp ($)
{
my ($file) = @_;
# Read regexps from $file, one per line, then 'OR'ing them together
# and wrap in (?:...) to form our result.
open IN, '<', $file
or die "$ME: $file: cannot open for reading: $!\n";
my @lines = <IN>;
close IN;
chomp @lines;
my $re = join '|', @lines;
return "(?:$re)";
}
{
my $exclude_regexp_file;
GetOptions
(
'exclude=s' => \$exclude_regexp_file,
help => sub { usage 0 },
verbose => \$verbose,
version => sub { print "$ME version $VERSION\n"; exit },
) or usage 1;
my $exempt_file_re;
defined $exclude_regexp_file
and $exempt_file_re = build_regexp $exclude_regexp_file;
my $xform_tabs;
while (defined (my $line = <>))
{
my $xformed;
if ($line =~ /^--- [a-z]\/(.*)/) # use the file name in $1
{
my $file_name = $1;
$xform_tabs = (defined $exempt_file_re
? $file_name !~ /$exempt_file_re/o
: 1);
$verbose
and warn "info: $file_name: " . ($xform_tabs ? 1 : 0) . "\n";
}
elsif ($line
=~ /^(?:address@hidden@[ ]
|(copy|rename)[ ]
|[ ]\d{6}$
|diff[ ]--git[ ]
|index[ ]
)
/x)
{
# ignore
}
elsif ($line =~ /^(?:$|[ +-])/)
{
$verbose
and warn "info: $.\n";
# Process or not, depending on name.
if ($xform_tabs)
{
$verbose
and warn "info: $line\n";
my $match = $line =~ /^([ +-])( *\t[ \t]*)(.*)/;
print $match ? $1 . expand($2) . $3 . "\n" : $line;
$xformed = 1;
$verbose && $match
and warn "info: MATCHED!\n";
}
}
else
{
# warn "$ME: unrecognized line: $line\n";
$xform_tabs = 0;
}
! $xformed
and print $line;
}
}
END { # use File::Coda; # http://meyering.net/code/Coda/
defined fileno STDOUT or return;
close STDOUT and return;
warn "$ME: failed to close standard output: $!\n";
$? ||= 1;
}
# Local variables:
# indent-tabs-mode: nil
# End:
You can do the same thing to a topic branch in git.
Here is pseudo-texinfo:
Let's assume that just after transforming @samp{master},
you tagged the result with @samp{tab}
and the changes you want to rebase are on the @samp{topic} branch.
With that, you would run these commands to rebase that branch:
@example
git checkout topic [1]
git rebase tab^ [2]
git format-patch --stdout master \
| patch-xform --exclude=leading-blank.exempt \
> topic.xformed [3]
git checkout -b topic2 tab [4]
git am topic.xformed [5]
git diff --ignore-space-change topic topic2 [6]
git branch -D topic [7]
git branch -m topic2 topic [8]
git rebase master [9]
@end example
Step 1 ensures that @samp{topic} is the current branch, which [2]
rebases to @samp{tab^}, the change-set just before the problematic one.
The third step prints the patch series on @samp{topic}, filters it through
our patch-transforming script and saves the result in a temporary file.
Step 4 creates and makes current our temporary branch, @samp{topic2},
with its base at @samp{tab}, and [5] then applies the transformed
patch set to that new branch.
[6] is an optional cross-check to ensure that the only differences
between the two branches are safely ignorable.
Steps 7 and 8 clean up by removing the original @samp{topic} branch
and replacing it with the temporary one.
Finally, step 9 rebases our new branch to @samp{master}.
We can perform the same task more efficiently and concisely,
with the advantage of no temporary file, but perhaps at the
expense of readability, depending on your familiarity with
these @command{git} commands. You be the judge:
@example
git rebase tab^ topic [a]
git checkout -b topic2 tab [b]
git format-patch --stdout master..topic \
| patch-xform --exclude=leading-blank.exempt \
| git am [c]
git diff --ignore-space-change topic topic2 [d]
git branch -D topic [e]
git branch -m topic2 topic [f]
git rebase master [g]
@end example
Step [a] combines [1] and [2], since there is no need to change the
current branch.
Since [c]'s use of @samp{git am} will modify the current branch
(contrast with [3], which just writes a temporary file),
step [b] must first create and switch to the destination branch, @samp{topic2}.
Step [c] forms the patch series for everything on the @samp{topic} branch,
filters it through our @command{patch-xform} script, and applies the
result to the current branch via @command{git am}.
The remaining steps are identical to address@hidden
However, all of the above doesn't qualify as ``easy enough''
for most people. There are too many variables and interdependencies.
Note that [a] and [g] may evoke merge conflicts, so they delineate
the non-interactive core: address@hidden
Even for so few steps, there are four inputs:
@itemize
@item @var{P} parent branch name [master]
@item @var{T} tag marking the transition point on @var{P} [tab]
@item @var{B} name of branch to move [topic] (forked off of @var{P}
prior to @var{T})
@item file name blacklist: [leading-blank.exempt]
@end itemize
@c note that the list of branch names from "git br --contains @var{T}"
@c must include @var{P}
You can also think of the type of transformation as an input:
trailing-blank-removal or leading-TAB-to-space, or even both.
If you make that the fifth input, verify that @var{T} contains
only changes implied by this type.
Actually, there's an even better way:
automatically derive the type from @var{T}'s change set.
If this command prints no changes, then @var{T} is a trailing-blank-removal
delta:
@example
git diff --ignore-space-at-eol T^..T
@end example
Otherwise, if @var{T}'s delta transforms @kbd{TAB}s to spaces in indentation,
this command will print no diffs:
@example
git diff --ignore-space-change T^..T
@end example
- untabify?, Bruno Haible, 2009/12/06
- Re: untabify - last call for objections, Simon Josefsson, 2009/12/10
- Re: untabify - last call for objections, Jim Meyering, 2009/12/10
- Re: untabify - last call for objections, Bruno Haible, 2009/12/10
- Re: untabify - last call for objections, Simon Josefsson, 2009/12/10
- Re: untabify, coding standards, Bruno Haible, 2009/12/10
- Re: untabify, coding standards, Simon Josefsson, 2009/12/10
- Re: untabify - last call for objections, Pádraig Brady, 2009/12/10