lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Generating pages with tables in the new PDF generation code


From: Vadim Zeitlin
Subject: Re: [lmi] Generating pages with tables in the new PDF generation code
Date: Mon, 28 Aug 2017 15:33:57 +0200

On Sun, 27 Aug 2017 23:37:25 +0000 Greg Chicares <address@hidden> wrote:

GC> I fear we're talking past each other. Perhaps we first need to
GC> establish a definition of 'merge'.

 Yes, sorry, I should have been more clear. For me the merge is integration
of the changes but, of course, not pushing these changes immediately into
production. In fact, you could even do "git merge" without even committing
the result, so you would have an opportunity to review, test and tweak the
changes before doing "git commit", let alone "git push".

 But, skipping ahead, I don't think such "git merge --no-commit" + local
changes + "git commit" workflow is the best idea in our case, I'll propose
something better (IMHO) below.


GC> To me, 'merge' means integration into the production system, which
GC> is HEAD: the tip of master, which is the one and only branch.
GC> Almost no proposed change goes directly into production, no matter
GC> who the author may be. If I'm the author, then I make changes in a
GC> local working copy, where they are reworked again and again until
GC> they are ready to commit; at that point, I mv them into a separate
GC> ../stash directory, then do
GC>   git ls-files --deleted | xargs git checkout --
GC> to start with an unmodified copy of HEAD, and then create commits
GC> that put the changes into production. That final "create commits"
GC> step may sound like repetitive busywork,

 Unfortunately it's much worse than this. It's fundamentally incompatible
with the normal way of using git because it means that all the commits that
finally go into master are different from the commits that I had originally
and hence merging my local branch with master later would result in
conflicts for *all* the changes. Git is smart enough to recognize that a
commit with the different SHA-1 doing the same changes is the same as the
original commit, although this is not nearly as fast or convenient as
actually using the same commit, but your commits are often different in
some minor (and sometimes not so minor) ways, which prevents even this from
working.

 In practice, when I tried merging my local group-quotes branch with the
changes that finally ended up in master, I had to abandon this effort after
spending an hour on it and realizing that I would need many more to do it.
Of course, for that branch I didn't actually need to preserve it because
the work had been finished by the time it was merged into master, so I
could just drop it, which is what I ended up by doing. But if the same
thing happens now, when you integrate-but-not-really-merge my half-finished
changes, it would be much worse because I would have to reconcile changes
done in my branch with yours.

 So I'd really like to avoid this. And I think it can be done while
respecting your goals, which, AFAICS, are all nicely summarized in these
lines (please let me know if I'm forgetting anything):

GC> I certainly don't want to examine a series of individual commits.
GC> I do want to scrutinize every change that goes into production.
GC> Often I'll want to make modifications of my own

I'll refer to them as GC{1,2,3} below. My goals would be:

VZ1: Submit changes in several parts rather than all at once. I put this as
     a goal because I believe that this would allow you to integrate the
     changes faster and would be more efficient for both of us than doing
     it all at once.

VZ2: Be able to continue working while you're review the changes already
     submitted for integration. This implies being able to easily
     re-integrate your changes before submitting the next bunch of mine for
     review. Which, in turn, crucially depends on you using "git merge" to
     preserve the original commits instead of "git am" or, worse, redoing
     almost-but-not-quite-the-same commits manually.

VZ3: Keep the changes history even when everything is integrated. This is
     not crucial, but can be nice when returning to the code later and we
     basically get it for free anyhow if (VZ2) is satisfied.


 And here is a high-level description of the approach that would, I
believe, satisfy all of these goals:

0. You would do the integration work on a branch in your repository.
   This is necessary because of (VZ1) as you simply can't put any part of
   the new PDF generation code in master yet, as it's incomplete. I'll call
   this branch "direct-pdf-gen" because it's really the same one, but if
   this is too confusing you could give it a different name, e.g.
   "greg-integration-branch-for-direct-pdf-gen" or whatever. Just notice
   that this will be a public branch, i.e. you will have to push it to
   Savannah so that I could access it as well, so it should have some
   meaningful name and not, e.g. "ZOMG-PONIES!", to take a random example.

1. I would produce branches with changes ready (in my opinion) to be
   integrated. E.g. the first such branch might contain just the
   refactoring of the group quote generation code. We could use Github PR
   machinery for this because it's really convenient, but it is not at all
   necessary, the important thing is that I'll tell you, by posting here,
   "Greg, please get the changes from such-and-such-branch and integrate
   them".

2. At some later time (crucially, I'll be continuing to work on my branch
   in the meanwhile), you would integrate such-and-such-branch into your
   direct-pdf-gen integration branch. While doing it you will undoubtedly
   find things you want to change and you would do it in one of 2 ways:

   (a) Either you would work on my such-and-such-branch and add your own
       commits fixing (or even undoing or rewriting) my changes in this
       branch. And then you would run "git merge such-and-such-branch" from
       the direct-pdf-gen branch to merge the changes.

   or

   (b) You could do "git merge such-and-such-branch" into direct-pdf-gen
       branch first and then commit your own changes on top of it.

   Personally the latter seems a bit simpler to me, but if you prefer the
   former it's not a problem at all. The critically important things are
   that you do not modify the existing commits (changing the code modified
   by them in a later commit doesn't count as "modifying existing commit")
   and you use "git merge" to bring them into your integration branch.

3. After finishing with integration you would push direct-pdf-branch to
   Savannah so that I could see your changes (and maybe post about them
   here). I would then get them and merge them back into my own
   direct-pdf-branch which, by now, contains plenty of changes since
   such-and-such-branch was created. And the magical thing is that this
   will allow me to get all your changes without conflicts and when I need
   to submit my next bunch of changes for integration, it will be trivially
   simple for me to do it.

   Of course, if any of your changes done during the integration process
   conflict with the changes I did in private on my own branch at the same
   time, I would still get conflicts and would have to resolve them. But
   there will be very few, if any, of them.

4. Go back to step (1) if any changes remain.


 Now this can seem horribly complicated to you but please trust me that it
isn't at all and that doing it like this will be much simpler than what you
had been doing so far while still satisfying

GC1: Because you would only look at the changes done (or about to be done)
     by "git merge" instead of individual commits.

GC2: Because you will still be able to look at every change and it will be
     simpler for you to do it because there will be fewer changes in each
     round of them.

GC3: Because you will be able to change whatever you like, all that changes
     is that you will commit your changes as separate commits instead of
     bundling them together with the preceding ones.


 Does this look reasonable to you? If so, I can describe all the git
commands you will need in excruciating details and will try to create a
first branch with the changes to be integrated to test the process.


GC> >  BTW, here is how I would review these changes:
GC> > 
GC> >   $ git log -p ..direct-pdf-gen -- group_quote_pdf_gen_wx.cpp
GC> > 
GC> > This command would show all the commits in the branch affecting this file
GC> > with the changes done by each commit to this file (only), due to -p
GC> > (--patch) option. Notice that this is not something you can do without 
git,
GC> > if you just diff the 2 versions of this file, you get a diff with
GC> > 185 insertions and 371 deletions, i.e. a huge number of changes which are
GC> > not easy to understand. Individual commits shown by the command above are
GC> > much simpler.
GC> 
GC> That doesn't quite work here, probably because of '--single-branch' below,

 Yes, sorry, my command above implicitly assumed that you had done "git
fetch" that I had mentioned before to get direct-pdf-gen into your
repository.

 And, I'm sorry to insist, but there is really no need to have different
clones, i.e. multiple git repositories. You can have them, of course, but
it's just completely useless and potentially confusing. Notice that you can
have multiple working directories associated with the same repository, but
I don't even see the need for this in lmi case. Switching between branches
is almost magically fast with Git (well, at least when not using Cygwin...).

GC> so I'll copy and paste everything I've done because it's so short:
GC> 
GC> /home/greg[0]$schroot --chroot=cross-lmi
GC> /home/greg[0]$ls -di /
GC> 19005443 /
GC> /home/greg[0]$cd /opt/lmi/
GC> /opt/lmi[0]$mkdir pdf
GC> /opt/lmi[0]$cd pdf
GC> /opt/lmi/pdf[0]$git clone https://github.com/vadz/lmi.git --branch 
direct-pdf-gen --single-branch
GC> /opt/lmi/pdf[0]$cd lmi
GC> /opt/lmi/pdf/lmi[0]$rm vcx *.sln *.bkl .gitignore
GC> /opt/lmi/pdf/lmi[0]$git log -p ..direct-pdf-gen -- 
group_quote_pdf_gen_wx.cpp |wc
GC>       0       0       0
GC> 
GC> Is it obvious to you what I should change to make this work? I'd
GC> guess that
GC> - git clone https://github.com/vadz/lmi.git --branch direct-pdf-gen 
--single-branch
GC> + git clone https://github.com/vadz/lmi.git
GC> should suffice, but that's just a guess.

 Yes, this would work because then you would have both master and
direct-pdf-gen branches and "..direct-pdf-gen" above, which is the same as
"HEAD..direct-pdf-gen", which is the same as "master..direct-pdf-gen" when
you're on master (which is tautologically equivalent to saying that HEAD
points to master), would show you all the commits that are in
direct-pdf-gen branch but not in master.

 However what would work even better would be just getting this branch into
your existing repository. You can do it using the command in my previous
reply but if you intend to update it often, it's better to configure a "git
remote", i.e. a remote repository, for it like this (the first "vadz" is
the name of the remote and you can use whatever you like instead):

        $ git remote add vadz https://github.com/vadz/lmi.git

Then you will be able to do

        $ git fetch vadz

to get all the latest changes from this repository. And then you could do,
assuming you're on master,

        $ git log -p ..vadz/direct-pdf-gen -- group_quote_pdf_gen_wx.cpp

to see all the changes on this branch affecting group_quote_pdf_gen_wx.cpp.

 But, again, please don't do this yet, I will prepare a branch with the
changes ready for integration later, but I haven't had time to do it yet.


GC> I'm not sure a separate branch for a subset of changes is needed:
GC> a list of files might do as well, once we arrive at a common
GC> definition of 'merge'.

 I'm afraid discussing changes in the context of sets of files is just a
fundamentally wrong approach which is incompatible with the goals above. We
really need to work in the units of commits, i.e. branches.


 Sorry for an awfully long post but it would be very nice to have a proper
integration process in place, so I believe it's worth spending some time on
this, even if it probably is rather infuriating right now.

 Thanks for reading,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]