lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] VCS caching


From: Vadim Zeitlin
Subject: Re: [lmi] VCS caching
Date: Fri, 13 Apr 2018 16:14:45 +0200

On Fri, 13 Apr 2018 11:32:43 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2018-04-11 13:01, Vadim Zeitlin wrote:
GC> > On Wed, 11 Apr 2018 12:46:59 +0000 Greg Chicares 
GC> > <address@hidden> wrote:
GC> [...]
GC> > GC> Of course, if you already know the best way to achieve the
GC> > GC> goal I'm trying to express, that spoiler would be welcome.
GC> > 
GC> > I'm afraid I don't even see the problem
GC> 
GC> I needed to learn more about git and explore various approaches in order to
GC> understand why there actually isn't any problem.

 I'm not sure about this. I think you had to do this to prove that there
wasn't another solution that would be significantly better, but IMHO this
wasn't really necessary to check that this one worked, which would have
required less effort.

GC> First, although I already could read 'install_wx.sh', I wouldn't have known
GC> how to write it--I didn't understand why you used git in one particular way
GC> when there are other ways. And I could see numerous ways to speed it up by
GC> caching a long-lived local bare repository, but didn't know how to choose
GC> among them.

 Just to be clear, I did the simplest thing I could think of which still
allowed using local mirror of the repository. I didn't even try finding the
best solution, but just chose one that was good enough and made it as
simple as I could.

GC> For example...
GC> 
GC> It seemed weird to iterate through `git submodule status` when git provides
GC> powerful commands that would appear to do that for us, e.g.:
GC> 
GC>   git clone --depth 1 --shallow-submodules --recurse-submodules \
GC>     file:///cache_for_lmi/vcs/wxWidgets
GC> 
GC> But that command clones submodules from github,

 Yes, I don't think there is any way to tell git-clone to get submodules
from anywhere else but .gitmodules file in the repository being cloned
itself, which is why I had to split the process in 2: clone the super
repository first and then submodules. Of course, this is what "git clone
--recurse-submodules" does internally anyhow, so it's not such a big deal.

GC> even though it doesn't even
GC> mention github...because '.gitmodules' does. I tried using 'git config' to
GC> reset my local mirror, e.g.:
GC>   git config submodule.src/png.url /cache_for_lmi/vcs/libpng.git
GC> but I discovered that '.gitmodules' governs--as I suppose it must, because
GC> different wx SHA1s may use different submodule SHA1s--so that's why it makes
GC> sense that '.gitmodules' file is in the repository rather than in $GIT_DIR.

 The latter makes sense precisely because you wouldn't be able to clone a
repository with the submodules otherwise. The former is not really correct,
however: .git/config submodules option has precedence over .gitmodules and
this is why "git config submodule.$subpath.url" command in install_wx.sh
actually works, i.e. the submodule will be cloned from the given URL after
it's given. Note, however, that the submodule URL can only be configured
like this before it is initialized using "git submodule init" or "git
submodule update --init", so it can't work _after_ "git clone
--recurse-submodules", which initializes all the submodules.


GC> Even if I split the command above into two steps--clone wx, then handle the
GC> submodules with a single 'git submodule update --recursive' command, it
GC> didn't work properly.

 I'm not sure what exactly didn't work, but this looks very similar to what
install_wx.sh does, so it really ought to work if done in the right order.

GC> This git modification seemed promising:
GC>     
https://github.com/git/git/commit/9671a76c174d9bd2b4f56243526fda51f9ff8e46

 I might be missing something, but I don't think it's related at all.

GC> Because this is the only way to learn git and I was already engaged in it, I
GC> decided to explore further. Running 'install_wx.sh' a couple days ago, I
GC> measure this disk usage:
GC>   $du -sh /opt/lmi/vcs/wxWidgets
GC>   767M    /opt/lmi/vcs/wxWidgets
GC> and I remember that getting everything from github took several minutes. 
With
GC> that as the baseline, how might I most effectively use a local bare 
repository?

 This is a question I didn't even ask because for me the only important
overhead to avoid is that of doing a remote clone. Cloning a local
repository is fast enough to not matter.

GC> I realize you're suggesting we use a local non-bare repository and
GC> build there, but first I want to explore the idea of using a cached
GC> bare repository as a drop-in replacement for github--i.e.,
GC>   wx_git_url="/cache_for_lmi/vcs/wxWidgets.git" install_wx.sh

 FWIW this is exactly how I use the script myself, except that I use a
repository on another machine on the LAN because cloning from LAN is still
fast enough, even if it's slower than using local file system, of course.

GC> How about '--shared'? IOW:
GC>   git clone --shared "$wx_git_url" ${wx_dir##*/}
GC> I don't think '--shared' is risky at all for our use case: this is a 
throwaway
GC> clone, used only for building a particular version of wx, one time only. The
GC> disk savings are the same as '--reference' above (which makes sense because
GC> one implies the other):
GC>   410M    /cache_for_lmi/vcs
GC>   2.1M    /tmp/vcs/wxWidgets/.git
GC>   215M    /tmp/vcs/wxWidgets
GC> and it takes only 2.670 s. This seems like a win.

 Yes, I think we lose nothing by adding --shared option to install_wx.sh as
it will be just ignored when the source repository is not local.

GC> Use '--separate-git-dir'?
GC>   $git init --separate-git-dir /cache_for_lmi/vcs/wxWidgets.git 
/tmp/vcs/wxWidgets

 Hmm, I don't think you want to do this: I've never used this option of
git-init before, but from its description, it doesn't create a new
repository at all, so why would you use it here?

 If you just want a new working tree, you should use git-worktree instead.
This should be the most space- and time-efficient way to do it, but it
won't work with remote repositories, so install_wx.sh would have to be made
more complicated to use git-worktree for local repositories or the current
code for the remote ones and IMHO it's not worth it, when the current
version works reasonably well in both cases.

GC> And it seems that, after much learning, I've come to the right answer
GC> (and now I understand why it's right):
GC> 
GC> > if the idea of using /srv/cache_for_lmi/vcs/wxWidgets
GC> > (and xxx.git alongside it for the submodules) and building in some other
GC> > directory (/opt/lmi/local/wx-scratch or wherever, it really doesn't 
matter)
GC> > is acceptable, then this is certainly what we should do.
GC> 
GC> Yup!
GC> 
GC> >  I don't know if I should make a patch for this as the changes seem so
GC> > trivial, but please let me know if I should.
GC> 
GC> No--if I had said 'yes', I'd have learned nothing: it's trivial, but only
GC> in retrospect.
GC> 
GC> But please tell me if there's a better technique than '--separate-git-dir'.

 I think git-worktree is the answer to this question. At the very least,
it's much more clear and simpler to use -- and I think it should be more
efficient too, but I didn't measure it.

 Regards,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]