Hosting Git repositories: how useful will git-gc be?
I'm helping my sysadmin to set up some Git repository hosting via
gitosis. I'm trying to keep it as simple as possible.
A question: is it necessary/recommanded/useless to set up a cron job
doing a "git gc" in each repository? My understanding is that a push
through ssh will do some packing, is it correct? Does receiving a pack
trigger a "git gc --auto"?
Re: Hosting Git repositories: how useful will git-gc be?
On Thu, Sep 03, 2009 at 11:45:25AM +0200, Matthieu Moy wrote:
> A question: is it necessary/recommanded/useless to set up a cron job
> doing a "git gc" in each repository? My understanding is that a push
> through ssh will do some packing, is it correct? Does receiving a pack
> trigger a "git gc --auto"?
The objects are transferred as a pack. If the number of objects is less
than receive.unpackLimit (default 100), then they are unpacked to loose
objects. If more, we keep the pack, after completing any missing deltas
used by a thin pack.
So if you tend to push frequently, you will end up with a lot of loose
objects. Even if you have packs, they will be larger than necessary
because you will be missing deltas between objects across packs. And of
course you will eventually end up with a large number of packs, which is
less efficient (each pack has an index, but I believe we search the
Receiving a pack does not (AFAICT looking at the code) trigger a "gc
--auto". Running it has other benefits, too, like pruning cruft and
packing refs. So I think it is probably a good idea to run it
Running it daily or weekly is probably reasonable. You could run it on
every push using the post-update hook, but that may cause excessive I/O
for very little benefit.