|
I have a repository with 5 years worth of history, I only want to keep
1 year, so I want to purge the first 4 years. As it happens, the repository only has a single branch which should simplify the problem. Cheers, Geoff Russell P.S. Apologies, but I've asked this question before but didn't get an answer which I understood or which worked, so perhaps my description of the problem was faulty. This is a second attempt. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [hidden email] More majordomo info at http://vger.kernel.org/majordomo-info.html |
|
On 2008.11.17 10:56:23 +1030, Geoff Russell wrote:
> I have a repository with 5 years worth of history, I only want to keep > 1 year, so I want to purge the first 4 years. As it happens, the > repository only has a single branch which should simplify the problem. Use filter-branch to drop the parents on the first commit you want to keep, and then drop the old cruft. Let's say $drop is the hash of the latest commit you want to drop. To keep things sane and simple, make sure the first commit you want to keep, ie. the child of $drop, is not a merge commit. Then you can use: git filter-branch --parent-filter "sed -e 's/-p $drop//'" \ --tag-name-filter cat -- \ --all ^$drop The above rewrites the parents of all commits that come "after" $drop. Check the results with gitk. Then, to clean out all the old cruft. First, the backup references from filter-branch: git for-each-ref --format='%(refname)' refs/original | \ while read ref do git update-ref -d "$ref" done Then clean your reflogs: git reflog expire --expire=0 --all And finally, repack and drop all the old unreachable objects: git repack -ad git prune # For objects that repack -ad might have left around At that point, everything leading up to and including $drop should be gone. HTH Björn -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [hidden email] More majordomo info at http://vger.kernel.org/majordomo-info.html |
|
On 2008.11.17 03:24:12 +0100, Björn Steinbrink wrote:
> On 2008.11.17 10:56:23 +1030, Geoff Russell wrote: > > I have a repository with 5 years worth of history, I only want to keep > > 1 year, so I want to purge the first 4 years. As it happens, the > > repository only has a single branch which should simplify the problem. > > Use filter-branch to drop the parents on the first commit you want to > keep, and then drop the old cruft. > > Let's say $drop is the hash of the latest commit you want to drop. To > keep things sane and simple, make sure the first commit you want to > keep, ie. the child of $drop, is not a merge commit. Then you can use: > > git filter-branch --parent-filter "sed -e 's/-p $drop//'" \ > --tag-name-filter cat -- \ > --all ^$drop > > The above rewrites the parents of all commits that come "after" $drop. > > Check the results with gitk. > > > Then, to clean out all the old cruft. > > First, the backup references from filter-branch: > > git for-each-ref --format='%(refname)' refs/original | \ > while read ref > do > git update-ref -d "$ref" > done > > Then clean your reflogs: > git reflog expire --expire=0 --all > > And finally, repack and drop all the old unreachable objects: > git repack -ad > git prune # For objects that repack -ad might have left around > > At that point, everything leading up to and including $drop should be > gone. Hm, on second thought, if you have tags referencing some of the old history, they'll still be around, I think. Just delete those before you start the rewriting. And of course do the above with a copy of your repo. Just in case. Björn -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [hidden email] More majordomo info at http://vger.kernel.org/majordomo-info.html |
|
On Mon, Nov 17, 2008 at 12:57 PM, Björn Steinbrink <[hidden email]> wrote:
> On 2008.11.17 03:24:12 +0100, Björn Steinbrink wrote: >> On 2008.11.17 10:56:23 +1030, Geoff Russell wrote: >> > I have a repository with 5 years worth of history, I only want to keep >> > 1 year, so I want to purge the first 4 years. As it happens, the >> > repository only has a single branch which should simplify the problem. >> >> Use filter-branch to drop the parents on the first commit you want to >> keep, and then drop the old cruft. >> >> Let's say $drop is the hash of the latest commit you want to drop. To >> keep things sane and simple, make sure the first commit you want to >> keep, ie. the child of $drop, is not a merge commit. Then you can use: >> >> git filter-branch --parent-filter "sed -e 's/-p $drop//'" \ >> --tag-name-filter cat -- \ >> --all ^$drop >> >> The above rewrites the parents of all commits that come "after" $drop. >> >> Check the results with gitk. >> >> >> Then, to clean out all the old cruft. >> >> First, the backup references from filter-branch: >> >> git for-each-ref --format='%(refname)' refs/original | \ >> while read ref >> do >> git update-ref -d "$ref" >> done >> >> Then clean your reflogs: >> git reflog expire --expire=0 --all >> >> And finally, repack and drop all the old unreachable objects: >> git repack -ad >> git prune # For objects that repack -ad might have left around >> >> At that point, everything leading up to and including $drop should be >> gone. > > Hm, on second thought, if you have tags referencing some of the old > history, they'll still be around, I think. Just delete those before you > start the rewriting. > > And of course do the above with a copy of your repo. Just in case. > > Björn > Great, I've just tested this and it is exactly what I want. I'm still getting my head around why, but understanding will arrive with a little more thought. Many thanks, Geoff -- 6 Fifth Ave, St Morris, S.A. 5068 Australia Ph: 041 8805 184 / 08 8332 5069 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [hidden email] More majordomo info at http://vger.kernel.org/majordomo-info.html |
|
In reply to this post by Geoff Russell-3
Geoff,
I'm able to prune history with git filter-branch. For example, to throw away history on the current branch before commit 171d7661eda111d3e35f6e8097a1a3a07b30026c, I tried: git filter-branch --parent-filter ' if [ $GIT_COMMIT = 171d7661eda111d3e35f6e8097a1a3a07b30026c ]; then echo ""; else read line; echo $line; fi' I found the diff between that commit and it's rewritten version was empty, and diffs to subsequent commits looked sane. It took an hour on the git repository with about 16k commits. I probably should have excluded all the commits I didn't want to keep to reduce processing time. However, after deleting all but the rewritten branch and cloning the repository, I didn't notice any decrease in the size of .git/, so I'm not sure why you'd want to do that. Also, all the remaining commitIDs changed so any previous clones would have a tough time merging with yours. Another possibility whose results might be similar in runtime and repository size would be to run git rebase --interactive and squash all the commits together before the ones you want to keep. Marcel Geoff Russell wrote: > I have a repository with 5 years worth of history, I only want to keep > 1 year, so I want to purge the > first 4 years. As it happens, the repository only has a single branch > which should > simplify the problem. > > Cheers, > > Geoff Russell > > P.S. Apologies, but I've asked this question before but didn't get an > answer which > I understood or which worked, so perhaps my description of the problem > was faulty. This > is a second attempt. > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to [hidden email] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [hidden email] More majordomo info at http://vger.kernel.org/majordomo-info.html |
| Powered by Nabble | Edit this page |
