how to make "full" copy of a repo

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

how to make "full" copy of a repo

Christoph Anton Mitterer
Hey.

I was looking for an ideally simple way to make a "full" copy of a git
repo. Many howtos are floating around on this on the web, with also lots
of voodoo.


First, it shouldn't be just a clone, i.o.w.
- I want to have all refs (local/remote branches/tags) and of course all
objects from the source repo copied as is.
So it's local branches should become my local branches and not remote
branches as well - and so on.
Basically I want to be able to delete the source afterwards (and all
backups ;) ) and not having anything lost.

- It shouldn't set the source repo as origin or it's branches as remote
tracking branches, as said it should be identical the source repo, just
"freshly copied" via the "Git aware transport mechanisms".

- Whether GC or repacking happens, I don't care, as long as nothing that
is still reachable in the source repo wouldn't get lost (or get lost
once I run a GC in the copied repo).

- Whether anything that other tools have added to .git (e.g. git-svn
stuff) get's lost, I don't care.

- It should work for both, bare and non-bare repos, but it's okay when
it doesn't copy anything that is not committed or stashed.



I'd have said that either:
$ git clone --mirror URl-to-source-repo copy
for the direction from "outside" the source to a copy,
or alternatively:
$ cd source-repo
$ git push --mirror URl-to-copy
for the direction from "within" the source to a copy with copy being an
empty bare or non-bare repo,
would do the job.

But:

a) but the git-clone(1) part for --mirror:
   >and sets up a refspec configuration such that all these refs are
   >overwritten by a git remote update in the target repository.
   kinda confuses me since I wanted to get independent of the source
   repo and this ssems to set up a remote to it?

b) do I need --all --tags for the push as well?

c) When following
   https://help.github.com/articles/duplicating-a-repository/
   it doesn't seem as if --mirror is what I want because they seem to
   advertise it rather as having the copy tracking the source repo.
   Of course I read about just using git-clone --bare, but that seems to
   not copy everything that --mirror does (remote-tracking branches,
   notes).

   So I'm a bit confused...


1) Is it working like I assumed above?
2) Does that also copy things like git-config, hooks, etc.?
3) Does it copy the configured remotes from the source?
4) What else is not copied by that? I'd assume anything that is not
   tracked by git and the stash of the source?



Thanks a lot,
Chris.

smime.p7s (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Kevin Daudt
On Sat, Mar 28, 2015 at 03:56:37AM +0100, Christoph Anton Mitterer wrote:

> Hey.
>
> I was looking for an ideally simple way to make a "full" copy of a git
> repo. Many howtos are floating around on this on the web, with also lots
> of voodoo.
>
>
> First, it shouldn't be just a clone, i.o.w.
> - I want to have all refs (local/remote branches/tags) and of course all
> objects from the source repo copied as is.
> So it's local branches should become my local branches and not remote
> branches as well - and so on.
> Basically I want to be able to delete the source afterwards (and all
> backups ;) ) and not having anything lost.
>
> - It shouldn't set the source repo as origin or it's branches as remote
> tracking branches, as said it should be identical the source repo, just
> "freshly copied" via the "Git aware transport mechanisms".
>
> - Whether GC or repacking happens, I don't care, as long as nothing that
> is still reachable in the source repo wouldn't get lost (or get lost
> once I run a GC in the copied repo).
>
> - Whether anything that other tools have added to .git (e.g. git-svn
> stuff) get's lost, I don't care.
>
> - It should work for both, bare and non-bare repos, but it's okay when
> it doesn't copy anything that is not committed or stashed.
>
>
>
> I'd have said that either:
> $ git clone --mirror URl-to-source-repo copy
> for the direction from "outside" the source to a copy,
> or alternatively:
> $ cd source-repo
> $ git push --mirror URl-to-copy
> for the direction from "within" the source to a copy with copy being an
> empty bare or non-bare repo,
> would do the job.
>
> But:
>
> a) but the git-clone(1) part for --mirror:
>    >and sets up a refspec configuration such that all these refs are
>    >overwritten by a git remote update in the target repository.
>    kinda confuses me since I wanted to get independent of the source
>    repo and this ssems to set up a remote to it?
>
> b) do I need --all --tags for the push as well?
>
> c) When following
>    https://help.github.com/articles/duplicating-a-repository/
>    it doesn't seem as if --mirror is what I want because they seem to
>    advertise it rather as having the copy tracking the source repo.
>    Of course I read about just using git-clone --bare, but that seems to
>    not copy everything that --mirror does (remote-tracking branches,
>    notes).
>
>    So I'm a bit confused...
>
>
> 1) Is it working like I assumed above?
> 2) Does that also copy things like git-config, hooks, etc.?
> 3) Does it copy the configured remotes from the source?
> 4) What else is not copied by that? I'd assume anything that is not
>    tracked by git and the stash of the source?
>
>
>
> Thanks a lot,
> Chris.

Git clone is never going to get you a copy where nothing is lost.

What you are losing on clone is:

* config settings (this includes the configures remotes)
* hooks
* reflog (history of refs, though, by default disabled for bare
  repositories)
* Stashes, because the reference to them is stored in the reflog
* unreferenced objects (though you said those are not a requirement, it
  is still something that is lost)

git clone --mirror is used for repositories that regularly get updates
from the repositories they were cloned from. Though this is not what you
want, it's not difficult to reset the refspecs to the default refspecs.
Because it fetches all refs, it's not necessary to add --all --tags
(because tags are also refs).

git clone --mirror is the closest you are going to get by only using
git.

I guess you are aware of this, but if you want to retain more
information, you have to rely on other means, like scp to get the other
things

So to summarize, git clone is only used for cloning history, which means
objects and refs, the rest is not part of cloning. To get more, you have
to go outside git.

Hope this helps to clear some confussion.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Torsten Bögershausen-2
In reply to this post by Christoph Anton Mitterer
On 2015-03-28 03.56, Christoph Anton Mitterer wrote:

> Hey.
>
> I was looking for an ideally simple way to make a "full" copy of a git
> repo. Many howtos are floating around on this on the web, with also lots
> of voodoo.
>
>
> First, it shouldn't be just a clone, i.o.w.
> - I want to have all refs (local/remote branches/tags) and of course all
> objects from the source repo copied as is.
> So it's local branches should become my local branches and not remote
> branches as well - and so on.
> Basically I want to be able to delete the source afterwards (and all
> backups ;) ) and not having anything lost.
>
> - It shouldn't set the source repo as origin or it's branches as remote
> tracking branches, as said it should be identical the source repo, just
> "freshly copied" via the "Git aware transport mechanisms".
>
> - Whether GC or repacking happens, I don't care, as long as nothing that
> is still reachable in the source repo wouldn't get lost (or get lost
> once I run a GC in the copied repo).
>
> - Whether anything that other tools have added to .git (e.g. git-svn
> stuff) get's lost, I don't care.
>
> - It should work for both, bare and non-bare repos, but it's okay when
> it doesn't copy anything that is not committed or stashed.
>
>
>
> I'd have said that either:
> $ git clone --mirror URl-to-source-repo copy
> for the direction from "outside" the source to a copy,
> or alternatively:
> $ cd source-repo
> $ git push --mirror URl-to-copy
> for the direction from "within" the source to a copy with copy being an
> empty bare or non-bare repo,
> would do the job.
>
> But:
>
> a) but the git-clone(1) part for --mirror:
>    >and sets up a refspec configuration such that all these refs are
>    >overwritten by a git remote update in the target repository.
>    kinda confuses me since I wanted to get independent of the source
>    repo and this ssems to set up a remote to it?
>
> b) do I need --all --tags for the push as well?
>
> c) When following
>    https://help.github.com/articles/duplicating-a-repository/
>    it doesn't seem as if --mirror is what I want because they seem to
>    advertise it rather as having the copy tracking the source repo.
>    Of course I read about just using git-clone --bare, but that seems to
>    not copy everything that --mirror does (remote-tracking branches,
>    notes).
>
>    So I'm a bit confused...
This instructions have 3 repos:
the source, "old", the destination "new" and a temporary one.
As you only push to "new", "new" should have no information about
"old" or "temp".
>
>
> 1) Is it working like I assumed above?
> 2) Does that also copy things like git-config, hooks, etc.?
> 3) Does it copy the configured remotes from the source?
> 4) What else is not copied by that? I'd assume anything that is not
>    tracked by git and the stash of the source?

You didn't write if this is a bare repository,
if it is on a local disc, if it is reachable by rsync ?
Linux or Windows ?

For a "full clone" (in the sense of having everything, bit for bit)
I would probably use rsync. (After stopping all activities on the repo)

But I don't know where you repos life, are they bare or not, so there
may be other ways to go.

>
>
> Thanks a lot,
> Chris.
>

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Ævar Arnfjörð Bjarmason
On Sat, Mar 28, 2015 at 7:52 PM, Torsten Bögershausen <[hidden email]> wrote:

> On 2015-03-28 03.56, Christoph Anton Mitterer wrote:
>> Hey.
>>
>> I was looking for an ideally simple way to make a "full" copy of a git
>> repo. Many howtos are floating around on this on the web, with also lots
>> of voodoo.
>>
>>
>> First, it shouldn't be just a clone, i.o.w.
>> - I want to have all refs (local/remote branches/tags) and of course all
>> objects from the source repo copied as is.
>> So it's local branches should become my local branches and not remote
>> branches as well - and so on.
>> Basically I want to be able to delete the source afterwards (and all
>> backups ;) ) and not having anything lost.
>>
>> - It shouldn't set the source repo as origin or it's branches as remote
>> tracking branches, as said it should be identical the source repo, just
>> "freshly copied" via the "Git aware transport mechanisms".
>>
>> - Whether GC or repacking happens, I don't care, as long as nothing that
>> is still reachable in the source repo wouldn't get lost (or get lost
>> once I run a GC in the copied repo).
>>
>> - Whether anything that other tools have added to .git (e.g. git-svn
>> stuff) get's lost, I don't care.
>>
>> - It should work for both, bare and non-bare repos, but it's okay when
>> it doesn't copy anything that is not committed or stashed.
>>
>>
>>
>> I'd have said that either:
>> $ git clone --mirror URl-to-source-repo copy
>> for the direction from "outside" the source to a copy,
>> or alternatively:
>> $ cd source-repo
>> $ git push --mirror URl-to-copy
>> for the direction from "within" the source to a copy with copy being an
>> empty bare or non-bare repo,
>> would do the job.
>>
>> But:
>>
>> a) but the git-clone(1) part for --mirror:
>>    >and sets up a refspec configuration such that all these refs are
>>    >overwritten by a git remote update in the target repository.
>>    kinda confuses me since I wanted to get independent of the source
>>    repo and this ssems to set up a remote to it?
>>
>> b) do I need --all --tags for the push as well?
>>
>> c) When following
>>    https://help.github.com/articles/duplicating-a-repository/
>>    it doesn't seem as if --mirror is what I want because they seem to
>>    advertise it rather as having the copy tracking the source repo.
>>    Of course I read about just using git-clone --bare, but that seems to
>>    not copy everything that --mirror does (remote-tracking branches,
>>    notes).
>>
>>    So I'm a bit confused...
> This instructions have 3 repos:
> the source, "old", the destination "new" and a temporary one.
> As you only push to "new", "new" should have no information about
> "old" or "temp".
>>
>>
>> 1) Is it working like I assumed above?
>> 2) Does that also copy things like git-config, hooks, etc.?
>> 3) Does it copy the configured remotes from the source?
>> 4) What else is not copied by that? I'd assume anything that is not
>>    tracked by git and the stash of the source?
>
> You didn't write if this is a bare repository,
> if it is on a local disc, if it is reachable by rsync ?
> Linux or Windows ?
>
> For a "full clone" (in the sense of having everything, bit for bit)
> I would probably use rsync. (After stopping all activities on the repo)

This warrants more emphasis. If you rsync a repository that's
"active", i.e. getting pushes you *will* get corrupt copies. E.g. you
can easily copy something out of the objects directory that's in the
middle of being written, or copy the "refs" namespace after you copy
"objects" and end up with an unreachable object.

There's unfortunately no good solution to this other than doing both
git --mirror backups and rsync backups (for hooks etc.) and combining
the two, or pushing a hook for the duration that bans all updates.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Christoph Anton Mitterer
In reply to this post by Kevin Daudt
On Sat, 2015-03-28 at 15:31 +0100, Kevin D wrote:
> What you are losing on clone is:
> * config settings (this includes the configures remotes)
> * hooks
that would be okay...


> * reflog (history of refs, though, by default disabled for bare
>   repositories)
is there a way to get this copied?


> * Stashes, because the reference to them is stored in the reflog
> * unreferenced objects (though you said those are not a requirement, it
>   is still something that is lost)
that would be okay for me either.


> git clone --mirror is used for repositories that regularly get updates
> from the repositories they were cloned from. Though this is not what you
> want, it's not difficult to reset the refspecs to the default refspecs.
What do you mean here? What would I need to reset exactly?


> git clone --mirror is the closest you are going to get by only using
> git.
I see, thanks :)

> So to summarize, git clone is only used for cloning history, which means
> objects and refs, the rest is not part of cloning. To get more, you have
> to go outside git.

Thanks :)
Chris.

smime.p7s (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Christoph Anton Mitterer
In reply to this post by Torsten Bögershausen-2
On Sat, 2015-03-28 at 19:52 +0100, Torsten Bögershausen wrote:
> As you only push to "new", "new" should have no information about
> "old" or "temp".
Exactly, that would be the goal.

 
> > 1) Is it working like I assumed above?
> > 2) Does that also copy things like git-config, hooks, etc.?
> > 3) Does it copy the configured remotes from the source?
> > 4) What else is not copied by that? I'd assume anything that is not
> >    tracked by git and the stash of the source?
> You didn't write if this is a bare repository,
> if it is on a local disc, if it is reachable by rsync ?
> Linux or Windows ?
Linux.
And in principle I have both cases, but mostly non-bare repos.


Cheers,
Chris.

smime.p7s (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Kevin Daudt
In reply to this post by Christoph Anton Mitterer
On Sun, Mar 29, 2015 at 04:21:26AM +0200, Christoph Anton Mitterer wrote:
> On Sat, 2015-03-28 at 15:31 +0100, Kevin D wrote:
> [..]
>
> > * reflog (history of refs, though, by default disabled for bare
> >   repositories)
> is there a way to get this copied?
>
>

No, the reflog is considered something private to the repository, so
there is no way to git it through git clone.

> [..]
>
> > git clone --mirror is used for repositories that regularly get updates
> > from the repositories they were cloned from. Though this is not what you
> > want, it's not difficult to reset the refspecs to the default refspecs.
> What do you mean here? What would I need to reset exactly?

git clone --mirror sets up the fetch refspec in such a way that local
refs would get reset to whatever upstream has:

+refs/*:refs/*

So every time you would fetch / pull, all your branches would reflect
the way they are on the mirrored repo (which is why it's called mirror).

The default refspec is:

+refs/heads/*:refs/remotes/origin/*

Which would only fetch heads (branches), and maps them as remote
tracking branches, so that your local branches are left alone.

> > git clone --mirror is the closest you are going to get by only using
> > git.
> I see, thanks :)
>
> > So to summarize, git clone is only used for cloning history, which means
> > objects and refs, the rest is not part of cloning. To get more, you have
> > to go outside git.
>
> Thanks :)
> Chris.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Duy Nguyen
In reply to this post by Kevin Daudt
On Sat, Mar 28, 2015 at 9:31 PM, Kevin D <[hidden email]> wrote:

> Git clone is never going to get you a copy where nothing is lost.
>
> What you are losing on clone is:
>
> * config settings (this includes the configures remotes)
> * hooks
> * reflog (history of refs, though, by default disabled for bare
>   repositories)
> * Stashes, because the reference to them is stored in the reflog
> * unreferenced objects (though you said those are not a requirement, it
>   is still something that is lost)

This is true. But I wonder if we should (and can) support
--super-mirror option (disabled by default), where reflog and stashes
are kept, for backup purposes. We might keep unreferenced objects as
well if it's not hard to do.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: how to make "full" copy of a repo

Junio C Hamano
Duy Nguyen <[hidden email]> writes:

> This is true. But I wonder if we should (and can) support
> --super-mirror option (disabled by default), where reflog and stashes
> are kept, for backup purposes. We might keep unreferenced objects as
> well if it's not hard to do.

I doubt that we want to be in the business of filesystem backup.  Is
there a practical use case that is *not* "I am relocating out of
this directory on this machine and will be using the other one I am
making by copying"?

For the "I am relocating" scenario, rsync is the most suitable
option.  The caveat "activity at the original will leave the copied
result incomplete" will apply to whatever transport method you use,
but in the "I am relocating" scenario, you will have a period that
the original is quiet (i.e. you stop using the original at some
point before you start the copied one, and do not expect that the
sequence to take zero down time).

In a sense, "super-mirror" is even worse, if it is doing some "Git
activity" on the source which we may want to log, which means the
original will never be quiet during the copying.  Sure, send-pack
may not currently not do any logging in the original repository, but
depending on the reason why such a copy is being made, the original
may even have a custom hook-based logging data left somewhere in the
repository and for copying such a repository the repository owner
would want to keep the logged data.

And if what super-mirror does is not considered a "Git activity" and
somehow bypasses all the Git rules in the original repository, then
what is the advantage of having it in Git in the first place, over
using something like rsync?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html