can we prevent reflog deletion when branch is deleted?

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
Hi,

Is there a way to prevent reflog deletion when the branch is deleted?
The last entry could simply be a line where the second SHA is all 0's.

--
Sitaram
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Michael Haggerty-2
On 06/01/2013 03:31 AM, Sitaram Chamarty wrote:
> Is there a way to prevent reflog deletion when the branch is deleted?
> The last entry could simply be a line where the second SHA is all 0's.

This is a known problem.  The technical reason that this is not trivial
to solve is the possibility of a directory/file conflict between old
reflog files and references that might be created subsequently (which in
turn is a limitation of how loose references and reflogs are mapped to
filenames):

    git branch foo
    git branch -d foo
    git branch foo/bar

Under your proposal, the second line would retain the reflog file for
foo, which is named ".git/logs/refs/heads/foo".  But the third line
wants to create a file ".git/logs/refs/heads/foo/bar".  The existence of
the "foo" file prevents the creation of a "foo" directory.

A similar problem exists if "foo" and "foo/bar" are exchanged in the
above example.

Peff proposed a solution to this problem [1], but AFAIK it is not making
progress.

Michael

[1]
http://thread.gmane.org/gmane.comp.version-control.git/201715/focus=201752

--
Michael Haggerty
[hidden email]
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Jeff King
On Sat, Jun 01, 2013 at 05:00:07AM +0200, Michael Haggerty wrote:

> This is a known problem.  The technical reason that this is not trivial
> to solve is the possibility of a directory/file conflict between old
> reflog files and references that might be created subsequently (which in
> turn is a limitation of how loose references and reflogs are mapped to
> filenames):
> [...]
> Peff proposed a solution to this problem [1], but AFAIK it is not making
> progress.

I was running with the patch series you mentioned for a while, but there
are some weird bugs with it that need to be tracked down.  I don't
recall the details, but I would occasionally get error messages that
showed that some parts of the code were surprised that the reflog
existed without the ref existing.

While I think solving the D/F conflict in the ref namespaces overall
would be a nice thing to have, doing it with compatibility with the
current system is complex and error-prone. I wonder if simply sticking
the reflog entries into a big GRAVEYARD reflog wouldn't be a great deal
simpler and accomplish the "keep deleted reflogs" goal, which is what
people actually want.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

artagnon
Jeff King wrote:
> I wonder if simply sticking
> the reflog entries into a big GRAVEYARD reflog wouldn't be a great deal
> simpler and accomplish the "keep deleted reflogs" goal, which is what
> people actually want.

Exactly what I was thinking when I read your proposal.  What is the
point of having individual graveyards for deleted branches?  The
branch names no longer have any significance, and separating the
reflogs using branch names nobody remembers is only making
discoverability harder.

What is the problem we are trying to solve?  Someone deletes a branch
by mistake, and wants to get it back?  There's the HEAD reflog for
that.  So, I think the problem is that the person did a flurry of
creation/ deletion/ rebases, and wants to reach one particular commit
she remembers seeing sometime in the past (possibly in refs/remotes/*,
in which case HEAD reflog wouldn't have logged it).  More than adding
a graveyard to provide hard-to-dissect information, I'm interested in
tooling support for the information we already have.

So, I want to search all reflogs for this particular commit I've seen:

  git log -1 --relative-date -g :/quuxery

Doesn't work.  I can search only search one reflog at a time:

  git log -1 --relative-date @@{0}^{/quuxery}

Isn't this much too painful?

Our "default" reflog command displays useless information: why should
I see HEAD@{1} followed by HEAD@{2} and other numbers in ascending
order?  What is the point of that when the abbreviated sha1 is already
shown in the first field?  I use the following alias for reflog:

  rfl = log --oneline --relative-date -g

but it could easily be better.

There are tons of other issues: for instance, after an
interactive-rebase, 'git checkout -' doesn't take me back to the
previous branch (because the parser for @{<N>} is broken).  There are
way too many SHA-1s polluting the description, which can easily be
replaced by a git-describe output.  When I git checkout @~1, my prompt
doesn't scream the sha1; it shows me upstream-error~1, which makes a
lot more sense.

I was under the impression that heavy reflog users would be more
interested in fixing these issues before dumping even more data onto
the user.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Jeff King
On Sat, Jun 01, 2013 at 01:29:07PM +0530, Ramkumar Ramachandra wrote:

> Jeff King wrote:
> > I wonder if simply sticking
> > the reflog entries into a big GRAVEYARD reflog wouldn't be a great deal
> > simpler and accomplish the "keep deleted reflogs" goal, which is what
> > people actually want.
>
> Exactly what I was thinking when I read your proposal.  What is the
> point of having individual graveyards for deleted branches?  The
> branch names no longer have any significance, and separating the
> reflogs using branch names nobody remembers is only making
> discoverability harder.

Why don't the branch names have significance? If I deleted branch "foo"
yesterday evening, wouldn't I want to be able to say "show me foo from
2pm yesterday" or even "show me all logs for foo, so that I can pick the
useful bit from the list"?

When I suggested a big graveyard reflog, I did not mean a straight
concatenation of the deleted reflogs; I meant one which would also
record the name of the ref whose log each entry came from.

If you mean "the branch names in the filesystem don't have
significance", I agree. Using a parallel hierarchy of reflogs was an
implementation choice that let us use the same reflog format.  Defining
a new GRAVEYARD format would need an additional field for the ref name
of each entry, but lets us drop the other naming complexities.

> What is the problem we are trying to solve?  Someone deletes a branch
> by mistake, and wants to get it back?  There's the HEAD reflog for
> that.

The HEAD reflog is not sufficient for two reasons:

  1. Not all ref updates were part of the HEAD reflog (e.g.,
     refs/remotes, tags).

  2. It is not easy to see deduce which ref each entry comes from, which
     makes "deleted_branch@{yesterday}" difficult. You can sometimes
     deduce the branch by reading the surrounding entries (e.g., for
     "checkout" entries), but I do not know offhand whether it can be
     done reliably in all cases (I suspect not, given that unreachable
     reflog entries may be pruned sooner than reachable ones, leaving
     "holes" in the reflog's story).

> More than adding a graveyard to provide hard-to-dissect information,
> I'm interested in tooling support for the information we already have.

I think that is an orthogonal concern. Already with the current reflogs,
such a tool would be useful. And even without such a tool, being able to
access reflog entries of deleted branches is still useful. Even simple
things like "git branch foo deleted@{yesterday}" and "git log -g
deleted" would give a safety net. And those are supported by the
existing porcelain tooling.

I do not necessarily disagree with your criticisms of the tooling around
reflogs, but they are just not my interest right now, and I do not think
working on one concept needs to hold up the other.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

artagnon
Jeff King wrote:
> Why don't the branch names have significance? If I deleted branch "foo"
> yesterday evening, wouldn't I want to be able to say "show me foo from
> 2pm yesterday" or even "show me all logs for foo, so that I can pick the
> useful bit from the list"?

Oh, I misunderstood then.  I didn't realize that your usecase was actually

    git log foo@{yesterday}

where foo is a deleted branch.  Just to give some perspective, so we
don't limit our problem space:

I only ever batch-delete "cold" branches: if I haven't touched a
branch in ~2 months, I consider the work abandoned (due to disinterest
or otherwise) and remove it.  Most of my branches are short-lived, and
I don't remember branch names, much less of the names of the cold
branches I deleted.  My usecase for a graveyeard is "I lost something,
and I need to find it": I don't want to have to remember the original
branch name "foo"; if you can tell everything I deleted yesterday, I
can spot foo and the commit I was looking for.  The HEAD reflog is
almost good enough for me.

To be clear: I'm not against including branch name information; I just
don't want to _have_ to remember them to find what I'm looking for.

> Defining
> a new GRAVEYARD format would need an additional field for the ref name
> of each entry, but lets us drop the other naming complexities.

Certainly.  Putting it in the description will only lead to more
problems (like bugs in the @{<N>} parser).

> The HEAD reflog is not sufficient for two reasons:
>
>   1. Not all ref updates were part of the HEAD reflog (e.g.,
>      refs/remotes, tags).

Would be nice to solve, but it's not a big itch in my opinion.

>   2. It is not easy to see deduce which ref each entry comes from, which
>      makes "deleted_branch@{yesterday}" difficult. You can sometimes
>      deduce the branch by reading the surrounding entries (e.g., for
>      "checkout" entries), but I do not know offhand whether it can be
>      done reliably in all cases (I suspect not, given that unreachable
>      reflog entries may be pruned sooner than reachable ones, leaving
>      "holes" in the reflog's story).

Yeah, this makes sense.

> I do not necessarily disagree with your criticisms of the tooling around
> reflogs, but they are just not my interest right now, and I do not think
> working on one concept needs to hold up the other.

Oh, I didn't mean to hold up anything.  I brought it up because I
thought it would be of interest to heavy reflog users.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
On Sat, Jun 1, 2013 at 3:17 PM, Ramkumar Ramachandra <[hidden email]> wrote:

> Jeff King wrote:
>> Why don't the branch names have significance? If I deleted branch "foo"
>> yesterday evening, wouldn't I want to be able to say "show me foo from
>> 2pm yesterday" or even "show me all logs for foo, so that I can pick the
>> useful bit from the list"?
>
> Oh, I misunderstood then.  I didn't realize that your usecase was actually
>
>     git log foo@{yesterday}
>
> where foo is a deleted branch.  Just to give some perspective, so we
> don't limit our problem space:
>
> I only ever batch-delete "cold" branches: if I haven't touched a
> branch in ~2 months, I consider the work abandoned (due to disinterest
> or otherwise) and remove it.  Most of my branches are short-lived, and
> I don't remember branch names, much less of the names of the cold
> branches I deleted.  My usecase for a graveyeard is "I lost something,
> and I need to find it": I don't want to have to remember the original
> branch name "foo"; if you can tell everything I deleted yesterday, I
> can spot foo and the commit I was looking for.  The HEAD reflog is
> almost good enough for me.

I think I'd have to be playing with *several* branches simultaneously
before I got to the point of forgetting the branch name!

More to the point, your use case may be relevant for a non-bare repo
where "work" is being done, but for a bare repo on a server, I think
the branch name *does* have significance, because it's what people are
collaborating on.

(Imagine someone accidentally nukes a branch, and then someone else
tries to "git pull" and finds it gone.  Any recovery at that point
must necessarily use the branch name).

PS: I am assuming core.logAllRefUpdates is on
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

artagnon
Sitaram Chamarty wrote:
> I think I'd have to be playing with *several* branches simultaneously
> before I got to the point of forgetting the branch name!

Yeah, I work on lots of small unrelated things: the patch-series I
send in are usually the result of few hours of work (upto a few days).
 I keep the branch around until I've rewritten it for enough re-rolls
and am sufficiently sure that it'll hit master.

> More to the point, your use case may be relevant for a non-bare repo
> where "work" is being done, but for a bare repo on a server, I think
> the branch name *does* have significance, because it's what people are
> collaborating on.
>
> (Imagine someone accidentally nukes a branch, and then someone else
> tries to "git pull" and finds it gone.  Any recovery at that point
> must necessarily use the branch name).

Ah, you're mostly talking about central workflows.  I'm on the other
end of the spectrum: I want triangular workflows (and git.git is
slowly getting there).  However, I might have a (vague) thought on
server-side safety in general: I think the harsh dichotomy in ff-only
versus non-ff branches is very inelegant.  Imposing ff-only feels like
a hammer solution, because what happens in practice is different: the
`master` does not need to be rewritten most of the time, but I think
it's useful to allow some "safe" rewrites to undo the mistake of
checking in an private key or something [*1*].  By safety, I mean that
git should give the user easy access to recent dangling objects by
annotating it with enough information: sort of like a general-purpose
"pretty" reflog that is gc-safe (configurable trunc_length?).  It's a
serves more usecases than just the branch-removal problem.

Ofcourse, the standard disclaimer applies: there's a high likelihood
that I'm saying nonsense, because I've never worked in a central
environment.

[Footnotes]

*1* It turns out that this is not uncommon:
https://github.com/search?q=path%3A.ssh%2Fid_rsa&type=Code&ref=searchresults
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
On Sat, Jun 1, 2013 at 11:26 PM, Ramkumar Ramachandra
<[hidden email]> wrote:

> Sitaram Chamarty wrote:
>> I think I'd have to be playing with *several* branches simultaneously
>> before I got to the point of forgetting the branch name!
>
> Yeah, I work on lots of small unrelated things: the patch-series I
> send in are usually the result of few hours of work (upto a few days).
>  I keep the branch around until I've rewritten it for enough re-rolls
> and am sufficiently sure that it'll hit master.
>
>> More to the point, your use case may be relevant for a non-bare repo
>> where "work" is being done, but for a bare repo on a server, I think
>> the branch name *does* have significance, because it's what people are
>> collaborating on.
>>
>> (Imagine someone accidentally nukes a branch, and then someone else
>> tries to "git pull" and finds it gone.  Any recovery at that point
>> must necessarily use the branch name).
>
> Ah, you're mostly talking about central workflows.  I'm on the other

Yes.  Not just because that's what "$dayjob" does, but also because
that's what gitolite does.

> end of the spectrum: I want triangular workflows (and git.git is
> slowly getting there).  However, I might have a (vague) thought on
> server-side safety in general: I think the harsh dichotomy in ff-only
> versus non-ff branches is very inelegant.  Imposing ff-only feels like
> a hammer solution, because what happens in practice is different: the
> `master` does not need to be rewritten most of the time, but I think
> it's useful to allow some "safe" rewrites to undo the mistake of
> checking in an private key or something [*1*].  By safety, I mean that

I suspect that's a big reason for why gitolite is so popular, at least
with central workflows.  It's trivial to set it up so master is
ff-only and any other branch is rewindable etc.

> git should give the user easy access to recent dangling objects by
> annotating it with enough information: sort of like a general-purpose
> "pretty" reflog that is gc-safe (configurable trunc_length?).  It's a
> serves more usecases than just the branch-removal problem.

Again, for "central workflow" folks, gitolite's log files actually
have enough info for all this and more.  Coupled with
"core.logAllRefUpdates", it's possible to recover anything that has
not been gc-ed, even deleted branches and tags.

But it would be nicer if git's own reflog is able to do that.  Hence
my original thought about preserving reflogs for deleted refs (even if
it is in a "graveyard" log to resolve the D/F conflict that Michael
and Peff were discussing up at the top of the thread).

> Ofcourse, the standard disclaimer applies: there's a high likelihood
> that I'm saying nonsense, because I've never worked in a central
> environment.
>
> [Footnotes]
>
> *1* It turns out that this is not uncommon:
> https://github.com/search?q=path%3A.ssh%2Fid_rsa&type=Code&ref=searchresults

Hah!  Lovely...

--
Sitaram
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
In reply to this post by artagnon
[top posting, and not preserving cc's because the original email thread
below is just for context; I don't want to force people into a
discussion that they may have considered closed :-)]

Is there *any* way we can preserve a reflog for a deleted branch,
perhaps under logs/refs/deleted/<timestamp>/full/ref/name ?

Whatever it was that happened to a hundred or more repos on the Jenkins
project seems to be stirring up this debate in some circles.

Just some basic protection -- don't delete the reflog, and instead,
rename it to something that preserves the name but in a different
namespace.

sitaram

On 06/01/2013 11:26 PM, Ramkumar Ramachandra wrote:

> Sitaram Chamarty wrote:
>> I think I'd have to be playing with *several* branches simultaneously
>> before I got to the point of forgetting the branch name!
>
> Yeah, I work on lots of small unrelated things: the patch-series I
> send in are usually the result of few hours of work (upto a few days).
>  I keep the branch around until I've rewritten it for enough re-rolls
> and am sufficiently sure that it'll hit master.
>
>> More to the point, your use case may be relevant for a non-bare repo
>> where "work" is being done, but for a bare repo on a server, I think
>> the branch name *does* have significance, because it's what people are
>> collaborating on.
>>
>> (Imagine someone accidentally nukes a branch, and then someone else
>> tries to "git pull" and finds it gone.  Any recovery at that point
>> must necessarily use the branch name).
>
> Ah, you're mostly talking about central workflows.  I'm on the other
> end of the spectrum: I want triangular workflows (and git.git is
> slowly getting there).  However, I might have a (vague) thought on
> server-side safety in general: I think the harsh dichotomy in ff-only
> versus non-ff branches is very inelegant.  Imposing ff-only feels like
> a hammer solution, because what happens in practice is different: the
> `master` does not need to be rewritten most of the time, but I think
> it's useful to allow some "safe" rewrites to undo the mistake of
> checking in an private key or something [*1*].  By safety, I mean that
> git should give the user easy access to recent dangling objects by
> annotating it with enough information: sort of like a general-purpose
> "pretty" reflog that is gc-safe (configurable trunc_length?).  It's a
> serves more usecases than just the branch-removal problem.
>
> Ofcourse, the standard disclaimer applies: there's a high likelihood
> that I'm saying nonsense, because I've never worked in a central
> environment.
>
> [Footnotes]
>
> *1* It turns out that this is not uncommon:
> https://github.com/search?q=path%3A.ssh%2Fid_rsa&type=Code&ref=searchresults
>

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Thomas Rast-2
Sitaram Chamarty <[hidden email]> writes:

> Whatever it was that happened to a hundred or more repos on the Jenkins
> project seems to be stirring up this debate in some circles.

Making us so curious ... and then you just leave us hanging there ;-)

Any pointers to this debate?

--
Thomas Rast
[hidden email]
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Jeff King
On Thu, Nov 14, 2013 at 08:56:07AM +0100, Thomas Rast wrote:

> > Whatever it was that happened to a hundred or more repos on the Jenkins
> > project seems to be stirring up this debate in some circles.
>
> Making us so curious ... and then you just leave us hanging there ;-)
>
> Any pointers to this debate?

I do not know about any particular debate in git circles, but I assume
Sitaram is referring to this incident:

  https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ

in which a Jenkins dev force-pushed and rewound history on 150 different
repos. In this case the reflog made rollback easy, but if he had pushed
a deletion, it would be harder.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Jeff King
In reply to this post by Sitaram Chamarty
On Thu, Nov 14, 2013 at 05:48:50AM +0530, Sitaram Chamarty wrote:

> Is there *any* way we can preserve a reflog for a deleted branch,
> perhaps under logs/refs/deleted/<timestamp>/full/ref/name ?

I had patches to do something like this here:

  http://thread.gmane.org/gmane.comp.version-control.git/201715/focus=201752

but there were definitely some buggy corners, as much of the code
assumed you needed to have a ref to have a reflog. I don't even run with
it locally anymore.

At GitHub, we log each change to an "audit log" in addition to the
regular reflog (we also stuff extra data from the environment into the
reflog message). So even after a branch is deleted, its audit log
entries remain, though you have to pull out the data by hand (git
doesn't know about it at all, except as an append-only sink for
writing). And git doesn't use the audit log for connectivity, either, so
eventually the objects could be pruned.

> Just some basic protection -- don't delete the reflog, and instead,
> rename it to something that preserves the name but in a different
> namespace.

That part is easy. Accessing it seamlessly and handling reflog
expiration are a little harder. Not because they're intractable, but
just because there are some low-level assumptions in the git code. The
patch series I mentioned above mostly works. It probably just needs
somebody to go through and find the corner cases.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
In reply to this post by Jeff King
On 11/14/2013 01:37 PM, Jeff King wrote:
> On Thu, Nov 14, 2013 at 08:56:07AM +0100, Thomas Rast wrote:
>
>>> Whatever it was that happened to a hundred or more repos on the Jenkins
>>> project seems to be stirring up this debate in some circles.
>>
>> Making us so curious ... and then you just leave us hanging there ;-)

Oh my apologies; I missed the URL!!  (But Peff supplied it before I saw
this email!)

>> Any pointers to this debate?
>
> I do not know about any particular debate in git circles, but I assume
> Sitaram is referring to this incident:
>
>   https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ
>
> in which a Jenkins dev force-pushed and rewound history on 150 different
> repos. In this case the reflog made rollback easy, but if he had pushed
> a deletion, it would be harder.

I don't know if they had a reflog on the server side; they used
client-side reflogs if I understood correctly.

I'm talking about server side (bare repo), assuming the site has
core.logAllRefUpdates set.

And I'll explain the "some circles" part as "something on LinkedIn".  To
be honest there's been a fair bit of FUDding by CVCS types there so I
stopped looking at the posts, but I get the subject lines by email and I
saw one that said "Git History Protection - if we needed proof..." or
something like that.

I admit I didn't check to see if a debate actually followed that post
:-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Jeff King
On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote:

> > I do not know about any particular debate in git circles, but I assume
> > Sitaram is referring to this incident:
> >
> >   https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ
> >
> > in which a Jenkins dev force-pushed and rewound history on 150 different
> > repos. In this case the reflog made rollback easy, but if he had pushed
> > a deletion, it would be harder.
>
> I don't know if they had a reflog on the server side; they used
> client-side reflogs if I understood correctly.
>
> I'm talking about server side (bare repo), assuming the site has
> core.logAllRefUpdates set.

Yes, they did have server-side reflogs (the pushes were to GitHub, and
we reflog everything). Client-side reflogs would not be sufficient, as
the client who pushed does not record the history he just rewound (he
_might_ have it at refs/remotes/origin/master@{1}, but if somebody
pushed since his last fetch, then he doesn't).

The "simplest" way to recover is to just have everyone push again
(without --force). The history will just silently fast-forward to
whoever has the most recent tip. The downside is that you have to wait
for that person to actually push. :)

I think they started with that, and then eventually GitHub support got
wind of it and pulled the last value for each repo out of the
server-side reflog for them.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Luca Milanesio
Would be really useful anyway to have the ability to create a server-side reference based on a SHA-1, using the Git protocol.
Alternatively, just fetching a remote repo based on a SHA-1 (not referenced by any ref-spec but still existent) so that you can create a new reference locally and push.

Luca.

On 14 Nov 2013, at 11:09, Jeff King <[hidden email]> wrote:

> On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote:
>
>>> I do not know about any particular debate in git circles, but I assume
>>> Sitaram is referring to this incident:
>>>
>>>  https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ
>>>
>>> in which a Jenkins dev force-pushed and rewound history on 150 different
>>> repos. In this case the reflog made rollback easy, but if he had pushed
>>> a deletion, it would be harder.
>>
>> I don't know if they had a reflog on the server side; they used
>> client-side reflogs if I understood correctly.
>>
>> I'm talking about server side (bare repo), assuming the site has
>> core.logAllRefUpdates set.
>
> Yes, they did have server-side reflogs (the pushes were to GitHub, and
> we reflog everything). Client-side reflogs would not be sufficient, as
> the client who pushed does not record the history he just rewound (he
> _might_ have it at refs/remotes/origin/master@{1}, but if somebody
> pushed since his last fetch, then he doesn't).
>
> The "simplest" way to recover is to just have everyone push again
> (without --force). The history will just silently fast-forward to
> whoever has the most recent tip. The downside is that you have to wait
> for that person to actually push. :)
>
> I think they started with that, and then eventually GitHub support got
> wind of it and pulled the last value for each repo out of the
> server-side reflog for them.
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to [hidden email]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
In reply to this post by Jeff King
On 11/14/2013 04:39 PM, Jeff King wrote:

> On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote:
>
>>> I do not know about any particular debate in git circles, but I assume
>>> Sitaram is referring to this incident:
>>>
>>>   https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ
>>>
>>> in which a Jenkins dev force-pushed and rewound history on 150 different
>>> repos. In this case the reflog made rollback easy, but if he had pushed
>>> a deletion, it would be harder.
>>
>> I don't know if they had a reflog on the server side; they used
>> client-side reflogs if I understood correctly.
>>
>> I'm talking about server side (bare repo), assuming the site has
>> core.logAllRefUpdates set.
>
> Yes, they did have server-side reflogs (the pushes were to GitHub, and
> we reflog everything). Client-side reflogs would not be sufficient, as
> the client who pushed does not record the history he just rewound (he
> _might_ have it at refs/remotes/origin/master@{1}, but if somebody
> pushed since his last fetch, then he doesn't).
>
> The "simplest" way to recover is to just have everyone push again
> (without --force). The history will just silently fast-forward to
> whoever has the most recent tip. The downside is that you have to wait
> for that person to actually push. :)
>
> I think they started with that, and then eventually GitHub support got
> wind of it and pulled the last value for each repo out of the
> server-side reflog for them.

Great.  But what does github do if the branches were *deleted* by
mistake (say someone does a "git push --mirror"; most likely in a
script, for added fun and laughs!)

Github may be able to help people recover from that also, but plain Git
won't.

And that's what I would like to see a change in.

>
> -Peff
>

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
In reply to this post by Luca Milanesio
On 11/14/2013 04:47 PM, Luca Milanesio wrote:
> Would be really useful anyway to have the ability to create a
> server-side reference based on a SHA-1, using the Git protocol.
> Alternatively, just fetching a remote repo based on a SHA-1 (not
> referenced by any ref-spec but still existent) so that you can create
> a new reference locally and push.

That's a security issue.

Just to clarify, what I am asking for is the ability to recover on the
server, where you have access to the actual files that comprise the
repo.

sitaram

>
> Luca.
>
> On 14 Nov 2013, at 11:09, Jeff King <[hidden email]> wrote:
>
>> On Thu, Nov 14, 2013 at 04:26:46PM +0530, Sitaram Chamarty wrote:
>>
>>>> I do not know about any particular debate in git circles, but I assume
>>>> Sitaram is referring to this incident:
>>>>
>>>>  https://groups.google.com/d/msg/jenkinsci-dev/-myjRIPcVwU/t4nkXONp8qgJ
>>>>
>>>> in which a Jenkins dev force-pushed and rewound history on 150 different
>>>> repos. In this case the reflog made rollback easy, but if he had pushed
>>>> a deletion, it would be harder.
>>>
>>> I don't know if they had a reflog on the server side; they used
>>> client-side reflogs if I understood correctly.
>>>
>>> I'm talking about server side (bare repo), assuming the site has
>>> core.logAllRefUpdates set.
>>
>> Yes, they did have server-side reflogs (the pushes were to GitHub, and
>> we reflog everything). Client-side reflogs would not be sufficient, as
>> the client who pushed does not record the history he just rewound (he
>> _might_ have it at refs/remotes/origin/master@{1}, but if somebody
>> pushed since his last fetch, then he doesn't).
>>
>> The "simplest" way to recover is to just have everyone push again
>> (without --force). The history will just silently fast-forward to
>> whoever has the most recent tip. The downside is that you have to wait
>> for that person to actually push. :)
>>
>> I think they started with that, and then eventually GitHub support got
>> wind of it and pulled the last value for each repo out of the
>> server-side reflog for them.
>>
>> -Peff
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to [hidden email]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Stephen Bash
In reply to this post by Jeff King
----- Original Message -----

> From: "Jeff King" <[hidden email]>
> Sent: Thursday, November 14, 2013 3:14:56 AM
> Subject: Re: can we prevent reflog deletion when branch is deleted?
>
> On Thu, Nov 14, 2013 at 05:48:50AM +0530, Sitaram Chamarty wrote:
>
> > Is there *any* way we can preserve a reflog for a deleted branch,
> > perhaps under logs/refs/deleted/<timestamp>/full/ref/name ?
>
> At GitHub, we log each change to an "audit log" in addition to the
> regular reflog (we also stuff extra data from the environment into the
> reflog message). So even after a branch is deleted, its audit log
> entries remain, though you have to pull out the data by hand (git
> doesn't know about it at all, except as an append-only sink for
> writing).

We recently ran into a similar situation at my $dayjob, so I made our
server side update hook log all pushes (including deletes) and added the
new log file to logrotate(8) -- note: make sure if logrotate recreates
the file that it allows everyone to write to it.  I'm sure it's not as
comprehensive as Peff's solution, but it's pretty simple for smaller
shops that want a little more protection.  Here are the relevant
excerpts from the script:

#!/usr/bin/env python

import os, sys, pwd, stat
from datetime import datetime

def log_push(too_many_changes):
    log_file = 'push-log.txt'
    try:
        f = open(log_file, 'a')

        try:
            # In case we just created the file, attempt to chmod it
            os.chmod(log_file, 0666)
        except OSError:
            # chmod will fail if the current user isn't the owner, but
            # if we've gotten this far we already have write permissions,
            # so just continue quietly
            pass

        # Linux/Mac okay, bad for Windows
        username = pwd.getpwuid(os.getuid())[0]
        f.write('%s: %s push by %s of %s from %s to %s\n'% \
                (datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
                'Failed' if too_many_changes else 'Successful', username,
                refname, oldsha, newsha))
        f.close()
    except IOError:
        try:
            log_stats = os.stat(log_file)
            # Figure out owner and permissions
            log_owner = pwd.getpwuid(log_stats.st_uid).pw_name
            log_perm = oct(stat.S_IMODE(log_stats.st_mode))
            print_flush('Unable to open %s for appending. Current owner ' + \
                        'is %s and permissions are %s.'%(log_file,
                        log_owner, log_perm))
        except:
            exception,desc,stack = sys.exc_info()
            print_flush('Unable to open log file.  While generating error' + \
                        ' message encountered error: %s'%(desc))

if len(sys.argv) != 4:
    print_flush('Usage: %s refname oldsha newsha'%sys.argv[0])
    sys.exit(1)

refname = sys.argv[1]
oldsha = sys.argv[2]
newsha = sys.argv[3]

if newsha == '0'*40:
    # Deleted ref, nothing to do
    log_push(False)
    sys.exit(0)

# ... checking for various rule/style violations ...

log_push(too_many_changes)
if too_many_changes:
    sys.exit(1)
else:
    sys.exit(0)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: can we prevent reflog deletion when branch is deleted?

Sitaram Chamarty
In reply to this post by Jeff King
On 11/14/2013 01:44 PM, Jeff King wrote:

> On Thu, Nov 14, 2013 at 05:48:50AM +0530, Sitaram Chamarty wrote:
>
>> Is there *any* way we can preserve a reflog for a deleted branch,
>> perhaps under logs/refs/deleted/<timestamp>/full/ref/name ?
>
> I had patches to do something like this here:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/201715/focus=201752
>
> but there were definitely some buggy corners, as much of the code
> assumed you needed to have a ref to have a reflog. I don't even run with
> it locally anymore.
>
> At GitHub, we log each change to an "audit log" in addition to the
> regular reflog (we also stuff extra data from the environment into the
> reflog message). So even after a branch is deleted, its audit log
> entries remain, though you have to pull out the data by hand (git
> doesn't know about it at all, except as an append-only sink for
> writing). And git doesn't use the audit log for connectivity, either, so
> eventually the objects could be pruned.
>
>> Just some basic protection -- don't delete the reflog, and instead,
>> rename it to something that preserves the name but in a different
>> namespace.
>
> That part is easy. Accessing it seamlessly and handling reflog
> expiration are a little harder. Not because they're intractable, but
> just because there are some low-level assumptions in the git code. The
> patch series I mentioned above mostly works. It probably just needs
> somebody to go through and find the corner cases.

The use cases I am talking about are those where someone deleted
something and it was noticed well within Git's the earliest of Git's
expire timeouts.

So, no need to worry about expiry times and connecting it with object
pruning.  Really, just the eqvt of a "cp" or "mv" of one file is all
that most people need.

Gitolite's log is the same.  So no one who uses Gitolite needs this
feature.  But people shouldn't have to install Gitolite or anything else
just to get this either!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
12