Quantcast

Massive repository corruptions

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Massive repository corruptions

Enrico Weigelt

Hi folks,


I've just reorganized several repositories (eg. splitted off a large
repo into several small ones), and then I had massive corruptions
(broken pack files) in the new repos (after they already had been clean).

Maybe it has something to do with a cronjob which frequently GC's
all the repos, and it could get even worse if the fs sometimes
goes full within this process.

Could multiple GCs running on the same repo cause this ?


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Avery Pennarun
On Mon, Jul 12, 2010 at 9:56 PM, Enrico Weigelt <[hidden email]> wrote:
> I've just reorganized several repositories (eg. splitted off a large
> repo into several small ones), and then I had massive corruptions
> (broken pack files) in the new repos (after they already had been clean).
>
> Maybe it has something to do with a cronjob which frequently GC's
> all the repos, and it could get even worse if the fs sometimes
> goes full within this process.
>
> Could multiple GCs running on the same repo cause this ?

Multiple simultaneous gc's shouldn't be a problem - git locks things
as it needs them.  Plus, git only removes objects after it has safely
created a new packfile that contains them.  Maybe a filesystem filling
up could cause a problem, but git should be detecting that if it
happens (maybe there's a bug that causes it to not notice, though).

You could experience corruption if your computer crashed before
everything was synced to disk.

Do you know which packfiles are corrupted?  Does 'git index-pack' on
the files reveal anything?

Be sure to make a backup copy of your corrupted repositories before
doing any experiments, or you might accidentally fix the problem and
make it harder to trace.

Good luck.

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Avery Pennarun <[hidden email]> wrote:

> Multiple simultaneous gc's shouldn't be a problem - git locks things
> as it needs them.  Plus, git only removes objects after it has safely
> created a new packfile that contains them.  Maybe a filesystem filling
> up could cause a problem, but git should be detecting that if it
> happens (maybe there's a bug that causes it to not notice, though).

Okay.

> You could experience corruption if your computer crashed before
> everything was synced to disk.

No machine crash, and no sign of filesystem or disk problems
(according to kernel log).

> Do you know which packfiles are corrupted?  Does 'git index-pack' on
> the files reveal anything?

git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
error: inflate: data stream error (incorrect data check)
fatal: pack has bad object at offset 37075832: inflate returned -3

(that's essentially the same git-gc says)


git@blackwidow ~/metux/work.git/pack $ git unpack-objects -r < pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
error: inflate: data stream error (incorrect data check)
error: inflate returned -3

error: inflate: data stream error (incorrect data check)
error: inflate returned -3

Unpacking objects: 100% (1223/1223), done.
fatal: final sha1 did not match



cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Enrico Weigelt <[hidden email]> wrote:

<snip>

What's strange:

when copying pack files from another machine to this box and
run git index-pack there, it fails with the same error.

also: pushing into a new (bare) repo sometimes fails with
inflate errors, sometimes succeeds but leaves an broken packfile.


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Enrico Weigelt <[hidden email]> wrote:

> * Enrico Weigelt <[hidden email]> wrote:
>
> <snip>
>
> What's strange:
>
> when copying pack files from another machine to this box and
> run git index-pack there, it fails with the same error.
>
> also: pushing into a new (bare) repo sometimes fails with
> inflate errors, sometimes succeeds but leaves an broken packfile.

Interesting: if I limit the pack size on the local repository,
and manually copy them over via scp, git index-pack runs fine
on them.

Subsequent push doesnt seem to recognize the already transferred
packs and still sends the big one which gets broken, but running
git-gc multiple times seems to clean up the mess.

Is there any way for limiting the pack size on push ?
(pack.packSizeLimit only affects git-repack, not remote transfers).


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Avery Pennarun
In reply to this post by Enrico Weigelt
On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <[hidden email]> wrote:
> * Avery Pennarun <[hidden email]> wrote:
>> Do you know which packfiles are corrupted?  Does 'git index-pack' on
>> the files reveal anything?
>
> git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
> error: inflate: data stream error (incorrect data check)
> fatal: pack has bad object at offset 37075832: inflate returned -3
>
> (that's essentially the same git-gc says)

What's the size of that .pack file?

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Valeo de Vries
In reply to this post by Enrico Weigelt
On 13 July 2010 06:31, Enrico Weigelt <[hidden email]> wrote:

> * Enrico Weigelt <[hidden email]> wrote:
>
> <snip>
>
> What's strange:
>
> when copying pack files from another machine to this box and
> run git index-pack there, it fails with the same error.
>
> also: pushing into a new (bare) repo sometimes fails with
> inflate errors, sometimes succeeds but leaves an broken packfile.

The pack files you copied over from another machine, were they sane
(i.e. non-corrupt)? If so, that perhaps smells like your hard drive
could be on its last legs...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
In reply to this post by Avery Pennarun
* Avery Pennarun <[hidden email]> wrote:

> On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <[hidden email]> wrote:
> > * Avery Pennarun <[hidden email]> wrote:
> >> Do you know which packfiles are corrupted?  Does 'git index-pack' on
> >> the files reveal anything?
> >
> > git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
> > error: inflate: data stream error (incorrect data check)
> > fatal: pack has bad object at offset 37075832: inflate returned -3
> >
> > (that's essentially the same git-gc says)
>
> What's the size of that .pack file?

Somewhat over 300MB.

Lowering the packfile size seemed to help.
(but I still only can do that for git-repack, not remote transfers)


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Avery Pennarun
On Tue, Jul 13, 2010 at 6:22 AM, Enrico Weigelt <[hidden email]> wrote:

> * Avery Pennarun <[hidden email]> wrote:
>> On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <[hidden email]> wrote:
>> > * Avery Pennarun <[hidden email]> wrote:
>> >> Do you know which packfiles are corrupted?  Does 'git index-pack' on
>> >> the files reveal anything?
>> >
>> > git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
>> > error: inflate: data stream error (incorrect data check)
>> > fatal: pack has bad object at offset 37075832: inflate returned -3
>> >
>> > (that's essentially the same git-gc says)
>>
>> What's the size of that .pack file?
>
> Somewhat over 300MB.
>
> Lowering the packfile size seemed to help.
> (but I still only can do that for git-repack, not remote transfers)

If you got corruption at offset 37,075,832 (about 37 megs) and the
pack is over 300 megs, then the file itself is corrupted right in the
middle (not truncated) and this couldn't have been caused by disk full
errors.  Either you have memory corruption problems, or disk
corruption problems, or filesystem corruption problems.  You'd better
watch out.

Forcing the packfile size to be smaller probably just changes your
memory access patterns and moves your errors around.  But it doesn't
sound like a git bug at this point.

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Avery Pennarun <[hidden email]> wrote:

> If you got corruption at offset 37,075,832 (about 37 megs) and the
> pack is over 300 megs, then the file itself is corrupted right in the
> middle (not truncated) and this couldn't have been caused by disk full
> errors.  Either you have memory corruption problems, or disk
> corruption problems, or filesystem corruption problems.  You'd better
> watch out.

hmm, I have no signs of any hw corruption, but I had a patched
version of zlib installed. Maybe some of my patches broke it,
so some strange overflow or sth like that caused that trouble.

Meanwhile, after reinstalling (unpatched) zlib and recloning the
broken repos, everything seems fine again. Maybe some of you would
like to have a look at my zlib patches ;-o


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Enrico Weigelt <[hidden email]> wrote:

> hmm, I have no signs of any hw corruption, but I had a patched
> version of zlib installed. Maybe some of my patches broke it,
> so some strange overflow or sth like that caused that trouble.
>
> Meanwhile, after reinstalling (unpatched) zlib and recloning the
> broken repos, everything seems fine again. Maybe some of you would
> like to have a look at my zlib patches ;-o

This only seemed to help for a while. Again have trouble w/ broken
repos. But the strange thing: seems to affect only large ones. For
example, could got clone and repeatedly gc --aggressive the git
source w/ trouble.

If it *is* any hw problem (which isnt that unplausible since that
machine is the only one making trouble now), how can I detect it ?
Shouldnt broken memory or disk raise some kernel log message ?


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Jussi Sirpoma <[hidden email]> wrote:

> I once had a difficult to trace memory problem on a box when one of the last
> memory banks
> was bad. It was only used during high load situations while compiling the
> kernel or something
> similar. The problem was finally pinpointed by memtest86 which stresses all
> memory.

hmm, you know some way to do a memory-stresstest w/o rebooting ?


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Massive repository corruptions

Enrico Weigelt
* Jussi Sirpoma <[hidden email]> wrote:

> Sorry not really. Maybe compiling kernel with lots of jobs would reveal some
> problems?

okay, I'll have a try :)

BTW: I'm currently recreating one of the broken repos (mozilla,
which might be large enough for stresstest ;-)) by adding and
fetching the remotes step by step. Reposity size now about 250M
(reduced the packsize to 32M, since this already helped on some
other repos) - yet no breaks occoured.

Let's see where it goes ...


cu
--
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: [hidden email]
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Loading...