Make the git codebase thread-safe

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Make the git codebase thread-safe

Stefan Zager-2
We in the chromium project have a keen interest in adding threading to
git in the pursuit of performance for lengthy operations (checkout,
status, blame, ...).  Our motivation comes from hitting some
performance walls when working with repositories the size of chromium
and blink:

https://chromium.googlesource.com/chromium/src
https://chromium.googlesource.com/chromium/blink

We are particularly concerned with the performance of msysgit, and we
have already chalked up a significant performance gain by turning on
the threading code in pack-objects (which was already enabled for
posix platforms, but not on msysgit, owing to the lack of a correct
pread implementation).

To this end, I'd like to start submitting patches that make the code
base generally more thread-safe and thread-friendly.  Right after this
email, I'm going to send the first such patch, which makes the global
list of pack files (packed_git) internal to sha1_file.c.

I realize this may be a contentious topic, and I'd like to get
feedback on the general effort to add more threading to git.  I'd
appreciate any feedback you'd like to give up front.

Thanks!

Stefan Zager
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Robin H. Johnson-2
On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
> We in the chromium project have a keen interest in adding threading to
> git in the pursuit of performance for lengthy operations (checkout,
> status, blame, ...).  Our motivation comes from hitting some
> performance walls when working with repositories the size of chromium
> and blink:
+1 from Gentoo on performance improvements for large repos.

The main repository in the ongoing Git migration project looks to be in
the 1.5GB range (and for those that want to propose splitting it up, we
have explored that option and found it lacking), with very deep history
(but no branches of note, and very few tags).

--
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail     : [hidden email]
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Duy Nguyen
In reply to this post by Stefan Zager-2
On Wed, Feb 12, 2014 at 8:54 AM, Stefan Zager <[hidden email]> wrote:
> We in the chromium project have a keen interest in adding threading to
> git in the pursuit of performance for lengthy operations (checkout,
> status, blame, ...).  Our motivation comes from hitting some
> performance walls when working with repositories the size of chromium
> and blink:
>
> https://chromium.googlesource.com/chromium/src
> https://chromium.googlesource.com/chromium/blink

I have no comments about thread safety improvements (well, not yet).
If you have investigated about git performance on chromium
repositories, could you please sum it up? Threading may be an option
to improve performance, but it's probably not the only option.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Duy Nguyen
In reply to this post by Robin H. Johnson-2
On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson <[hidden email]> wrote:

> On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
>> We in the chromium project have a keen interest in adding threading to
>> git in the pursuit of performance for lengthy operations (checkout,
>> status, blame, ...).  Our motivation comes from hitting some
>> performance walls when working with repositories the size of chromium
>> and blink:
> +1 from Gentoo on performance improvements for large repos.
>
> The main repository in the ongoing Git migration project looks to be in
> the 1.5GB range (and for those that want to propose splitting it up, we
> have explored that option and found it lacking), with very deep history
> (but no branches of note, and very few tags).

From v1.9 shallow clone should work for all push/pull/clone... so
history depth does not matter (on the client side). As for
gentoo-x86's large worktree, using index v4 and avoid full-tree
operations (e.g. "status .", not "status"..) should make all
operations reasonably fast. I plan to make "status" fast even without
path limiting with the help of inotify, but that's not going to be
finished soon. Did I miss anything else?
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Karsten Blees-2
Am 12.02.2014 04:43, schrieb Duy Nguyen:

> On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson <[hidden email]> wrote:
>> On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
>>> We in the chromium project have a keen interest in adding threading to
>>> git in the pursuit of performance for lengthy operations (checkout,
>>> status, blame, ...).  Our motivation comes from hitting some
>>> performance walls when working with repositories the size of chromium
>>> and blink:
>> +1 from Gentoo on performance improvements for large repos.
>>
>> The main repository in the ongoing Git migration project looks to be in
>> the 1.5GB range (and for those that want to propose splitting it up, we
>> have explored that option and found it lacking), with very deep history
>> (but no branches of note, and very few tags).
>
> From v1.9 shallow clone should work for all push/pull/clone... so
> history depth does not matter (on the client side). As for
> gentoo-x86's large worktree, using index v4 and avoid full-tree
> operations (e.g. "status .", not "status"..) should make all
> operations reasonably fast. I plan to make "status" fast even without
> path limiting with the help of inotify, but that's not going to be
> finished soon. Did I miss anything else?
>

Regarding git-status on msysgit, enable core.preloadindex and core.fscache (as of 1.8.5.2).

There's no inotify on Windows, and I gave up using ReadDirectoryChangesW to keep fscache up to date, as it _may_ report DOS file names (e.g. C:\PROGRA~1 instead of C:\Program Files).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Erik Faye-Lund-2
In reply to this post by Stefan Zager-2
On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager <[hidden email]> wrote:

> We in the chromium project have a keen interest in adding threading to
> git in the pursuit of performance for lengthy operations (checkout,
> status, blame, ...).  Our motivation comes from hitting some
> performance walls when working with repositories the size of chromium
> and blink:
>
> https://chromium.googlesource.com/chromium/src
> https://chromium.googlesource.com/chromium/blink
>
> We are particularly concerned with the performance of msysgit, and we
> have already chalked up a significant performance gain by turning on
> the threading code in pack-objects (which was already enabled for
> posix platforms, but not on msysgit, owing to the lack of a correct
> pread implementation).

How did you manage to do this? I'm not aware of any way to implement
pread on Windows (without going down the insanity-path of wrapping and
potentially locking inside every IO operation)...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager-2
In reply to this post by Duy Nguyen
On Tue, Feb 11, 2014 at 6:11 PM, Duy Nguyen <[hidden email]> wrote:
>
> I have no comments about thread safety improvements (well, not yet).
> If you have investigated about git performance on chromium
> repositories, could you please sum it up? Threading may be an option
> to improve performance, but it's probably not the only option.

Well, the painful operations that we use frequently are pack-objects,
checkout, status, and blame.  Anything on Windows that touches a lot
of files is miserable due to the usual file system slowness on
Windows, and luafv.sys (the UAC file virtualization driver) seems to
make it much worse.

With threading turned on, pack-objects on Windows now takes about
twice as long as on Linux, which is still more than a 2x improvement
over the non-threaded operation.

Checkout is really bad on Windows.  The blink repository is ~200K
files, and a full clean checkout from the index takes about 10 seconds
on Linux, and about 3:30 on Windows.  I used the Very Sleepy profiler
to see where all the time was spent on Windows: 55% of the time was
spent in OpenFile, and 25% in CloseFile (both in win32).  My immediate
goal is to add threading to checkout, so those file system calls can
be done in parallel.

Enabling the fscache speeds up status quite a bit.  I'm optimistic
that parallelizing the stat calls will yield a further improvement.
Beyond that, it may not be possible to do much more without using a
file system watcher daemon, like facebook does with mercurial.
(https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/)

Blame is something that chromium and blink developers use heavily, and
it is not unusual for a blame invocation on the blink repository to
run for 30 seconds.  It seems like it should be possible to
parallelize blame, but it requires pack file operations to be
thread-safe.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager-2
In reply to this post by Duy Nguyen
On Tue, Feb 11, 2014 at 7:43 PM, Duy Nguyen <[hidden email]> wrote:
>
> From v1.9 shallow clone should work for all push/pull/clone... so
> history depth does not matter (on the client side). As for
> gentoo-x86's large worktree, using index v4 and avoid full-tree
> operations (e.g. "status .", not "status"..) should make all
> operations reasonably fast. I plan to make "status" fast even without
> path limiting with the help of inotify, but that's not going to be
> finished soon. Did I miss anything else?

Chromium developers frequently want to run status over their entire
checkout, and a lot of them run 'git commit -a'.  We want to do
everything possible to speed this up.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager
In reply to this post by Erik Faye-Lund-2
On Wed, Feb 12, 2014 at 3:59 AM, Erik Faye-Lund <[hidden email]> wrote:

> On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager <[hidden email]> wrote:
>>
>> We are particularly concerned with the performance of msysgit, and we
>> have already chalked up a significant performance gain by turning on
>> the threading code in pack-objects (which was already enabled for
>> posix platforms, but not on msysgit, owing to the lack of a correct
>> pread implementation).
>
> How did you manage to do this? I'm not aware of any way to implement
> pread on Windows (without going down the insanity-path of wrapping and
> potentially locking inside every IO operation)...

I don't want to steal the thunder of my coworker, who wrote the
implementation.  He plans to submit it upstream soon-ish.  It relies
on using the lpOverlapped argument to ReadFile(), with some additional
tomfoolery to make sure that the implicit position pointer for the
file descriptor doesn't get modified.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Erik Faye-Lund-2
On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager <[hidden email]> wrote:

> On Wed, Feb 12, 2014 at 3:59 AM, Erik Faye-Lund <[hidden email]> wrote:
>> On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager <[hidden email]> wrote:
>>>
>>> We are particularly concerned with the performance of msysgit, and we
>>> have already chalked up a significant performance gain by turning on
>>> the threading code in pack-objects (which was already enabled for
>>> posix platforms, but not on msysgit, owing to the lack of a correct
>>> pread implementation).
>>
>> How did you manage to do this? I'm not aware of any way to implement
>> pread on Windows (without going down the insanity-path of wrapping and
>> potentially locking inside every IO operation)...
>
> I don't want to steal the thunder of my coworker, who wrote the
> implementation.  He plans to submit it upstream soon-ish.  It relies
> on using the lpOverlapped argument to ReadFile(), with some additional
> tomfoolery to make sure that the implicit position pointer for the
> file descriptor doesn't get modified.

Is the code available somewhere? I'm especially interested in the
"additional tomfoolery to make sure that the implicit position pointer
for the file descriptor doesn't get modified"-part, as this was what I
ended up butting my head into when trying to do this myself.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Matthieu Moy-2
In reply to this post by Stefan Zager-2
Stefan Zager <[hidden email]> writes:

> I'm optimistic that parallelizing the stat calls will yield a further
> improvement.

It has already been mentionned in the thread, but in case you overlooked
it: did you look at core.preloadindex? It seems at least very close to
what you want.

--
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager
In reply to this post by Erik Faye-Lund-2
On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund <[hidden email]> wrote:

> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager <[hidden email]> wrote:
>>
>> I don't want to steal the thunder of my coworker, who wrote the
>> implementation.  He plans to submit it upstream soon-ish.  It relies
>> on using the lpOverlapped argument to ReadFile(), with some additional
>> tomfoolery to make sure that the implicit position pointer for the
>> file descriptor doesn't get modified.
>
> Is the code available somewhere? I'm especially interested in the
> "additional tomfoolery to make sure that the implicit position pointer
> for the file descriptor doesn't get modified"-part, as this was what I
> ended up butting my head into when trying to do this myself.

https://chromium-review.googlesource.com/#/c/186104/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Erik Faye-Lund-2
On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager <[hidden email]> wrote:

> On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund <[hidden email]> wrote:
>> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager <[hidden email]> wrote:
>>>
>>> I don't want to steal the thunder of my coworker, who wrote the
>>> implementation.  He plans to submit it upstream soon-ish.  It relies
>>> on using the lpOverlapped argument to ReadFile(), with some additional
>>> tomfoolery to make sure that the implicit position pointer for the
>>> file descriptor doesn't get modified.
>>
>> Is the code available somewhere? I'm especially interested in the
>> "additional tomfoolery to make sure that the implicit position pointer
>> for the file descriptor doesn't get modified"-part, as this was what I
>> ended up butting my head into when trying to do this myself.
>
> https://chromium-review.googlesource.com/#/c/186104/

ReOpenFile, that's fantastic. Thanks a lot!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager-2
In reply to this post by Matthieu Moy-2
On Wed, Feb 12, 2014 at 10:33 AM, Matthieu Moy
<[hidden email]> wrote:
> Stefan Zager <[hidden email]> writes:
>
>> I'm optimistic that parallelizing the stat calls will yield a further
>> improvement.
>
> It has already been mentionned in the thread, but in case you overlooked
> it: did you look at core.preloadindex? It seems at least very close to
> what you want.

Ah yes, sorry, I overlooked that.  We have indeed turned on
core.preloadindex, and it does indeed speed up status.  That speedup
is reflected in my previous comments about our observations working
with chromium and blink.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

David Kastrup
In reply to this post by Stefan Zager-2
Stefan Zager <[hidden email]> writes:

> On Tue, Feb 11, 2014 at 6:11 PM, Duy Nguyen <[hidden email]> wrote:
>>
>> I have no comments about thread safety improvements (well, not yet).
>> If you have investigated about git performance on chromium
>> repositories, could you please sum it up? Threading may be an option
>> to improve performance, but it's probably not the only option.
>
> Well, the painful operations that we use frequently are pack-objects,
> checkout, status, and blame.

Have you checked the patch in
<URL:http://thread.gmane.org/gmane.comp.version-control.git/241448> and
followups,
Message-ID: <[hidden email]>?

While this does not yet support -M and -C options, it's conceivable that
you don't use them in your server/scripts.

> Anything on Windows that touches a lot of files is miserable due to
> the usual file system slowness on Windows, and luafv.sys (the UAC file
> virtualization driver) seems to make it much worse.

There is an obvious solution here...  Dedicated hardware is not that
expensive.  Virtualization will always have a price.

> Blame is something that chromium and blink developers use heavily, and
> it is not unusual for a blame invocation on the blink repository to
> run for 30 seconds.  It seems like it should be possible to
> parallelize blame, but it requires pack file operations to be
> thread-safe.

Really, give the above patch a try.  I am taking longer to finish it
than anticipated (with a lot due to procrastination but that is,
unfortunately, a large part of my workflow), and it's cutting into my
"paychecks" (voluntary donations which to a good degree depend on timely
and nontrivial progress reports for my freely available work on GNU
LilyPond).

Note that it looks like the majority of the remaining time on GNU/Linux
tends to be spent in system time: I/O time, memory management.  And I
have an SSD drive.  When using packed repositories of considerable size,
decompression comes into play as well.  I don't think that you can hope
to get noticeably higher I/O throughput by multithreading, so really,
really, really consider dedicated hardware running on a native Linux
file system.

--
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager-2
On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup <[hidden email]> wrote:
> Stefan Zager <[hidden email]> writes:
>
>> Anything on Windows that touches a lot of files is miserable due to
>> the usual file system slowness on Windows, and luafv.sys (the UAC file
>> virtualization driver) seems to make it much worse.
>
> There is an obvious solution here...  Dedicated hardware is not that
> expensive.  Virtualization will always have a price.

Not sure I follow you.  We need to support people developing,
building, and testing on natively Windows machines.  And we need to
support users with reasonable hardware, including spinning disks.  If
we were only interested in optimizing for Google employees, each of
whom has one or more small nuclear reactors under their desk, this
would be easy.

>> Blame is something that chromium and blink developers use heavily, and
>> it is not unusual for a blame invocation on the blink repository to
>> run for 30 seconds.  It seems like it should be possible to
>> parallelize blame, but it requires pack file operations to be
>> thread-safe.
>
> Really, give the above patch a try.  I am taking longer to finish it
> than anticipated (with a lot due to procrastination but that is,
> unfortunately, a large part of my workflow), and it's cutting into my
> "paychecks" (voluntary donations which to a good degree depend on timely
> and nontrivial progress reports for my freely available work on GNU
> LilyPond).

I will give that a try.  How much of a performance improvement have you clocked?

> Note that it looks like the majority of the remaining time on GNU/Linux
> tends to be spent in system time: I/O time, memory management.  And I
> have an SSD drive.  When using packed repositories of considerable size,
> decompression comes into play as well.  I don't think that you can hope
> to get noticeably higher I/O throughput by multithreading, so really,
> really, really consider dedicated hardware running on a native Linux
> file system.

I have a background in hardware, and I have much more faith in modern
disk schedulers :)

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

David Kastrup
Stefan Zager <[hidden email]> writes:

> On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup <[hidden email]> wrote:
>
>> Really, give the above patch a try.  I am taking longer to finish it
>> than anticipated (with a lot due to procrastination but that is,
>> unfortunately, a large part of my workflow), and it's cutting into my
>> "paychecks" (voluntary donations which to a good degree depend on timely
>> and nontrivial progress reports for my freely available work on GNU
>> LilyPond).
>
> I will give that a try.  How much of a performance improvement have
> you clocked?

Depends on file type and size.  With large files with lots of small
changes, performance improvements get more impressive.

Some ugly real-world examples are the Emacs repository, src/xdisp.c
(performance improvement about a factor of 3), a large file in the style
of /usr/share/dict/words clocking in at a factor of about 5.

Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
are no improvements in system time (I/O) except for patch 4 of the
series which helps perhaps 20% or so.

So the benefits of the patch will come into play mostly for big, bad
files on Windows: other than that, the I/O time is likely to be the
dominant player anyway.

If you have benchmarked the stuff, for annoying cases expect I/O time to
go down maybe 10-20%, and user time to drop by a factor of 4.  Under
GNU/Linux, that makes for a significant overall improvement.  On
Windows, the payback is likely quite less because of the worse I/O
performance.  Pity.

--
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Karsten Blees-2
In reply to this post by Erik Faye-Lund-2
Am 12.02.2014 19:37, schrieb Erik Faye-Lund:

> On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager <[hidden email]> wrote:
>> On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund <[hidden email]> wrote:
>>> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager <[hidden email]> wrote:
>>>>
>>>> I don't want to steal the thunder of my coworker, who wrote the
>>>> implementation.  He plans to submit it upstream soon-ish.  It relies
>>>> on using the lpOverlapped argument to ReadFile(), with some additional
>>>> tomfoolery to make sure that the implicit position pointer for the
>>>> file descriptor doesn't get modified.
>>>
>>> Is the code available somewhere? I'm especially interested in the
>>> "additional tomfoolery to make sure that the implicit position pointer
>>> for the file descriptor doesn't get modified"-part, as this was what I
>>> ended up butting my head into when trying to do this myself.
>>
>> https://chromium-review.googlesource.com/#/c/186104/
>
> ReOpenFile, that's fantastic. Thanks a lot!

...but should be loaded dynamically via GetProcAddress, or are we ready to drop XP support?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Stefan Zager
On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees <[hidden email]> wrote:

> Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
>> On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager <[hidden email]> wrote:
>>> On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund <[hidden email]> wrote:
>>>> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager <[hidden email]> wrote:
>>>>>
>>>>> I don't want to steal the thunder of my coworker, who wrote the
>>>>> implementation.  He plans to submit it upstream soon-ish.  It relies
>>>>> on using the lpOverlapped argument to ReadFile(), with some additional
>>>>> tomfoolery to make sure that the implicit position pointer for the
>>>>> file descriptor doesn't get modified.
>>>>
>>>> Is the code available somewhere? I'm especially interested in the
>>>> "additional tomfoolery to make sure that the implicit position pointer
>>>> for the file descriptor doesn't get modified"-part, as this was what I
>>>> ended up butting my head into when trying to do this myself.
>>>
>>> https://chromium-review.googlesource.com/#/c/186104/
>>
>> ReOpenFile, that's fantastic. Thanks a lot!
>
> ...but should be loaded dynamically via GetProcAddress, or are we ready to drop XP support?

Right, that is an issue.  From our perspective, it's well past time to
drop XP support.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Make the git codebase thread-safe

Junio C Hamano
In reply to this post by Stefan Zager-2
Stefan Zager <[hidden email]> writes:

> ...  I used the Very Sleepy profiler
> to see where all the time was spent on Windows: 55% of the time was
> spent in OpenFile, and 25% in CloseFile (both in win32).

This is somewhat interesting.

When we check things out, checkout_paths() has a list of paths to be
checked out, and iterates over them and call checkout_entry().

I wonder if you can:

 - introduce a version of checkout_entry() that takes file
   descriptors to write to;

 - have an asynchronous helper threads that pre-open the paths to be
   written out and feed <ce, file descriptor to be written> to a
   queue;

 - restructure that loop so that it reads the <ce, file descriptor
   to be written> from the queue, performs the actual writing out,
   and then feeds <file descriptor to be closed> to another queue; and

 - have another asynchronous helper threads that reads <file
   descriptor to be closed> from the queue and close them.

Calls to write (and preparation of data to be written) will then
remain single-threaded, but it sounds like that codepath is not the
bottleneck in your measurement, so....

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
123