Bloom filters for have/want negotiation

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Bloom filters for have/want negotiation

Michael Haggerty-2
I have been thinking about Wilhelm Bierbaum's talk at the last GitMerge
conference [1] in which he describes a scheme for using Bloom filters to
make the initial reference advertisement less expensive.

In his scheme (if I understand correctly) the client starts off the
conversation by passing the server a Bloom filter that indicates what
(refname, SHA-1) pairs the client already has. This makes it unnecessary
for the server to advertise those references, thereby reducing the cost
of incremental fetches dramatically if the server has very many references.

Because Bloom filters have false positives, this scheme is not 100%
reliable. Therefore I don't think we would want Git to depend on it.

But it got me thinking about how the client could use a Bloom filter in
a later stage of the negotiation, when telling the server what objects
it already has, while preserving 100% reliability.

The idea is to use connectivity information to correct mistakes caused
by reliance on the Bloom-filter:

1. The server advertises the references that it has in the way that it
is currently done.
2. The client advertises the objects that it has (or some subset of
them; see below) via a Bloom filter.
3. The server sends the client the packfile that results from assuming
that the Bloom filter is giving correct answers. (This might mean that
too few objects are sent to the client.)
4. The client does a connectivity check on the packfile. If any objects
are missing, it asks the server for them via a reliable
(non-Bloom-filter-based) request.

How would one construct the Bloom filter for step 2? (Remember that a
properly-configured Bloom filter requires about 5 bits of space per
value stored for each factor of 0.1 in the false-positive rate. So, for
example, to store 5000 values with a 1% false-positive rate, the Bloom
filter would need to be approximately 5000 * 10 bits = 6.2 kB in size.)

Here are some possible schemes:

* Record *all* objects in the Bloom filter. The Git repo has
approximately 200k objects, so, supposing that we could live with a 10%
false-positive rate (see below), the Bloom filter would need to be about
125 kB.

* Record all commit objects in the Bloom filter. For the Git repo that
is about 40k commits, so for a 10% error rate the Bloom filter would
have to be about 25 kB.

* Record some subset of commits; for example, all unique branch and tag
tips, the peeled tags, plus some sparse subsets of commits deeper in the
history. The ls-remote for the Git repo lists 1730 unique SHA-1s, so,
supposing we choose 10x that number with a 1% error rate, the Bloom
filter would be about 20 kB.

* Record only the branch and tag tips and peeled tags. Please note that
for situations where the client has fetched from the server before and
still has the remote-tracking references from that fetch, this scheme
might work surprisingly well. For the Git repository, with a 1% error
rate, this would be about 2 kB.

For the first two schemes, we could tolerate a pretty high error rate
because the server could perform additional consistency checks to reduce
the error rate. For example, if the Bloom filter reports that the client
has commit X, but that the client does *not* have a parent of X, then
the server can assume that the check of X was a false positive and
discard it. Such consistency checks would not be possible with the third
or fourth schemes, so I have chosen lower false-positive rates for those
schemes.

Additional points:

* The client can decide what to include in the Bloom filter. For
example, if it has done a recent fetch from the server, it might want to
send only the remote-tracking branch tips. But if it has never fetched
from this server before, it might want to send all commits.

* A Bloom filter could be computed at repack time rather than at each
fetch. On fetch, the precomputed Bloom filters could be loaded, any
loose objects added to it, and the result sent to the server.

I don't have a gut feeling about the cost of this phase of the
negotiation, so I don't know whether this would be a net savings, let
alone one that is worth the added complexity. But I wanted to document
the idea in case somebody thinks it has promise. (I have no plans to
pursue it.)

Michael

[1] http://git-merge.com/videos/scaling-git-at-twitter-wilhelm-bierbaum.html

--
Michael Haggerty
[hidden email]

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Bloom filters for have/want negotiation

Junio C Hamano
Michael Haggerty <[hidden email]> writes:

> 1. The server advertises the references that it has in the way that it
> is currently done.
> 2. The client advertises the objects that it has (or some subset of
> them; see below) via a Bloom filter.
> 3. The server sends the client the packfile that results from assuming
> that the Bloom filter is giving correct answers. (This might mean that
> too few objects are sent to the client.)

Wouldn't this end up sending objects the server has (perhaps
reachable from a branch that is not advertised to the client) that
is not reachable from the tips the client asked upon a false hit?
If so, that has security ramifications for servers who restrict what
you can download based on who you are (and which branches are
supposed to be visible to you).

Using bloom filter (perhaps invertible ones) to identify what the
receiving end lacks is certainly an intriguing approach, though.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Bloom filters for have/want negotiation

Shawn Pearce
In reply to this post by Michael Haggerty-2
On Fri, Sep 11, 2015 at 2:13 PM, Michael Haggerty <[hidden email]> wrote:
> I have been thinking about Wilhelm Bierbaum's talk at the last GitMerge
> conference [1] in which he describes a scheme for using Bloom filters to
> make the initial reference advertisement less expensive.
...
> But it got me thinking about how the client could use a Bloom filter in
> a later stage of the negotiation, when telling the server what objects
> it already has, while preserving 100% reliability.
...
> I don't have a gut feeling about the cost of this phase of the
> negotiation, so I don't know whether this would be a net savings, let
> alone one that is worth the added complexity. But I wanted to document
> the idea in case somebody thinks it has promise. (I have no plans to
> pursue it.)

Maybe I can help... it just so happens that I have Git servers at
$DAY_JOB instrumented in the smart HTTP negotiate code. They do "many"
fetch requests. :)

The average client is only sending 16 have lines; at 50 bytes/line
this is 800 bytes.

The worst case is due to a bug in the negotiation. With nothing
common, the client just goes on forever until it reaches roots
(something is wrong with MAX_IN_VAIN). We saw 56,318 have lines ... a
2.6 MiB section. But smart HTTP gzips, so this may be only 1.3 MiB on
the wire.

But average/worst doesn't tell the entire picture. Bucketing requests
by have lines gives us more:

  %req | have_lines
  -----|-----------
  100%   56,318
   99%       88
   98%       52
   97%       35
   96%       31
   95%       26

95% of requests have <= 26 have lines, or ~650 bytes after gzip on smart HTTP.
99% of requests have <= 88 have lines, like 2.15 KiB after gzip.

The smart HTTP client with nodone extension is allowed to send up to
1024 have lines in a single round trip; if the server can make an
efficient graph cut using only that batch, it can immediately return
the pack in that same round trip.

Ergo, if this is all working correctly on smart HTTP, clients can
fetch from a server they already "know" with decent efficiency, and
smaller than your 2 KiB Bloom filter estimate for git.git at 1% error
rate.


In practice this is not what always happens. The client isn't making
great decisions about what to send on smart HTTP. For example I have
observed a fetch session where the client sends:

  req #1:  16 haves
  req #2:  26 haves
  req #3:  45 haves
  req #4:  70 haves ... and done

So that is 4 round trips. It started out with 16; waited for common
ACKs. I guess 10 were common but no clean graph cut was made so the
client send another 16 to make 26. Again no clean cut so it added more
to send 45, then finally at 70 we found the cut point.

Reading fetch-pack.c, the first flush is at 16 lines. On the 2nd
request my intent was to double to 32 lines, but its not because count
is not reset to 0. So flush_at starts at 16, doubles to 32, but count
stays at 16 and only 16 new lines can be sent in the next packet, when
the intent was to try probing 32 on the second round. IOW fetch-pack.c
is not expanding the have window as quickly as I wanted.

In reality the above session probably sent:

  16 new,  0 state = 16 haves
  16 new, 10 state = 26 haves
  32 new, 13 state = 45 haves
  ?? new, ?? state = 70 haves

Maybe 120 unique have lines.


Looking at my request data and your Bloom filter "budget" of 2 KiB ..
we could just update fetch-pack.c to send 88 have lines in the first
request burst. 90+-ish% of my user's traffic might be served in just 1
round trip on smart HTTP, instead of 4.


This smart HTTP study also applies to the traditional bi-directional
protocol. With multi_ack the sends a 2nd batch while waiting for reply
from the server. With data moving in both directions on the wire total
round-trips becomes less important and its just data volume. But again
we are talking about 120 unique have lines, so 5.9 KiB (no
compression).

The have lines are very verbose in text format. Just using a binary
SHA-1 would nearly halve the negotiation to ~3 KiB. Binary SHA-1 have
line extension to upload-pack extension is far simpler than a Bloom
filter.

I do wonder if we are stuffing the pipe deep enough with multi_ack on
Internet links. Git doesn't need very long to walk 16 commits,
certainly less than the 200 ms RTT a user might have talking to a
server on the other side of the US. It is possible both sides are
spending most of their time waiting for data transfer of the batches
in flight.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Bloom filters for have/want negotiation

Junio C Hamano
Shawn Pearce <[hidden email]> writes:

> The worst case is due to a bug in the negotiation. With nothing
> common, the client just goes on forever until it reaches roots
> (something is wrong with MAX_IN_VAIN). We saw 56,318 have lines ... a
> 2.6 MiB section. But smart HTTP gzips, so this may be only 1.3 MiB on
> the wire.

This one is interesting.

> But average/worst doesn't tell the entire picture. Bucketing requests
> by have lines gives us more:
>
>   %req | have_lines
>   -----|-----------
>   100%   56,318
>    99%       88
>    98%       52
>    97%       35
>    96%       31
>    95%       26

So is this.  From this observation, at least for your audience, it
is expected that we would usually not need a very long back and
forth session to discover where the history diverges.

But that is kind of expected.  Because the current protocol, in
which the upload-pack speaks first and advertises all tips it has,
allows the fetcher to omit what is known to be common very early and
to concentrate on sending the "have"s from the branches the fetcher
has worked on since the last time it fetched.  The amount of "have"s
needed is expected to be small in an "everybody meets at the same
central place and pushes into and fetches out of there" workflow,
because the amount of work done by a single fetcher since the last
fetch will by definition be a lot smaller than what happened in the
central meeting place.

I wonder how flipping the "who speaks first" would affect that
equation, though.

> Ergo, if this is all working correctly on smart HTTP, clients can
> fetch from a server they already "know" with decent efficiency, and
> smaller than your 2 KiB Bloom filter estimate for git.git at 1% error
> rate.

Isn't this part a bit of oranges and apples comparison?  If I
understand the motivation of Michael's looking into Bloom filter or
some other techniques correctly, it is to find a way to address the
initial advertisement from the sender.  Your analysis is about the
amount of "have", which is an orthogonal issue.

> I do wonder if we are stuffing the pipe deep enough with multi_ack on
> Internet links. Git doesn't need very long to walk 16 commits,
> certainly less than the 200 ms RTT a user might have talking to a
> server on the other side of the US. It is possible both sides are
> spending most of their time waiting for data transfer of the batches
> in flight.

Yes, this is a very useful insight.  An experiment or two with
larger amount of data in flight may be an interesting thing to try.
Do we still have the deadlock possibility caused by our implicit
reliance on the fact that a single batch was expected to fit in a
pipe buffer, by the way, or have we addressed that issue already?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Bloom filters for have/want negotiation

Shawn Pearce
On Sat, Sep 12, 2015 at 12:01 PM, Junio C Hamano <[hidden email]> wrote:

> Shawn Pearce <[hidden email]> writes:
>
>> The worst case is due to a bug in the negotiation. With nothing
>> common, the client just goes on forever until it reaches roots
>> (something is wrong with MAX_IN_VAIN). We saw 56,318 have lines ... a
>> 2.6 MiB section. But smart HTTP gzips, so this may be only 1.3 MiB on
>> the wire.
>
> This one is interesting.
>
>> But average/worst doesn't tell the entire picture. Bucketing requests
>> by have lines gives us more:
>>
>>   %req | have_lines
>>   -----|-----------
>>   100%   56,318
>>    99%       88
>>    98%       52
>>    97%       35
>>    96%       31
>>    95%       26
>
> So is this.  From this observation, at least for your audience, it
> is expected that we would usually not need a very long back and
> forth session to discover where the history diverges.
>
> But that is kind of expected.  Because the current protocol, in
> which the upload-pack speaks first and advertises all tips it has,
> allows the fetcher to omit what is known to be common very early and
> to concentrate on sending the "have"s from the branches the fetcher
> has worked on since the last time it fetched.  The amount of "have"s
> needed is expected to be small in an "everybody meets at the same
> central place and pushes into and fetches out of there" workflow,
> because the amount of work done by a single fetcher since the last
> fetch will by definition be a lot smaller than what happened in the
> central meeting place.

Correct... this was based around a very central meeting place.

Linus pulling from lieutenants likely has very different behavior.
During the merge window, Linus' repo probably has hundreds of commits
not in the peer he is currently pulling from. The have line
transmission may take some time to find a common base.

>> Ergo, if this is all working correctly on smart HTTP, clients can
>> fetch from a server they already "know" with decent efficiency, and
>> smaller than your 2 KiB Bloom filter estimate for git.git at 1% error
>> rate.
>
> Isn't this part a bit of oranges and apples comparison?  If I
> understand the motivation of Michael's looking into Bloom filter or
> some other techniques correctly, it is to find a way to address the
> initial advertisement from the sender.  Your analysis is about the
> amount of "have", which is an orthogonal issue.

No, I think you confused parts of the thread.

Michael started this thread with a lead-in about ref advertisement
then punted and went to a discussion about the have negotiation. The
Bloom filter estimates he was writing were for the client to explain
his current state to the server, replacing the "have" negotiation.

>> I do wonder if we are stuffing the pipe deep enough with multi_ack on
>> Internet links. Git doesn't need very long to walk 16 commits,
>> certainly less than the 200 ms RTT a user might have talking to a
>> server on the other side of the US. It is possible both sides are
>> spending most of their time waiting for data transfer of the batches
>> in flight.
>
> Yes, this is a very useful insight.  An experiment or two with
> larger amount of data in flight may be an interesting thing to try.
> Do we still have the deadlock possibility caused by our implicit
> reliance on the fact that a single batch was expected to fit in a
> pipe buffer, by the way, or have we addressed that issue already?

We still deadlock if the pipe buffer is smaller than the batch size.

I was thinking last night this could be reworked in fetch-pack.c to be
a select() based process.

Instead of immediately writing each have line to the socket, queue
them in a strbuf like we do with stateless_rpc. If the strbuf length
is > 0 add the socket to the writeable FD_SET. Run select() with 0
timeout every 5 commit iterations to see if the remote side is
readable or writeable.

If its readable, read the replies from the server and see how that
leaves the client state. If we are done we can break out, discard our
buffered strbuf, and send "done" to begin the pack stream. If not we
keep running the loop.

If its writeable, attempt to send out the current strbuf, trimming off
the prefix of whatever was successfully written.

fetch-pack can put its own cap on the amount of data its willing to
store in that strbuf, Maybe its 100,000 have lines, which is about 5
MiB of data. If the strbuf is exceeding that we stop queuing and just
pump the select() loop.

When sending "done" there may still be "NAK" or "ACK" lines coming
back from the peer due to the deeper buffering. The client would have
to consume those off before it finds the PACK header. We can handle
that by keeping track of how many flush-pkts were written but have not
yet been NAKed by the peer. When breaking out to send "done" we read
until the remaining number of NAK lines were consumed.


The nice part is we can do an improvement strictly in fetch-pack.c.
upload-pack is not affected and wouldn't care. So we may be able to
experiment with clients, and users who want more aggressive
negotiation could upgrade.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Bloom filters for have/want negotiation

Michael Haggerty-2
In reply to this post by Shawn Pearce
On 09/12/2015 07:16 AM, Shawn Pearce wrote:

> On Fri, Sep 11, 2015 at 2:13 PM, Michael Haggerty <[hidden email]> wrote:
>> I have been thinking about Wilhelm Bierbaum's talk at the last GitMerge
>> conference [1] in which he describes a scheme for using Bloom filters to
>> make the initial reference advertisement less expensive.
> ...
>> But it got me thinking about how the client could use a Bloom filter in
>> a later stage of the negotiation, when telling the server what objects
>> it already has, while preserving 100% reliability.
> ...
>> I don't have a gut feeling about the cost of this phase of the
>> negotiation, so I don't know whether this would be a net savings, let
>> alone one that is worth the added complexity. But I wanted to document
>> the idea in case somebody thinks it has promise. (I have no plans to
>> pursue it.)
>
> Maybe I can help... it just so happens that I have Git servers at
> $DAY_JOB instrumented in the smart HTTP negotiate code. They do "many"
> fetch requests. :)
> [...]
>
> Ergo, if this is all working correctly on smart HTTP, clients can
> fetch from a server they already "know" with decent efficiency, and
> smaller than your 2 KiB Bloom filter estimate for git.git at 1% error
> rate.

Thanks for the awesome data, Shawn. Your data do indeed seem to prove
that there would be no benefit to using Bloom filters in this part of
the negotiation.

Michael

--
Michael Haggerty
[hidden email]

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html