[WIP PATCH 00/14] Protocol v2 patches

classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 03/14] upload-pack-2: Implement the version 2 of upload-pack

David Turner
On Mon, 2016-05-02 at 10:51 -0700, Stefan Beller wrote:

> On Mon, May 2, 2016 at 10:43 AM, David Turner <
> [hidden email]> wrote:
> > On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
> > > In upload-pack-2 we send each capability in its own packet
> > > buffer.
> > > The construction of upload-pack-2 is a bit unfortunate as I would
> > > like
> > > it to not be depending on a symlink linking to upload-pack.c, but
> > > I
> > > did
> > > not find another easy way to do it. I would like it to generate
> > > upload-pack-2.o from upload-pack.c but with '
> > > -DTRANSPORT_VERSION=2'
> > > set.
> >
> > Couldn't we check argv[0] and use that to determine protocol?  Then
> > we
> > could symlink executables rather than source code.
>
> IIRC I proposed a similar thing earlier, i.e.
>
>     if (argv[0] ends with 2)
>         do_protocol_v_2(...)
>
> but that may break (and confuse a lot!) some use cases.
> `git fetch` has the documented --upload-pack switch, so as a server
> -admin
> you are free to have git-upload-pack linking to
> "git-upload-pack-2.8" but additionally you still have
> "git-upload-pack-1.7" or "git-upload-pack-custom-2".
>
> so I think we should not do that :(
> I do like to symlink the executables though.

I think it would probably not break anyone if the new executable name
were sufficiently distinctive -- e.g. starts_with (strrchr(argv[0],
'/'), "git-upload-pack-protocol-v2"). But it would make custom
executables a bit more complicated for the future.

I guess it is better to have silly source code but clean binaries than
clean source code but silly user-visible rules.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 06/14] remote.h: add get_remote_capabilities, request_capabilities

David Turner
In reply to this post by Stefan Beller-4
On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
 

> +const char *known_capabilities[] = {
> + "multi_ack",
> + "thin-pack",
> + "side-band",
> + "side-band-64k",
> + "ofs-delta",
> + "shallow",
> + "no-progress",
> + "include-tag",
> + "multi_ack_detailed",
> + "allow-tip-sha1-in-want",
> + "allow-reachable-sha1-in-want",
> + "no-done",
> +};

I wonder if it is possible to not repeat the list from upload-pack.c?
It seems unfortunate to have to add the same string in two places
whenever you add a capability.

> +static int keep_capability(char *line)

s/keep_/is_known_/ ?  Also it would be good to handle capabilities that
are prefixes of others correctly.

> +{
> + int i;
> + for (i = 0; i < ARRAY_SIZE(known_capabilities); i++)
> + if (starts_with(line, known_capabilities[i]))
> + return 1;
> + return 0;
> +}
> +
> +void get_remote_capabilities(int in, char *src_buf, size_t src_len)

maybe rename "in" to "fd" or "in_fd"?  I don't immediately know what
"in" is supposed to be when I just look at this signature.

> +void request_capabilities(int out, struct string_list *list)

Maybe name this "send_capabilities_request"?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 08/14] fetch-pack: factor out get_selected_capabilities_list

David Turner
In reply to this post by Stefan Beller-4
On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
> + struct string_list_item *item;
> + get_selected_capabilities_list(&list, args);
> + for_each_string_list_item(item, &list) {
> + strbuf_addstr(&c, " ");
> + strbuf_addstr(&c, item->string);
> + }
> +

I think the contents of list leak here -- you need a string_list_clear.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 12/14] Add test for fetch-pack

David Turner
In reply to this post by Stefan Beller-4
On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:

> +test_expect_success 'fetch-pack with protocol version 2' '
> + test_when_finished "rm -rf repo1" &&
> + mkdir repo1 &&
> + (
> + cd repo1 &&
> + git init &&
> + test_commit 1 &&
> + test_commit 2 &&
> + test_commit 3 &&
> + echo "$(git rev-parse master) refs/heads/master"
> >expected &&
> + mkdir repo2 &&
> + (
> + cd repo2 &&
> + git init &&
> + git fetch-pack --transport-version=2 -
> -upload-pack=git-upload-pack-2 ../.git refs/heads/master >../actual
> + ) &&
> + test_cmp expected actual
> + )
> +'

This doesn't actually test that protocol v2 is in fact used (it just
tests that --transport-version=2 doesn't crash).  It would be nice to
actually test the version in-use.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

David Turner
In reply to this post by Stefan Beller-4
On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:

> Hi David,
>
> here are my patches for a protocol v2.
>
> ("Negotiate capabilities before doing anything else", or as code:
>
>         static void upload_pack_version_2(void)
>         {
>                 send_capabilities_version_2();
>                 receive_capabilities_version_2();
>
>                 /* The rest of the protocol stays the same,
> capabilities advertising
>                    is disabled though. */
>                 advertise_capabilities = 0;
>                 upload_pack();
>         }
> )

Overall, except for the comments I made, these patches seem sensible.

Would it be possible to add some docs on the new protocol when you re
-roll?  (I know these are just the initial patches, but it really helps
me to see an explanation along with the code).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

Stefan Beller-4
On Mon, May 2, 2016 at 1:41 PM, David Turner <[hidden email]> wrote:

> On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
>> Hi David,
>>
>> here are my patches for a protocol v2.
>>
>> ("Negotiate capabilities before doing anything else", or as code:
>>
>>         static void upload_pack_version_2(void)
>>         {
>>                 send_capabilities_version_2();
>>                 receive_capabilities_version_2();
>>
>>                 /* The rest of the protocol stays the same,
>> capabilities advertising
>>                    is disabled though. */
>>                 advertise_capabilities = 0;
>>                 upload_pack();
>>         }
>> )
>
> Overall, except for the comments I made, these patches seem sensible.
>
> Would it be possible to add some docs on the new protocol when you re
> -roll?  (I know these are just the initial patches, but it really helps
> me to see an explanation along with the code).
>

Thanks for the review :) I'll fix the issues and add some docs, though
the reroll
may take some time. I really want to get the submodule groups stuff
done as well.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 03/14] upload-pack-2: Implement the version 2 of upload-pack

Duy Nguyen
In reply to this post by David Turner
On Tue, May 3, 2016 at 1:56 AM, David Turner <[hidden email]> wrote:

> On Mon, 2016-05-02 at 10:51 -0700, Stefan Beller wrote:
>> On Mon, May 2, 2016 at 10:43 AM, David Turner <
>> [hidden email]> wrote:
>> > On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
>> > > In upload-pack-2 we send each capability in its own packet
>> > > buffer.
>> > > The construction of upload-pack-2 is a bit unfortunate as I would
>> > > like
>> > > it to not be depending on a symlink linking to upload-pack.c, but
>> > > I
>> > > did
>> > > not find another easy way to do it. I would like it to generate
>> > > upload-pack-2.o from upload-pack.c but with '
>> > > -DTRANSPORT_VERSION=2'
>> > > set.
>> >
>> > Couldn't we check argv[0] and use that to determine protocol?  Then
>> > we
>> > could symlink executables rather than source code.
>>
>> IIRC I proposed a similar thing earlier, i.e.
>>
>>     if (argv[0] ends with 2)
>>         do_protocol_v_2(...)
>>
>> but that may break (and confuse a lot!) some use cases.
>> `git fetch` has the documented --upload-pack switch, so as a server
>> -admin
>> you are free to have git-upload-pack linking to
>> "git-upload-pack-2.8" but additionally you still have
>> "git-upload-pack-1.7" or "git-upload-pack-custom-2".
>>
>> so I think we should not do that :(
>> I do like to symlink the executables though.
>
> I think it would probably not break anyone if the new executable name
> were sufficiently distinctive -- e.g. starts_with (strrchr(argv[0],
> '/'), "git-upload-pack-protocol-v2"). But it would make custom
> executables a bit more complicated for the future.
>
> I guess it is better to have silly source code but clean binaries than
> clean source code but silly user-visible rules.

Maybe add --version to upload-pack? Then we can have a script
git-upload-pack-v2 that does "exec git upload-pack --version=2"
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 06/14] remote.h: add get_remote_capabilities, request_capabilities

Jeff King
In reply to this post by David Turner
On Mon, May 02, 2016 at 02:57:43PM -0400, David Turner wrote:

> On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
>  
> > +const char *known_capabilities[] = {
> > + "multi_ack",
> > + "thin-pack",
> > + "side-band",
> > + "side-band-64k",
> > + "ofs-delta",
> > + "shallow",
> > + "no-progress",
> > + "include-tag",
> > + "multi_ack_detailed",
> > + "allow-tip-sha1-in-want",
> > + "allow-reachable-sha1-in-want",
> > + "no-done",
> > +};
>
> I wonder if it is possible to not repeat the list from upload-pack.c?
> It seems unfortunate to have to add the same string in two places
> whenever you add a capability.

I think that in general, we'd stop adding capabilities to v1. If you
have a client which speaks the new capability, then it should also be
speaking the new protocol. That's not strictly true if other non-git.git
implementations want to learn capability X but not protocol v2, but I
think in practice it's not an unreasonable world view.

I guess there may be a grey area for a while, though, where even
v2-capable clients don't end up speaking it, because they don't yet know
that a particular server can handle it. So any capabilities added in
that grey area may want to go to both v1 and v2.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 06/14] remote.h: add get_remote_capabilities, request_capabilities

David Turner
On Tue, 2016-05-03 at 01:33 -0400, Jeff King wrote:

> On Mon, May 02, 2016 at 02:57:43PM -0400, David Turner wrote:
>
> > On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
> >  
> > > +const char *known_capabilities[] = {
> > > + "multi_ack",
> > > + "thin-pack",
> > > + "side-band",
> > > + "side-band-64k",
> > > + "ofs-delta",
> > > + "shallow",
> > > + "no-progress",
> > > + "include-tag",
> > > + "multi_ack_detailed",
> > > + "allow-tip-sha1-in-want",
> > > + "allow-reachable-sha1-in-want",
> > > + "no-done",
> > > +};
> >
> > I wonder if it is possible to not repeat the list from upload
> > -pack.c?
> > It seems unfortunate to have to add the same string in two places
> > whenever you add a capability.
>
> I think that in general, we'd stop adding capabilities to v1. If you
> have a client which speaks the new capability, then it should also be
> speaking the new protocol. That's not strictly true if other non
> -git.git
> implementations want to learn capability X but not protocol v2, but I
> think in practice it's not an unreasonable world view.
>
> I guess there may be a grey area for a while, though, where even
> v2-capable clients don't end up speaking it, because they don't yet
> know
> that a particular server can handle it. So any capabilities added in
> that grey area may want to go to both v1 and v2.

OK, but then there should be one list per protocol version rather than
two copies of the same list.  


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 06/14] remote.h: add get_remote_capabilities, request_capabilities

Stefan Beller-4
On Tue, May 3, 2016 at 2:21 PM, David Turner <[hidden email]> wrote:

> On Tue, 2016-05-03 at 01:33 -0400, Jeff King wrote:
>> On Mon, May 02, 2016 at 02:57:43PM -0400, David Turner wrote:
>>
>> > On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
>> >
>> > > +const char *known_capabilities[] = {
>> > > + "multi_ack",
>> > > + "thin-pack",
>> > > + "side-band",
>> > > + "side-band-64k",
>> > > + "ofs-delta",
>> > > + "shallow",
>> > > + "no-progress",
>> > > + "include-tag",
>> > > + "multi_ack_detailed",
>> > > + "allow-tip-sha1-in-want",
>> > > + "allow-reachable-sha1-in-want",
>> > > + "no-done",
>> > > +};
>> >
>> > I wonder if it is possible to not repeat the list from upload
>> > -pack.c?
>> > It seems unfortunate to have to add the same string in two places
>> > whenever you add a capability.
>>
>> I think that in general, we'd stop adding capabilities to v1. If you
>> have a client which speaks the new capability, then it should also be
>> speaking the new protocol. That's not strictly true if other non
>> -git.git
>> implementations want to learn capability X but not protocol v2, but I
>> think in practice it's not an unreasonable world view.
>>
>> I guess there may be a grey area for a while, though, where even
>> v2-capable clients don't end up speaking it, because they don't yet
>> know
>> that a particular server can handle it. So any capabilities added in
>> that grey area may want to go to both v1 and v2.
>
> OK, but then there should be one list per protocol version rather than
> two copies of the same list.
>

I thought this is by design as upload-pack is a different program, i.e. it
could be developed out of sync with the client, adding/removing
capabilities there but not in fetch-pack. That doesn't make sense though.

We could introduce known_capabilities_v1 and _v2 respectively in shared
header files, though.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 02/14] upload-pack.c: Refactor capability advertising

Junio C Hamano
In reply to this post by Stefan Beller-4
Stefan Beller <[hidden email]> writes:

> Instead of having the capabilities in a local string, keep them
> in a struct outside the function. This will allow us in a later patch
> to easily reuse the capabilities in version 2 of the protocol.
>
> Signed-off-by: Stefan Beller <[hidden email]>
> ---

Between a flat string and a list, you may be able to enumerate the
elements in a list more easily, but my knee-jerk reaction is that
this does not go far enough, if you are to introduce a table.  It
for example does not allow us to lose the conditional writing of
some capabilties based on allow_*_request and replace that with a
more table-driven approach.

Perhaps that can come in a later step (in other words, the above is
not a basis for rejecting this change; just pointing it out that
this does not have to be the end of the story).

>  upload-pack.c | 59 ++++++++++++++++++++++++++++++++++++++++-------------------
>  1 file changed, 40 insertions(+), 19 deletions(-)
>
> diff --git a/upload-pack.c b/upload-pack.c
> index aaaf883..85381d5 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -719,37 +719,58 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
>   strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
>  }
>  
> +static int advertise_capabilities = 1;
> +const char *all_capabilities[] = {
> + "multi_ack",
> + "thin-pack",
> + "side-band",
> + "side-band-64k",
> + "ofs-delta",
> + "shallow",
> + "no-progress",
> + "include-tag",
> + "multi_ack_detailed",
> + "allow-tip-sha1-in-want",
> + "allow-reachable-sha1-in-want",
> + "no-done",
> +};
> +
>  static int send_ref(const char *refname, const struct object_id *oid,
>      int flag, void *cb_data)
>  {
> - static const char *capabilities = "multi_ack thin-pack side-band"
> - " side-band-64k ofs-delta shallow no-progress"
> - " include-tag multi_ack_detailed";
>   const char *refname_nons = strip_namespace(refname);
>   struct object_id peeled;
>  
>   if (mark_our_ref(refname_nons, refname, oid))
>   return 0;
>  
> - if (capabilities) {
> - struct strbuf symref_info = STRBUF_INIT;
> -
> - format_symref_info(&symref_info, cb_data);
> - packet_write(1, "%s %s%c%s%s%s%s%s agent=%s\n",
> -     oid_to_hex(oid), refname_nons,
> -     0, capabilities,
> -     (allow_unadvertised_object_request & ALLOW_TIP_SHA1) ?
> -     " allow-tip-sha1-in-want" : "",
> -     (allow_unadvertised_object_request & ALLOW_REACHABLE_SHA1) ?
> -     " allow-reachable-sha1-in-want" : "",
> -     stateless_rpc ? " no-done" : "",
> -     symref_info.buf,
> -     git_user_agent_sanitized());
> - strbuf_release(&symref_info);
> + if (advertise_capabilities) {
> + int i;
> + struct strbuf capabilities = STRBUF_INIT;
> +
> + for (i = 0; i < ARRAY_SIZE(all_capabilities); i++) {
> + const char *cap = all_capabilities[i];
> + if (!strcmp(cap, "allow-tip-sha1-in-want")
> +    && !(allow_unadvertised_object_request & ALLOW_TIP_SHA1))
> + continue;
> + if (!strcmp(cap, "allow-reachable-sha1-in-want")
> +    && !(allow_unadvertised_object_request & ALLOW_REACHABLE_SHA1))
> + continue;
> + if (!strcmp(cap, "no-done") && !stateless_rpc)
> + continue;
> + strbuf_addf(&capabilities, " %s", cap);
> + }
> +
> + format_symref_info(&capabilities, cb_data);
> + strbuf_addf(&capabilities, " agent=%s", git_user_agent_sanitized());
> +
> + packet_write(1, "%s %s%c%s\n", oid_to_hex(oid), refname_nons,
> +     0, capabilities.buf);
> + strbuf_release(&capabilities);
> + advertise_capabilities = 0;
>   } else {
>   packet_write(1, "%s %s\n", oid_to_hex(oid), refname_nons);
>   }
> - capabilities = NULL;
>   if (!peel_ref(refname, peeled.hash))
>   packet_write(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
>   return 0;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 03/14] upload-pack-2: Implement the version 2 of upload-pack

Junio C Hamano
In reply to this post by David Turner
David Turner <[hidden email]> writes:

> On Fri, 2016-04-29 at 16:34 -0700, Stefan Beller wrote:
>> In upload-pack-2 we send each capability in its own packet buffer.
>> The construction of upload-pack-2 is a bit unfortunate as I would
>> like
>> it to not be depending on a symlink linking to upload-pack.c, but I
>> did
>> not find another easy way to do it. I would like it to generate
>> upload-pack-2.o from upload-pack.c but with '-DTRANSPORT_VERSION=2'
>> set.
>
> Couldn't we check argv[0] and use that to determine protocol?  Then we
> could symlink executables rather than source code.

Yeah, I do not have a good suggestion on the mechanism to actually
switch between behaviours other than what was already been discussed
in the thread, but the code resulting from the patch proposed is too
ugly with full of #ifdef for it to be the final form.

Just to make sure nobody gets me wrong; it is OK as a POC to move
the discussion forward.

A production quality implemention however would need to either be a
single executable that switches behaviour at runtime, or two
executables, each with its own *.c file with its own main(), that
share code in another common *.c file, I would think.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 04/14] connect: rewrite feature parsing to work on string_list

Junio C Hamano
In reply to this post by Stefan Beller-4
Stefan Beller <[hidden email]> writes:

> Later on when we introduce the version 2 transport protocol, the
> capabilities will not be transported in one lone string but each

s/lone/long/, I think.

> capability will be carried in its own pkt line.
>
> To reuse existing infrastructure we would either need to join the
> capabilities into a single string again later or refactor the current
> capability parsing to be using a data structure which fits both
> versions of the transport protocol. We chose to implement the later.
>
> Signed-off-by: Stefan Beller <[hidden email]>
> ---
>  builtin/receive-pack.c | 15 ++++++---
>  connect.c              | 82 +++++++++++++++++++++++---------------------------
>  connect.h              |  2 +-
>  upload-pack.c          | 13 ++++++--
>  4 files changed, 58 insertions(+), 54 deletions(-)

I am not sure if the churn is make a right tradeoff here.

A loop to concatenate each segment into a strbuf before passing it
to parse_feature_request would be at most 5 lines long, no?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 04/14] connect: rewrite feature parsing to work on string_list

David Turner
On Wed, 2016-05-04 at 13:13 -0700, Junio C Hamano wrote:

> Stefan Beller <[hidden email]> writes:
>
> > Later on when we introduce the version 2 transport protocol, the
> > capabilities will not be transported in one lone string but each
>
> s/lone/long/, I think.
>
> > capability will be carried in its own pkt line.
> >
> > To reuse existing infrastructure we would either need to join the
> > capabilities into a single string again later or refactor the
> > current
> > capability parsing to be using a data structure which fits both
> > versions of the transport protocol. We chose to implement the
> > later.
> >
> > Signed-off-by: Stefan Beller <[hidden email]>
> > ---
> >  builtin/receive-pack.c | 15 ++++++---
> >  connect.c              | 82 +++++++++++++++++++++++---------------
> > ------------
> >  connect.h              |  2 +-
> >  upload-pack.c          | 13 ++++++--
> >  4 files changed, 58 insertions(+), 54 deletions(-)
>
> I am not sure if the churn is make a right tradeoff here.
>
> A loop to concatenate each segment into a strbuf before passing it
> to parse_feature_request would be at most 5 lines long, no?

I started looking at this again briefly today, and I actually think
that string manipulation of this sort is a mistake.  String
manipulation is error-prone, and having an interface defined in terms
of strings makes it hard for a reader to see what sorts of input are
valid and invalid.   (It's also often somewhat less efficient).  So I
think it is a good idea to improve the interface while we're already
changing code in this area.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

David Turner
In reply to this post by Stefan Beller-4
I was looking at this again today, and noticed that it doesn't really
address the HTTP case.

The central problem is that protocol v2 goes like this:
server: I have capabilities w,x,y, and z
client: I want capabilities x and z.

But HTTP goes like this:
client: [request]
server: [response]

I tried to make libcurl do the receive-before-sending thing, but it
doesn't seem to be designed for it (even if you prime things by sending
a "hello" from the client first).  My thought was to hook up
CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION, and have the read
function return CURL_READFUNC_PAUSE and then have the write (=client
receiving data ) function unpause the reader (= client sending data)
once it gets the capabilities.  But apparently pausing only works with
chunked encoding, which seems to cause Apache's mod_cgi to fail.

Maybe I'm missing something.  Has anyone else ever made something like
this work?

Of course, I could always use CURLOPT_CONNECT_ONLY to write my own HTTP
client, but that seems pretty unreasonable.

I also looked to see if libcurl had websockets support, since that's
one kind of bidirectional conversation over HTTP, but it doesn't seem
to.

Another choice is to make a separate /capabilities endpoint that gets
hit before /info/refs.  This is a bit bad because:
(a) it's another HTTP request
(b) it adds implicit state to the HTTP conversation.  If multiple git
servers were behind a load balancer, you might end up getting server A
for /capabilities and server B for /info/refs, and those servers might
have different capabilities.  This is not impossible when testing a git
server upgrade on one machine before rolling it out to a whole fleet.
 Maybe the rule for clients re capabilities is that they can request
whatever capabilities they want, but the server is free to ignore that
request and send whatever data it feels like.  That's not great, but it
should work (I think).

Does anyone else have any thoughts on how this ought to work?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

Duy Nguyen
On Wed, May 25, 2016 at 5:46 AM, David Turner <[hidden email]> wrote:

> I was looking at this again today, and noticed that it doesn't really
> address the HTTP case.
>
> The central problem is that protocol v2 goes like this:
> server: I have capabilities w,x,y, and z
> client: I want capabilities x and z.
>
> But HTTP goes like this:
> client: [request]
> server: [response]
>
> I tried to make libcurl do the receive-before-sending thing, but it
> doesn't seem to be designed for it (even if you prime things by sending
> a "hello" from the client first).  My thought was to hook up
> CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION, and have the read
> function return CURL_READFUNC_PAUSE and then have the write (=client
> receiving data ) function unpause the reader (= client sending data)
> once it gets the capabilities.  But apparently pausing only works with
> chunked encoding, which seems to cause Apache's mod_cgi to fail.
>
> Maybe I'm missing something.  Has anyone else ever made something like
> this work?

It simply takes one more round-trip to negotiate. Not the best thing, but...

> I also looked to see if libcurl had websockets support, since that's
> one kind of bidirectional conversation over HTTP, but it doesn't seem
> to.

Yeah. And libcurl probably will not support websockets even in long
run. I've been searching for a websocket implementation for git and
finally settled for netcat-like programs, sitting in front of git and
dealing with network just like ssh. It will be the simplest way to add
either websocket or http/2 support. If either protocol gets popular
enough, smart-http can become a fall-back mechanism where performance
does not matter much.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

Junio C Hamano
In reply to this post by David Turner
David Turner <[hidden email]> writes:

> I was looking at this again today, and noticed that it doesn't really
> address the HTTP case.
>
> The central problem is that protocol v2 goes like this:
> server: I have capabilities w,x,y, and z
> client: I want capabilities x and z.
>
> But HTTP goes like this:
> client: [request]
> server: [response]

I wonder if that can be solved by speculative request?

Let the connection initiator say "If you can do x and z, please do
so", and allow the responder to say either "OK, I can do x and z; by
the way the full capabilites I support are w, x, y and z", or
"Sorry, can't do that; I have capabilities w, x, and y".


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

David Turner
In reply to this post by Duy Nguyen
On Wed, 2016-05-25 at 06:03 +0700, Duy Nguyen wrote:

> On Wed, May 25, 2016 at 5:46 AM, David Turner <
> [hidden email]> wrote:
> > I was looking at this again today, and noticed that it doesn't
> > really
> > address the HTTP case.
> >
> > The central problem is that protocol v2 goes like this:
> > server: I have capabilities w,x,y, and z
> > client: I want capabilities x and z.
> >
> > But HTTP goes like this:
> > client: [request]
> > server: [response]
> >
> > I tried to make libcurl do the receive-before-sending thing, but it
> > doesn't seem to be designed for it (even if you prime things by
> > sending
> > a "hello" from the client first).  My thought was to hook up
> > CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION, and have the read
> > function return CURL_READFUNC_PAUSE and then have the write
> > (=client
> > receiving data ) function unpause the reader (= client sending
> > data)
> > once it gets the capabilities.  But apparently pausing only works
> > with
> > chunked encoding, which seems to cause Apache's mod_cgi to fail.
> >
> > Maybe I'm missing something.  Has anyone else ever made something
> > like
> > this work?
>
> It simply takes one more round-trip to negotiate. Not the best thing,
> but...

Do you mean that it can be done with libcurl?  Or do you mean that I
should go with the /capabilities endpoint?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

David Turner
In reply to this post by Junio C Hamano
On Wed, 2016-05-25 at 09:23 -0700, Junio C Hamano wrote:

> David Turner <[hidden email]> writes:
>
> > I was looking at this again today, and noticed that it doesn't
> > really
> > address the HTTP case.
> >
> > The central problem is that protocol v2 goes like this:
> > server: I have capabilities w,x,y, and z
> > client: I want capabilities x and z.
> >
> > But HTTP goes like this:
> > client: [request]
> > server: [response]
>
> I wonder if that can be solved by speculative request?
>
> Let the connection initiator say "If you can do x and z, please do
> so", and allow the responder to say either "OK, I can do x and z; by
> the way the full capabilites I support are w, x, y and z", or
> "Sorry, can't do that; I have capabilities w, x, and y".

That protocol is somewhat more complicated.  And it's different than
the v2 protocol for the other transports (unless you are thinking that
we should change those too?).  It's actually a tiny bit like what I
originally proposed for HTTP.

It sounds OK to me, but I want to know what others think before I start
implementing.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [WIP PATCH 00/14] Protocol v2 patches

Jeff King
In reply to this post by David Turner
On Tue, May 24, 2016 at 06:46:48PM -0400, David Turner wrote:

> I tried to make libcurl do the receive-before-sending thing, but it
> doesn't seem to be designed for it (even if you prime things by sending
> a "hello" from the client first).  My thought was to hook up
> CURLOPT_READFUNCTION and CURLOPT_WRITEFUNCTION, and have the read
> function return CURL_READFUNC_PAUSE and then have the write (=client
> receiving data ) function unpause the reader (= client sending data)
> once it gets the capabilities.  But apparently pausing only works with
> chunked encoding, which seems to cause Apache's mod_cgi to fail.
>
> Maybe I'm missing something.  Has anyone else ever made something like
> this work?

I don't think it can work in the general case. HTTP is not full-duplex,
and you have to send off the request and wait for the response. Even if
you could convince the client and git-http-backend to do it, you're
going to get foiled by proxies, web server implementations, and other
middle-men.

> Of course, I could always use CURLOPT_CONNECT_ONLY to write my own HTTP
> client, but that seems pretty unreasonable.
>
> I also looked to see if libcurl had websockets support, since that's
> one kind of bidirectional conversation over HTTP, but it doesn't seem
> to.

I would love to see us move to a true bidirectional HTTP-based protocol.
It would clear up all of the drawbacks that the current HTTP protocol
has, and I think we could generally recommend it entirely over using
git://. But like you, I haven't figured out an easy way to do it.

I hoped that maybe HTTP/2 would solve some of that if we waited long
enough for it to be adopted, but it doesn't look like there's anything
out of the box. It seems like the recommended solutions still involve
websockets. I might be wrong, though; this is very much outside my area
of expertise.

> Another choice is to make a separate /capabilities endpoint that gets
> hit before /info/refs.  This is a bit bad because:
> (a) it's another HTTP request

Right, this is the extra round-trip I mentioned in:

  http://thread.gmane.org/gmane.comp.version-control.git/291640/focus=291951

I think you could get rid of it by making protocol v2 a true "client
speaks first" protocol, which aligns better with how HTTP works (but if
we do that, it would be nice to do it for _all_ of the transports, so
they stay closer to each other). But...

> (b) it adds implicit state to the HTTP conversation.  If multiple git
> servers were behind a load balancer, you might end up getting server A
> for /capabilities and server B for /info/refs, and those servers might
> have different capabilities.  This is not impossible when testing a git
> server upgrade on one machine before rolling it out to a whole fleet.
>  Maybe the rule for clients re capabilities is that they can request
> whatever capabilities they want, but the server is free to ignore that
> request and send whatever data it feels like.  That's not great, but it
> should work (I think).

I think this is already the case today. Every non-trivial git-over-http
request requires at least two HTTP requests: one to receive the server
fetch advertisement, and the second to actually do the work (and in the
fetch case, the have/want negotiation in the second one may actually
span several requests).

The capabilities from the server come in the first request, and then the
client sends back its capabilities in the second one. So if you are
hitting multiple incompatible servers, the server may not understand
your request. Likewise, if an upload-pack request takes multiple hits,
we send up the client capabilities in each request.

I don't think quietly ignoring unknown capabilities is a good idea. The
results would range from confusing breakages (e.g., ignored multi-ack or
no-done capabilities) to subtly wrong behavior (e.g., a server which
ignores "atomic" and proceeds with a half-failed push anyway).  Given
the rarity of the situation, it's probably better for the server to barf
with an appropriate error message. That sucks for the user, but it's
probably better than the alternatives.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
12