[RFC] How to pass Git config command line instructions to Submodule commands?

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC] How to pass Git config command line instructions to Submodule commands?

larsxschneider
Hi,

a few folks from the Git LFS project and I try to make cloning of repositories
with a lot of LFS files faster.

The core problem is that Git LFS uses a Git smudge filter to replace LFS
pointers with the actual file content. Right now, a smudge filter can only
be executed on an individual file which makes the operation slow for many
files [1].

We solved this issue by temporarily disabling the smudge filter for the clone
command via Git config (optimized in 1a8630 [2]):

    git -c filter.lfs.smudge= -c filter.lfs.required=false clone <url> <path>

Afterwards Git LFS runs a special command to download and replace all LFS
content in bulk [3]. This works great for LFS repositories.

However, I noticed that git config command line instructions such as
"-c filter.lfs.smudge=" are not passed to Git submodule operations. Thus
this does not work as expected:

    git -c filter.lfs.smudge= -c filter.lfs.required=false clone --recursive <url> <path>

I tried to work around that by copying the relevant pieced from the Git
Submodule command [4] and applying the command line Git config
manually (look closely at the modified checkout command):

    git -c filter.lfs.smudge= -c filter.lfs.required=false clone $@
    if [[ -z $2 ]]; then
        CLONE_PATH=$(basename ${1%.git});
    else
        CLONE_PATH=$2;
    fi
    pushd "$CLONE_PATH"
        git submodule init
        wt_prefix=$(git rev-parse --show-prefix)
        git submodule--helper list --prefix "$wt_prefix" | {
            while read mode sha1 stage sm_path
            do
                name=$(git submodule--helper name "$sm_path") || exit
                url=$(git config submodule."$name".url)
                if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
                then
                    git submodule--helper clone --prefix "$wt_prefix" --path "$sm_path" --name "$name" --url "$url"
                    pushd "$sm_path"
                        git -c filter.lfs.smudge= -c filter.lfs.required=false checkout -q $sha1 || exit
                        git-lfs pull || exit
                    popd
                fi
            done
        }
    popd

Do you see an easier way to pass command line Git config instructions to the
underlaying Git Submodule commands? If not, do you think a patch adding this
would be worth working on?

I also started a discussion about that on the Git LFS issue page [5].

Thanks,
Lars


[1] https://github.com/github/git-lfs/issues/931
[2] https://github.com/git/git/commit/1a8630dc3b1cc6f1361a4e5d94630133c24c97d9
[3] https://developer.atlassian.com/blog/2016/04/git-lfs-12-clone-faster/
[4] https://github.com/git/git/blob/6a6636270fbaf74609cd3e1bd207dd2c420d640a/git-submodule.sh#L686-L731
[5] https://github.com/github/git-lfs/issues/1172

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Stefan Beller-4
On Mon, Apr 25, 2016 at 3:39 AM, Lars Schneider
<[hidden email]> wrote:

> Hi,
>
> a few folks from the Git LFS project and I try to make cloning of repositories
> with a lot of LFS files faster.
>
> The core problem is that Git LFS uses a Git smudge filter to replace LFS
> pointers with the actual file content. Right now, a smudge filter can only
> be executed on an individual file which makes the operation slow for many
> files [1].
>
> We solved this issue by temporarily disabling the smudge filter for the clone
> command via Git config (optimized in 1a8630 [2]):
>
>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone <url> <path>
>
> Afterwards Git LFS runs a special command to download and replace all LFS
> content in bulk [3]. This works great for LFS repositories.
>
> However, I noticed that git config command line instructions such as
> "-c filter.lfs.smudge=" are not passed to Git submodule operations. Thus
> this does not work as expected:
>
>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone --recursive <url> <path>

I have cc'd Jacob Keller, who authored origin/jk/submodule-c-credential,
which does work in that area (deciding which config option to pass down
into the submodule commands).

>
> I tried to work around that by copying the relevant pieced from the Git
> Submodule command [4] and applying the command line Git config
> manually (look closely at the modified checkout command):
>
>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone $@
>     if [[ -z $2 ]]; then
>         CLONE_PATH=$(basename ${1%.git});
>     else
>         CLONE_PATH=$2;
>     fi
>     pushd "$CLONE_PATH"
>         git submodule init
>         wt_prefix=$(git rev-parse --show-prefix)
>         git submodule--helper list --prefix "$wt_prefix" | {
>             while read mode sha1 stage sm_path
>             do
>                 name=$(git submodule--helper name "$sm_path") || exit
>                 url=$(git config submodule."$name".url)
>                 if ! test -d "$sm_path"/.git && ! test -f "$sm_path"/.git
>                 then
>                     git submodule--helper clone --prefix "$wt_prefix" --path "$sm_path" --name "$name" --url "$url"

The init and then clone should be covered by
"git submodule--helper update-clone", which may be better named as
"list-and-clone-if-necessary", then you get parallel cloning for free
as well. ;)

>                     pushd "$sm_path"
>                         git -c filter.lfs.smudge= -c filter.lfs.required=false checkout -q $sha1 || exit
>                         git-lfs pull || exit
>                     popd
>                 fi
>             done
>         }
>     popd
>
> Do you see an easier way to pass command line Git config instructions to the
> underlaying Git Submodule commands? If not, do you think a patch adding this
> would be worth working on?

I would build on top of origin/jk/submodule-c-credential at least, and using
"git submodule--helper update-clone" (origin/sb/submodule-parallel-update)

>
> I also started a discussion about that on the Git LFS issue page [5].

Unrelated to this, but about LFS:
Currently it is only possible to store the big blobs at a $VENDOR hosting site.
How would you backup you whole repository including all binariers?

I think we should add an option to store one blob in a dedicated ref
(refs/lfs/$blobsha1 or such). That way you can make a backup of the
repository including all large files using "git clone --mirror" and
you don't have
to rely on the $VENDOR hosting your files.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jacob Keller
On Mon, Apr 25, 2016 at 10:02 AM, Stefan Beller <[hidden email]> wrote:

> On Mon, Apr 25, 2016 at 3:39 AM, Lars Schneider
> <[hidden email]> wrote:
>> Hi,
>>
>> a few folks from the Git LFS project and I try to make cloning of repositories
>> with a lot of LFS files faster.
>>
>> The core problem is that Git LFS uses a Git smudge filter to replace LFS
>> pointers with the actual file content. Right now, a smudge filter can only
>> be executed on an individual file which makes the operation slow for many
>> files [1].
>>
>> We solved this issue by temporarily disabling the smudge filter for the clone
>> command via Git config (optimized in 1a8630 [2]):
>>
>>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone <url> <path>
>>
>> Afterwards Git LFS runs a special command to download and replace all LFS
>> content in bulk [3]. This works great for LFS repositories.
>>
>> However, I noticed that git config command line instructions such as
>> "-c filter.lfs.smudge=" are not passed to Git submodule operations. Thus
>> this does not work as expected:
>>
>>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone --recursive <url> <path>
>
> I have cc'd Jacob Keller, who authored origin/jk/submodule-c-credential,
> which does work in that area (deciding which config option to pass down
> into the submodule commands).
>

This is a tricky question. The problem is that some configurations are
obviously not intended to go into the submodules, but determining how
is somewhat troublesome. There was some discussion on this previous
thread when we added support for credential options to pass through.

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
On Mon, Apr 25, 2016 at 01:59:03PM -0700, Jacob Keller wrote:

> >> However, I noticed that git config command line instructions such as
> >> "-c filter.lfs.smudge=" are not passed to Git submodule operations. Thus
> >> this does not work as expected:
> >>
> >>     git -c filter.lfs.smudge= -c filter.lfs.required=false clone --recursive <url> <path>
> >
> > I have cc'd Jacob Keller, who authored origin/jk/submodule-c-credential,
> > which does work in that area (deciding which config option to pass down
> > into the submodule commands).
> >
>
> This is a tricky question. The problem is that some configurations are
> obviously not intended to go into the submodules, but determining how
> is somewhat troublesome. There was some discussion on this previous
> thread when we added support for credential options to pass through.

Right. I think it may be reasonable to pass through filter.* in the
whitelist.  They are not activated without a matching .gitattributes
entry in the repository (and people would generally configure them in
their user-level ~/.gitconfig for that reason).

It does mean that somebody would be stuck who really wanted to run the
smudge filter in their local repo, but for some reason not in the
subrepos. I am trying to think of a case in which that might be
security-relevant if you didn't trust the sub-repos[1]. But I really
don't see it. The filter is arbitrary code, but that's specified by the
user; we're just feeding it possibly untrusted blobs.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
On Mon, Apr 25, 2016 at 05:24:50PM -0400, Jeff King wrote:

> It does mean that somebody would be stuck who really wanted to run the
> smudge filter in their local repo, but for some reason not in the
> subrepos. I am trying to think of a case in which that might be
> security-relevant if you didn't trust the sub-repos[1]. But I really
> don't see it. The filter is arbitrary code, but that's specified by the
> user; we're just feeding it possibly untrusted blobs.

I forgot my [1], which was going to be: I wonder if there are any
interesting things you can do by feeding git-lfs untrusted content
(e.g., convincing it to hit arbitrary URLs). But I don't think so. The
URL is derived from the remote, and the LFS pointer files just contain
hashes.

That's all orthogonal to this thread anyway, though. People using LFS
generally have the config in ~/.gitconfig, so they run it for all repos,
trusted and untrusted.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

larsxschneider
In reply to this post by Jeff King

> On 25 Apr 2016, at 23:24, Jeff King <[hidden email]> wrote:
>
> On Mon, Apr 25, 2016 at 01:59:03PM -0700, Jacob Keller wrote:
>
>>>> However, I noticed that git config command line instructions such as
>>>> "-c filter.lfs.smudge=" are not passed to Git submodule operations. Thus
>>>> this does not work as expected:
>>>>
>>>>    git -c filter.lfs.smudge= -c filter.lfs.required=false clone --recursive <url> <path>
>>>
>>> I have cc'd Jacob Keller, who authored origin/jk/submodule-c-credential,
>>> which does work in that area (deciding which config option to pass down
>>> into the submodule commands).
>>>
>>
>> This is a tricky question. The problem is that some configurations are
>> obviously not intended to go into the submodules, but determining how
>> is somewhat troublesome. There was some discussion on this previous
>> thread when we added support for credential options to pass through.
>
> Right. I think it may be reasonable to pass through filter.* in the
> whitelist.  They are not activated without a matching .gitattributes
> entry in the repository (and people would generally configure them in
> their user-level ~/.gitconfig for that reason).
>
> It does mean that somebody would be stuck who really wanted to run the
> smudge filter in their local repo, but for some reason not in the
> subrepos. I am trying to think of a case in which that might be
> security-relevant if you didn't trust the sub-repos[1]. But I really
> don't see it. The filter is arbitrary code, but that's specified by the
> user; we're just feeding it possibly untrusted blobs.

This looks like a very promising solution and I can't think of a
security problem either (I checked the older thread [1] you
referenced, too)!

I got my Git-LFS use case working with the patch below.
For me it was necessary to export GIT_CONFIG_PARAMETERS
to make it available to the Git process if the process is
invoked as follows [2]:

(sanitize_submodule_env; cd "$sm_path" && git <something>")


Exporting the variable would not be necessary in this case:

(cd "$sm_path" && sanitize_submodule_env git <something>")

Unfortunately that does not work and I think the reason is that
clear_local_git_env (invoked by sanitize_submodule_env) clears
the $GIT_DIR, too.


If there is a reason against exporting GIT_CONFIG_PARAMETERS,
then this would work, too:

(sanitize_submodule_env; cd "$sm_path" && GIT_CONFIG_PARAMETERS=$GIT_CONFIG_PARAMETERS git <something>)


Would the patch below be an acceptable solution?


Thanks,
Lars


[1] http://thread.gmane.org/gmane.comp.version-control.git/264840
[2] https://github.com/git/git/blob/3ad15fd5e17bbb73fb1161ff4e9c3ed254d5b243/git-submodule.sh#L811



diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 3bd6883..9231089 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -129,6 +129,8 @@ static int submodule_config_ok(const char *var)
 {
  if (starts_with(var, "credential."))
  return 1;
+ if (starts_with(var, "filter."))
+ return 1;
  return 0;
 }

diff --git a/git-submodule.sh b/git-submodule.sh
index 2a84d7e..b02f5b9 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -199,7 +199,7 @@ sanitize_submodule_env()
 {
  sanitized_config=$(git submodule--helper sanitize-config)
  clear_local_git_env
- GIT_CONFIG_PARAMETERS=$sanitized_config
+ export GIT_CONFIG_PARAMETERS=$sanitized_config
 }

 #


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
On Thu, Apr 28, 2016 at 01:06:45PM +0200, Lars Schneider wrote:

> I got my Git-LFS use case working with the patch below.
> For me it was necessary to export GIT_CONFIG_PARAMETERS
> to make it available to the Git process if the process is
> invoked as follows [2]:
>
> (sanitize_submodule_env; cd "$sm_path" && git <something>")

Hrm. I'm not sure why you need to export. Or perhaps, I am not sure why
it ever works in the first place in git-submodule.sh. In this code:

> diff --git a/git-submodule.sh b/git-submodule.sh
> index 2a84d7e..b02f5b9 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -199,7 +199,7 @@ sanitize_submodule_env()
>  {
>   sanitized_config=$(git submodule--helper sanitize-config)
>   clear_local_git_env
> - GIT_CONFIG_PARAMETERS=$sanitized_config
> + export GIT_CONFIG_PARAMETERS=$sanitized_config
>  }

If you already have $GIT_CONFIG_PARAMETERS exported when we enter the
function, then we should not need to re-export it when changing the
value in the final line (the export bit is retained by the shell). But
if you don't have it set already, then $sanitized_config must by
definition be empty.

So it should do the right thing without the export.

At the same time, clear_local_git_env() will call "unset" on
GIT_CONFIG_PARAMETERS. Which would clear the export bit, meaning the
final line doesn't ever have any impact on sub-programs, and the whole
thing is totally broken. But then, why does the test in t5550 pass?

Confused...

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
On Thu, Apr 28, 2016 at 07:25:11AM -0400, Jeff King wrote:

> > diff --git a/git-submodule.sh b/git-submodule.sh
> > index 2a84d7e..b02f5b9 100755
> > --- a/git-submodule.sh
> > +++ b/git-submodule.sh
> > @@ -199,7 +199,7 @@ sanitize_submodule_env()
> >  {
> >   sanitized_config=$(git submodule--helper sanitize-config)
> >   clear_local_git_env
> > - GIT_CONFIG_PARAMETERS=$sanitized_config
> > + export GIT_CONFIG_PARAMETERS=$sanitized_config
> >  }
>
> If you already have $GIT_CONFIG_PARAMETERS exported when we enter the
> function, then we should not need to re-export it when changing the
> value in the final line (the export bit is retained by the shell). But
> if you don't have it set already, then $sanitized_config must by
> definition be empty.
>
> So it should do the right thing without the export.
>
> At the same time, clear_local_git_env() will call "unset" on
> GIT_CONFIG_PARAMETERS. Which would clear the export bit, meaning the
> final line doesn't ever have any impact on sub-programs, and the whole
> thing is totally broken. But then, why does the test in t5550 pass?
>
> Confused...

Ah. t5550 passes because it does not exercise this code path at all. We
try a recursive clone, which calls "git submodule update --init", which
does not seem to clear the config at all. So it works even without
14111fc49.

I tried to improve the test by adding git-fetch (note that I also fixed
a bug where we use $HTTP_URL instead of $HTTPD_URL, and added some
whitespace to make the result more readable):

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 69ef388..6ec3ba3 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -103,12 +103,23 @@ test_expect_success 'cmdline credential config passes into submodules' '
  git submodule add "$HTTPD_URL/auth/dumb/repo.git" sub &&
  git commit -m "add submodule"
  ) &&
+
  set_askpass wrong pass@host &&
  test_must_fail git clone --recursive super super-clone &&
  rm -rf super-clone &&
+
  set_askpass wrong pass@host &&
- git -c "credential.$HTTP_URL.username=user@host" \
+ git -c "credential.$HTTPD_URL.username=user@host" \
  clone --recursive super super-clone &&
+ expect_askpass pass user@host &&
+
+ set_askpass wrong pass@host &&
+ test_must_fail git -C super-clone fetch --recurse-submodules &&
+
+ set_askpass wrong pass@host &&
+ git -C super-clone \
+    -c "credential.$HTTPD_URL.username=user@host" \
+    fetch --recurse-submodules &&
  expect_askpass pass user@host
 '
 

but that doesn't pass, even with the export fix! That's because fetch
doesn't go through git-submodule at all; it calls "git fetch" itself,
and uses local_repo_env, which clears the config. It needs to learn to
use the same mechanism that sanitize_submodule_env() does.

So AFAICT 14111fc49 is totally broken. It doesn't actually work for
git-submodule (because of the missing export), nor for git-fetch
(because that skips the shell script), and the one case we are testing
already worked without it (but probably _should_ be sanitizing the
config, so is buggy, too).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

larsxschneider
In reply to this post by Jeff King

> On 28 Apr 2016, at 13:25, Jeff King <[hidden email]> wrote:
>
> On Thu, Apr 28, 2016 at 01:06:45PM +0200, Lars Schneider wrote:
>
>> I got my Git-LFS use case working with the patch below.
>> For me it was necessary to export GIT_CONFIG_PARAMETERS
>> to make it available to the Git process if the process is
>> invoked as follows [2]:
>>
>> (sanitize_submodule_env; cd "$sm_path" && git <something>")
>
> Hrm. I'm not sure why you need to export. Or perhaps, I am not sure why
> it ever works in the first place in git-submodule.sh. In this code:
>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 2a84d7e..b02f5b9 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -199,7 +199,7 @@ sanitize_submodule_env()
>> {
>> sanitized_config=$(git submodule--helper sanitize-config)
>> clear_local_git_env
>> - GIT_CONFIG_PARAMETERS=$sanitized_config
>> + export GIT_CONFIG_PARAMETERS=$sanitized_config
>> }
>
> If you already have $GIT_CONFIG_PARAMETERS exported when we enter the
> function, then we should not need to re-export it when changing the
> value in the final line (the export bit is retained by the shell). But
> if you don't have it set already, then $sanitized_config must by
> definition be empty.
>
> So it should do the right thing without the export.
>
> At the same time, clear_local_git_env() will call "unset" on
> GIT_CONFIG_PARAMETERS. Which would clear the export bit, meaning the
> final line doesn't ever have any impact on sub-programs, and the whole
> thing is totally broken. But then, why does the test in t5550 pass?
>
> Confused...

I am no expert in the Submodule code but I think the cloning of
the submodules is not yet guarded with sanitize_submodule_env [3].
That means the submodule is cloned with the GIT_CONFIG_PARAMETERS
of the super project. That might explain why t5550 passes as the
credential config is only used in that area.

The submodule checkout is guarded with sanitize_submodule_env
and therefore my Git-LFS filter use case is affect.

Does this sound reasonable?

Thanks,
Lars

[3] https://github.com/git/git/blob/3ad15fd5e17bbb73fb1161ff4e9c3ed254d5b243/git-submodule.sh#L704-L711
[4] https://github.com/git/git/blob/3ad15fd5e17bbb73fb1161ff4e9c3ed254d5b243/git-submodule.sh#L811

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
In reply to this post by Jeff King
On Thu, Apr 28, 2016 at 08:05:04AM -0400, Jeff King wrote:

> So AFAICT 14111fc49 is totally broken. It doesn't actually work for
> git-submodule (because of the missing export), nor for git-fetch
> (because that skips the shell script), and the one case we are testing
> already worked without it (but probably _should_ be sanitizing the
> config, so is buggy, too).

This last bit is not quite accurate. The test in t5550 doesn't pass
without 14111fc49. But it _does_ pass if we make
sanitize_submodule_env() in the shell script a noop. That's because it
is going through clone_submodule() in the C code, which uses the C-only
prepare_submodule_repo_env().

So that case _is_ correct right now. It's just that t5550 isn't testing
the shell script part, which is broken. Probably running "git submodule
update" in the resulting clone would cover that.

And for the fetch case, we probably just need to be calling
prepare_submodule_repo_env() there, too.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/5] fixes for sanitized submodule config

Jeff King
On Thu, Apr 28, 2016 at 08:17:53AM -0400, Jeff King wrote:

> So that case _is_ correct right now. It's just that t5550 isn't testing
> the shell script part, which is broken. Probably running "git submodule
> update" in the resulting clone would cover that.
>
> And for the fetch case, we probably just need to be calling
> prepare_submodule_repo_env() there, too.

So here's a series which fixes sanitizing in the "git-submodule" shell
script, along with "git fetch". And cleans up a few things along the
way.

  [1/5]: t5550: fix typo in $HTTPD_URL
  [2/5]: t5550: break submodule config test into multiple sub-tests
  [3/5]: submodule: export sanitize GIT_CONFIG_PARAMETERS
  [4/5]: submodule--helper: move config-sanitizing to submodule.c
  [5/5]: submodule: use prepare_submodule_repo_env consistently

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/5] t5550: fix typo in $HTTPD_URL

Jeff King
Commit 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29) accidentally wrote $HTTP_URL. It
happened to work because we ended up with "credential..helper",
which we treat the same as "credential.helper", applying it
to all URLs.

Signed-off-by: Jeff King <[hidden email]>
---
 t/t5550-http-fetch-dumb.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 48e2ab6..81cc57f 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -103,7 +103,7 @@ test_expect_success 'cmdline credential config passes into submodules' '
  test_must_fail git clone --recursive super super-clone &&
  rm -rf super-clone &&
  set_askpass wrong pass@host &&
- git -c "credential.$HTTP_URL.username=user@host" \
+ git -c "credential.$HTTPD_URL.username=user@host" \
  clone --recursive super super-clone &&
  expect_askpass pass user@host
 '
--
2.8.1.617.gbdccc2d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/5] t5550: break submodule config test into multiple sub-tests

Jeff King
In reply to this post by Jeff King
Right now we test only the cloning case, but there are other
interesting cases (e.g., fetching). Let's pull the setup
bits into their own test, which will make things flow more
logically once we start adding more tests which use the
setup.

Let's also introduce some whitespace to the clone-test to
split the two parts: making sure it fails without our
cmdline config, and that it succeeds with it.

Signed-off-by: Jeff King <[hidden email]>
---
 t/t5550-http-fetch-dumb.sh | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 81cc57f..e8e91bb 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -91,17 +91,21 @@ test_expect_success 'configured username does not override URL' '
  expect_askpass pass user@host
 '
 
-test_expect_success 'cmdline credential config passes into submodules' '
+test_expect_success 'set up repo with http submodules' '
  git init super &&
  set_askpass user@host pass@host &&
  (
  cd super &&
  git submodule add "$HTTPD_URL/auth/dumb/repo.git" sub &&
  git commit -m "add submodule"
- ) &&
+ )
+'
+
+test_expect_success 'cmdline credential config passes to submodule via clone' '
  set_askpass wrong pass@host &&
  test_must_fail git clone --recursive super super-clone &&
  rm -rf super-clone &&
+
  set_askpass wrong pass@host &&
  git -c "credential.$HTTPD_URL.username=user@host" \
  clone --recursive super super-clone &&
--
2.8.1.617.gbdccc2d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/5] submodule: export sanitized GIT_CONFIG_PARAMETERS

Jeff King
In reply to this post by Jeff King
Commit 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29) taught git-submodule.sh to save
the sanitized value of $GIT_CONFIG_PARAMETERS when clearing
the environment for a submodule. However, it failed to
export the result, meaning that it had no effect for any
sub-programs.

We didn't catch this in our initial tests because we checked
only the "clone" case, which does not go through the shell
script at all. Provoking "git submodule update" to do a
fetch demonstrates the bug.

Noticed-by: Lars Schneider <[hidden email]>
Signed-off-by: Jeff King <[hidden email]>
---
 git-submodule.sh           |  1 +
 t/t5550-http-fetch-dumb.sh | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/git-submodule.sh b/git-submodule.sh
index 2a84d7e..3a40d4b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -200,6 +200,7 @@ sanitize_submodule_env()
  sanitized_config=$(git submodule--helper sanitize-config)
  clear_local_git_env
  GIT_CONFIG_PARAMETERS=$sanitized_config
+ export GIT_CONFIG_PARAMETERS
 }
 
 #
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index e8e91bb..13ac788 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -112,6 +112,23 @@ test_expect_success 'cmdline credential config passes to submodule via clone' '
  expect_askpass pass user@host
 '
 
+test_expect_success 'cmdline credential config passes submodule update' '
+ # advance the submodule HEAD so that a fetch is required
+ git commit --allow-empty -m foo &&
+ git push "$HTTPD_DOCUMENT_ROOT_PATH/auth/dumb/repo.git" HEAD &&
+ sha1=$(git rev-parse HEAD) &&
+ git -C super-clone update-index --cacheinfo 160000,$sha1,sub &&
+
+ set_askpass wrong pass@host &&
+ test_must_fail git -C super-clone submodule update &&
+
+ set_askpass wrong pass@host &&
+ git -C super-clone \
+    -c "credential.$HTTPD_URL.username=user@host" \
+    submodule update &&
+ expect_askpass pass user@host
+'
+
 test_expect_success 'fetch changes via http' '
  echo content >>file &&
  git commit -a -m two &&
--
2.8.1.617.gbdccc2d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 4/5] submodule--helper: move config-sanitizing to submodule.c

Jeff King
In reply to this post by Jeff King
These functions should be used by any code which spawns a
submodule process, which may happen in submodule.c (e.g.,
for spawning fetch). Let's move them there and make them
public so that submodule--helper can continue to use them.

Sine they're now public, let's also provide a basic overview
of their intended use.

Signed-off-by: Jeff King <[hidden email]>
---
 builtin/submodule--helper.c | 48 --------------------------------------------
 submodule.c                 | 49 +++++++++++++++++++++++++++++++++++++++++++++
 submodule.h                 | 16 +++++++++++++++
 3 files changed, 65 insertions(+), 48 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 3bd6883..de3ad5b 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -119,54 +119,6 @@ static int module_name(int argc, const char **argv, const char *prefix)
  return 0;
 }
 
-/*
- * Rules to sanitize configuration variables that are Ok to be passed into
- * submodule operations from the parent project using "-c". Should only
- * include keys which are both (a) safe and (b) necessary for proper
- * operation.
- */
-static int submodule_config_ok(const char *var)
-{
- if (starts_with(var, "credential."))
- return 1;
- return 0;
-}
-
-static int sanitize_submodule_config(const char *var, const char *value, void *data)
-{
- struct strbuf *out = data;
-
- if (submodule_config_ok(var)) {
- if (out->len)
- strbuf_addch(out, ' ');
-
- if (value)
- sq_quotef(out, "%s=%s", var, value);
- else
- sq_quote_buf(out, var);
- }
-
- return 0;
-}
-
-static void prepare_submodule_repo_env(struct argv_array *out)
-{
- const char * const *var;
-
- for (var = local_repo_env; *var; var++) {
- if (!strcmp(*var, CONFIG_DATA_ENVIRONMENT)) {
- struct strbuf sanitized_config = STRBUF_INIT;
- git_config_from_parameters(sanitize_submodule_config,
-   &sanitized_config);
- argv_array_pushf(out, "%s=%s", *var, sanitized_config.buf);
- strbuf_release(&sanitized_config);
- } else {
- argv_array_push(out, *var);
- }
- }
-
-}
-
 static int clone_submodule(const char *path, const char *gitdir, const char *url,
    const char *depth, const char *reference, int quiet)
 {
diff --git a/submodule.c b/submodule.c
index 90825e1..02eaf0e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -13,6 +13,7 @@
 #include "argv-array.h"
 #include "blob.h"
 #include "thread-utils.h"
+#include "quote.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int parallel_jobs = 1;
@@ -1129,3 +1130,51 @@ int parallel_submodules(void)
 {
  return parallel_jobs;
 }
+
+/*
+ * Rules to sanitize configuration variables that are Ok to be passed into
+ * submodule operations from the parent project using "-c". Should only
+ * include keys which are both (a) safe and (b) necessary for proper
+ * operation.
+ */
+static int submodule_config_ok(const char *var)
+{
+ if (starts_with(var, "credential."))
+ return 1;
+ return 0;
+}
+
+int sanitize_submodule_config(const char *var, const char *value, void *data)
+{
+ struct strbuf *out = data;
+
+ if (submodule_config_ok(var)) {
+ if (out->len)
+ strbuf_addch(out, ' ');
+
+ if (value)
+ sq_quotef(out, "%s=%s", var, value);
+ else
+ sq_quote_buf(out, var);
+ }
+
+ return 0;
+}
+
+void prepare_submodule_repo_env(struct argv_array *out)
+{
+ const char * const *var;
+
+ for (var = local_repo_env; *var; var++) {
+ if (!strcmp(*var, CONFIG_DATA_ENVIRONMENT)) {
+ struct strbuf sanitized_config = STRBUF_INIT;
+ git_config_from_parameters(sanitize_submodule_config,
+   &sanitized_config);
+ argv_array_pushf(out, "%s=%s", *var, sanitized_config.buf);
+ strbuf_release(&sanitized_config);
+ } else {
+ argv_array_push(out, *var);
+ }
+ }
+
+}
diff --git a/submodule.h b/submodule.h
index 7ef3775..7577b3b 100644
--- a/submodule.h
+++ b/submodule.h
@@ -61,4 +61,20 @@ int push_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_nam
 void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
 int parallel_submodules(void);
 
+/*
+ * This function is intended as a callback for use with
+ * git_config_from_parameters(). It ignores any config options which
+ * are not suitable for passing along to a submodule, and accumulates the rest
+ * in "data", which must be a pointer to a strbuf. The end result can
+ * be put into $GIT_CONFIG_PARAMETERS for passing to a sub-process.
+ */
+int sanitize_submodule_config(const char *var, const char *value, void *data);
+
+/*
+ * Prepare the "env_array" parameter of a "struct child_process" for executing
+ * a submodule by clearing any repo-specific envirionment variables, but
+ * retaining any config approved by sanitize_submodule_config().
+ */
+void prepare_submodule_repo_env(struct argv_array *out);
+
 #endif
--
2.8.1.617.gbdccc2d

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 5/5] submodule: use prepare_submodule_repo_env consistently

Jeff King
In reply to this post by Jeff King
Before 14111fc (git: submodule honor -c credential.* from
command line, 2016-02-29), it was sufficient for code which
spawned a process in a submodule to just set the child
process's "env" field to "local_repo_env" to clear the
environment of any repo-specific variables.

That commit introduced a more complicated procedure, in
which we clear most variables but allow through sanitized
config. For C code, we used that procedure only for cloning,
but not for any of the programs spawned by submodule.c. As a
result, things like "git fetch --recurse-submodules" behave
differently than "git clone --recursive"; the former will
not pass through the sanitized config.

We can fix this by using prepare_submodule_repo_env()
everywhere in submodule.c.

Signed-off-by: Jeff King <[hidden email]>
---
 submodule.c                | 14 +++++++-------
 t/t5550-http-fetch-dumb.sh | 11 +++++++++++
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/submodule.c b/submodule.c
index 02eaf0e..4e76b98 100644
--- a/submodule.c
+++ b/submodule.c
@@ -394,7 +394,7 @@ static int submodule_needs_pushing(const char *path, const unsigned char sha1[20
 
  argv[1] = sha1_to_hex(sha1);
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.out = -1;
@@ -481,7 +481,7 @@ static int push_submodule(const char *path)
  const char *argv[] = {"push", NULL};
 
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.dir = path;
@@ -527,7 +527,7 @@ static int is_submodule_commit_present(const char *path, unsigned char sha1[20])
 
  argv[3] = sha1_to_hex(sha1);
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.dir = path;
@@ -710,7 +710,7 @@ static int get_next_submodule(struct child_process *cp,
  if (is_directory(git_dir)) {
  child_process_init(cp);
  cp->dir = strbuf_detach(&submodule_path, NULL);
- cp->env = local_repo_env;
+ prepare_submodule_repo_env(&cp->env_array);
  cp->git_cmd = 1;
  if (!spf->quiet)
  strbuf_addf(err, "Fetching submodule %s%s\n",
@@ -825,7 +825,7 @@ unsigned is_submodule_modified(const char *path, int ignore_untracked)
  argv[2] = "-uno";
 
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.out = -1;
@@ -886,7 +886,7 @@ int submodule_uses_gitfile(const char *path)
 
  /* Now test that all nested submodules use a gitfile too */
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.no_stderr = 1;
@@ -919,7 +919,7 @@ int ok_to_remove_submodule(const char *path)
  return 0;
 
  cp.argv = argv;
- cp.env = local_repo_env;
+ prepare_submodule_repo_env(&cp.env_array);
  cp.git_cmd = 1;
  cp.no_stdin = 1;
  cp.out = -1;
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 13ac788..3484b6f 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -112,6 +112,17 @@ test_expect_success 'cmdline credential config passes to submodule via clone' '
  expect_askpass pass user@host
 '
 
+test_expect_success 'cmdline credential config passes submodule via fetch' '
+ set_askpass wrong pass@host &&
+ test_must_fail git -C super-clone fetch --recurse-submodules &&
+
+ set_askpass wrong pass@host &&
+ git -C super-clone \
+    -c "credential.$HTTPD_URL.username=user@host" \
+    fetch --recurse-submodules &&
+ expect_askpass pass user@host
+'
+
 test_expect_success 'cmdline credential config passes submodule update' '
  # advance the submodule HEAD so that a fetch is required
  git commit --allow-empty -m foo &&
--
2.8.1.617.gbdccc2d
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] How to pass Git config command line instructions to Submodule commands?

Jeff King
In reply to this post by larsxschneider
On Thu, Apr 28, 2016 at 02:05:20PM +0200, Lars Schneider wrote:

> I am no expert in the Submodule code but I think the cloning of
> the submodules is not yet guarded with sanitize_submodule_env [3].
> That means the submodule is cloned with the GIT_CONFIG_PARAMETERS
> of the super project. That might explain why t5550 passes as the
> credential config is only used in that area.
>
> The submodule checkout is guarded with sanitize_submodule_env
> and therefore my Git-LFS filter use case is affect.
>
> Does this sound reasonable?

Yes, that's exactly what's going on. git-submodule.sh's code _is_
broken, which is what you're seeing. I've just posted a series to fix
this. I think it would be reasonable to add filter.* to the whitelist on
top of that series.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/5] fixes for sanitized submodule config

Johannes Schindelin
In reply to this post by Jeff King
Hi Peff,

On Thu, 28 Apr 2016, Jeff King wrote:

> On Thu, Apr 28, 2016 at 08:17:53AM -0400, Jeff King wrote:
>
> > So that case _is_ correct right now. It's just that t5550 isn't testing
> > the shell script part, which is broken. Probably running "git submodule
> > update" in the resulting clone would cover that.
> >
> > And for the fetch case, we probably just need to be calling
> > prepare_submodule_repo_env() there, too.
>
> So here's a series which fixes sanitizing in the "git-submodule" shell
> script, along with "git fetch". And cleans up a few things along the
> way.

Nice!

I reviewed those changes and they all look sensible to me (did not apply
them locally for lack of time, though).

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/5] t5550: break submodule config test into multiple sub-tests

Stefan Beller-4
In reply to this post by Jeff King
On Thu, Apr 28, 2016 at 6:37 AM, Jeff King <[hidden email]> wrote:

> Right now we test only the cloning case, but there are other
> interesting cases (e.g., fetching). Let's pull the setup
> bits into their own test, which will make things flow more
> logically once we start adding more tests which use the
> setup.
>
> Let's also introduce some whitespace to the clone-test to
> split the two parts: making sure it fails without our
> cmdline config, and that it succeeds with it.
>
> Signed-off-by: Jeff King <[hidden email]>
> ---
>  t/t5550-http-fetch-dumb.sh | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index 81cc57f..e8e91bb 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -91,17 +91,21 @@ test_expect_success 'configured username does not override URL' '
>         expect_askpass pass user@host
>  '
>
> -test_expect_success 'cmdline credential config passes into submodules' '
> +test_expect_success 'set up repo with http submodules' '

set up or setup?

$ grep -r "set up" |wc -l
69
$ grep -r "setup" |wc -l
1162

Apart from that nit, this patch looks good to me.

>         git init super &&
>         set_askpass user@host pass@host &&
>         (
>                 cd super &&
>                 git submodule add "$HTTPD_URL/auth/dumb/repo.git" sub &&
>                 git commit -m "add submodule"
> -       ) &&
> +       )
> +'
> +
> +test_expect_success 'cmdline credential config passes to submodule via clone' '
>         set_askpass wrong pass@host &&
>         test_must_fail git clone --recursive super super-clone &&
>         rm -rf super-clone &&
> +
>         set_askpass wrong pass@host &&
>         git -c "credential.$HTTPD_URL.username=user@host" \
>                 clone --recursive super super-clone &&
> --
> 2.8.1.617.gbdccc2d
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/5] t5550: fix typo in $HTTPD_URL

Jacob Keller
In reply to this post by Jeff King
On Thu, Apr 28, 2016 at 6:36 AM, Jeff King <[hidden email]> wrote:
> Commit 14111fc (git: submodule honor -c credential.* from
> command line, 2016-02-29) accidentally wrote $HTTP_URL. It
> happened to work because we ended up with "credential..helper",
> which we treat the same as "credential.helper", applying it
> to all URLs.

You say "credential.helper" twice here? I think that's confusing. The
patch looks perfectly fine but I am having trouble parsing this
description. 'We end up with X which we treat the same as X"?

>
> Signed-off-by: Jeff King <[hidden email]>
> ---
>  t/t5550-http-fetch-dumb.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
> index 48e2ab6..81cc57f 100755
> --- a/t/t5550-http-fetch-dumb.sh
> +++ b/t/t5550-http-fetch-dumb.sh
> @@ -103,7 +103,7 @@ test_expect_success 'cmdline credential config passes into submodules' '
>         test_must_fail git clone --recursive super super-clone &&
>         rm -rf super-clone &&
>         set_askpass wrong pass@host &&
> -       git -c "credential.$HTTP_URL.username=user@host" \
> +       git -c "credential.$HTTPD_URL.username=user@host" \
>                 clone --recursive super super-clone &&
>         expect_askpass pass user@host
>  '
> --
> 2.8.1.617.gbdccc2d
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
12