Quantcast

[PATCH] git-am: Ignore whitespace before patches

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
Change git-am to ignore whitespace (as defined by sh's read) at the
beginning of patches.

Empty lines are wont to creep in at the beginning of patches, here's
an example from a raw Gmail attachment:

    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
    52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|

Whitespace is also likely to appear if the user copy/pastes the patch
around, e.g. via a pastebin, or any any number of other cases. This
harms nothing and makes git-am's detection more fault tolerant.

Signed-off-by: Ævar Arnfjörð Bjarmason <[hidden email]>
---
 git-am.sh     |   16 +++++++++++++++-
 t/t4150-am.sh |   30 ++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/git-am.sh b/git-am.sh
index 1056075..1b4baa8 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -172,7 +172,21 @@ check_patch_format () {
  # otherwise, check the first few lines of the first patch to try
  # to detect its format
  {
- read l1
+ while read -r line
+ do
+ case "$line" in
+ "")
+ # Just skip whitespace
+ continue
+ ;;
+ *)
+ # First non-empty line
+ l1=$line
+ break
+ ;;
+ esac
+ done
+
  read l2
  read l3
  case "$l1" in
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 810b04b..3d089de 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -318,6 +318,36 @@ test_expect_success 'am without --committer-date-is-author-date' '
  test "$at" != "$ct"
 '
 
+test_expect_success 'am applying a patch that begins with an empty line' '
+ git checkout first &&
+ test_tick &&
+ echo > patch1-white &&
+ cat patch1 >> patch1-white &&
+ git am patch1-white &&
+ git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
+ at=$(sed -ne "/^author /s/.*> //p" head1) &&
+ ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
+ test "$at" != "$ct"
+'
+
+test_expect_success 'am applying a patch that begins with many empty lines' '
+ git checkout first &&
+ test_tick &&
+ echo "   " > patch1-white2 &&
+ echo "  " >> patch1-white2 &&
+ echo " " >> patch1-white2 &&
+ echo "" >> patch1-white2 &&
+ echo " " >> patch1-white2 &&
+ echo "  " >> patch1-white2 &&
+ echo "   " >> patch1-white2 &&
+ cat patch1 >> patch1-white2 &&
+ git am patch1-white2 &&
+ git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
+ at=$(sed -ne "/^author /s/.*> //p" head1) &&
+ ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
+ test "$at" != "$ct"
+'
+
 # This checks for +0000 because TZ is set to UTC and that should
 # show up when the current time is used. The date in message is set
 # by test_tick that uses -0700 timezone; if this feature does not
--
1.7.1.84.gd92f8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
On Sat, May 15, 2010 at 17:23, Ævar Arnfjörð Bjarmason <[hidden email]> wrote:

> Change git-am to ignore whitespace (as defined by sh's read) at the
> beginning of patches.
>
> Empty lines are wont to creep in at the beginning of patches, here's
> an example from a raw Gmail attachment:
>
>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
>    52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|
>
> Whitespace is also likely to appear if the user copy/pastes the patch
> around, e.g. via a pastebin, or any any number of other cases. This
> harms nothing and makes git-am's detection more fault tolerant.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <[hidden email]>
> ---
>  git-am.sh     |   16 +++++++++++++++-
>  t/t4150-am.sh |   30 ++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletions(-)
>
> diff --git a/git-am.sh b/git-am.sh
> index 1056075..1b4baa8 100755
> --- a/git-am.sh
> +++ b/git-am.sh
> @@ -172,7 +172,21 @@ check_patch_format () {
>        # otherwise, check the first few lines of the first patch to try
>        # to detect its format
>        {
> -               read l1
> +               while read -r line
> +               do
> +                       case "$line" in
> +                               "")
> +                                       # Just skip whitespace
> +                                       continue
> +                                       ;;
> +                               *)
> +                                       # First non-empty line
> +                                       l1=$line
> +                                       break
> +                                       ;;
> +                       esac
> +               done
> +
>                read l2
>                read l3
>                case "$l1" in
> diff --git a/t/t4150-am.sh b/t/t4150-am.sh
> index 810b04b..3d089de 100755
> --- a/t/t4150-am.sh
> +++ b/t/t4150-am.sh
> @@ -318,6 +318,36 @@ test_expect_success 'am without --committer-date-is-author-date' '
>        test "$at" != "$ct"
>  '
>
> +test_expect_success 'am applying a patch that begins with an empty line' '
> +       git checkout first &&
> +       test_tick &&
> +       echo > patch1-white &&
> +       cat patch1 >> patch1-white &&
> +       git am patch1-white &&
> +       git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
> +       at=$(sed -ne "/^author /s/.*> //p" head1) &&
> +       ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
> +       test "$at" != "$ct"
> +'
> +
> +test_expect_success 'am applying a patch that begins with many empty lines' '
> +       git checkout first &&
> +       test_tick &&
> +       echo "   " > patch1-white2 &&
> +       echo "  " >> patch1-white2 &&
> +       echo " " >> patch1-white2 &&
> +       echo "" >> patch1-white2 &&
> +       echo " " >> patch1-white2 &&
> +       echo "  " >> patch1-white2 &&
> +       echo "   " >> patch1-white2 &&
> +       cat patch1 >> patch1-white2 &&
> +       git am patch1-white2 &&
> +       git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
> +       at=$(sed -ne "/^author /s/.*> //p" head1) &&
> +       ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
> +       test "$at" != "$ct"
> +'
> +
>  # This checks for +0000 because TZ is set to UTC and that should
>  # show up when the current time is used. The date in message is set
>  # by test_tick that uses -0700 timezone; if this feature does not
> --
> 1.7.1.84.gd92f8
>
>

Adding Giuseppe Bilotta who wrote the original code I'm modifying to
the CC list.

It would be nice to get an ack or tested-by for this trivial patch. It
makes git-am & GMail integration much easier. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] git-am: Ignore whitespace before patches

Junio C Hamano
In reply to this post by Ævar Arnfjörð Bjarmason
Ævar Arnfjörð Bjarmason  <[hidden email]> writes:

> Change git-am to ignore whitespace (as defined by sh's read) at the
> beginning of patches.
>
> Empty lines are wont to creep in at the beginning of patches, here's
> an example from a raw Gmail attachment:
>
>     20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
>     20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
>     52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|
>
> Whitespace is also likely to appear if the user copy/pastes the patch
> around, e.g. via a pastebin, or any any number of other cases. This
> harms nothing and makes git-am's detection more fault tolerant.

Actually cut-and-paste is often a major source of whitespace breakage
(including tabs silently being expanded), and I personally think a patch
like this to encourage the practice is going in a wrong direction.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
On Thu, Jun 10, 2010 at 16:51, Junio C Hamano <[hidden email]> wrote:

> Ævar Arnfjörð Bjarmason  <[hidden email]> writes:
>
>> Change git-am to ignore whitespace (as defined by sh's read) at the
>> beginning of patches.
>>
>> Empty lines are wont to creep in at the beginning of patches, here's
>> an example from a raw Gmail attachment:
>>
>>     20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
>>     20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
>>     52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|
>>
>> Whitespace is also likely to appear if the user copy/pastes the patch
>> around, e.g. via a pastebin, or any any number of other cases. This
>> harms nothing and makes git-am's detection more fault tolerant.
>
> Actually cut-and-paste is often a major source of whitespace breakage
> (including tabs silently being expanded), and I personally think a patch
> like this to encourage the practice is going in a wrong direction.

It doesn't encourage that copy/paste. It's just tangentally mentioned
in the commit message since it's a plausable use case.

What it does is enable the GMail -> download -> git-am workflow. GMail
(and doubtless countless other) E-Mail providers introduce whitespace
at the beginning of raw E-Mail messages, while otherwise leaving them
intact.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH v2] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
In reply to this post by Ævar Arnfjörð Bjarmason
Change git-am to ignore whitespace (as defined by sh's read) at the
beginning of patches.

This makes git-am work with patches downloaded from the GMail web
interface, here's an example from a raw Gmail attachment produced with
`hexdump -C':

    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
    52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|

Having to tell GMail users that they must manually edit their patches
before git-am will accept them (as this article does:
http://evag.evn.am/git/git-and-gmail) isn't optimal.

This change is probably useful for other things than GMail patch
downloads, whitespace is also likely to appear if the user copy/pastes
the patch around, e.g. via a pastebin, or any any number of other
cases. This change harms nothing and makes git-am's detection more
fault tolerant.

Signed-off-by: Ævar Arnfjörð Bjarmason <[hidden email]>
---

I originally sent this on July 8 but it was never picked up. Junio commented:

>> Whitespace is also likely to appear if the user copy/pastes the patch
>> around, e.g. via a pastebin, or any any number of other cases. This
>> harms nothing and makes git-am's detection more fault tolerant.
>
> Actually cut-and-paste is often a major source of whitespace breakage
> (including tabs silently being expanded), and I personally think a patch
> like this to encourage the practice is going in a wrong direction.

I disagree and think git-am should be smarter. Any human looking at
something like a GMail mail.txt download will clearly see that it's a
patch, but git-am is pedantic and doesn't skip past whitespace at the
beginning of the file.

I think it should have more smarts and less pedanticness, and I run
into this bug every time I download a patch via GMail.

So please pick it up, thanks.

 git-am.sh     |   16 +++++++++++++++-
 t/t4150-am.sh |   30 ++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/git-am.sh b/git-am.sh
index e7f008c..4ed8544 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -173,7 +173,21 @@ check_patch_format () {
  # otherwise, check the first few lines of the first patch to try
  # to detect its format
  {
- read l1
+ while read -r line
+ do
+ case "$line" in
+ "")
+ # Just skip whitespace
+ continue
+ ;;
+ *)
+ # First non-empty line
+ l1=$line
+ break
+ ;;
+ esac
+ done
+
  read l2
  read l3
  case "$l1" in
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 810b04b..3d089de 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -318,6 +318,36 @@ test_expect_success 'am without --committer-date-is-author-date' '
  test "$at" != "$ct"
 '
 
+test_expect_success 'am applying a patch that begins with an empty line' '
+ git checkout first &&
+ test_tick &&
+ echo > patch1-white &&
+ cat patch1 >> patch1-white &&
+ git am patch1-white &&
+ git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
+ at=$(sed -ne "/^author /s/.*> //p" head1) &&
+ ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
+ test "$at" != "$ct"
+'
+
+test_expect_success 'am applying a patch that begins with many empty lines' '
+ git checkout first &&
+ test_tick &&
+ echo "   " > patch1-white2 &&
+ echo "  " >> patch1-white2 &&
+ echo " " >> patch1-white2 &&
+ echo "" >> patch1-white2 &&
+ echo " " >> patch1-white2 &&
+ echo "  " >> patch1-white2 &&
+ echo "   " >> patch1-white2 &&
+ cat patch1 >> patch1-white2 &&
+ git am patch1-white2 &&
+ git cat-file commit HEAD | sed -e "/^\$/q" >head1 &&
+ at=$(sed -ne "/^author /s/.*> //p" head1) &&
+ ct=$(sed -ne "/^committer /s/.*> //p" head1) &&
+ test "$at" != "$ct"
+'
+
 # This checks for +0000 because TZ is set to UTC and that should
 # show up when the current time is used. The date in message is set
 # by test_tick that uses -0700 timezone; if this feature does not
--
1.7.2.1.295.gdf931

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Jon Seymour
On Thu, Aug 12, 2010 at 5:57 AM, Ævar Arnfjörð Bjarmason
<[hidden email]> wrote:

> Change git-am to ignore whitespace (as defined by sh's read) at the
> beginning of patches.
>
> This makes git-am work with patches downloaded from the GMail web
> interface, here's an example from a raw Gmail attachment produced with
> `hexdump -C':
>
>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
>    52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|
>
> Having to tell GMail users that they must manually edit their patches
> before git-am will accept them (as this article does:
> http://evag.evn.am/git/git-and-gmail) isn't optimal.

This is a good point. Current behaviour discourages testing of patches
as delivered by e-mail since the GMail user is more likely to overlook
actual whitespace errors in a patch because they come to expect this
usual failure.

jon.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
On Wed, Aug 11, 2010 at 22:50, Jon Seymour <[hidden email]> wrote:

> On Thu, Aug 12, 2010 at 5:57 AM, Ævar Arnfjörð Bjarmason
> <[hidden email]> wrote:
>> Change git-am to ignore whitespace (as defined by sh's read) at the
>> beginning of patches.
>>
>> This makes git-am work with patches downloaded from the GMail web
>> interface, here's an example from a raw Gmail attachment produced with
>> `hexdump -C':
>>
>>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
>>    20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 0a  |               .|
>>    52 65 74 75 72 6e 2d 50  61 74 68 3a 20 3c 61 76  |Return-Path: <av|
>>
>> Having to tell GMail users that they must manually edit their patches
>> before git-am will accept them (as this article does:
>> http://evag.evn.am/git/git-and-gmail) isn't optimal.
>
> This is a good point. Current behaviour discourages testing of patches
> as delivered by e-mail since the GMail user is more likely to overlook
> actual whitespace errors in a patch because they come to expect this
> usual failure.

Just to clarify, git-am doesn't print a whitespace error on GMail
patches currently, the detection just fails:

    $ git am ~/Desktop/mail.txt
    Patch format detection failed.

But with my patch:

    $ git am ~/Desktop/mail.txt
    Applying: git-am: Ignore whitespace before patches
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Jay Soffian
In reply to this post by Ævar Arnfjörð Bjarmason
On Wed, Aug 11, 2010 at 3:57 PM, Ævar Arnfjörð Bjarmason
<[hidden email]> wrote:
>> Actually cut-and-paste is often a major source of whitespace breakage
>> (including tabs silently being expanded), and I personally think a patch
>> like this to encourage the practice is going in a wrong direction.
>
> I disagree and think git-am should be smarter. Any human looking at
> something like a GMail mail.txt download will clearly see that it's a
> patch, but git-am is pedantic and doesn't skip past whitespace at the
> beginning of the file.

The point of git-am being pedantic is to prevent the original patch
from being applied w/silent corruption (e.g., tabs-to-spaces).

Perhaps, before making git-am less strict, we should modify
format-patch to include a sha1 of the diff output so that corruption
can be reliably detected by git-am.

Just a thought.

j.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Ævar Arnfjörð Bjarmason
On Thu, Aug 12, 2010 at 19:13, Jay Soffian <[hidden email]> wrote:

> On Wed, Aug 11, 2010 at 3:57 PM, Ævar Arnfjörð Bjarmason
> <[hidden email]> wrote:
>>> Actually cut-and-paste is often a major source of whitespace breakage
>>> (including tabs silently being expanded), and I personally think a patch
>>> like this to encourage the practice is going in a wrong direction.
>>
>> I disagree and think git-am should be smarter. Any human looking at
>> something like a GMail mail.txt download will clearly see that it's a
>> patch, but git-am is pedantic and doesn't skip past whitespace at the
>> beginning of the file.
>
> The point of git-am being pedantic is to prevent the original patch
> from being applied w/silent corruption (e.g., tabs-to-spaces).

The git-am code doesn't strike me as particularly pedantic. It'll fail
to spot any number of common pitfalls like patches being pasted inline
(recently discussed on list), double encoding, invalid author names /
E-Mail addresses etc. (which can e.g. happen when applying patches
from RT).

The parsing code just didn't think of this issue, I'm not aware of any
corruption that begins with whitespace being added to the beginning of
a patch, but I am aware of a non-corruption (GMail) that does that.

> Perhaps, before making git-am less strict

I don't think it's less strict with this patch, just more intelligent.

> we should modify format-patch to include a sha1 of the diff output
> so that corruption can be reliably detected by git-am.

There's a lot we could do in this department, and there was a previous
discussion on list about schemas like that (can't find it now).

We could do an ad-hoc checksum, but including more than the SHA would
be better, e.g.:

    --
    cs:<7 char SHA1> t:<NUM TABS> s:<NUM SPACES> c:<NUM CHARS ([^
\t])> ln:<NUM LINES>

That'd allow git-am to print more intelligent error messages than just
"ok/not ok", e.g.:

    * "your patch is $x lines, but the patch thinks it's $y, something
       may have gone wrong with wrapping"

    * "You have 0 tabs, but the patch thinks it has 20"

etc., but that's a project for another day.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Jonathan Nieder-2
In reply to this post by Jay Soffian
Jay Soffian wrote:

> Perhaps, before making git-am less strict, we should modify
> format-patch to include a sha1 of the diff output so that corruption
> can be reliably detected by git-am.

I seem to remember a discussion about hand-munging and rebasing of
patches suggesting such verification might be a bad idea[1].  I dunno.

[1] but all a search turned up is this:
http://thread.gmane.org/gmane.comp.version-control.git/136008/focus=136234
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v2] git-am: Ignore whitespace before patches

Junio C Hamano
Jonathan Nieder <[hidden email]> writes:

> Jay Soffian wrote:
>
>> Perhaps, before making git-am less strict, we should modify
>> format-patch to include a sha1 of the diff output so that corruption
>> can be reliably detected by git-am.
>
> I seem to remember a discussion about hand-munging and rebasing of
> patches suggesting such verification might be a bad idea[1].  I dunno.
>
> [1] but all a search turned up is this:
> http://thread.gmane.org/gmane.comp.version-control.git/136008/focus=136234

You meant this perhaps.

http://thread.gmane.org/gmane.comp.version-control.git/138084/focus=138100

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Loading...