email address handling

classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Andrew Morton-9
On Fri, 1 Aug 2008 18:15:39 -0400 Theodore Tso <[hidden email]> wrote:

> How about this as a compromise?  Git continues to store the names in
> its internal format as it always does, but there is a configuration
> option which controls whether the various Author: and Committer:
> fields when displayd by git-log are in RFC-822 format or not.  

Well I believe/expect/hope that git's name+email-address transformation
goes via a lookup in the kernel's .mailmap file.

And the existing .mailmap appears to have taken care that all the
"name" parts are in an MUA-usable form.  There are no periods or
commas.

So if everyone had a .mailmap entry then

- The Author: lines would all be MUA usable

- The Author lines would all be in their owners' preferred form.   I mean,
  converting

        "Morton, Andrew"

  into

        Morton, Andrew

  didn't improve things much.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3


On Fri, 1 Aug 2008, Andrew Morton wrote:
>
> And the existing .mailmap appears to have taken care that all the
> "name" parts are in an MUA-usable form.  There are no periods or
> commas.

Umm. Or quotes? I don't think so. Or even periods? You must not have
looked at things, I found one at the very first screenful.

        Ed L. Cashin <[hidden email]>
        Paolo 'Blaisorblade' Giarrusso <[hidden email]>
        S.Çağlar Onur <[hidden email]>

adn that's just basically ignoring the fact that we only add mailmap
entries for people who can't get it right other ways (where admittedly
sometimes the "can't get it right" comes from the people in between: poor
Çağlar has had his name corrupted so many times that it's funny).

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3
In reply to this post by Andrew Morton-9


Btw, the real issue here is

 - why do you want to make things uglier and make up stupid rules that are
   irrelevant to git, just for something that you admit you hadn't ever
   even _noticed_ until now, and now that you know about it it's not even
   a problem any more?

especially as

 - we know people won't do the quoting _anyway_, since we actually have
   tons of examples of that in the kernel as-is.

Quoting should be for _tools_, not for people. And even if we did it, we
probably wouldn't be fully rfc2822-compliant anyway, because anybody sane
would decide to not quote '@' and '.', rigth?

Because those don't actually really have special meaning (yeah, they are
"special" characters in rfc-2822, but nobody cares, and the MUA can do it
for us, no)?

So now we'd actually not really be rfc-compliant _anyway_, because
everybody really realizes just how annoying that would really be.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Andrew Morton-9
In reply to this post by Linus Torvalds-3
On Fri, 1 Aug 2008 15:23:23 -0700 (PDT) Linus Torvalds <[hidden email]> wrote:

>
>
> On Fri, 1 Aug 2008, Andrew Morton wrote:
> >
> > I preserve the quotes (when present) in signoffs for this exact reason.
>
> You must be one of the few ones.

Not the only one.  See d67d1c7bf948341fd8678c8e337ec27f4b46b206,
3bf2e77453a87c22eb57ed4926760ac131c84459, ...

> According to the RFC's, you should quote
> pretty much any punctuation mark, including "." itself. Which means that
> things like
>
> Signed-off-by: David S. Miller <[hidden email]>
>
> should be quoted if they were email addresses.
>
> That would be very irritating.

Yeah, it's ugly as sin.  But it has usability benefits.  Few people
actually need this treatment.

> It's even _more_ irritating for things like D'Souza (or Giuseppe D'Eliseo
> to take a real example from the kernel).  For David, we could just not use
> the "S." - for others, the special characters are very much part of the
> name. It would also be very irritating for important messages like
>
> Signed-off-by: Linus "I'm a moron" Torvalds <[hidden email]>
>
> etc, where it sure as heck isn't a rfc2822-compliant email address.

It might be.  Look at this guy:

From: Josef "Jeff" Sipek <[hidden email]>

Who later did an edit and became

From: "Josef 'Jeff' Sipek" <[hidden email]>

> So the thing is, "strict email format" is just very annoying. Git does
> know how to do (well, it _should_) it for "git send-email", but making the
> human-readable output ugly just because somebody might want to
> cut-and-paste it sounds really sad.

It didn't make human-readable output ugly.  It was already ugly and it
just left it alone so it was still usable.

> You could cut-and-paste just the stuff inside the angle branckets, though.
> That should work.

Sure.  I like to include people's names though.

Perhaps a suitable solution to all this would be to teach more things
to use .mailmap transformations and to update that file more.

otoh, if people really want to present themselves to the world in a
name-reversed, comma-stuffed, quote-wrapped form then that was their
choice..

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Andrew Morton-9
In reply to this post by Linus Torvalds-3
On Fri, 1 Aug 2008 15:34:16 -0700 (PDT) Linus Torvalds <[hidden email]> wrote:

>
>
> On Fri, 1 Aug 2008, Andrew Morton wrote:
> >
> > And the existing .mailmap appears to have taken care that all the
> > "name" parts are in an MUA-usable form.  There are no periods or
> > commas.
>
> Umm. Or quotes? I don't think so. Or even periods? You must not have
> looked at things, I found one at the very first screenful.
>
> Ed L. Cashin <[hidden email]>
> Paolo 'Blaisorblade' Giarrusso <[hidden email]>
> S.__a__lar Onur <[hidden email]>

oh.  So .mailmap isn't usable either.  Argh.

I guess it'd be fairly simple to slap quotes around anything which
contains fishy characters.

> adn that's just basically ignoring the fact that we only add mailmap
> entries for people who can't get it right other ways (where admittedly
> sometimes the "can't get it right" comes from the people in between: poor
> __a__lar has had his name corrupted so many times that it's funny).
  ^^^^^^^^ (lol)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Andrew Morton-9
In reply to this post by Linus Torvalds-3
On Fri, 1 Aug 2008 15:39:37 -0700 (PDT) Linus Torvalds <[hidden email]> wrote:

>
>
> Btw, the real issue here is
>
>  - why do you want to make things uglier and make up stupid rules that are
>    irrelevant to git, just for something that you admit you hadn't ever
>    even _noticed_ until now, and now that you know about it it's not even
>    a problem any more?

None of that is correct.

The real issue here is:

 - Why do you want to take usable RFC-compliant email addresses and
   mangle them in a manner which still doesn't match the person's
   actual name and which makes unsuspecting users of git potentially
   lose important email communications?

Ain't framing great?

> especially as
>
>  - we know people won't do the quoting _anyway_, since we actually have
>    tons of examples of that in the kernel as-is.
>
> Quoting should be for _tools_, not for people. And even if we did it, we
> probably wouldn't be fully rfc2822-compliant anyway, because anybody sane
> would decide to not quote '@' and '.', rigth?
>
> Because those don't actually really have special meaning (yeah, they are
> "special" characters in rfc-2822, but nobody cares, and the MUA can do it
> for us, no)?
>
> So now we'd actually not really be rfc-compliant _anyway_, because
> everybody really realizes just how annoying that would really be.
>

Linus, just admit it: copying and pasting from git-log output into the MUA
is *useful*.  And you've made it less reliable.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3
In reply to this post by Linus Torvalds-3


On Fri, 1 Aug 2008, Linus Torvalds wrote:
>
> S.Çağlar Onur <[hidden email]>

Btw, poor guy is _really_ screwed. He'd show up as

        "=?utf-8?q?S=2E=C3=87a=C4=9Flar?= Onur" <[hidden email]>

which must really hurt.

Can you not see how STUPID it would be to say that the name should be
shown as an email encoding requires it?

Really. Just admit that you were wrong. The fact is, asking for rfc2822
encoding in logs etc is a HORRIBLY HORRIBLY stupid thing to do.

What you really want was just something you could cut-and-paste into your
mailer. Which actually means that the only special character is probably
",", and your claims of how bad the design was that it didn't leave the
total mess that rfc2822 is was actually not true, and was based on simply
not knowing how nasty the real world is...

Quote frankly, If I had one of the Finnish special characters in my name,
I'd piss on your grave if you suggested that. Try to guess what something
like

         =?ISO-8859-15?Q?Linus_T=F6rnqvist?= <[hidden email]>

is supposed to be.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3
In reply to this post by Andrew Morton-9


On Fri, 1 Aug 2008, Andrew Morton wrote:
> > S.__a__lar Onur <[hidden email]>
>
> oh.  So .mailmap isn't usable either.  Argh.

Btw, your mailer really is broken. It seems to have turned my correct
utf-8 email into US-ASCII.

Or at least it was correct when it came back to me. I don't see the
corruption. But your mailer seems to be unable to handle any complex
character sets and did

        X-Mailer: Sylpheed 2.4.8 (GTK+ 2.12.5; x86_64-redhat-linux-gnu)
        Mime-Version: 1.0
        Content-Type: text/plain; charset=US-ASCII

and I wonder why?

Yeah, I feel superior, because alpine actually gets things right these
days. I too used to be character-set-confused.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3
In reply to this post by Andrew Morton-9


On Fri, 1 Aug 2008, Andrew Morton wrote:
>
> Linus, just admit it: copying and pasting from git-log output into the MUA
> is *useful*.  And you've made it less reliable.

Oh, I admit it is useful.

But your "solution" is actually MUCH MUCH MUCH worse than what git does.

That's my argument here. Life is tough.  Not everthing is going to be
easy. Your solution would "work", but it would be a horrid piece of crap.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Andrew Morton-9
In reply to this post by Linus Torvalds-3
On Fri, 1 Aug 2008 15:52:36 -0700 (PDT) Linus Torvalds <[hidden email]> wrote:

>
>
> On Fri, 1 Aug 2008, Andrew Morton wrote:
> > > S.__a__lar Onur <[hidden email]>
> >
> > oh.  So .mailmap isn't usable either.  Argh.
>
> Btw, your mailer really is broken. It seems to have turned my correct
> utf-8 email into US-ASCII.
>
> Or at least it was correct when it came back to me. I don't see the
> corruption. But your mailer seems to be unable to handle any complex
> character sets and did
>
> X-Mailer: Sylpheed 2.4.8 (GTK+ 2.12.5; x86_64-redhat-linux-gnu)
> Mime-Version: 1.0
> Content-Type: text/plain; charset=US-ASCII
>
> and I wonder why?
>
> Yeah, I feel superior, because alpine actually gets things right these
> days. I too used to be character-set-confused.
>

sylpheed.  If you use its internal editor it mostly gets things right.
But if you use its use-external-editor feature it messes up those
things when saving out to its temporary file.  And it was written by a
Japanese guy.

I'll often fix it in changelogs by re-editing the changelog and doing
a copy-n-paste from sylpheed's display window into the editor, which
does work.  All a bit of a pain though.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3
In reply to this post by Linus Torvalds-3


On Fri, 1 Aug 2008, Linus Torvalds wrote:
>
> That's my argument here. Life is tough.  Not everthing is going to be
> easy. Your solution would "work", but it would be a horrid piece of crap.

..and I really think that the

        "=?utf-8?q?S=2E=C3=87a=C4=9Flar?= Onur" <[hidden email]>

example should be the one that makes you say "Ok, you're right".

The undeniable fact is, if we kept things in that format, even your broken
mailer wouldn't have corrupted it. You could cut-and-paste things, and
they's show up correctly at the other end, regardless of whether the
problem is with your mailer or with the cut-and-paste, or anything else.

So clearly, "=?utf-8?q?S=2E=C3=87a=C4=9Flar?= Onur" _must_ be the superior
format that git should have used, no?

Because clearly that is the most automation-friendly thing that _never_
requires anybody to think at all, and you can cut-and-paste it between
programs without ever having to worry about anything at all. No special
characters, no special meanings, no need to worry about limitations of
implementation.

So the fact that git completely FUCKS IT UP, and when you do 'git log' git
will have corrupted this to

        Author: S.Çağlar Onur <[hidden email]>

is clearly git doing the wrong thing. Right?

WRONG.

The fact is, git does the right thing. And yes, it means that you cannot
just blindly cut-and-paste. And yes, it means that your mailer actually
has to work right for you to even -see- the right email address. And yes,
it means that any number of things can screw up, and corrupt it.

But it is STILL the right thing. Because what matters more than your
ability to cut-and-paste or anything like that is  the fact that we should
make things look sane.

The thing is, you can actually get git to output the crazy names. Just do

        git show --pretty=email 37a4c940749670671adab211a2d9c9fed9f3f757

and now you get the email-prettified thing for at least the author. No,
git won't corrupt the actual message, so the Signed-off-by: lines will
still show Çağlar's first name, but you can actually get back that odd
format.

(In fact, --pretty=email will do it as

        From: =?utf-8?q?S.=C3=87a=C4=9Flar=20Onur?= <[hidden email]>

which is admittedy _even_uglier_, but whatever.. The difference between
really f*cking ugly and really f*cking uglier is not really relevant).

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Johannes Schindelin
In reply to this post by Junio C Hamano
Hi,

On Fri, 1 Aug 2008, Junio C Hamano wrote:

> Johannes Schindelin <[hidden email]> writes:
>
> > On Fri, 1 Aug 2008, Andrew Morton wrote:
> >
> >> I very very frequently copy and paste name+email address out of git
> >> output and into an MUA.  Have done it thousands and thousands of times,
> >> and it has always worked.  I'm sure that many others do the same thing.
>
> >
> > $ git log --pretty=email
> >
> > after this patch:
>
> You are quoting only Author: and not Signed-off-by: and Cc: that are used
> for e-mail purposes.

You might have realized that this was not a proper patch with a commit
message and a SOB?

As for Cc: I agree.  But not for S-O-B: this is not an email header.  And
I was very specific in only changing the behavior for "pretty=email".

At least _I_ was surprised that pretty=email did not behave as if it was
outputting email headers.

I agree with Linus for pretty=non-email, but not at all for pretty=email.

> I already said send-email is the right place to do this kind of thing,
> didn't I?

For the given scenario send-email is completely irrelevant.

Ciao,
Dscho

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Johannes Schindelin
In reply to this post by Linus Torvalds-3
Hi,

On Fri, 1 Aug 2008, Linus Torvalds wrote:

> The thing is, you can actually get git to output the crazy names. Just
> do
>
> git show --pretty=email 37a4c940749670671adab211a2d9c9fed9f3f757
>
> and now you get the email-prettified thing for at least the author.

Ah, there lies the rub (you forgot that the original complaint was about
a comma, and pretty=email does not handle those):

-- snipsnap --

 pretty.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/pretty.c b/pretty.c
index 33ef34a..9db0333 100644
--- a/pretty.c
+++ b/pretty.c
@@ -79,7 +79,8 @@ int non_ascii(int ch)
 
 static int is_rfc2047_special(char ch)
 {
- return (non_ascii(ch) || (ch == '=') || (ch == '?') || (ch == '_'));
+ return (non_ascii(ch) || (ch == '=') || (ch == '?') || (ch == '_') ||
+ (ch == ',') || (ch == '"') || (ch == '\''));
 }
 
 static void add_rfc2047(struct strbuf *sb, const char *line, int len,
@@ -89,7 +90,7 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 
  for (i = 0; i < len; i++) {
  int ch = line[i];
- if (non_ascii(ch))
+ if (is_rfc2047_special(ch))
  goto needquote;
  if ((i + 1 < len) && (ch == '=' && line[i+1] == '?'))
  goto needquote;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3


On Sat, 2 Aug 2008, Johannes Schindelin wrote:
>
> Ah, there lies the rub (you forgot that the original complaint was about
> a comma, and pretty=email does not handle those):

Indeed.

I wonder where that is_rfc2047_special() function came from.  The list of
"special" characters is totally bogus.

The real RFC has comma, but it has a lot of other characters too:

  especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" /
              "<" / "> / "/" / "[" / "]" / "?" / "." / "="

because basically the rfc2047 encoding has to be a superset of the 822
(and later 2822) encodings.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Junio C Hamano
Linus Torvalds <[hidden email]> writes:

> On Sat, 2 Aug 2008, Johannes Schindelin wrote:
>>
>> Ah, there lies the rub (you forgot that the original complaint was about
>> a comma, and pretty=email does not handle those):
>
> Indeed.
>
> I wonder where that is_rfc2047_special() function came from.

It came from the earlier patch from Dscho I rejected yesterday ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Junio C Hamano
In reply to this post by Linus Torvalds-3
Linus Torvalds <[hidden email]> writes:

> On Sat, 2 Aug 2008, Johannes Schindelin wrote:
>>
>> Ah, there lies the rub (you forgot that the original complaint was about
>> a comma, and pretty=email does not handle those):
>
> Indeed.
>
> I wonder where that is_rfc2047_special() function came from.  The list of
> "special" characters is totally bogus.

This function is about quoting inside dq pair, so the function does not
look at the set you listed.  It is about quoting non-ascii chars using the
?charset?Q? or ?charset?B? notation.

If we want to use double quotes that should be done elsewhere, not in that
function.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Junio C Hamano
Junio C Hamano <[hidden email]> writes:

> Linus Torvalds <[hidden email]> writes:
>
>> On Sat, 2 Aug 2008, Johannes Schindelin wrote:
>>>
>>> Ah, there lies the rub (you forgot that the original complaint was about
>>> a comma, and pretty=email does not handle those):
>>
>> Indeed.
>>
>> I wonder where that is_rfc2047_special() function came from.  The list of
>> "special" characters is totally bogus.
>
> This function is about quoting inside dq pair, so the function does not

s/is about/is NOT about/;

Sorry, I should grab coffee before continuing.

> look at the set you listed.  It is about quoting non-ascii chars using the
> ?charset?Q? or ?charset?B? notation.
>
> If we want to use double quotes that should be done elsewhere, not in that
> function.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Johannes Schindelin
In reply to this post by Junio C Hamano
Hi,

On Sat, 2 Aug 2008, Junio C Hamano wrote:

> Linus Torvalds <[hidden email]> writes:
>
> > I wonder where that is_rfc2047_special() function came from.

It comes straight from cdd406e(CMIT_FMT_EMAIL: Q-encode Subject: and
display-name part of From: fields.).

Ciao,
Dscho


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Linus Torvalds-3


On Sat, 2 Aug 2008, Johannes Schindelin wrote:

> Hi,
>
> On Sat, 2 Aug 2008, Junio C Hamano wrote:
>
> > Linus Torvalds <[hidden email]> writes:
> >
> > > I wonder where that is_rfc2047_special() function came from.
>
> It comes straight from cdd406e(CMIT_FMT_EMAIL: Q-encode Subject: and
> display-name part of From: fields.).

That's not what I meant.

I meant "what drugs induced somebody to write that function and give it
that name, since it clearly has never seen rfc2047, and has nothing to do
with it".

In other words, it sure as hell didn't come from the rfc2047 in this
universe, so it must have come from some exciting alternate alien universe
with different laws of nature and internet.

Or maybe there's just another rfc2047 that I've not heard of.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: email address handling

Junio C Hamano
In reply to this post by Linus Torvalds-3
Linus Torvalds <[hidden email]> writes:

> On Sat, 2 Aug 2008, Johannes Schindelin wrote:
>>
>> Ah, there lies the rub (you forgot that the original complaint was about
>> a comma, and pretty=email does not handle those):
>
> Indeed.
>
> I wonder where that is_rfc2047_special() function came from.  The list of
> "special" characters is totally bogus.
>
> The real RFC has comma, but it has a lot of other characters too:
>
>   especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" /
>               "<" / "> / "/" / "[" / "]" / "?" / "." / "="
>
> because basically the rfc2047 encoding has to be a superset of the 822
> (and later 2822) encodings.

Hmm, you're right.  This has to be trickier than I originally thought it
would be ... ;-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
123