[PATCHv5 0/2] Implement the GIT_TRACE_CURL environment variable

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCHv5 0/2] Implement the GIT_TRACE_CURL environment variable

Elia Pinto
This is the fifth version but in reality is the complete rewriting of the patches discussed here
(here called V1)

$gmane/290520
$gmane/290521

*Changes from V4
($gmane/292867)

-  add a better abstraction with the routine setup_curl_trace
-  curl_dump : drop the noex parameter, define nopriv boolean as int
-  use decimal constant where appropiate
-  fix multi-line comment
-  redo the authorization header skip with a replace of possible sensitive data.
   We prefer to print only:
       09:00:53.238330 http.c:534              => Send header: Authorization:  <redacted>
   intested of
       09:00:53.238330 http.c:534              => Send header: Authorization:  Basic(o other scheme) <redacted>
   as it was done in the original proposed suggestion by Jeff King.
   This is because i think it's better not to print even the authorization scheme.
   We add also the (previously missing) proxy-authorization case
-  curl_dump: fix strbuf memory leak

as suggested by Jeff King
($gmane/292891)
($gmane/292892)

In this series i keep the original curl_dump parsing code, even though it is
objectively difficult to read. This is because the same code is used internally by curl
to do "ascii-trace" and is also reported in the libcurl code examples and test.
I think this may make maintenance of code easier in the future (libcurl
new dev, new features and so on)

Of course if the maintainer (or other) believes it is really necessary
to rewrite the above code to accept the patches i will do.

*Changes from V3
($gmane/292040)

- add missing static to curl_dump
- reorder the patch order
- tried to fix all (but i am not sure) the problems reported by Julio ($gmane/292055)
- * squash the documentation with the http.c commit.
  * in the trace prefix each line to indicate it is about sending a header, and drop the
    initial hex count
  * auto-censor Authorization headers by default

    as suggested by Jeff King ($gmane/292074)

*Changes from V2
($gmane/291868)

- fix garbage comment in http.c (i am very sorry for that)
- add final '.' to the commit message for imap-send.c and to other commit messages
- typofix double ; in http.c
- merge the nice cleanup and code refactoring of Ramsay Jones (Thank you very much !!)
- squash the previous commit 2/4

*Changes from V1

- introduced GIT_TRACE_CURL variable with its documentation
- changed the name of the temporary variable "i" in "w" in the helper routine
- used the c escape sequences instead of the hex equivalent
- dropped the previous GIT_DEBUG_CURL env var
- curl_dump and curl_trace factored out to a shared implementation
in http.c


Elia Pinto (2):
  http.c: implement the GIT_TRACE_CURL environment variable
  imap-send.c: introduce the GIT_TRACE_CURL enviroment variable

 Documentation/git.txt |   8 ++++
 http.c                | 115 +++++++++++++++++++++++++++++++++++++++++++++++++-
 http.h                |   4 ++
 imap-send.c           |   1 +
 4 files changed, 126 insertions(+), 2 deletions(-)

--
2.8.2.435.ga07a3e0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Elia Pinto
Implement the GIT_TRACE_CURL environment variable to allow a
greater degree of detail of GIT_CURL_VERBOSE, in particular
the complete transport header and all the data payload exchanged.
It might be useful if a particular situation could require a more
thorough debugging analysis. Document the new GIT_TRACE_CURL
environment variable.

Helped-by: Torsten Bögershausen <[hidden email]>
Helped-by: Ramsay Jones <[hidden email]>
Helped-by: Junio C Hamano <[hidden email]>
Helped-by: Eric Sunshine <[hidden email]>
Helped-by: Jeff King <[hidden email]>
Signed-off-by: Elia Pinto <[hidden email]>
---
 Documentation/git.txt |   8 ++++
 http.c                | 115 +++++++++++++++++++++++++++++++++++++++++++++++++-
 http.h                |   4 ++
 3 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 34ff007..5e59576 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1076,6 +1076,14 @@ of clones and fetches.
  cloning of shallow repositories.
  See 'GIT_TRACE' for available trace output options.
 
+'GIT_TRACE_CURL'::
+ Enables a curl full trace dump of all incoming and outgoing data,
+ including descriptive information, of the git transport protocol.
+ This is similar to doing curl --trace-ascii on the command line.
+ This option overrides setting the GIT_CURL_VERBOSE environment
+ variable.
+ See 'GIT_TRACE' for available trace output options.
+
 'GIT_LITERAL_PATHSPECS'::
  Setting this variable to `1` will cause Git to treat all
  pathspecs literally, rather than as glob patterns. For example,
diff --git a/http.c b/http.c
index 985b995..5e2bc19 100644
--- a/http.c
+++ b/http.c
@@ -11,6 +11,7 @@
 #include "gettext.h"
 #include "transport.h"
 
+static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
 long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
 #else
@@ -478,6 +479,116 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
+static void curl_dump(const char *text, unsigned char *ptr, size_t size, int nopriv)
+{
+ size_t i;
+ struct strbuf out = STRBUF_INIT;
+ unsigned int width = 80;
+
+ strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
+ text, (long)size, (long)size);
+ trace_strbuf(&trace_curl, &out);
+
+ for (i = 0; i < size; i += width) {
+ size_t w;
+ size_t prefix_len;
+ const char *header;
+
+ strbuf_reset(&out);
+ strbuf_addf(&out, "%s: ", text);
+ prefix_len = out.len;
+ for (w = 0; (w < width) && (i + w < size); w++) {
+ if ((i + w + 1 < size) && ptr[i + w] == '\r'
+    && ptr[i + w + 1] == '\n') {
+ i += (w + 2 - width);
+ break;
+ }
+ strbuf_addch(&out, (ptr[i + w] >= 0x20)
+ && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');
+ if ((i + w + 2 < size)
+    && ptr[i + w + 1] == '\r'
+    && ptr[i + w + 2] == '\n') {
+ i += (w + 3 - width);
+ break;
+ }
+ }
+
+ /*
+ * if we are called with nopriv substitute a dummy value
+ * in the Authorization or Proxy-Authorization http header if
+ * present.
+ */
+ if (nopriv &&
+ (skip_prefix(out.buf + prefix_len, "Authorization:", &header)
+ || skip_prefix(out.buf + prefix_len, "Proxy-Authorization:", &header))) {
+ /* The first token is the type, which is OK to log */
+ while (isspace(*header))
+ header++;
+ /* Everything else is opaque and possibly sensitive */
+ strbuf_setlen(&out, header - out.buf);
+ strbuf_addstr(&out, " <redacted>");
+ }
+ strbuf_addch(&out, '\n');
+ trace_strbuf(&trace_curl, &out);
+ }
+ strbuf_release(&out);
+}
+
+void setup_curl_trace(CURL *handle)
+{
+ if (!trace_want(&trace_curl)) return;
+ curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L);
+ curl_easy_setopt(handle, CURLOPT_DEBUGFUNCTION, curl_trace);
+ curl_easy_setopt(handle, CURLOPT_DEBUGDATA, NULL);
+}
+
+int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
+{
+ const char *text;
+ (void)handle; /* prevent compiler unused parameter warning if checked */
+ (void)userp; /* prevent compiler unused parameter warning if checked */
+ int nopriv = 0; /*
+ * default: there are no sensitive data
+ * in the trace to be skipped
+ */
+
+ switch (type) {
+ case CURLINFO_TEXT:
+ trace_printf_key(&trace_curl, "== Info: %s", data);
+ default: /* we ignore unknown types by default */
+ return 0;
+
+ case CURLINFO_HEADER_OUT:
+ text = "=> Send header";
+ nopriv = 1;
+ break;
+ case CURLINFO_DATA_OUT:
+ text = "=> Send data";
+ nopriv = 0;
+ break;
+ case CURLINFO_SSL_DATA_OUT:
+ text = "=> Send SSL data";
+ nopriv = 0;
+ break;
+ case CURLINFO_HEADER_IN:
+ text = "<= Recv header";
+ nopriv = 0;
+ break;
+ case CURLINFO_DATA_IN:
+ text = "<= Recv data";
+ nopriv = 0;
+ break;
+ case CURLINFO_SSL_DATA_IN:
+ text = "<= Recv SSL data";
+ nopriv = 0;
+ break;
+ }
+
+ curl_dump(text, (unsigned char *)data, size, nopriv);
+ return 0;
+}
+
+
 static CURL *get_curl_handle(void)
 {
  CURL *result = curl_easy_init();
@@ -576,9 +687,9 @@ static CURL *get_curl_handle(void)
  warning("protocol restrictions not applied to curl redirects because\n"
  "your curl version is too old (>= 7.19.4)");
 #endif
-
  if (getenv("GIT_CURL_VERBOSE"))
- curl_easy_setopt(result, CURLOPT_VERBOSE, 1);
+ curl_easy_setopt(result, CURLOPT_VERBOSE, 1L);
+ setup_curl_trace(result);
 
  curl_easy_setopt(result, CURLOPT_USERAGENT,
  user_agent ? user_agent : git_user_agent());
diff --git a/http.h b/http.h
index 36f558b..cd186a4 100644
--- a/http.h
+++ b/http.h
@@ -225,4 +225,8 @@ extern int finish_http_object_request(struct http_object_request *freq);
 extern void abort_http_object_request(struct http_object_request *freq);
 extern void release_http_object_request(struct http_object_request *freq);
 
+/* Debug callback and setup routine for curl_easy_setopt CURLOPT_DEBUGFUNCTION */
+void setup_curl_trace(CURL *handle);
+int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp);
+
 #endif /* HTTP_H */
--
2.8.2.435.ga07a3e0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCHv5 2/2] imap-send.c: introduce the GIT_TRACE_CURL environment variable

Elia Pinto
In reply to this post by Elia Pinto
Permit the use of the GIT_TRACE_CURL environment variable calling
the curl_trace and curl_dump http.c helper routine.

Helped-by: Torsten Bögershausen <[hidden email]>
Helped-by: Ramsay Jones <[hidden email]>
Helped-by: Junio C Hamano <[hidden email]>
Helped-by: Eric Sunshine <[hidden email]>
Helped-by: Jeff King <[hidden email]>
Signed-off-by: Elia Pinto <[hidden email]>
---
 imap-send.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/imap-send.c b/imap-send.c
index 938c691..50377c5 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1443,6 +1443,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc)
 
  if (0 < verbosity || getenv("GIT_CURL_VERBOSE"))
  curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
+ setup_curl_trace(curl);
 
  return curl;
 }
--
2.8.2.435.ga07a3e0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Ramsay Jones-2
In reply to this post by Elia Pinto


On 02/05/16 15:28, Elia Pinto wrote:

[snip]

> diff --git a/http.h b/http.h
> index 36f558b..cd186a4 100644
> --- a/http.h
> +++ b/http.h
> @@ -225,4 +225,8 @@ extern int finish_http_object_request(struct http_object_request *freq);
>  extern void abort_http_object_request(struct http_object_request *freq);
>  extern void release_http_object_request(struct http_object_request *freq);
>  
> +/* Debug callback and setup routine for curl_easy_setopt CURLOPT_DEBUGFUNCTION */
> +void setup_curl_trace(CURL *handle);
> +int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp);

Given that you have wrapped the use of the debug callback into the
setup_curl_trace() routine (good), do you still need to export the
curl_trace() function?

I would make that static, so that only setup_curl_trace() needs to
be a public symbol. (I would also re-order the function definitions,
so that setup_curl_trace() comes after curl_trace(), but that is a
minor point).

ATB,
Ramsay Jones
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 0/2] Implement the GIT_TRACE_CURL environment variable

Jeff King
In reply to this post by Elia Pinto
On Mon, May 02, 2016 at 02:28:11PM +0000, Elia Pinto wrote:

> -  redo the authorization header skip with a replace of possible sensitive data.
>    We prefer to print only:
>        09:00:53.238330 http.c:534              => Send header: Authorization:  <redacted>
>    intested of
>        09:00:53.238330 http.c:534              => Send header: Authorization:  Basic(o other scheme) <redacted>
>    as it was done in the original proposed suggestion by Jeff King.
>    This is because i think it's better not to print even the authorization scheme.

I'm not sure I agree. If you're debugging curl's auth selection, that's
omitting an important piece of data. And unlike the actual credential, I
don't think it's particularly secret (and in many cases can be deduced
from the "WWW-Authenticate" header the server sends, coupled with curl's
code).

>    We add also the (previously missing) proxy-authorization case

Good catch.

> In this series i keep the original curl_dump parsing code, even though it is
> objectively difficult to read. This is because the same code is used internally by curl
> to do "ascii-trace" and is also reported in the libcurl code examples and test.
> I think this may make maintenance of code easier in the future (libcurl
> new dev, new features and so on)

I don't agree with this line of reasoning. The code in question is
purely about how we format the output buffer, not about parsing what
curl gives us. We _should_ be diverging if we prefer a different output
format. And I don't think it's a question just of readability (though I
do agree the existing one is hard to read); it also foils the redaction
of the authorization header.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Jeff King
In reply to this post by Elia Pinto
On Mon, May 02, 2016 at 02:28:12PM +0000, Elia Pinto wrote:

> + if (nopriv &&
> + (skip_prefix(out.buf + prefix_len, "Authorization:", &header)
> + || skip_prefix(out.buf + prefix_len, "Proxy-Authorization:", &header))) {
> + /* The first token is the type, which is OK to log */
> + while (isspace(*header))
> + header++;
> + /* Everything else is opaque and possibly sensitive */

The first comment there is now misleading, as we _don't_ keep the first
token (though as mentioned elsewhere, I think we should).

> +void setup_curl_trace(CURL *handle)
> +{
> + if (!trace_want(&trace_curl)) return;

Style: we always put conditional bodies on a new line, even when they
are trivial like this.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Junio C Hamano
In reply to this post by Elia Pinto
Elia Pinto <[hidden email]> writes:

> +static void curl_dump(const char *text, unsigned char *ptr, size_t size, int nopriv)
> +{
> + size_t i;
> + struct strbuf out = STRBUF_INIT;
> + unsigned int width = 80;
> +
> + strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
> + text, (long)size, (long)size);
> + trace_strbuf(&trace_curl, &out);
> +
> + for (i = 0; i < size; i += width) {
> + ...
> + for (w = 0; (w < width) && (i + w < size); w++) {
> + if ((i + w + 1 < size) && ptr[i + w] == '\r'
> +    && ptr[i + w + 1] == '\n') {
> + i += (w + 2 - width);
> + break;
> + }
> + strbuf_addch(&out, (ptr[i + w] >= 0x20)
> + && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');
> + if ((i + w + 2 < size)
> +    && ptr[i + w + 1] == '\r'
> +    && ptr[i + w + 2] == '\n') {
> + i += (w + 3 - width);
> + break;
> + }
> + }
> + ...
> + }
> + strbuf_release(&out);
> +}

There is no change in this hard-to-read double-loop since the
previous round?

> +int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
> +{
> + const char *text;
> + (void)handle; /* prevent compiler unused parameter warning if checked */
> + (void)userp; /* prevent compiler unused parameter warning if checked */

I really do not want to see these casts.  Unused parameters are
perfectly normal in a codebase with callback functions, no?  I do
not think these are the first occurrences of unused parameters in
our codebase, and I do not think we have such cast to void to them.
Why add this ugliness only to here?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Junio C Hamano
In reply to this post by Elia Pinto
Elia Pinto <[hidden email]> writes:

> +int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
> +{
> + const char *text;
> + (void)handle; /* prevent compiler unused parameter warning if checked */
> + (void)userp; /* prevent compiler unused parameter warning if checked */
> + int nopriv = 0; /*

This decl-after-statment breaks our build.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHv5 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Jeff King
In reply to this post by Junio C Hamano
On Mon, May 02, 2016 at 11:59:14AM -0700, Junio C Hamano wrote:

> > +int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
> > +{
> > + const char *text;
> > + (void)handle; /* prevent compiler unused parameter warning if checked */
> > + (void)userp; /* prevent compiler unused parameter warning if checked */
>
> I really do not want to see these casts.  Unused parameters are
> perfectly normal in a codebase with callback functions, no?  I do
> not think these are the first occurrences of unused parameters in
> our codebase, and I do not think we have such cast to void to them.
> Why add this ugliness only to here?

Yeah, it is pointless to use -Wunused-parameter in our code base, as it
turns up over 1200 hits (though some are repeats from include files).
Most are callbacks, but some also look like compat functions. But
clearly it's going to be a false positive any time the interface to the
function is dictated by something other than the function body.

I generally don't mind quieting false positive warnings if we think it
might provide an opportunity for cleanup (i.e., parameters that really
_can_ go away). But I don't think gcc provides a great way to do it.

You can mark individual parameters as unused, like:

diff --git a/git-compat-util.h b/git-compat-util.h
index ba51cfd..71b4f7b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -331,7 +331,7 @@ extern char *gitdirname(char *);
 #endif
 
 #ifndef has_dos_drive_prefix
-static inline int git_has_dos_drive_prefix(const char *path)
+static inline int git_has_dos_drive_prefix(const char *path __attribute__((unused)))
 {
  return 0;
 }
@@ -339,7 +339,7 @@ static inline int git_has_dos_drive_prefix(const char *path)
 #endif
 
 #ifndef skip_dos_drive_prefix
-static inline int git_skip_dos_drive_prefix(char **path)
+static inline int git_skip_dos_drive_prefix(char **path __attribute__((unused)))
 {
  return 0;
 }

That's not too bad in a single-argument function, but for callbacks it
can get pretty tedious:

diff --git a/wt-status.c b/wt-status.c
index 1ea2ebe..7f00981 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1300,8 +1300,11 @@ struct grab_1st_switch_cbdata {
  unsigned char nsha1[20];
 };
 
-static int grab_1st_switch(unsigned char *osha1, unsigned char *nsha1,
-   const char *email, unsigned long timestamp, int tz,
+static int grab_1st_switch(unsigned char *osha1 __attribute__((unused)),
+   unsigned char *nsha1,
+   const char *email __attribute__((unused)),
+   unsigned long timestamp __attribute__((unused)),
+   int tz __attribute__((unused)),
    const char *message, void *cb_data)
 {
  struct grab_1st_switch_cbdata *cb = cb_data;

It would be much nicer if we had some way of declaring the whole
_function_ as having an interface dictated elsewhere. Then it becomes a
single attribute per callback function, and actually tells the human
reader something useful: not "I happen to not need these variables right
now", but "my interface is specified elsewhere and not up for debate".

But AFAIK, there's no such attribute.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html