[PATCH v6 0/2] Implement the GIT_TRACE_CURL environment variable

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH v6 0/2] Implement the GIT_TRACE_CURL environment variable

Elia Pinto
This is the sixty version but in reality is the complete rewriting of the patches discussed here
(here called V1)

$gmane/290520
$gmane/290521

*Changes from V5
($gmane/293236)

- don't export curl_trace anymore. Define it static
- fix a minor cleanup (style) in setup_curl_trace
- and, finally, i rewrote completely curl_dump, separating it
into two functions (one for http header and one for the http data),
hoping for a coherent implementation with curl --ascii-trace output
but easier to read than the previous implementation that
used a hard-to-read double-loop.

 
*Changes from V4
($gmane/292867)

-  add a better abstraction with the routine setup_curl_trace
-  curl_dump : drop the noex parameter, define nopriv boolean as int
-  use decimal constant where appropiate
-  fix multi-line comment
-  redo the authorization header skip with a replace of possible sensitive data.
   We prefer to print only:
       09:00:53.238330 http.c:534              => Send header: Authorization:  <redacted>
   intested of
       09:00:53.238330 http.c:534              => Send header: Authorization:  Basic(o other scheme) <redacted>
   as it was done in the original proposed suggestion by Jeff King.
   This is because i think it's better not to print even the authorization scheme.
   We add also the (previously missing) proxy-authorization case
-  curl_dump: fix strbuf memory leak

as suggested by Jeff King
($gmane/292891)
($gmane/292892)

In this series i keep the original curl_dump parsing code, even though it is
objectively difficult to read. This is because the same code is used internally by curl
to do "ascii-trace" and is also reported in the libcurl code examples and test.
I think this may make maintenance of code easier in the future (libcurl
new dev, new features and so on)

Of course if the maintainer (or other) believes it is really necessary
to rewrite the above code to accept the patches i will do.

*Changes from V3
($gmane/292040)

- add missing static to curl_dump
- reorder the patch order
- tried to fix all (but i am not sure) the problems reported by Julio ($gmane/292055)
- * squash the documentation with the http.c commit.
  * in the trace prefix each line to indicate it is about sending a header, and drop the
    initial hex count
  * auto-censor Authorization headers by default

    as suggested by Jeff King ($gmane/292074)

*Changes from V2
($gmane/291868)

- fix garbage comment in http.c (i am very sorry for that)
- add final '.' to the commit message for imap-send.c and to other commit messages
- typofix double ; in http.c
- merge the nice cleanup and code refactoring of Ramsay Jones (Thank you very much !!)
- squash the previous commit 2/4

*Changes from V1

- introduced GIT_TRACE_CURL variable with its documentation
- changed the name of the temporary variable "i" in "w" in the helper routine
- used the c escape sequences instead of the hex equivalent
- dropped the previous GIT_DEBUG_CURL env var
- curl_dump and curl_trace factored out to a shared implementation
in http.c




Elia Pinto (2):
  http.c: implement the GIT_TRACE_CURL environment variable
  imap-send.c: introduce the GIT_TRACE_CURL enviroment variable

 Documentation/git.txt |   8 ++++
 http.c                | 123 +++++++++++++++++++++++++++++++++++++++++++++++++-
 http.h                |   2 +
 imap-send.c           |   1 +
 4 files changed, 132 insertions(+), 2 deletions(-)

--
2.8.2.435.g7c6234f.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Elia Pinto
Implement the GIT_TRACE_CURL environment variable to allow a
greater degree of detail of GIT_CURL_VERBOSE, in particular
the complete transport header and all the data payload exchanged.
It might be useful if a particular situation could require a more
thorough debugging analysis. Document the new GIT_TRACE_CURL
environment variable.

Helped-by: Torsten Bögershausen <[hidden email]>
Helped-by: Ramsay Jones <[hidden email]>
Helped-by: Junio C Hamano <[hidden email]>
Helped-by: Eric Sunshine <[hidden email]>
Helped-by: Jeff King <[hidden email]>
Signed-off-by: Elia Pinto <[hidden email]>
---
 Documentation/git.txt |   8 ++++
 http.c                | 124 +++++++++++++++++++++++++++++++++++++++++++++++++-
 http.h                |   2 +
 3 files changed, 132 insertions(+), 2 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index dd6dbf7..a46a356 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1077,6 +1077,14 @@ of clones and fetches.
  cloning of shallow repositories.
  See 'GIT_TRACE' for available trace output options.
 
+'GIT_TRACE_CURL'::
+ Enables a curl full trace dump of all incoming and outgoing data,
+ including descriptive information, of the git transport protocol.
+ This is similar to doing curl --trace-ascii on the command line.
+ This option overrides setting the GIT_CURL_VERBOSE environment
+ variable.
+ See 'GIT_TRACE' for available trace output options.
+
 'GIT_LITERAL_PATHSPECS'::
  Setting this variable to `1` will cause Git to treat all
  pathspecs literally, rather than as glob patterns. For example,
diff --git a/http.c b/http.c
index df6dd01..ba32bac 100644
--- a/http.c
+++ b/http.c
@@ -11,6 +11,7 @@
 #include "gettext.h"
 #include "transport.h"
 
+static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
 long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
 #else
@@ -477,6 +478,125 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
+static void curl_dump_header(const char *text, unsigned char *ptr, size_t size, int nopriv_header)
+{
+ struct strbuf out = STRBUF_INIT;
+ const char *header;
+ struct strbuf **header_list, **ptr_list;
+
+ strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
+ text, (long)size, (long)size);
+ trace_strbuf(&trace_curl, &out);
+ strbuf_reset(&out);
+ strbuf_add(&out,ptr,size);
+ header_list = strbuf_split_max(&out, '\n', 0);
+
+ for (ptr_list = header_list; *ptr_list; ptr_list++) {
+ /*
+ * if we are called with nopriv_header substitute a dummy value
+ * in the Authorization or Proxy-Authorization http header if
+ * present.
+ */
+ if (nopriv_header &&
+ (skip_prefix((*ptr_list)->buf , "Authorization:", &header)
+ || skip_prefix((*ptr_list)->buf , "Proxy-Authorization:", &header))) {
+ /* The first token is the type, which is OK to log */
+ while (isspace(*header))
+ header++;
+ while (*header && !isspace(*header))
+ header++;
+ /* Everything else is opaque and possibly sensitive */
+ strbuf_setlen((*ptr_list),  header - (*ptr_list)->buf );
+ strbuf_addstr((*ptr_list), " <redacted>");
+ }
+ strbuf_insert((*ptr_list), 0, text, strlen(text));
+ strbuf_insert((*ptr_list), strlen(text), ": ", 2);
+ strbuf_rtrim((*ptr_list));
+ strbuf_addch((*ptr_list), '\n');
+ trace_strbuf(&trace_curl, (*ptr_list));
+ }
+ strbuf_list_free(header_list);
+ strbuf_release(&out);
+}
+static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
+{
+ size_t i;
+ struct strbuf out = STRBUF_INIT;
+ unsigned int width = 80;
+
+ strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
+ text, (long)size, (long)size);
+ trace_strbuf(&trace_curl, &out);
+
+ for (i = 0; i < size; i += width) {
+ size_t w;
+
+ strbuf_reset(&out);
+ strbuf_addf(&out, "%s: ", text);
+ for (w = 0; (w < width) && (i + w < size); w++) {
+ strbuf_addch(&out, (ptr[i + w] >= 0x20)
+ && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');
+ }
+ strbuf_addch(&out, '\n');
+ trace_strbuf(&trace_curl, &out);
+ }
+ strbuf_release(&out);
+}
+
+static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
+{
+ const char *text;
+ int nopriv_header = 0; /*
+ * default: there are no sensitive data
+ * in the trace to be skipped
+ */
+
+ switch (type) {
+ case CURLINFO_TEXT:
+ trace_printf_key(&trace_curl, "== Info: %s", data);
+ default: /* we ignore unknown types by default */
+ return 0;
+
+ case CURLINFO_HEADER_OUT:
+ text = "=> Send header";
+ nopriv_header = 1;
+ curl_dump_header(text, (unsigned char *)data, size, nopriv_header);
+ break;
+ case CURLINFO_DATA_OUT:
+ text = "=> Send data";
+ curl_dump_data(text, (unsigned char *)data, size);
+ break;
+ case CURLINFO_SSL_DATA_OUT:
+ text = "=> Send SSL data";
+ curl_dump_data(text, (unsigned char *)data, size);
+ break;
+ case CURLINFO_HEADER_IN:
+ text = "<= Recv header";
+ nopriv_header = 0;
+ curl_dump_header(text, (unsigned char *)data, size, nopriv_header);
+ break;
+ case CURLINFO_DATA_IN:
+ text = "<= Recv data";
+ curl_dump_data(text, (unsigned char *)data, size);
+ break;
+ case CURLINFO_SSL_DATA_IN:
+ text = "<= Recv SSL data";
+ curl_dump_data(text, (unsigned char *)data, size);
+ break;
+ }
+ return 0;
+}
+
+void setup_curl_trace(CURL *handle)
+{
+ if (!trace_want(&trace_curl))
+ return;
+ curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L);
+ curl_easy_setopt(handle, CURLOPT_DEBUGFUNCTION, curl_trace);
+ curl_easy_setopt(handle, CURLOPT_DEBUGDATA, NULL);
+}
+
+
 static CURL *get_curl_handle(void)
 {
  CURL *result = curl_easy_init();
@@ -575,9 +695,9 @@ static CURL *get_curl_handle(void)
  warning("protocol restrictions not applied to curl redirects because\n"
  "your curl version is too old (>= 7.19.4)");
 #endif
-
  if (getenv("GIT_CURL_VERBOSE"))
- curl_easy_setopt(result, CURLOPT_VERBOSE, 1);
+ curl_easy_setopt(result, CURLOPT_VERBOSE, 1L);
+ setup_curl_trace(result);
 
  curl_easy_setopt(result, CURLOPT_USERAGENT,
  user_agent ? user_agent : git_user_agent());
diff --git a/http.h b/http.h
index 36f558b..5ab9d9c 100644
--- a/http.h
+++ b/http.h
@@ -225,4 +225,6 @@ extern int finish_http_object_request(struct http_object_request *freq);
 extern void abort_http_object_request(struct http_object_request *freq);
 extern void release_http_object_request(struct http_object_request *freq);
 
+/* setup routine for curl_easy_setopt CURLOPT_DEBUGFUNCTION */
+void setup_curl_trace(CURL *handle);
 #endif /* HTTP_H */
--
2.8.2.435.g7c6234f.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH v6 2/2] imap-send.c: introduce the GIT_TRACE_CURL enviroment variable

Elia Pinto
In reply to this post by Elia Pinto
Permit the use of the GIT_TRACE_CURL environment variable calling
the setup_curl_trace http.c helper routine.

Helped-by: Torsten Bögershausen <[hidden email]>
Helped-by: Ramsay Jones <[hidden email]>
Helped-by: Junio C Hamano <[hidden email]>
Helped-by: Eric Sunshine <[hidden email]>
Helped-by: Jeff King <[hidden email]>
Signed-off-by: Elia Pinto <[hidden email]>
---
 imap-send.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/imap-send.c b/imap-send.c
index 938c691..50377c5 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1443,6 +1443,7 @@ static CURL *setup_curl(struct imap_server_conf *srvc)
 
  if (0 < verbosity || getenv("GIT_CURL_VERBOSE"))
  curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);
+ setup_curl_trace(curl);
 
  return curl;
 }
--
2.8.2.435.g7c6234f.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Junio C Hamano
In reply to this post by Elia Pinto
Elia Pinto <[hidden email]> writes:

> diff --git a/http.c b/http.c
> index df6dd01..ba32bac 100644
> --- a/http.c
> +++ b/http.c
> @@ -11,6 +11,7 @@
>  #include "gettext.h"
>  #include "transport.h"
>  
> +static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
>  #if LIBCURL_VERSION_NUM >= 0x070a08
>  long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
>  #else
> @@ -477,6 +478,125 @@ static void set_curl_keepalive(CURL *c)
>  }
>  #endif
>  
> +static void curl_dump_header(const char *text, unsigned char *ptr, size_t size, int nopriv_header)
> +{
> + struct strbuf out = STRBUF_INIT;
> + const char *header;
> + struct strbuf **header_list, **ptr_list;
> +
> + strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
> + text, (long)size, (long)size);
> + trace_strbuf(&trace_curl, &out);
> + strbuf_reset(&out);
> + strbuf_add(&out,ptr,size);
> + header_list = strbuf_split_max(&out, '\n', 0);
> +
> + for (ptr_list = header_list; *ptr_list; ptr_list++) {
> + /*
> + * if we are called with nopriv_header substitute a dummy value
> + * in the Authorization or Proxy-Authorization http header if
> + * present.
> + */
> + if (nopriv_header &&
> + (skip_prefix((*ptr_list)->buf , "Authorization:", &header)
> + || skip_prefix((*ptr_list)->buf , "Proxy-Authorization:", &header))) {
> + /* The first token is the type, which is OK to log */
> + while (isspace(*header))
> + header++;
> + while (*header && !isspace(*header))
> + header++;
> + /* Everything else is opaque and possibly sensitive */
> + strbuf_setlen((*ptr_list),  header - (*ptr_list)->buf );
> + strbuf_addstr((*ptr_list), " <redacted>");
> + }
> + strbuf_insert((*ptr_list), 0, text, strlen(text));
> + strbuf_insert((*ptr_list), strlen(text), ": ", 2);
> + strbuf_rtrim((*ptr_list));
> + strbuf_addch((*ptr_list), '\n');
> + trace_strbuf(&trace_curl, (*ptr_list));

This funny indentation makes me wonder why you didn't make it into a
helper function.  If it would require too many parameters, I could
understand the aversion, as it would end up looking like:

        for (header = headers; *header; header++) {
        if (hide_sensitive_header)
                        redact_sensitive_header(header, &too, &many, &other,
                                                &params, &you,
        &need, &to, &pass);
                strbuf_insert(*header, 0, text, strlen(text));
                strbuf_insert(*header, strlen(text), ": ", 2);
                ...
                trace_strbuf(&trace_curl, *header);
        }

but I think redact_sensitive_header() helper would only need to take
a single strbuf, from the look of your actual implementation.

Yes, I am also suggesting to rename header_list and ptr_list to
headers and header, respectively.  header_list may be an OK name (as
it is "a list, each element of which is a header"), but ptr_list is
a poor name--"a pointer into a list" is a much less interesting thing
for the reader of the code to learn from the name than it represents
"one header".

And your "const char *header" does not have to be here (it would be
an implementation detail of redact_sensitive_header() function), so
such a renaming would not cause conflicts in variable names.

> + }
> + strbuf_list_free(header_list);
> + strbuf_release(&out);
> +}
> +static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)

A blank line, between the end of previous function body and the
begining of this function, is missing.

> +{
> + size_t i;
> + struct strbuf out = STRBUF_INIT;
> + unsigned int width = 80;

In a few places Git limits the width of the output, like using function
name in hunk header lines and drawing diffstat in "diff --stat", we
do default to limit the total width to 80 display columns.

Given that this routine prefixes each and every line with a short
heading like "=> Send header: " that costs at around 15-20 columns,
and the loop we see below only counts the true payload without
counting the heading, you would probably want to reduce this
somewhat so that the whole thing would fit within 80 columns.

> + strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
> + text, (long)size, (long)size);
> + trace_strbuf(&trace_curl, &out);
> +
> + for (i = 0; i < size; i += width) {
> + size_t w;
> +
> + strbuf_reset(&out);
> + strbuf_addf(&out, "%s: ", text);
> + for (w = 0; (w < width) && (i + w < size); w++) {
> + strbuf_addch(&out, (ptr[i + w] >= 0x20)
> + && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');

Please think twice to make sure a long expression that has to span
multiple lines is cut at a sensible place.  This cuts at the worst
place where the syntactic element that binds strongest sits in the
expression.

The weakest binding is the comma between two parameters given to
addch(), so let's cut it there:

                for (w = 0; (w < width) && (i + w < size); w++) {
                        unsigned char ch = ptr[i + w];
                        strbuf_addch(&out,
                                     (0x20 <= ch && ch < 0x80) ? ch : '.');
                }

If you find the second line still too long, then cut it before '?', i.e.

                        strbuf_addch(&out,
                                     (0x20 <= ch && ch < 0x80)
                                     ? ch : '.');

or alternatively:


                        strbuf_addch(&out, ((0x20 <= ch && ch < 0x80)
                                            ? ch : '.'));

> +static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t size, void *userp)
> +{
> + const char *text;
> + int nopriv_header = 0; /*
> + * default: there are no sensitive data
> + * in the trace to be skipped
> + */
> +
> + switch (type) {
> + case CURLINFO_TEXT:
> + trace_printf_key(&trace_curl, "== Info: %s", data);
> + default: /* we ignore unknown types by default */
> + return 0;
> +
> + case CURLINFO_HEADER_OUT:
> + text = "=> Send header";
> + nopriv_header = 1;
> + curl_dump_header(text, (unsigned char *)data, size, nopriv_header);
> + case CURLINFO_HEADER_IN:
> + text = "<= Recv header";
> + nopriv_header = 0;
> + curl_dump_header(text, (unsigned char *)data, size, nopriv_header);
> + break;

I do not think an extra variable nopriv_header gives us any
readability advantage.  If you looked at these two calls to
curl_dump_header() and its parameters alone, you cannot tell
which one is sensitive.

Wouldn't it be easier to read if they were like this?

        static int curl_trace(..., char *data_, ...)
        {
                enum { NO_FILTER = 0, DO_FILTER = 1 };
                unsigned char *data = (unsigned char *) data_;

                curl_dump_header("Send header", data, size, DO_FILTER);
                curl_dump_header("Recv header", data, size, NO_FILTER);

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

Jeff King
On Fri, May 20, 2016 at 01:39:06PM -0700, Junio C Hamano wrote:

> > +{
> > + size_t i;
> > + struct strbuf out = STRBUF_INIT;
> > + unsigned int width = 80;
>
> In a few places Git limits the width of the output, like using function
> name in hunk header lines and drawing diffstat in "diff --stat", we
> do default to limit the total width to 80 display columns.
>
> Given that this routine prefixes each and every line with a short
> heading like "=> Send header: " that costs at around 15-20 columns,
> and the loop we see below only counts the true payload without
> counting the heading, you would probably want to reduce this
> somewhat so that the whole thing would fit within 80 columns.

I think that may be a losing battle. Remember that we are getting the
usual trace header on top of this, which I think leaves only about 20
characters for actual data on each line.

I kind of doubt people will manually read the body data. In my
experience, debugging git-over-http you want either:

  1. Just the request/response headers to see what was said.

  2. A complete dump of each body with no other formatting (e.g., so you
     can replay it with curl).

This trace output gives you (1). You can in theory generate (2) from it
with a perl snippet, but the non-printing characters have been
irreversibly transformed.

So I dunno. I do not mind having the body stuff there for completeness,
but I doubt I'd use it myself. And I don't care much either way about
how long its lines are.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Loading...