[RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Stefan Beller-4
After some fruitful discussion[1] in which Junio suggested trying a very
different route[2] that is more general and not submodule related, I considered
doing a mock for this.

This lets you label arbitrary pathspecs, e.g. in git.git we may want to have:

    t/t[0-9]*.sh label=tests
   
such that

    $ git log --author=Beller ":(label=tests)"
   
would show all commits in which I touched tests.

This has suprisingly few lines of code, as the first 3 patches are refactoring.
The actual new feature is in the last patch.

This would solve the submodule issues I want to solve, as I can produce a
.gitattributes like:

    ./submodule1 label=default
    ./submodule2 label=default
    ./submodule3 label=optional-feature1
   
and then I'd instruct the users to clone like this:

    git clone .. superproject
    cd superproject
    git submodule update --init :(label:default)
   
The second part of the submodule series to collapse these three commands
will come as an extra series later, then.

What annoys me here:
Attributes can only be set once at the moment, there is no way to collect all
attributes. If we'd have such a collection feature we would be able to
configure this:

    *.[ch] label=C,code
    contrib/** label=oldstuff
   
and run this:

    git status ":(label:C oldstuff)"
   
which would be the equivalent to

    contrib/**.[ch]
   
as in this proposed implementation the labels which are given within one
pathspec item are logical AND. To get the logical OR, you'd have to give multiple
pathspec items, i.e. ":(label:C)" ":(label:oldstuff)"

Feedback welcome!

Thanks,
Stefan

[1] http://thread.gmane.org/gmane.comp.version-control.git/294212
[2] http://thread.gmane.org/gmane.comp.version-control.git/294212/focus=294391

Stefan Beller (4):
  Documentation: correct typo in example for querying attributes
  pathspec: move long magic parsing out of prefix_pathspec
  pathspec: move prefix check out of the inner loop
  pathspec: record labels

 Documentation/glossary-content.txt            |   5 ++
 Documentation/technical/api-gitattributes.txt |   2 +-
 attr.h                                        |   1 +
 dir.c                                         |  31 ++++++++
 pathspec.c                                    | 109 +++++++++++++++++---------
 pathspec.h                                    |   1 +
 t/t6134-pathspec-with-labels.sh               |  91 +++++++++++++++++++++
 7 files changed, 201 insertions(+), 39 deletions(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

--
2.8.2.400.g66c4903.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/4] Documentation: correct typo in example for querying attributes

Stefan Beller-4
The introductory text of the example has a typo in the
specification which makes it harder to follow that example.

Signed-off-by: Stefan Beller <[hidden email]>
---
 Documentation/technical/api-gitattributes.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index 2602668..36fc9af 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -61,7 +61,7 @@ Querying Specific Attributes
 Example
 -------
 
-To see how attributes "crlf" and "indent" are set for different paths.
+To see how attributes "crlf" and "ident" are set for different paths.
 
 . Prepare an array of `struct git_attr_check` with two elements (because
   we are checking two attributes).  Initialize their `attr` member with
--
2.8.2.400.g66c4903.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/4] pathspec: move long magic parsing out of prefix_pathspec

Stefan Beller-4
In reply to this post by Stefan Beller-4
`prefix_pathspec` is quite a lengthy function and we plan on adding more.
Split it up for better readability. As we want to add code into the
inner loop of the long magic parsing, we also benefit from lower
indentation.

Signed-off-by: Stefan Beller <[hidden email]>
---
 pathspec.c | 84 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 47 insertions(+), 37 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index c9e9b6c..eba37c2 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -88,6 +88,52 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen,
  strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
+static void eat_long_magic(struct pathspec_item *item, const char *elt,
+ unsigned *magic, int *pathspec_prefix,
+ const char **copyfrom_, const char **long_magic_end)
+{
+ int i;
+ const char *copyfrom = *copyfrom_;
+ /* longhand */
+ const char *nextat;
+ for (copyfrom = elt + 2;
+     *copyfrom && *copyfrom != ')';
+     copyfrom = nextat) {
+ size_t len = strcspn(copyfrom, ",)");
+ if (copyfrom[len] == ',')
+ nextat = copyfrom + len + 1;
+ else
+ /* handle ')' and '\0' */
+ nextat = copyfrom + len;
+ if (!len)
+ continue;
+ for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+ if (strlen(pathspec_magic[i].name) == len &&
+    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
+ *magic |= pathspec_magic[i].bit;
+ break;
+ }
+ if (starts_with(copyfrom, "prefix:")) {
+ char *endptr;
+ *pathspec_prefix = strtol(copyfrom + 7,
+  &endptr, 10);
+ if (endptr - copyfrom != len)
+ die(_("invalid parameter for pathspec magic 'prefix'"));
+ /* "i" would be wrong, but it does not matter */
+ break;
+ }
+ }
+ if (ARRAY_SIZE(pathspec_magic) <= i)
+ die(_("Invalid pathspec magic '%.*s' in '%s'"),
+    (int) len, copyfrom, elt);
+ }
+ if (*copyfrom != ')')
+ die(_("Missing ')' at the end of pathspec magic in '%s'"), elt);
+ *long_magic_end = copyfrom;
+ copyfrom++;
+ *copyfrom_ = copyfrom;
+}
+
 /*
  * Take an element of a pathspec and check for magic signatures.
  * Append the result to the prefix. Return the magic bitmap.
@@ -150,43 +196,7 @@ static unsigned prefix_pathspec(struct pathspec_item *item,
     (flags & PATHSPEC_LITERAL_PATH)) {
  ; /* nothing to do */
  } else if (elt[1] == '(') {
- /* longhand */
- const char *nextat;
- for (copyfrom = elt + 2;
-     *copyfrom && *copyfrom != ')';
-     copyfrom = nextat) {
- size_t len = strcspn(copyfrom, ",)");
- if (copyfrom[len] == ',')
- nextat = copyfrom + len + 1;
- else
- /* handle ')' and '\0' */
- nextat = copyfrom + len;
- if (!len)
- continue;
- for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
- if (strlen(pathspec_magic[i].name) == len &&
-    !strncmp(pathspec_magic[i].name, copyfrom, len)) {
- magic |= pathspec_magic[i].bit;
- break;
- }
- if (starts_with(copyfrom, "prefix:")) {
- char *endptr;
- pathspec_prefix = strtol(copyfrom + 7,
- &endptr, 10);
- if (endptr - copyfrom != len)
- die(_("invalid parameter for pathspec magic 'prefix'"));
- /* "i" would be wrong, but it does not matter */
- break;
- }
- }
- if (ARRAY_SIZE(pathspec_magic) <= i)
- die(_("Invalid pathspec magic '%.*s' in '%s'"),
-    (int) len, copyfrom, elt);
- }
- if (*copyfrom != ')')
- die(_("Missing ')' at the end of pathspec magic in '%s'"), elt);
- long_magic_end = copyfrom;
- copyfrom++;
+ eat_long_magic(item, elt, &magic, &pathspec_prefix, &copyfrom, &long_magic_end);
  } else {
  /* shorthand */
  for (copyfrom = elt + 1;
--
2.8.2.400.g66c4903.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/4] pathspec: move prefix check out of the inner loop

Stefan Beller-4
In reply to this post by Stefan Beller-4
The prefix check is not related the check of pathspec magic; also there
is no code that is relevant after we'd break the loop on a match for
"prefix:". So move the check before the loop and shortcircuit the outer
loop.

Signed-off-by: Stefan Beller <[hidden email]>
---
 pathspec.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index eba37c2..4dff252 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
  nextat = copyfrom + len;
  if (!len)
  continue;
+
+ if (starts_with(copyfrom, "prefix:")) {
+ char *endptr;
+ *pathspec_prefix = strtol(copyfrom + 7,
+  &endptr, 10);
+ if (endptr - copyfrom != len)
+ die(_("invalid parameter for pathspec magic 'prefix'"));
+ continue;
+ }
+
  for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
  if (strlen(pathspec_magic[i].name) == len &&
     !strncmp(pathspec_magic[i].name, copyfrom, len)) {
  *magic |= pathspec_magic[i].bit;
  break;
  }
- if (starts_with(copyfrom, "prefix:")) {
- char *endptr;
- *pathspec_prefix = strtol(copyfrom + 7,
-  &endptr, 10);
- if (endptr - copyfrom != len)
- die(_("invalid parameter for pathspec magic 'prefix'"));
- /* "i" would be wrong, but it does not matter */
- break;
- }
  }
  if (ARRAY_SIZE(pathspec_magic) <= i)
  die(_("Invalid pathspec magic '%.*s' in '%s'"),
--
2.8.2.400.g66c4903.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 4/4] pathspec: record labels

Stefan Beller-4
In reply to this post by Stefan Beller-4
Labels were originally designed to manage large amount of
submodules, the discussion steered this in a more general
direction, such that other files can be labeled as well.

Labels are meant to describe arbitrary set of files, which
is not described via the tree layout.

Signed-off-by: Stefan Beller <[hidden email]>
---
 Documentation/glossary-content.txt |  5 +++
 attr.h                             |  1 +
 dir.c                              | 31 +++++++++++++
 pathspec.c                         | 24 +++++++++-
 pathspec.h                         |  1 +
 t/t6134-pathspec-with-labels.sh    | 91 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100755 t/t6134-pathspec-with-labels.sh

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 8ad29e6..a1fc9e0 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -362,6 +362,11 @@ glob;;
  For example, "Documentation/{asterisk}.html" matches
  "Documentation/git.html" but not "Documentation/ppc/ppc.html"
  or "tools/perf/Documentation/perf.html".
+
+label:<white space separated list>;;
+ Labels can be assigned to pathspecs in the .gitattributes file.
+ By specifying a list of labels the pattern will match only
+ files which have all of the listed labels.
 +
 Two consecutive asterisks ("`**`") in patterns matched against
 full pathname may have special meaning:
diff --git a/attr.h b/attr.h
index 8b08d33..f6fc7c3 100644
--- a/attr.h
+++ b/attr.h
@@ -18,6 +18,7 @@ extern const char git_attr__false[];
 #define ATTR_TRUE(v) ((v) == git_attr__true)
 #define ATTR_FALSE(v) ((v) == git_attr__false)
 #define ATTR_UNSET(v) ((v) == NULL)
+#define ATTR_CUSTOM(v) (!(ATTR_UNSET(v) || ATTR_FALSE(v) || ATTR_TRUE(v)))
 
 /*
  * Send one or more git_attr_check to git_check_attr(), and
diff --git a/dir.c b/dir.c
index 656f272..51d5965 100644
--- a/dir.c
+++ b/dir.c
@@ -9,6 +9,7 @@
  */
 #include "cache.h"
 #include "dir.h"
+#include "attr.h"
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
@@ -208,6 +209,27 @@ int within_depth(const char *name, int namelen,
  return 1;
 }
 
+void load_labels(const char *name, int namelen, struct string_list *list)
+{
+ static struct git_attr *attr;
+ struct git_attr_check check;
+ char *path = xmemdupz(name, namelen);
+
+ if (!attr)
+ attr = git_attr("label");
+ check.attr = attr;
+
+ if (git_check_attr(path, 1, &check))
+ die("git_check_attr died");
+
+ if (ATTR_CUSTOM(check.value)) {
+ string_list_split(list, check.value, ',', -1);
+ string_list_sort(list);
+ }
+
+ free(path);
+}
+
 #define DO_MATCH_EXCLUDE   1
 #define DO_MATCH_DIRECTORY 2
 
@@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
     strncmp(item->match, name - prefix, item->prefix))
  return 0;
 
+ if (item->group) {
+ struct string_list has_labels = STRING_LIST_INIT_DUP;
+ struct string_list_item *si;
+ load_labels(name, namelen, &has_labels);
+ for_each_string_list_item(si, item->group)
+ if (!string_list_has_string(&has_labels, si->string))
+ return 0;
+ }
+
  /* If the match was just the prefix, we matched */
  if (!*match)
  return MATCHED_RECURSIVELY;
diff --git a/pathspec.c b/pathspec.c
index 4dff252..c227c25 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
 {
  int i;
  const char *copyfrom = *copyfrom_;
+ const char *out;
  /* longhand */
  const char *nextat;
  for (copyfrom = elt + 2;
@@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
  continue;
  }
 
+ if (skip_prefix(copyfrom, "label:", &out)) {
+ struct strbuf sb = STRBUF_INIT;
+ size_t l = nextat - out;
+ strbuf_add(&sb, out, l);
+ if (!item->group) {
+ item->group = xmalloc(sizeof(*item->group));
+ string_list_init(item->group, 1);
+ }
+ string_list_split(item->group, sb.buf, ' ', -1);
+ string_list_remove_empty_items(item->group, 0);
+ strbuf_release(&sb);
+ continue;
+ }
+
  for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
  if (strlen(pathspec_magic[i].name) == len &&
     !strncmp(pathspec_magic[i].name, copyfrom, len)) {
@@ -425,7 +440,7 @@ void parse_pathspec(struct pathspec *pathspec,
  for (i = 0; i < n; i++) {
  unsigned short_magic;
  entry = argv[i];
-
+ item[i].group = NULL;
  item[i].magic = prefix_pathspec(item + i, &short_magic,
  argv + i, flags,
  prefix, prefixlen, entry);
@@ -502,6 +517,13 @@ void copy_pathspec(struct pathspec *dst, const struct pathspec *src)
 
 void free_pathspec(struct pathspec *pathspec)
 {
+ int i;
+ for (i = 0; i < pathspec->nr; i++) {
+ if (pathspec->items[i].group)
+ string_list_clear(pathspec->items[i].group, 1);
+ free(pathspec->items[i].group);
+ }
+
  free(pathspec->items);
  pathspec->items = NULL;
 }
diff --git a/pathspec.h b/pathspec.h
index 0c11262..e3f7ebf 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -32,6 +32,7 @@ struct pathspec {
  int len, prefix;
  int nowildcard_len;
  int flags;
+ struct string_list *group;
  } *items;
 };
 
diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh
new file mode 100755
index 0000000..0c061ce
--- /dev/null
+++ b/t/t6134-pathspec-with-labels.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='test labels in pathspecs'
+. ./test-lib.sh
+
+test_expect_success 'setup a tree' '
+ for p in file sub/file sub/sub/file sub/file2 sub/sub/sub/file sub2/file; do
+ if echo $p | grep /; then
+ mkdir -p $(dirname $p)
+ fi &&
+ : >$p &&
+ git add $p &&
+ git commit -m $p
+ done &&
+ git log --oneline --format=%s >actual &&
+ cat <<EOF >expect &&
+sub2/file
+sub/sub/sub/file
+sub/file2
+sub/sub/file
+sub/file
+file
+EOF
+ test_cmp expect actual
+'
+
+test_expect_success 'pathspec with labels and no .gitattributes exists' '
+ git ls-files ":(label:a)" >actual &&
+ test_must_be_empty actual
+'
+
+test_expect_success 'setup .gitattributes' '
+ cat <<-EOF >.gitattributes &&
+ /file label=b
+ sub/file label=a
+ sub/sub/* label=b,c
+ EOF
+ git add .gitattributes &&
+ git commit -m "add attributes"
+'
+
+test_expect_success 'check label' '
+ cat <<-EOF >expect &&
+ sub/file
+ EOF
+ git ls-files ":(label:a)" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'check label from label list' '
+ cat <<-EOF >expect &&
+ sub/sub/file
+ EOF
+ git ls-files ":(label:c)" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'check label with more labels' '
+ cat <<-EOF >expect &&
+ file
+ sub/sub/file
+ EOF
+ git ls-files ":(label:b)" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'check label with more labels but excluded path' '
+ cat <<-EOF >expect &&
+ sub/sub/file
+ EOF
+ git ls-files ":(label:b)" ":(exclude)./file" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'check label specifying more labels' '
+ cat <<-EOF >expect &&
+ sub/sub/file
+ EOF
+ git ls-files ":(label:b c)" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'check label specifying more labels' '
+ cat <<-EOF >expect &&
+ sub/file
+ sub/sub/file
+ EOF
+ git ls-files ":(label:b c)" ":(label:a)" >actual &&
+ test_cmp expect actual
+'
+test_done
--
2.8.2.400.g66c4903.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] pathspec: record labels

Junio C Hamano
Stefan Beller <[hidden email]> writes:

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index 8ad29e6..a1fc9e0 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -362,6 +362,11 @@ glob;;
>   For example, "Documentation/{asterisk}.html" matches
>   "Documentation/git.html" but not "Documentation/ppc/ppc.html"
>   or "tools/perf/Documentation/perf.html".
> +
> +label:<white space separated list>;;
> + Labels can be assigned to pathspecs in the .gitattributes file.
> + By specifying a list of labels the pattern will match only
> + files which have all of the listed labels.
>  +
>  Two consecutive asterisks ("`**`") in patterns matched against
>  full pathname may have special meaning:

I think the new text is stuffed in the middle of the "glob;;" entry.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/4] pathspec: move prefix check out of the inner loop

Junio C Hamano
In reply to this post by Stefan Beller-4
Stefan Beller <[hidden email]> writes:

> The prefix check is not related the check of pathspec magic; also there
> is no code that is relevant after we'd break the loop on a match for
> "prefix:". So move the check before the loop and shortcircuit the outer
> loop.
>
> Signed-off-by: Stefan Beller <[hidden email]>
> ---

What were we thinking back when we added this in the middle of the
loop at 233c3e6c (parse_pathspec: preserve prefix length via
PATHSPEC_PREFIX_ORIGIN, 2013-07-14)?  I am a bit embarrassed.

>  pathspec.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/pathspec.c b/pathspec.c
> index eba37c2..4dff252 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -107,21 +107,22 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>   nextat = copyfrom + len;
>   if (!len)
>   continue;
> +
> + if (starts_with(copyfrom, "prefix:")) {
> + char *endptr;
> + *pathspec_prefix = strtol(copyfrom + 7,
> +  &endptr, 10);
> + if (endptr - copyfrom != len)
> + die(_("invalid parameter for pathspec magic 'prefix'"));
> + continue;
> + }
> +
>   for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
>   if (strlen(pathspec_magic[i].name) == len &&
>      !strncmp(pathspec_magic[i].name, copyfrom, len)) {
>   *magic |= pathspec_magic[i].bit;
>   break;
>   }
> - if (starts_with(copyfrom, "prefix:")) {
> - char *endptr;
> - *pathspec_prefix = strtol(copyfrom + 7,
> -  &endptr, 10);
> - if (endptr - copyfrom != len)
> - die(_("invalid parameter for pathspec magic 'prefix'"));
> - /* "i" would be wrong, but it does not matter */
> - break;
> - }
>   }
>   if (ARRAY_SIZE(pathspec_magic) <= i)
>   die(_("Invalid pathspec magic '%.*s' in '%s'"),
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] pathspec: record labels

Stefan Beller-4
In reply to this post by Junio C Hamano
On Thu, May 12, 2016 at 9:32 PM, Junio C Hamano <[hidden email]> wrote:

> Stefan Beller <[hidden email]> writes:
>
>> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
>> index 8ad29e6..a1fc9e0 100644
>> --- a/Documentation/glossary-content.txt
>> +++ b/Documentation/glossary-content.txt
>> @@ -362,6 +362,11 @@ glob;;
>>       For example, "Documentation/{asterisk}.html" matches
>>       "Documentation/git.html" but not "Documentation/ppc/ppc.html"
>>       or "tools/perf/Documentation/perf.html".
>> +
>> +label:<white space separated list>;;
>> +     Labels can be assigned to pathspecs in the .gitattributes file.
>> +     By specifying a list of labels the pattern will match only
>> +     files which have all of the listed labels.
>>  +
>>  Two consecutive asterisks ("`**`") in patterns matched against
>>  full pathname may have special meaning:
>
> I think the new text is stuffed in the middle of the "glob;;" entry.

It is. Will fix in a reroll.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] pathspec: record labels

Junio C Hamano
In reply to this post by Stefan Beller-4
Stefan Beller <[hidden email]> writes:

> +
> +label:<white space separated list>;;
> + Labels can be assigned to pathspecs in the .gitattributes file.
> + By specifying a list of labels the pattern will match only
> + files which have all of the listed labels.
>  +

Attributes are given to paths, not pathspecs.  You specify which paths
the entry applies to by matching the pattern (which is at the beginning
of the line) against each path, and take the matching entries.

In any case, what the first sentence tries to explain applies to
each and every attribute .gitattributes file can define, and
explaining it should be better left for the first sentence in the
DESCRIPTION section.

As to the second sentence, I would say "When specifying the paths to
work on to various Git commands, the :(label=<label>,...)  magic
pathspec can be used to select paths that have all the labels given
by the 'label' attribute.", or something like that, to clarify where
that "specifying" and "match" happens (they do not happen here, but
happen when using magic pathspec).

> +void load_labels(const char *name, int namelen, struct string_list *list)

This must be file scope static, no?

> +{
> + static struct git_attr *attr;
> + struct git_attr_check check;
> + char *path = xmemdupz(name, namelen);
> +
> + if (!attr)
> + attr = git_attr("label");
> + check.attr = attr;
> +
> + if (git_check_attr(path, 1, &check))
> + die("git_check_attr died");
> +
> + if (ATTR_CUSTOM(check.value)) {
> + string_list_split(list, check.value, ',', -1);
> + string_list_sort(list);
> + }
> +
> + free(path);
> +}

I am wondering where we should sanity check (and die, pointing out
an error in .gitattributes file).  Is this function a good place to
reject TRUE and FALSE?

By the way, ATTR_CUSTOM() is probably a bad name.  gitattributes.txt
calls them Set, Unset, Set to a value and Unspecified, and this is
trying to single out "Set to a value" case.  ATTR_STRING() may be
more appropriate.

> @@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>      strncmp(item->match, name - prefix, item->prefix))
>   return 0;
>  
> + if (item->group) {
> + struct string_list has_labels = STRING_LIST_INIT_DUP;
> + struct string_list_item *si;
> + load_labels(name, namelen, &has_labels);
> + for_each_string_list_item(si, item->group)
> + if (!string_list_has_string(&has_labels, si->string))
> + return 0;
> + }
> +

Is this the right place, before we even check if the prefix already
covered everything?

We are looking at one element in the pathspec here.  If that element
is known to be a "group", and the path has all the required labels,
is it correct to fall through?

    Ahh, you are making ":(label=...)makefile" to say "paths must
    match the string 'makefile' in some way, and further the paths
    must have all these labels?  Then falling through is correct.
    The placement before the "prefix covered" looks still a bit
    strange, but understandable.

Is this code leaking has_labels every time it is called?

By the way, we should pick one between label and group and stick to
it, no?  Perhaps call the new field "item->label"?

>   /* If the match was just the prefix, we matched */
>   if (!*match)
>   return MATCHED_RECURSIVELY;
> diff --git a/pathspec.c b/pathspec.c
> index 4dff252..c227c25 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>  {
>   int i;
>   const char *copyfrom = *copyfrom_;
> + const char *out;

It probably is meant as "output", but it looks more like the "body" or
the "contents" of the named magic to me.

>   /* longhand */
>   const char *nextat;
>   for (copyfrom = elt + 2;
> @@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>   continue;
>   }
>  
> + if (skip_prefix(copyfrom, "label:", &out)) {

This is good, but can you fix the "prefix:" one we see a few lines
above, too, using this to lose "copyfrom + 7" there?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] pathspec: record labels

Stefan Beller-4
On Thu, May 12, 2016 at 10:26 PM, Junio C Hamano <[hidden email]> wrote:

> Stefan Beller <[hidden email]> writes:
>
>> +
>> +label:<white space separated list>;;
>> +     Labels can be assigned to pathspecs in the .gitattributes file.
>> +     By specifying a list of labels the pattern will match only
>> +     files which have all of the listed labels.
>>  +
>
> Attributes are given to paths, not pathspecs.  You specify which paths
> the entry applies to by matching the pattern (which is at the beginning
> of the line) against each path, and take the matching entries.
>
> In any case, what the first sentence tries to explain applies to
> each and every attribute .gitattributes file can define, and
> explaining it should be better left for the first sentence in the
> DESCRIPTION section.
>
> As to the second sentence, I would say "When specifying the paths to
> work on to various Git commands, the :(label=<label>,...)  magic
> pathspec can be used to select paths that have all the labels given
> by the 'label' attribute.", or something like that, to clarify where
> that "specifying" and "match" happens (they do not happen here, but
> happen when using magic pathspec).

Reminder for myself: add a test for

    ":(label=a) :(exclude,label=b)"

>
>> +void load_labels(const char *name, int namelen, struct string_list *list)
>
> This must be file scope static, no?

Yes. It seems like I forget these statics often. :(

>
>> +{
>> +     static struct git_attr *attr;
>> +     struct git_attr_check check;
>> +     char *path = xmemdupz(name, namelen);
>> +
>> +     if (!attr)
>> +             attr = git_attr("label");
>> +     check.attr = attr;
>> +
>> +     if (git_check_attr(path, 1, &check))
>> +             die("git_check_attr died");
>> +
>> +     if (ATTR_CUSTOM(check.value)) {
>> +             string_list_split(list, check.value, ',', -1);
>> +             string_list_sort(list);
>> +     }
>> +
>> +     free(path);
>> +}
>
> I am wondering where we should sanity check (and die, pointing out
> an error in .gitattributes file).  Is this function a good place to
> reject TRUE and FALSE?

Would it make sense to mark a file as

    "follows the labeling system, but has no label" (TRUE)
    "doesn't follow the labeling system at all" (FALSE)

This may be a useful addition later on to say

    "I want to talk only about labeled files, but not label 'a'"

I would think.


>
> By the way, ATTR_CUSTOM() is probably a bad name.  gitattributes.txt
> calls them Set, Unset, Set to a value and Unspecified, and this is
> trying to single out "Set to a value" case.  ATTR_STRING() may be
> more appropriate.

Ok.

>
>> @@ -263,6 +285,15 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
>>           strncmp(item->match, name - prefix, item->prefix))
>>               return 0;
>>
>> +     if (item->group) {
>> +             struct string_list has_labels = STRING_LIST_INIT_DUP;
>> +             struct string_list_item *si;
>> +             load_labels(name, namelen, &has_labels);
>> +             for_each_string_list_item(si, item->group)
>> +                     if (!string_list_has_string(&has_labels, si->string))
>> +                             return 0;
>> +     }
>> +
>
> Is this the right place, before we even check if the prefix already
> covered everything?
>
> We are looking at one element in the pathspec here.  If that element
> is known to be a "group", and the path has all the required labels,
> is it correct to fall through?
>
>     Ahh, you are making ":(label=...)makefile" to say "paths must
>     match the string 'makefile' in some way, and further the paths
>     must have all these labels?  Then falling through is correct.

This is how I understood your initial design idea.

    :(label=C_code)contrib/

gives all the retired C programs.

>     The placement before the "prefix covered" looks still a bit
>     strange, but understandable.
>
> Is this code leaking has_labels every time it is called?

It does.

>
> By the way, we should pick one between label and group and stick to
> it, no?  Perhaps call the new field "item->label"?

will do.

>
>>       /* If the match was just the prefix, we matched */
>>       if (!*match)
>>               return MATCHED_RECURSIVELY;
>> diff --git a/pathspec.c b/pathspec.c
>> index 4dff252..c227c25 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>> @@ -94,6 +94,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>>  {
>>       int i;
>>       const char *copyfrom = *copyfrom_;
>> +     const char *out;
>
> It probably is meant as "output", but it looks more like the "body" or
> the "contents" of the named magic to me.

ok.

>
>>       /* longhand */
>>       const char *nextat;
>>       for (copyfrom = elt + 2;
>> @@ -117,6 +118,20 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt,
>>                       continue;
>>               }
>>
>> +             if (skip_prefix(copyfrom, "label:", &out)) {
>
> This is good, but can you fix the "prefix:" one we see a few lines
> above, too, using this to lose "copyfrom + 7" there?

I can do that. Though I did not want to clutter this patch with it.

I wonder that you focus on the details already, but not on the grand
design of things. "Is it actually a sane thing I am proposing here?"
Though you may be biased as the the high level idea came from
you. :)

One of the things I switched last minute and tried to address in the
cover letter is the semantics of ORing or ANDing the labels given
within one pathspec item.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] pathspec: record labels

Junio C Hamano
Stefan Beller <[hidden email]> writes:

> Would it make sense to mark a file as
>
>     "follows the labeling system, but has no label" (TRUE)
>     "doesn't follow the labeling system at all" (FALSE)

Isn't the former be "label="

I do not know what you mean by the latter.  I would understand
"pretend this has all the labels under the sun", though.

>>     Ahh, you are making ":(label=...)makefile" to say "paths must
>>     match the string 'makefile' in some way, and further the paths
>>     must have all these labels?  Then falling through is correct.
>
> This is how I understood your initial design idea.
>
>     :(label=C_code)contrib/
>
> gives all the retired C programs.

OK, you probably meant s/C program/shell script/ with a different
label, but I think I got the idea.

> I wonder that you focus on the details already, but not on the grand
> design of things. "Is it actually a sane thing I am proposing here?"
> Though you may be biased as the the high level idea came from
> you. :)

Biased is the primary reason ;-) I trust that other reviewers can
stop me and correct course when I veer off the deep end by myself.

Just like you called this as a "mock", I am treating this as a
testbed to let you try out the "defaultGroup" thing to see how
flexibly you can express common wishes the end users have, i.e.
"does it give us an expressive enough system to replace repo?"
is the question I want this code to help answering.  And that is
why I was trying to make sure it is good enough quickly.

> One of the things I switched last minute and tried to address in
> the cover letter is the semantics of ORing or ANDing the labels
> given within one pathspec item.

That is something we may find out that the other way is more useful,
or your final choice is better, by seeing how easy to express common
patterns of submodule selections.  Perhaps we might end up wanting
both, but as you said, OR can be given by listing the same path with
differnt required-labels so I think what you have is good enough for
us to start experimenting with the higher layer.

One worry I have is that historically pathspec matching and
attribute matching have both been very hard to make it perform well.
The placement of string_list_has_string() and load_labels() at the
leaf level of the pathspec matching logic cannot be helped, but
these calls might turn out to be an unacceptable performance
bottleneck.  But it is a bit too early to worry about that, I think.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Duy Nguyen
In reply to this post by Stefan Beller-4
On Fri, May 13, 2016 at 7:19 AM, Stefan Beller <[hidden email]> wrote:

> After some fruitful discussion[1] in which Junio suggested trying a very
> different route[2] that is more general and not submodule related, I considered
> doing a mock for this.
>
> This lets you label arbitrary pathspecs, e.g. in git.git we may want to have:
>
>     t/t[0-9]*.sh label=tests
>
> such that
>
>     $ git log --author=Beller ":(label=tests)"
>
> would show all commits in which I touched tests.
>
> This has suprisingly few lines of code, as the first 3 patches are refactoring.
> The actual new feature is in the last patch.
>
> This would solve the submodule issues I want to solve, as I can produce a
> .gitattributes like:
>
>     ./submodule1 label=default
>     ./submodule2 label=default
>     ./submodule3 label=optional-feature1
>
> and then I'd instruct the users to clone like this:
>
>     git clone .. superproject
>     cd superproject
>     git submodule update --init :(label:default)
>
> The second part of the submodule series to collapse these three commands
> will come as an extra series later, then.

Yeah. I can see the use of this, even without submodules.

> What annoys me here:
> Attributes can only be set once at the moment, there is no way to collect all
> attributes. If we'd have such a collection feature we would be able to
> configure this:
>
>     *.[ch] label=C,code
>     contrib/** label=oldstuff

Instead of putting everything in under the same attribute name
"label", make one attribute per label? Would this work?

*.[ch] c-group code-group

And the pathspec magic name can be simply "attr", any git attribute
can be used with it, e.g. :(attr:c-group)

> and run this:
>
>     git status ":(label:C oldstuff)"
>
> which would be the equivalent to
>
>     contrib/**.[ch]
>
> as in this proposed implementation the labels which are given within one
> pathspec item are logical AND. To get the logical OR, you'd have to give multiple
> pathspec items, i.e. ":(label:C)" ":(label:oldstuff)"

It's even better if pathspec does not have to deal with combination
logic. Can attr macros deal with that (or if it can't at the moment,
can we improve it)?
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Junio C Hamano
Duy Nguyen <[hidden email]> writes:

> Instead of putting everything in under the same attribute name
> "label", make one attribute per label? Would this work?
>
> *.[ch] c-group code-group

The attribute subsystem expects that there will not be unbounded
large number of attributes, so this is not a good direction to go.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Junio C Hamano
Junio C Hamano <[hidden email]> writes:

> Duy Nguyen <[hidden email]> writes:
>
>> Instead of putting everything in under the same attribute name
>> "label", make one attribute per label? Would this work?
>>
>> *.[ch] c-group code-group
>
> The attribute subsystem expects that there will not be unbounded
> large number of attributes, so this is not a good direction to go.

Having said that, I do not mind too much if it turns out that it is
necessary to use an unbounded set of random attributes to solve a
specific problem, if the use case is good.

But even then, in order to avoid confusion and name clashes, I'd
prefer to see more like

        *.[ch] group-c group-code

that is matched by

        git cmd ':(group:c code)

i.e. reserve a single prefix that is not and will not be used for
other purposes.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Duy Nguyen
On Mon, May 16, 2016 at 2:33 AM, Junio C Hamano <[hidden email]> wrote:

> Junio C Hamano <[hidden email]> writes:
>
>> Duy Nguyen <[hidden email]> writes:
>>
>>> Instead of putting everything in under the same attribute name
>>> "label", make one attribute per label? Would this work?
>>>
>>> *.[ch] c-group code-group
>>
>> The attribute subsystem expects that there will not be unbounded
>> large number of attributes, so this is not a good direction to go.
>
> Having said that, I do not mind too much if it turns out that it is
> necessary to use an unbounded set of random attributes to solve a
> specific problem, if the use case is good.

I only have a vague idea so far, it seems to me a good idea to be able
to specify "binary-marked paths" or "filter attached paths" from
pathspec. We can already do this with scripts, but we need to be very
careful about quoting. And if it's pathspec it can be combined with
other magic (most likely negation)

> But even then, in order to avoid confusion and name clashes, I'd
> prefer to see more like
>
>         *.[ch] group-c group-code
>
> that is matched by
>
>         git cmd ':(group:c code)
>
> i.e. reserve a single prefix that is not and will not be used for
> other purposes.

For my above use case, i can still define macro group-binary that is
an alias of binary to get around this. So it's ok.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Stefan Beller-4
In reply to this post by Duy Nguyen
On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <[hidden email]> wrote:
> Instead of putting everything in under the same attribute name
> "label", make one attribute per label? Would this work?
>
> *.[ch] c-group code-group
>
> And the pathspec magic name can be simply "attr", any git attribute
> can be used with it, e.g. :(attr:c-group)

So you want to be able to query something like:

    git ls-files ":(crlf=auto)"

as well?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Junio C Hamano
Stefan Beller <[hidden email]> writes:

> On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <[hidden email]> wrote:
>> Instead of putting everything in under the same attribute name
>> "label", make one attribute per label? Would this work?
>>
>> *.[ch] c-group code-group
>>
>> And the pathspec magic name can be simply "attr", any git attribute
>> can be used with it, e.g. :(attr:c-group)
>
> So you want to be able to query something like:
>
>     git ls-files ":(crlf=auto)"
>
> as well?

It would be more like

        git ls-files ":(attr:crlf=auto)"

It certainly sounds tempting, even though I do not want to keep what
this initial chunk of the series needs to do to the minimum.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Stefan Beller-4
On Mon, May 16, 2016 at 10:39 AM, Junio C Hamano <[hidden email]> wrote:

> Stefan Beller <[hidden email]> writes:
>
>> On Sun, May 15, 2016 at 3:06 AM, Duy Nguyen <[hidden email]> wrote:
>>> Instead of putting everything in under the same attribute name
>>> "label", make one attribute per label? Would this work?
>>>
>>> *.[ch] c-group code-group
>>>
>>> And the pathspec magic name can be simply "attr", any git attribute
>>> can be used with it, e.g. :(attr:c-group)
>>
>> So you want to be able to query something like:
>>
>>     git ls-files ":(crlf=auto)"
>>
>> as well?
>
> It would be more like
>
>         git ls-files ":(attr:crlf=auto)"
>
> It certainly sounds tempting, even though I do not want to keep what
> this initial chunk of the series needs to do to the minimum.

This is another case for using ':' instead of '='.
So I think ':' is better for this future enhancement.

Also this future enhancement may ask for

      git ls-files ":(attr:label=foo)"

or

      git ls-files ":(attr:-label)"

so the user can circumvent the warn and ignore strategy. :(
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Junio C Hamano
Stefan Beller <[hidden email]> writes:

> This is another case for using ':' instead of '='.
> So I think ':' is better for this future enhancement.
>
> Also this future enhancement may ask for
>
>       git ls-files ":(attr:label=foo)"
>
> or
>
>       git ls-files ":(attr:-label)"
>
> so the user can circumvent the warn and ignore strategy. :(

That is exactly what we want, I would say, if we want to accept
"arbitrary set of attributes with their states".

The "warn and ignore" comes only from "with '(:label=X Y Z)', we
inspect attributes X, Y and Z and insist them to be set to true; it
is a configuration error to set the label to anything but a string",
and accepting "arbitrary set of attributes with their states" makes
it moot (as I said earlier in $gmane/294776).

So are we leaning towards ":(attr:<state>)", where <state> is one of

    var=value
    var
    -var
    !var

now?  It makes the pathspec magic parser a bit more complex (and I
am not sure if we have to worry too much about quoting "value" part
to allow arbitrary sequence of bytes), though.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH 0/4] pathspec labels [WAS: submodule groups]

Stefan Beller-4
On Mon, May 16, 2016 at 2:18 PM, Junio C Hamano <[hidden email]> wrote:

> Stefan Beller <[hidden email]> writes:
>
>> This is another case for using ':' instead of '='.
>> So I think ':' is better for this future enhancement.
>>
>> Also this future enhancement may ask for
>>
>>       git ls-files ":(attr:label=foo)"
>>
>> or
>>
>>       git ls-files ":(attr:-label)"
>>
>> so the user can circumvent the warn and ignore strategy. :(
>
> That is exactly what we want, I would say, if we want to accept
> "arbitrary set of attributes with their states".
>
> The "warn and ignore" comes only from "with '(:label=X Y Z)', we
> inspect attributes X, Y and Z and insist them to be set to true; it
> is a configuration error to set the label to anything but a string",
> and accepting "arbitrary set of attributes with their states" makes
> it moot (as I said earlier in $gmane/294776).
>
> So are we leaning towards ":(attr:<state>)", where <state> is one of
>
>     var=value
>     var
>     -var
>     !var
>
> now?  It makes the pathspec magic parser a bit more complex (and I
> am not sure if we have to worry too much about quoting "value" part
> to allow arbitrary sequence of bytes), though.

And we want to have both "label=A B C" and attr:label=A B C" or *just* the
attr query?

We should not allow the user to add arbitrary attributes (i.e. labels).
Instead each of the "attr for labeling purposes" needs to follow a clear scheme
that allows us to add attributes later on that are outside of that scheme.

In a very first implementation we could require the attribute to start with
"label=". ;)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
12