[RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Thomas Rast
This is an attempt to rewrite the documentation for --full-history,
--simplify-merges, and --sparse for clarity.

--prune-merges does not exist yet, but would be intended to restore
the default behavior over possible earlier --full-history and
--simplify-merges.

WARNING: Does not match current code behavior.

Signed-off-by: Thomas Rast <[hidden email]>
---

This is fallout from my investigation of rev-list during the
filter-branch topic.  I found the descriptions quite unhelpful, and
investigated commit logs and empiric behavior a bit.

The problem is, rev-list behaves quite erratically if one mixes
--sparse, --full-history and --parents in combinations.

With the history from the patch, i.e.

  $ git log --graph --pretty=oneline --abbrev-commit --decorate
  *-.   e0083e6... (refs/heads/master) Merge branches 'side' and 'unrelated'
  |\ \
  | | * b3127f4... (refs/heads/unrelated) d: unrelated
  | * | 984aa48... (refs/heads/side) C: dir=B
  | |/
  * | aad9982... B: dir
  |/
  * b60c459... A: dir
  * ad7052b... initial

where only commits with 'dir' touch the 'dir' subdirectory, and C=B in
terms of diffs, a few unexpected things happen.  (I manually applied
--abbrev-commit to rev-list output to make it fit in less columns.)

As noted in the patch, I would expect the default pruning to simplify
the history to 'o -- A -- B -- m' (drop parent 'd' because the entire
sideline is non-touching; drop parent 'C' because it is the same as
'B'), and then have --dense decide to drop 'o' and 'm' because they
are not touching.  The output agrees:

  $ git rev-list --pretty=oneline HEAD -- dir
  aad9982... B: dir
  b60c459... A: dir

Also, --sparse tends to support this two-pass simplification theory:

  $ git rev-list --pretty=oneline --sparse HEAD -- dir
  e0083e6... Merge branches 'side' and 'unrelated'
  aad9982... B: dir
  b60c459... A: dir
  ad7052b... initial

Now according to current docs, --full-history should turn off removing
"merges which didn't change anything at all at some child", but still
"simplify away merges that didn't change anything at all into either
child."  How does being a merge have anything to do with it's
_children_?  So I read this as saying that a merge is removed iff it
agrees with all _parents_.  However, despite having a merge that is
different relative to its third parent, it is dropped:

  $ g rev-list --pretty=oneline --full-history HEAD -- dir
  984aa48... C: dir=B
  aad9982... B: dir
  b60c459... A: dir

But --parents --full-history magically revives the merge:

  $ git rev-list --pretty=oneline --parents --full-history HEAD -- dir
  e0083e6... aad9982... 984aa48... b60c459... Merge branches 'side' and 'unrelated'
  984aa48... b60c459... C: dir=B
  aad9982... b60c459... B: dir
  b60c459... A: dir

One quickly verifies that --sparse --full-history shows everything, as
it should.

More to the point, --simplify-merges actually shows the merge when
--full-history does not, resulting in

  $ git rev-list --pretty=oneline --simplify-merges HEAD -- dir
  e0083e6... Merge branches 'side' and 'unrelated'
  984aa48... C: dir=B
  aad9982... B: dir
  b60c459... A: dir

despite the commit message (90818fdc) claiming that it is an
additional simplification on top of --full-history, and should
therefore output at most as many commits.  --parents seems to agree
that the simplification is working right:

  $ git rev-list --pretty=oneline --parents --simplify-merges HEAD -- dir
  e0083e6... aad9982... 984aa48... Merge branches 'side' and 'unrelated'
  984aa48... b60c459... C: dir=B
  aad9982... b60c459... B: dir
  b60c459... A: dir

(Note that it dropped the third unrelated parent.)

Furthermore, with --sparse the same turns into

  $ git rev-list --pretty=oneline --sparse --parents --simplify-merges HEAD -- dir
  e0083e6... aad9982... 984aa48... b3127f4... Merge branches 'side' and 'unrelated'
  b3127f4... b60c459... d: unrelated
  984aa48... b60c459... C: dir=B
  aad9982... b60c459... B: dir
  b60c459... ad7052b... A: dir
  ad7052b... initial

So it didn't simplify away the parent 'd' after all, even though
90818fdc (in the second * bullet) does not mention any criterion that
would make 'd' its own replacement.  (I would actually expect it to
replace 'd' with 'initial' on that ancestor line, and subsequently
prune that parent because it is redundant.)

So if anyone read this mail up to this point:  Which of these are
actual bugs?  Which of them are a misunderstanding on my part?

Thanks for your time,

Thomas


 Documentation/rev-list-options.txt |   97 ++++++++++++++++++++++++++++-------
 1 files changed, 77 insertions(+), 20 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index ee6822a..0eaefd2 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -191,20 +191,6 @@ endif::git-rev-list[]
 
  Stop when a given path disappears from the tree.
 
---full-history::
-
- Show also parts of history irrelevant to current state of given
- paths. This turns off history simplification, which removed merges
- which didn't change anything at all at some child. It will still actually
- simplify away merges that didn't change anything at all into either
- child.
-
---simplify-merges::
-
- Simplify away commits that did not change the given paths, similar
- to `--full-history`, and further remove merges none of whose
- parent history changes the given paths.
-
 --no-merges::
 
  Do not print commits with more than one parent.
@@ -287,18 +273,89 @@ See also linkgit:git-reflog[1].
  Output uninteresting commits at the boundary, which are usually
  not shown.
 
+--
+
+History Simplification
+~~~~~~~~~~~~~~~~~~~~~~
+
+When optional paths are given, 'git-rev-list' simplifies merge and
+non-merge commits separately.  First, all non-merge commits that do
+not touch the given paths are marked as such.  We'll call them
+'non-touching' commits, and all other commits 'touching'.
+
+Second, merges are simplified.  You can choose three levels.  We
+illustrate the strategies with the following example history, where
+touching commits are shown with capital letters and both B and C
+contain the same changes:
+
+-----------------------------------------------------------------------
+ o -- A -- B -- m
+     |\      /|
+     | \- C -/ |
+     \       /
+      \-- d --/
+-----------------------------------------------------------------------
+
+--prune-merges::
+
+ This is the default.  A merge is has its parents rewritten as
+ follows:
++
+ * All parents that do not have any touching ancestors are
+  removed.
++
+ * Of a set of parents that agree on the path contents, only
+  the first is kept.
++
+In our example, we get the following:
++
+-----------------------------------------------------------------------
+ o -- A -- B -- m
+-----------------------------------------------------------------------
+
+--simplify-merges::
+
+ For each commit C, compute its replacement in the final history:
++
+* First compute the replacements of all parents of C, and
+  rewrite C to have these parents.  Then remove parents that
+  are either identical to or ancestors of an existing parent.
++
+* If, after this simplification, the commit is touching, a root or
+  merge commit, or marked as uninteresting, it remains.
++
+In the example, history is simplified as follows.  (Note that while
+'o' remains, it will not be output with '\--dense'.)
++
+-----------------------------------------------------------------------
+ o -- A -- B -- m
+      \      /
+       \- C -/
+-----------------------------------------------------------------------
+
+--full-history::
+
+ Do not simplify merges at all.  Their ancestor lines are still
+ only shown if they have any touching commits, but the merges
+ themselves are always output.
+
+Finally the simplified history is output.  You can control which
+commits are shown:
+
 --dense::
+
+ Hide all non-touching non-merge commits.  This is the default.
+
 --sparse::
 
-When optional paths are given, the default behaviour ('--dense') is to
-only output commits that changes at least one of them, and also ignore
-merges that do not touch the given paths.
+ Output all commits.  (Still subject to merge simplification
+ and count and age limitations.)
 
-Use the '--sparse' flag to makes the command output all eligible commits
-(still subject to count and age limitation), but apply merge
-simplification nevertheless.
 
 ifdef::git-rev-list[]
+Bisection Helpers
+~~~~~~~~~~~~~~~~~
+
 --bisect::
 
 Limit output to the one commit object which is roughly halfway between
--
1.6.0.rc2.29.g7ec81

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Junio C Hamano
Thomas Rast <[hidden email]> writes:

>   $ g rev-list --pretty=oneline --full-history HEAD -- dir
> ...
> But --parents --full-history magically revives the merge:
> ...

Personally I do not think --full-history without --parents is of much
usefulness (I'd let Linus or somebody else defend this usage, or make it
imply revs.rewrite_parents otherwise).  If you remove that case from your
set of experiments in the equation, do the rest of the results make sense?

> More to the point, --simplify-merges actually shows the merge when
> --full-history does not, resulting in ...

One thing I forgot to mention (but the code of course does not forget to
do) in the series is that --simplify-merges implies revs.rewrite_parents
which roughly translates to your experiments from the command line to
always have --parents option.

>   $ git rev-list --pretty=oneline --sparse --parents --simplify-merges HEAD -- dir
>   e0083e6... aad9982... 984aa48... b3127f4... Merge branches 'side' and 'unrelated'
>   b3127f4... b60c459... d: unrelated
>   984aa48... b60c459... C: dir=B
>   aad9982... b60c459... B: dir
>   b60c459... ad7052b... A: dir
>   ad7052b... initial

I am not sure what one should expect from combination between these two
options.  --sparse says do not drop commits that are of no interest with
respect to the paths specified, while --simplify-merges tells it to
simplify merges so that the remaining graph shows only the ones that have
relevance to !TREESAME (iow "has some changes") nodes.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Linus Torvalds-3


On Sun, 10 Aug 2008, Junio C Hamano wrote:
>
> Personally I do not think --full-history without --parents is of much
> usefulness (I'd let Linus or somebody else defend this usage, or make it
> imply revs.rewrite_parents otherwise).  If you remove that case from your
> set of experiments in the equation, do the rest of the results make sense?

Oh, it's _very_ useful.

The most common case is "git whatchanged". It's useful to find a commit
that did some change _without_ any graphical front-end.

And then the merges and parenthood are totally pointless - no human can
try to tie things together in their head _anyway_, so why show them? You
just want to find the change.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Thomas Rast
In reply to this post by Junio C Hamano
Junio C Hamano wrote:

> Thomas Rast <[hidden email]> writes:
>
> >   $ g rev-list --pretty=oneline --full-history HEAD -- dir
> > ...
> > But --parents --full-history magically revives the merge:
> > ...
>
> Personally I do not think --full-history without --parents is of much
> usefulness (I'd let Linus or somebody else defend this usage, or make it
> imply revs.rewrite_parents otherwise).
Well,

  --parents::
          Print the parents of the commit.

does not mention any change in behaviour.  I find it very surprising
that a simple commit formatting option changes the way commits are
_selected_.

> One thing I forgot to mention (but the code of course does not forget to
> do) in the series is that --simplify-merges implies revs.rewrite_parents
> which roughly translates to your experiments from the command line to
> always have --parents option.

Then it makes sense of course.

> >   $ git rev-list --pretty=oneline --sparse --parents --simplify-merges HEAD -- dir
[...]
> I am not sure what one should expect from combination between these two
> options.  --sparse says do not drop commits that are of no interest with
> respect to the paths specified, while --simplify-merges tells it to
> simplify merges so that the remaining graph shows only the ones that have
> relevance to !TREESAME (iow "has some changes") nodes.

It makes sense assuming a one-pass (plus simplify-merges) model.  It
did not fit into my two-pass model that I tried to come up with for an
easier explanation.

So in my current (ahem, new) understanding, that means (assuming the
side effect of --parents):

The simplification follows commits backwards into history according to
the following rules:

--dense:
        Non-merges are included if TREESAME[1], otherwise they are
        skipped.
--sparse:
        Non-merges are always included.

default:
        Merges are included unless they are TREESAME with a parent, in
        which case they are skipped and only that parent is followed.
--full-history:
        Merges are always included.

Conceptually, that builds a subset of the history, although it is
not kept in memory unless absolutely required.  Then:

--simplify-commits:
        Implies --full-history, then applies your algorithm on the
        resulting subset.

Which probably means that --sparse --simplify-commits makes no sense,
but explains the results.

Note that --full-history makes no exceptions, not even for merges that
are TREESAME w.r.t. all parents, unlike current docs state.  (This is
empirically correct.)  If the above is correct, and you think that has
some merit, I'll rewrite my patch to reflect this (with examples) to
update the docs.

- Thomas


[1] I still think "touching" was a pretty neat idea ;-)

--
Thomas Rast
[hidden email]



signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Junio C Hamano
In reply to this post by Linus Torvalds-3
Linus Torvalds <[hidden email]> writes:

> On Sun, 10 Aug 2008, Junio C Hamano wrote:
>>
>> Personally I do not think --full-history without --parents is of much
>> usefulness (I'd let Linus or somebody else defend this usage, or make it
>> imply revs.rewrite_parents otherwise).  If you remove that case from your
>> set of experiments in the equation, do the rest of the results make sense?
>
> Oh, it's _very_ useful.
>
> The most common case is "git whatchanged". It's useful to find a commit
> that did some change _without_ any graphical front-end.
>
> And then the merges and parenthood are totally pointless - no human can
> try to tie things together in their head _anyway_, so why show them? You
> just want to find the change.

Oh, I was not talking about revs.print_parents part, but about
revs.rewrite_parents part.  What got Thomas puzzled about was exactly how
the set of commits _shown_ are different with and without --parents, which
sets both of these internal flags.  Your "pointless" argument applies to
"print_parents" part, but "rewrite_parents" affects the resulting set.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Linus Torvalds-3


On Sun, 10 Aug 2008, Junio C Hamano wrote:
>
> Oh, I was not talking about revs.print_parents part, but about
> revs.rewrite_parents part.  What got Thomas puzzled about was exactly how
> the set of commits _shown_ are different with and without --parents, which
> sets both of these internal flags.  Your "pointless" argument applies to
> "print_parents" part, but "rewrite_parents" affects the resulting set.

Umm. Since --parents sets both, I don't see the point of that statement.

The fact is, --parents makes us show parenthood. That means that we need
the merges to show up, otherwise the parenthood is meaningless.

So without "--parents", there is absolutely no point in showing the merges
that don't have any other reason to be shown. And _with_ --parents, we
need to show them because they matter for parenthood.

What's so confusing or hard to understand?

And yes, we now have split that "parents" flag into two separate flags
internally, but that has absolutely _zero_ meaning for "--parents" itself,
and it is totally irrelevant and meaningless to bring up that internal
implementation issue and mentioning "print_parents" and "rewrite_parents".

They are immaterial to the actual argument, and the split-up happened
because of the "--graph" flag, where we actually *do* show parents too,
but we show them through the graph, not by printing the SHA1 of the
parent. So the reason "rewrite_parents" is the one that affects the set of
commits that get output is a small _internal_ implementation detail, and I
really don't see what it has to do with anything at all.

So the fact is:

 - "rewrite_parents" is the flag that says that we are interested in
   parenthood and keeping the graph consistent and dense.

   This very much implies showing merges that would otherwise not be
   relevant.

 - "print_parents" is just a flag whether we should print the parent SHA1
   or not. Nothing less, nothing more.

   This one has absolutely no relevance to whether a merge should be shown
   or not, since it only affects the output _format_. It's related to
   pretty-printing, not to anything else!

I really don't see what the confusion is all about. It's very
straightforward and obvious.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Junio C Hamano
In reply to this post by Thomas Rast
Thomas Rast <[hidden email]> writes:

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index ee6822a..0eaefd2 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -191,20 +191,6 @@ endif::git-rev-list[]
> ...
> +--

What was the meaning of the double-dash at the beginning of line in
AsciiDoc markup?  I forgot.

> +History Simplification
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +When optional paths are given, 'git-rev-list' simplifies merge and
> +non-merge commits separately.  First, all non-merge commits that do
> +not touch the given paths are marked as such.  We'll call them
> +'non-touching' commits, and all other commits 'touching'.
> +
> +Second, merges are simplified.  You can choose three levels.  We
> +illustrate the strategies with the following example history, where
> +touching commits are shown with capital letters and both B and C
> +contain the same changes:
> +
> +-----------------------------------------------------------------------
> + o -- A -- B -- m
> +     |\      /|
> +     | \- C -/ |
> +     \       /
> +      \-- d --/
> +-----------------------------------------------------------------------

Please draw it a bit more consistently with pictures in other existing
documentation, perhaps like this:

              d---.  
             /     \
        o---A---B---m
             \     /
              C---.

> +--prune-merges::
> +
> + This is the default.  A merge is has its parents rewritten as
> + follows:
> ++
> + * All parents that do not have any touching ancestors are
> +  removed.
> ++
> + * Of a set of parents that agree on the path contents, only
> +  the first is kept.
> ++
> +In our example, we get the following:
> ++
> +-----------------------------------------------------------------------
> + o -- A -- B -- m
> +-----------------------------------------------------------------------

I'd rather make this the part of the base text.  In other words, remove
the "--prune-merges" header, dedent the description and start the sentence
with "By default, parents of a merge is rewritten with the following
rules:".

Then before listing the options, say something like "You can influence how
simplification works using the following options".

> +--simplify-merges::
> +
> + For each commit C, compute its replacement in the final history:
> ++
> +* First compute the replacements of all parents of C, and
> +  rewrite C to have these parents.  Then remove parents that
> +  are either identical to or ancestors of an existing parent.
> ++
> +* If, after this simplification, the commit is touching, a root or
> +  merge commit, or marked as uninteresting, it remains.
> ++
> +In the example, history is simplified as follows.  (Note that while
> +'o' remains, it will not be output with '\--dense'.)
> ++

Also this option implies --full-history's true meaning "do not cull side
branches even when they led to the same conclusion", with --parent's
meaning "do not drop merges that are necessary to keep the rewritten
history still connected".

> +-----------------------------------------------------------------------
> + o -- A -- B -- m
> +      \      /
> +       \- C -/
> +-----------------------------------------------------------------------

Same comment on the way the picture is drawn.

> +--full-history::
> +
> + Do not simplify merges at all.  Their ancestor lines are still
> + only shown if they have any touching commits, but the merges
> + themselves are always output.

With clarification from Linus yesterday, this would need rewording.  It is
not about "simplifying merges", but its main focus is "do not cull side
branches".  When --parents is in effect, merges need to be shown because
otherwise the resulting list of commits won't be connected, but otherwise
you are getting a flat list of commits and useless merges won't be shown.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Junio C Hamano
In reply to this post by Thomas Rast
Thomas Rast <[hidden email]> writes:

> [1] I still think "touching" was a pretty neat idea ;-)

Actually, I felt it was a horrible wording, because the first association
I got from the word was with "emotionally moved", and not about !TREESAME
which is "Does the commit modify the specified paths?" at all.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: [RFC PATCH] Documentation: rev-list-options: clarify history simplification with paths

Thomas Rast
In reply to this post by Junio C Hamano
Junio C Hamano <[hidden email]> wrote:
>

Thanks for taking the time to look into this!

> What was the meaning of the double-dash at the beginning of line in
> AsciiDoc markup?  I forgot.

I wish I knew.  I simply copied that from elsewhere in the docs to
make it shut up about an error.  It would seem that it is required to
end the list before a title, except if there's already an 'if' doing
the split, unless on a full moon, except if it's wednesday.  I've
tried to make the patches compile with asciidoc (8.2.5 here), but
that's about as far as it goes.

I haven't found any mention of the magic features of '^--' in the user
manual, though the cheat sheet

  http://powerman.name/doc/asciidoc

has nice examples how to interrupt lists, which I used for the
upcoming patch contents.

> Please draw it a bit more consistently with pictures in other existing
> documentation, perhaps like this:

Hmm.  I've tried to give the new examples a more compact and round
appearance, like in your example.  Tell me if that works for you.

> I'd rather make this the part of the base text.  In other words, remove
> the "--prune-merges" header, dedent the description and start the sentence
> with "By default, parents of a merge is rewritten with the following
> rules:".
>
> Then before listing the options, say something like "You can influence how
> simplification works using the following options".

I dropped the "--prune-merges" since it would be a new option.
However, I would like to keep some sort of "Default mode" header (not
necessarily as a list header, if you have better ideas).  Otherwise,
upon encountering "--full-history ... differs from the default", the
reader would have to (tediously) scan several paragraph breaks to
discover where the default begins.


I completely rewrote it along the outlines given in my own followup.
I also devised a better example that shows the differences between
all-TREESAME merges, one-TREESAME merges, and (!)TREESAME commits.  I
am open for further suggestions of course.  (I'm also violating the
"no patches after midnight" rule, so feel free to point out obvious
mistakes too.)

I furthermore split the patch into two halves:

* 2/3 applies on top of master, so that it is independent of
  --simplify-merges.

* 3/3 replaces the docs coming with --simplify-merges with an extra
  paragraph in 'History Simplification'.

I hope that's the right way to proceed.  This does mean you will get a
merge conflict when merging jc/post-simplify, but it's a fairly
obvious one.

1/3 is just a one-character typo that I discovered along the way.

Finally, I'm attaching a script that generates a repository with the
history used in the example.

- Thomas

--
Thomas Rast
[hidden email]



make-test-repo (952 bytes) Download Attachment
signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

[PATCH 1/3] Documentation: rev-list-options: Fix a typo

Thomas Rast
Signed-off-by: Thomas Rast <[hidden email]>
---
 Documentation/rev-list-options.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 3aa3809..83070ed 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -71,7 +71,7 @@ For example, if you have this topology:
          o---x---a---a  branch A
 -----------------------------------------------------------------------
 +
-you would get an output line this:
+you would get an output like this:
 +
 -----------------------------------------------------------------------
  $ git rev-list --left-right --boundary --pretty=oneline A...B
--
1.6.0.rc2.56.g86ca

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 2/3] Documentation: rev-list-options: Rewrite simplification descriptions for clarity

Thomas Rast
In reply to this post by Thomas Rast
This completely rewrites the documentation of --full-history with lots
of examples.

Signed-off-by: Thomas Rast <[hidden email]>
---
 Documentation/rev-list-options.txt |  153 ++++++++++++++++++++++++++++++++----
 1 files changed, 136 insertions(+), 17 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 83070ed..f8e5fb9 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -43,11 +43,13 @@ endif::git-rev-list[]
 
 --parents::
 
- Print the parents of the commit.
+ Print the parents of the commit.  Also enables parent
+ rewriting, see 'History Simplification' below.
 
 --children::
 
- Print the children of the commit.
+ Print the children of the commit.  Also enables parent
+ rewriting, see 'History Simplification' below.
 
 ifdef::git-rev-list[]
 --timestamp::
@@ -191,14 +193,6 @@ endif::git-rev-list[]
 
  Stop when a given path disappears from the tree.
 
---full-history::
-
- Show also parts of history irrelevant to current state of a given
- path. This turns off history simplification, which removed merges
- which didn't change anything at all at some child. It will still actually
- simplify away merges that didn't change anything at all into either
- child.
-
 --no-merges::
 
  Do not print commits with more than one parent.
@@ -281,18 +275,144 @@ See also linkgit:git-reflog[1].
  Output uninteresting commits at the boundary, which are usually
  not shown.
 
+--
+
+History Simplification
+~~~~~~~~~~~~~~~~~~~~~~
+
+When optional paths are given, 'git-rev-list' simplifies commits with
+various strategies, according to the options you have selected.
+
+Suppose you specified `foo` as the <paths>.  We shall call commits
+that modify `foo` !TREESAME, and the rest TREESAME.  (In a diff
+filtered for `foo`, they look different and equal, respectively.)
+
+In the following, we will always refer to the same example history to
+illustrate the differences between simplification settings.  We assume
+that you are filtering for a file `foo` in this commit graph:
+-----------------------------------------------------------------------
+  .-A---M---N---O---P
+ /     /   /   /   /
+ I     B   C   D   E
+ \   /   /   /   /
+  `-------------'
+-----------------------------------------------------------------------
+The horizontal line of history A--P is taken to be the first parent of
+each merge.  The commits are:
+
+* `I` is the initial commit, in which `foo` exists with contents
+  "asdf", and a file `quux` exists with contents "quux".  Initial
+  commits are compared to an empty tree, so `I` is !TREESAME.
+
+* In `A`, `foo` contains just "foo".
+
+* `B` contains the same change as `A`.  Its merge `M` is trivial and
+  hence TREESAME to all parents.
+
+* `C` does not change `foo`, but its merge `N` changes it to "foobar",
+  so it is not TREESAME to any parent.
+
+* `D` sets `foo` to "baz".  Its merge `O` combines the strings from
+  `N` and `D` to "foobarbaz"; i.e., it is not TREESAME to any parent.
+
+* `E` changes `quux` to "xyzzy", and its merge `P` combines the
+  strings to "quux xyzzy".  Despite appearing interesting, `P` is
+  TREESAME to all parents.
+
+'rev-list' walks backwards through history, including or excluding
+commits based on whether '\--full-history' and/or parent rewriting
+(via '\--parents' or '\--children') are used.  The following settings
+are available.
+
+Default mode::
+
+ Commits are included if they are not TREESAME to any parent
+ (though this can be changed, see '\--sparse' below).  If the
+ commit was a merge, and it was TREESAME to one parent, follow
+ only that parent.  (Even if there are several TREESAME
+ parents, follow only one of them.)  Otherwise, follow all
+ parents.
++
+This results in:
++
+-----------------------------------------------------------------------
+  .-A---N---O
+ /         /
+ I---------D
+-----------------------------------------------------------------------
++
+Note how the rule to only follow the TREESAME parent, if one is
+available, removed `B` from consideration entirely.  `C` was
+considered via `N`, but is TREESAME.  Root commits are compared to an
+empty tree, so `I` is !TREESAME.
++
+Parent/child relations are only visible with --parents, but that does
+not affect the commits selected in default mode, so we have shown the
+parent lines.
+
+--full-history without parent rewriting::
+
+ This mode differs from the default in one point: always follow
+ all parents of a merge, even if it is TREESAME to one of them.
+ Even if more than one side of the merge has commits that are
+ included, this does not imply that the merge itself is!  In
+ the example, we get
++
+-----------------------------------------------------------------------
+ I  A  B  N  D  O
+-----------------------------------------------------------------------
++
+`P` and `M` were excluded because they are TREESAME to a parent.  `E`,
+`C` and `B` were all walked, but only `B` was !TREESAME, so the others
+do not appear.
++
+Note that without parent rewriting, it is not really possible to talk
+about the parent/child relationships between the commits, so we show
+them disconnected.
+
+--full-history with parent rewriting::
+
+ Ordinary commits are only included if they are !TREESAME
+ (though this can be changed, see '\--sparse' below).
++
+Merges are always included.  However, their parent list is rewritten:
+Along each parent, prune away commits that are not included
+themselves.  This results in
++
+-----------------------------------------------------------------------
+  .-A---M---N---O---P
+ /     /   /   /   /
+ I     B   /   D   /
+ \   /   /   /   /
+  `-------------'
+-----------------------------------------------------------------------
++
+Compare to '\--full-history' without rewriting above.  Note that `E`
+was pruned away because it is TREESAME, but the parent list of P was
+rewritten to contain `E`'s parent `I`.  The same happened for `C` and
+`N`.  Note also that `P` was included despite being TREESAME.
+
+In addition to the above settings, you can change whether TREESAME
+affects inclusion:
+
 --dense::
+
+ Commits that are walked are included if they are not TREESAME
+ to any parent.
+
 --sparse::
 
-When optional paths are given, the default behaviour ('--dense') is to
-only output commits that changes at least one of them, and also ignore
-merges that do not touch the given paths.
+ All commits that are walked are included.
++
+Note that without '\--full-history', this still simplifies merges: if
+one of the parents is TREESAME, we follow only that one, so the other
+sides of the merge are never walked.
 
-Use the '--sparse' flag to makes the command output all eligible commits
-(still subject to count and age limitation), but apply merge
-simplification nevertheless.
 
 ifdef::git-rev-list[]
+Bisection Helpers
+~~~~~~~~~~~~~~~~~
+
 --bisect::
 
 Limit output to the one commit object which is roughly halfway between
@@ -342,7 +462,6 @@ after all the sorted commit objects, there will be the same text as if
 `--bisect-vars` had been used alone.
 endif::git-rev-list[]
 
---
 
 Commit Ordering
 ~~~~~~~~~~~~~~~
--
1.6.0.rc2.56.g86ca

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

[PATCH 3/3] Documentation: rev-list-options: move --simplify-merges documentation

Thomas Rast
In reply to this post by Thomas Rast
Fits --simplify-merges documentation into the 'History Simplification'
section, including example.

Signed-off-by: Thomas Rast <[hidden email]>
---
 Documentation/rev-list-options.txt |   48 +++++++++++++++++++++++++++++++----
 1 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index c6b0bf1..1d0048e 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -193,12 +193,6 @@ endif::git-rev-list[]
 
  Stop when a given path disappears from the tree.
 
---simplify-merges::
-
- Simplify away commits that did not change the given paths, similar
- to `--full-history`, and further remove merges none of whose
- parent history changes the given paths.
-
 --no-merges::
 
  Do not print commits with more than one parent.
@@ -414,6 +408,48 @@ Note that without '\--full-history', this still simplifies merges: if
 one of the parents is TREESAME, we follow only that one, so the other
 sides of the merge are never walked.
 
+Finally, there is a fourth simplification mode available:
+
+--simplify-merges::
+
+ First, build a history graph in the same way that
+ '\--full-history' with parent rewriting does (see above).
++
+Then simplify each commit `C` to its replacement `C'` in the final
+history according to the following rules:
++
+--
+* Set `C'` to `C`.
++
+* Replace each parent `P` of `C'` with its simplification `P'`.  In
+  the process, drop parents that are ancestors of other parents, and
+  remove duplicates.
++
+* If after this parent rewriting, `C'` is a root or merge commit (has
+  zero or >1 parents), a boundary commit, or !TREESAME, it remains.
+  Otherwise, it is replaced with its only parent.
+--
++
+The effect of this is best shown by way of comparing to
+'\--full-history' with parent rewriting.  The example turns into:
++
+-----------------------------------------------------------------------
+  .-A---M---N---O
+ /     /       /
+ I     B       D
+ \   /       /
+  `---------'
+-----------------------------------------------------------------------
++
+Note the major differences in `N` and `P` over '\--full-history':
++
+--
+* `N`'s parent list had `I` removed, because it is an ancestor of the
+  other parent `M`.  Still, `N` remained because it is !TREESAME.
++
+* `P`'s parent list similarly had `I` removed.  `P` was then
+  removed completely, because it had one parent and is TREESAME.
+--
 
 ifdef::git-rev-list[]
 Bisection Helpers
--
1.6.0.rc2.56.g86ca

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html