run-command: output owner picking strategy

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

run-command: output owner picking strategy

William Duclot
Hi,
I stumbled upon this piece of code (run-command.c:pp_collect_finish()), picking the owner of the output amongst parallel processes (introduced by Stephan Beller in commit c553c72eed64b5f7316ce227f6d5d783eae6f2ed)

/*
 * Pick next process to output live.
 * NEEDSWORK:
 * For now we pick it randomly by doing a round
 * robin. Later we may want to pick the one with
 * the most output or the longest or shortest
 * running process time.
 */
for (i = 0; i < n; i++)
   if (pp->children[(pp->output_owner + i) % n].state == GIT_CP_WORKING)
      break;
pp->output_owner = (pp->output_owner + i) % n;


Would it be useful to improve this round-robin into something smarter (as stated by the NEEDSWORK)? It seems to be only used for submodules fetch/clone.

The options would be (as said in the comment):
1 - pick the process with the longest running process time
2 - pick the process with the shortest running process time
3 - pick the process for which the output buffer is the longest

But with one of those strategies, wouldn't we lose the advantage of having the same output order as a non-parallelized version? Cf the commit message.

What do you think ?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: run-command: output owner picking strategy

Stefan Beller-4
On Fri, May 20, 2016 at 6:11 AM, William Duclot
<[hidden email]> wrote:

> Hi,
> I stumbled upon this piece of code (run-command.c:pp_collect_finish()), picking the owner
> of the output amongst parallel processes (introduced by Stephan Beller in commit c553c72eed64b5f7316ce227f6d5d783eae6f2ed)
>
> /*
>  * Pick next process to output live.
>  * NEEDSWORK:
>  * For now we pick it randomly by doing a round
>  * robin. Later we may want to pick the one with
>  * the most output or the longest or shortest
>  * running process time.
>  */
> for (i = 0; i < n; i++)
>    if (pp->children[(pp->output_owner + i) % n].state == GIT_CP_WORKING)
>       break;
> pp->output_owner = (pp->output_owner + i) % n;
>
>
> Would it be useful to improve this round-robin into something smarter (as stated by the NEEDSWORK)? It seems to be only used for submodules fetch/clone.
>
> The options would be (as said in the comment):
> 1 - pick the process with the longest running process time
> 2 - pick the process with the shortest running process time
> 3 - pick the process for which the output buffer is the longest
>
> But with one of those strategies, wouldn't we lose the advantage of having the same output order as a non-parallelized version? Cf the commit message.
>
> What do you think ?

When running in parallel we already may be out of order
(relative to serial processing). See the second example in the
commit message to produce a different order.

Consider we scheduled tasks to be run in 3 parallel processes:
(As we NEEDSWORK comment only addresses the ouput selection,
let's assume this is a fixes schedule, which we cannot alter.
Which is true if we only change the code you quoted. That picks
the process to output.)


    process 1: |---A---||-E----------|

    process 2: |-B-||-----------D----|

    process 3: |-------C-------||-F-|

output order:    A, B, D, C, F, E
timing:        |---A---|{B}|------D----|{C,F,E}

All outputs with {braces} are instant from buffers,
and output surrounded by |--dashes--| are live.

The output is produced by the current algorithm:
(1) Start with process 1 (A) whose output will be live
(2) Once A is done, flush all other done things, (B)
(3) live output will be round robin, so process 2 (D)
(4) Once D is done, flush all other done things (C, F, E)
    in order of who finshed first


(1) is uncontroversial. We have no information about tasks A,B,C,
    so pick a random candidate. We hardcoded process 1 for now.

(2) also uncontroversial IMHO. There is not much we can do different.

(3) is what this NEEDSWORK comment is about. Instead of outputting D
    we might have choosen C. (for $REASONS, e.g.: C is running longer than
    D already, so we expect it to finish sooner, by assuming
    any task takes the same expected time to finish. And as C
    is expected to finish earlier than D, we may have smoother
    output. "Less buffered bursts")

Then the output looks like this:

output order: A B C D F E
timing:        |---A---|{B}|-C-||-D-|{F,E}
(Same notation as above)

This seems to be better than the current behavior as we have more
different tasks with "live" output, i.e. you see stuff moving.
I made up the data to make the point though. We would need to use
live data and experiment with different strategies to find a
good/better solution.

Another way to do output would be to not round robin but keep outputting
process 1 live (different than current behavior, and not optimal IMHO)

output order: A B E C D F
timing:        |---A---|{B}|-----E----|{C,D,F}
(Same notation as above)

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: run-command: output owner picking strategy

William Duclot
> When running in parallel we already may be out of order
> (relative to serial processing). See the second example in the
> commit message to produce a different order.

Right, I could (should) have understood that by myself.
 
> Consider we scheduled tasks to be run in 3 parallel processes:
> (As we NEEDSWORK comment only addresses the ouput selection,
> let's assume this is a fixes schedule, which we cannot alter.
> Which is true if we only change the code you quoted. That picks
> the process to output.)
>
> [...]
 

> The output is produced by the current algorithm:
> (1) Start with process 1 (A) whose output will be live
> (2) Once A is done, flush all other done things, (B)
> (3) live output will be round robin, so process 2 (D)
> (4) Once D is done, flush all other done things (C, F, E)
>     in order of who finshed first
>
>
> (1) is uncontroversial. We have no information about tasks A,B,C,
>     so pick a random candidate. We hardcoded process 1 for now.
>
> (2) also uncontroversial IMHO. There is not much we can do different.

Agreed
 

> (3) is what this NEEDSWORK comment is about. Instead of outputting D
>     we might have choosen C. (for $REASONS, e.g.: C is running longer than
>     D already, so we expect it to finish sooner, by assuming
>     any task takes the same expected time to finish. And as C
>     is expected to finish earlier than D, we may have smoother
>     output. "Less buffered bursts")
>
> [...]
>
> This seems to be better than the current behavior as we have more
> different tasks with "live" output, i.e. you see stuff moving.
> I made up the data to make the point though. We would need to use
> live data and experiment with different strategies to find a
> good/better solution.

We should probably settle on what is the behavior we want to obtain,
before trying to find a strategy to implement (or approximate) it:
- Do we want to be as close as possible to a serial processing output?
- Do we want to see as much live output as possible?

I do not think that being close to serial processing is a relevant
behavior: we applied an arbitrary order to tasks when naming them for
explanations (A, B, C...), but the tasks aren't really sorted in any
way (and that's why the parallelization is relevant).Neither the user
nor git have any interest in getting these ouputs in a specific order.

Therefore, a "as much live output as possible" behavior would be more
sensible. But I wonder: is there a worthy benefit in optimizing the
output owner strategy? I'm not used to working with submodules, but I
don't think that having a great number of submodules is a common thing.
Basically: we could solve a problem, but is there a problem?
I'm not trying to bury this NEEDSWORK, I'd be happy to look into it if
need be!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: run-command: output owner picking strategy

Stefan Beller-4
On Fri, May 20, 2016 at 11:29 AM, William Duclot
<[hidden email]> wrote:

>> When running in parallel we already may be out of order
>> (relative to serial processing). See the second example in the
>> commit message to produce a different order.
>
> Right, I could (should) have understood that by myself.
>
>> Consider we scheduled tasks to be run in 3 parallel processes:
>> (As we NEEDSWORK comment only addresses the ouput selection,
>> let's assume this is a fixes schedule, which we cannot alter.
>> Which is true if we only change the code you quoted. That picks
>> the process to output.)
>>
>> [...]
>
>> The output is produced by the current algorithm:
>> (1) Start with process 1 (A) whose output will be live
>> (2) Once A is done, flush all other done things, (B)
>> (3) live output will be round robin, so process 2 (D)
>> (4) Once D is done, flush all other done things (C, F, E)
>>     in order of who finshed first
>>
>>
>> (1) is uncontroversial. We have no information about tasks A,B,C,
>>     so pick a random candidate. We hardcoded process 1 for now.
>>
>> (2) also uncontroversial IMHO. There is not much we can do different.
>
> Agreed
>
>> (3) is what this NEEDSWORK comment is about. Instead of outputting D
>>     we might have choosen C. (for $REASONS, e.g.: C is running longer than
>>     D already, so we expect it to finish sooner, by assuming
>>     any task takes the same expected time to finish. And as C
>>     is expected to finish earlier than D, we may have smoother
>>     output. "Less buffered bursts")
>>
>> [...]
>>
>> This seems to be better than the current behavior as we have more
>> different tasks with "live" output, i.e. you see stuff moving.
>> I made up the data to make the point though. We would need to use
>> live data and experiment with different strategies to find a
>> good/better solution.
>
> We should probably settle on what is the behavior we want to obtain,
> before trying to find a strategy to implement (or approximate) it:
> - Do we want to be as close as possible to a serial processing output?
> - Do we want to see as much live output as possible?
>
> I do not think that being close to serial processing is a relevant
> behavior: we applied an arbitrary order to tasks when naming them for
> explanations (A, B, C...), but the tasks aren't really sorted in any
> way (and that's why the parallelization is relevant).Neither the user
> nor git have any interest in getting these ouputs in a specific order.

IIRC In serial processing the output was according to the sort order
within the tree. I agree that this sorting property is of no value to the user.

>
> Therefore, a "as much live output as possible" behavior would be more
> sensible.

I choose "as much live output" as an approximation of "least amount buffered
over time, i.e. if you were to integrate the buffer size over time
that should be
minimized. (c.f. users waiting for output: http://imgur.com/gallery/lhjhbB9)
I am not sure if that is ultimate thing to optimize for though.

> But I wonder: is there a worthy benefit in optimizing the
> output owner strategy?

Eventually there are more users than just submodules for this
parallel processing machinery, I would hope. They would also benefit
of a good fundament?

> I'm not used to working with submodules, but I
> don't think that having a great number of submodules is a common thing.

(Not yet, because of the chicken and egg problem: submodule UI is not
as polished because very few people use it. And few people use it because
of confusing UI. ;)

At his GitMerge2016 talk, Lars Schneider proposed a guideline to
not use more than 25 submodules as it "doesn't scale" IIRC.
And that resentment seems to be all over the place.

> Basically: we could solve a problem, but is there a problem?
> I'm not trying to bury this NEEDSWORK, I'd be happy to look into it if
> need be!

Well Git as a community doesn't ask you to solve any problems. ;)
So if you have fun thinking about scheduling problems (and as you do
it as part of a university project, if Matthieu is happy about this problem
also), go for it :)

If you find another more "interesting" problem (either as defined by
you personal interests or by the possible impact or by possible grading),
choose that?

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: run-command: output owner picking strategy

Junio C Hamano
Stefan Beller <[hidden email]> writes:

> I choose "as much live output" as an approximation of "least amount buffered
> over time, i.e. if you were to integrate the buffer size over time
> that should be
> minimized. (c.f. users waiting for output: http://imgur.com/gallery/lhjhbB9)
> I am not sure if that is ultimate thing to optimize for though.

It might be interesting to optimize to minimize the time the user
needs to stare an inactive screen, i.e. make sure there always is
something flowing.  That may involve throttling the output from a
buffer if that is the only buffer with any output, the foreground
thing hasn't made any output yet, and expected to stay silent for a
while, etc.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Loading...