Consistent terminology: cached/staged/index

classic Classic list List threaded Threaded
65 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Jeff King
On Sat, Feb 26, 2011 at 11:09:14PM +0200, Felipe Contreras wrote:

> On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <[hidden email]> wrote:
> > When people talk about the staging area I tend to get confused.  I
> > think there's an idea that because it sounds more concrete, there is
> > less to explain --- or maybe I am just wired the wrong way.
>
> I don't like the phrase "staging area". A "stage" already has an area.
> You put things on the stage. Sometimes there are multiple stages.

As a native English speaker, this makes no sense to me. A stage as a
noun is either:

  1. a raised platform where you give performances

  2. a phase that some process goes through (e.g., "the early stages of
     Alzheimer's disease")

Whereas the term "staging area" is a stopping point on a journey for
collecting and organizing items. I couldn't find a definite etymology
online, but it seems to be military in origin (e.g., you would send all
your tanks to a staging area, then once assembled and organized, begin
your attack). You can't just call it "staging", which is not a noun, and
the term "stage" is not a synonym. "Staging area" has a very particular
meaning.

So the term "staging area" makes perfect sense to me; it is where we
collect changes to make a commit. I am willing to accept that does not
to others (native English speakers or no), and that we may need to come
up with a better term. But I think just calling it "the stage" is even
worse; it loses the concept that it is a place for collecting and
organizing.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Miles Bader-2
Jeff King <[hidden email]> writes:
> So the term "staging area" makes perfect sense to me; it is where we
> collect changes to make a commit. I am willing to accept that does not
> to others (native English speakers or no), and that we may need to come
> up with a better term. But I think just calling it "the stage" is even
> worse; it loses the concept that it is a place for collecting and
> organizing.

Agreed.

"Staging area" is a good noun (phrase) for this.  "Stage" is a good verb
(for "move into the staging area"), but isn't intuitive as a noun.

-miles

--
In New York, most people don't have cars, so if you want to kill a person, you
have to take the subway to their house.  And sometimes on the way, the train
is delayed and you get impatient, so you have to kill someone on the subway.
  [George Carlin]
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Drew Northup
In reply to this post by Felipe Contreras

On Sat, 2011-02-26 at 22:36 +0200, Felipe Contreras wrote:

> On Thu, Feb 17, 2011 at 1:11 AM, Drew Northup <[hidden email]> wrote:
> >
> > On Sun, 2011-02-13 at 19:09 -0800, Pete Harlan wrote:
> >> On 02/13/2011 02:58 PM, Junio C Hamano wrote:
> >> >> --staged
> >> >> ~~~~~~~~
> >> >> diff takes --staged, but that is only to support some people's habits.
> >> > The term "stage" comes from "staging area", a term people used to explain
> >> > the concept of the index by saying "The index holds set of contents to be
> >> > made into the next commit; it is _like_ the staging area".
> >> >
> >> > My feeling is that "to stage" is primarily used, outside "git" circle, as
> >> > a logistics term.  If you find it easier to visualize the concept of the
> >> > index with "staging area" ("an area where troops and equipment in transit
> >> > are assembled before a military operation", you may find it easier to say
> >> > "stage this path ('git add path')", instead of "adding to the set of
> >> > contents...".
> >>
> >> FWIW, when teaching Git I have found that users immediately understand
> >> "staging area", while "index" and "cache" confuse them.
> >>
> >> "Index" means to them a numerical index into a data structure.
> >> "Cache" is a local copy of something that exists remotely.  Neither
> >> word describes the concept correctly from a user's perspective.
> >
> > According to the dictionary (actually, more than one) "cache" is a
> > hidden storage space. I'm pretty sure that's the sense most global and
> > therefore most appropriate to thinking about Git. (It certainly
> > describes correctly what web browser cache and on-CPU cache is doing.)
> > One would only think the definition you gave applied if they didn't know
> > that squirrels "cache" nuts. I don't think that the problem is the
> > idiom.
>
> Not really. If a squirrel "caches" nuts, it means a squirrel is
> putting them in a hidden place to save them for future use. So, in the
> future, if said squirrel wants a nut, it doesn't have to look for it
> in the trees, just go to the cache. So the cache makes it easier to
> access whatever your want.
>
> IOW; if you don't cache something, you would have more trouble getting
> it, but you still can.
>
> That's not what Git is doing. Git is not putting changes in a place so
> the can be more easily accessed in the future. It is using a temporary
> device that allows the commit to be built through an extended period
> of time. It's not a cache.

As I noted earlier, "cache" classically has nothing whatsoever to do
with temporality, it is a descriptor of visibility. Any notion of
temporality or intentionality is imposed by the reader. THAT'S THE
PROBLEM.

> >> I learned long ago to type "index" and "cached", but when talking (and
> >> thinking) about Git I find "the staging area" gets the point across
> >> very clearly and moves Git from interesting techie-tool to
> >> world-dominating SCM territory.  I'm surprised that that experience
> >> isn't universal.
> >
> > Perhaps that helps you associate it with other SCM/VCS software, but it
> > didn't help me. When I realized that the "index" is called that BECAUSE
> > IT IS AN INDEX (of content/data states for a pending commit operation)
> > the sky cleared and the sun came out.
>
> That's not an index. An index is a guide of pointers to something
> else. It allows you to find whatever you are looking for by looking in
> small table of pointers instead of looking through all the samples.
>
> IOW; if you don't index something, you would have more trouble finding
> it, but you still can.
>
> That's not what Git is doing.

Index: "That which guides, points out, informs, or directs" [1913
Edition Webster's Dictionary--new one says something pretty similar if
not the same].
As far as I can tell Git is using the "Index" to do just that. Again, I
am discarding all notions of connotation here and focusing solely on the
denotation of the word. Besides, it is still possible to build a commit
with git without the "Index"; it is a real royal pain--and not the least
advisable for day-to-day use.

> > In all reality the closest thing Git has to an actual staging area is
> > all of the objects in .git/objects only recorded by the index itself.
> > Git-stored objects not compressed into pack files could technically be
> > described as "cached" using the standard definition--they aren't visible
> > in the working directory. Unfortunately this probably just muddies the
> > water for all too many users.
>
> That's irrelevant. You can implement the same functionality in many
> other ways. How it is implement doesn't matter, what matters is what
> the user experiences.

Please re-read what I said, more slowly and without notion of previous
disagreement if you can muster it. We both agree that the notion of
caching here is superfluous to most users. Alas, I am not one to say
that what any one user experiences should dictate to us who all users
SHOULD experience Git. It is fairly clear to me that isn't what is
currently happening and any efforts to force the matter thus far haven't
helped matters much if at all.

> > So, in summary--the index is real, objects "cached" pending
> > commit/cleanup/packing are real; any "staging area" is a rhetorical
> > combination of the two. Given that rhetorical device may not work in all
> > languages (as Junio mentioned earlier) I don't recommend that we rely on
> > it.
>
> Branches and tags are "rthetorical" devices as well. But behind scenes
> they are just refs. Shall we disregard 'branch' and 'tag'?
>
> No. What Git does behind scenes is irrelevant to the user. What
> matters is what the device does, not how it is implemented; the
> implementation might change. "Stage" is the perfect word; both verb
> and a noun that express a temporary space where things are prepared
> for their final form.

Yes they (branches and tags) are. They also have a "physical"
manifestation. A "staging area" does not. This obviously is of little
importance to you (as a user--I know you do more than that), but would
matter a great deal to somebody like myself currently mulling over how
to craft a contribution to this project.

Alas, as Junio pointed out earlier, "stage" is a metaphor of limited
utility (it also means a large number of things in English alone--I tend
to think of theaters and not states when I read it). In fact, it opens
up more questions: "Staged where? In a cache. Where is the cache? It
doesn't really exist, but it is a combination of the Index and
under-referenced objects in the object store acting as a cache. Why? How
does it do that?....." We are therefore where we started. Users are just
as confused as they were before, and we're looking for a good watering
hole to cluster at and come up with a better way to explain it without
getting into the gritty details.

Details sometimes matter, sometimes they don't, and much more often the
reality is halfway between the two. Currently I think that Git is in
that middle state. Discarding outright the notion of the Index and of
caching doesn't make sense (as, at some level, that's what's happening),
yet staging isn't perfect either. That's my point.

(Please also see my pending reply to Jeff's missive from 8:43 UTC
today.)

--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Drew Northup
In reply to this post by Jeff King

On Sun, 2011-02-27 at 03:43 -0500, Jeff King wrote:

> On Sat, Feb 26, 2011 at 11:09:14PM +0200, Felipe Contreras wrote:
>
> > On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <[hidden email]> wrote:
> > > When people talk about the staging area I tend to get confused.  I
> > > think there's an idea that because it sounds more concrete, there is
> > > less to explain --- or maybe I am just wired the wrong way.
> >
> > I don't like the phrase "staging area". A "stage" already has an area.
> > You put things on the stage. Sometimes there are multiple stages.
>
> As a native English speaker, this makes no sense to me. A stage as a
> noun is either:
>
>   1. a raised platform where you give performances
>
>   2. a phase that some process goes through (e.g., "the early stages of
>      Alzheimer's disease")

I definitely appreciate this notion. The equivalence of "stage ===
status of something, given place and or time" is itself metaphorical in
nature. I don't know how translatable the idiom is.

> Whereas the term "staging area" is a stopping point on a journey for
> collecting and organizing items. I couldn't find a definite etymology
> online, but it seems to be military in origin (e.g., you would send all
> your tanks to a staging area, then once assembled and organized, begin
> your attack). You can't just call it "staging", which is not a noun, and
> the term "stage" is not a synonym. "Staging area" has a very particular
> meaning.

I would have to check, but I believe you would find it linked to
metaphorical language about the "stage on which a battle is
fought" (battleground) and the fact that forces are sometimes organized
into formation--as they would appear upon a stage--in such an area
(before a parade or a march, for instance).

> So the term "staging area" makes perfect sense to me; it is where we
> collect changes to make a commit. I am willing to accept that does not
> to others (native English speakers or no), and that we may need to come
> up with a better term. But I think just calling it "the stage" is even
> worse; it loses the concept that it is a place for collecting and
> organizing.
>
> -Peff

The concept of a "staging area" is definitely of limited use for many of
us attempting to learn how git works. The very fact that the object
cache and the Index (or multiple, as is useful at times) are distinct
elements is useful and should be mentioned somewhere. Alas, creating in
the user's mind that there is a distinct unified "staging area" acts
against this dissemination of knowledge. It definitely didn't help me.

If we use "staging area made up of the object store and information kept
in the Index" then we tie a knot on everything, make it clear that it
may be more complex than that--and you don't have to care, and we do not
foreclose on the possibility of more complete explanation later. That
does not bother me. We do however need to recognize that "staging area"
is an idiom of limited portability and deal with that appropriately.

A particular Three Stooges episode comes to mind here for me. The Three,
in one scene, are getting dressed up to go to an estate (a relative of
one of them has died) to collect an inheritance. They are jumping up and
down yelling "We're gonna get rich!" in the English original. However,
the only thing the only timing appropriate thing the translator could
think of when producing the Spanish voice-over was "Vamos a
vestirse" (we're going to get dressed). Obviously this made them seem
like more utter fools than the were, but equally obviously the meaning
of the idiom "gonna get rich" was lost on the translator. This is what
has been replaying in my mind since Junio brought up the limited
portability of the notion of a "staging area" a little while back. He's
right--many idioms do not not survive translation. This is why we need
to make the documentation robust and technically correct while also
attempting to be nice to new users.

--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Phil Hord (hordp)
In reply to this post by Felipe Contreras
On 02/26/2011 04:09 PM, Felipe Contreras wrote:
> I don't like the phrase "staging area". A "stage" already has an area.
> You put things on the stage. Sometimes there are multiple stages.

A "staging area" (idiomatically, perhaps) is a location where things are
collected to be organized before deployment.  Sounds a lot like our index.

http://en.wikipedia.org/wiki/Staging_area

> If only a subset of the files are there, it's an 'index', if not, then
> I'd say it's a 'registry'. Anyway, it's something the user shouldn't
> care about.

When we pack up our kayak club for a trip, we stage equipment we're
bringing.  Eventually we make a decision about which equipment is going
and which is staying.  The decision is codified by the equipment we
leave in the staging area versus the equipment we remove to local
storage.  Everyone seems to understand the term when we use it in this
context.

I think the parade analogy is also pretty common.

I like "staging area(n)/stage(v)" better than "index" or "cache" because
of the connotation in English.  But if it doesn't translate well, the
search may need to go on.  Maybe we can fall back on stdc methods and
invent generic terms like strcpy.  How about "xnar"?

Phil


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Aghiles-2
In reply to this post by Pete Harlan
> FWIW, when teaching Git I have found that users immediately understand
> "staging area", while "index" and "cache" confuse them.

FWIW, same here.

-- aghiles
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Jon Seymour
In reply to this post by Miles Bader-2
On Sun, Feb 27, 2011 at 8:21 PM, Miles Bader <[hidden email]> wrote:

> Jeff King <[hidden email]> writes:
>> So the term "staging area" makes perfect sense to me; it is where we
>> collect changes to make a commit. I am willing to accept that does not
>> to others (native English speakers or no), and that we may need to come
>> up with a better term. But I think just calling it "the stage" is even
>> worse; it loses the concept that it is a place for collecting and
>> organizing.
>
> Agreed.
>
> "Staging area" is a good noun (phrase) for this.  "Stage" is a good verb
> (for "move into the staging area"), but isn't intuitive as a noun.
>

When used to describe a pre-production environment, the noun in my experience
is inevitably 'staging' (short for staging environment) rather than
'stage' which
is consistent with the origin Jeff posits.

I guess the noun 'stage' does have a use in git-speak to refer to the
different arms of
an unresolved merge.

jon.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Junio C Hamano
Jon Seymour <[hidden email]> writes:

> I guess the noun 'stage' does have a use in git-speak to refer to the
> different arms of an unresolved merge.

That is correct.

For some historical background around "cache" and "index", this

  http://thread.gmane.org/gmane.comp.version-control.git/780/focus=924

may shed some light.

    From: Linus Torvalds <[hidden email]>
    Subject: Re: [RFC] Possible strategy cleanup for git add/remove/diff etc.
    Date: Tue, 19 Apr 2005 18:51:06 -0700 (PDT)
    Message-ID: <[hidden email]>

    That is indeed the whole point of the index file. In my world-view, the
    index file does _everything_. It's the staging area ("work file"), it's
    the merging area ("merge directory") and it's the cache file ("stat
    cache").

And this one:

  http://thread.gmane.org/gmane.comp.version-control.git/6670/focus=6863

is even more illuminating.

Notice that the word "staging area" is used in the old article as a way to
explain one of the three important aspects of the index, and the other
article that is about nailing down the terminology, the word does not even
come into the picture at all (one reason being that it will confuse
readers if "staging area" is used too casually in a document to precisely
define terminology, which needs to explain the merge stage(s) in the
index).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

MichaelJGruber
Junio C Hamano venit, vidit, dixit 28.02.2011 00:57:

> Jon Seymour <[hidden email]> writes:
>
>> I guess the noun 'stage' does have a use in git-speak to refer to the
>> different arms of an unresolved merge.
>
> That is correct.
>
> For some historical background around "cache" and "index", this
>
>   http://thread.gmane.org/gmane.comp.version-control.git/780/focus=924
>
> may shed some light.
>
>     From: Linus Torvalds <[hidden email]>
>     Subject: Re: [RFC] Possible strategy cleanup for git add/remove/diff etc.
>     Date: Tue, 19 Apr 2005 18:51:06 -0700 (PDT)
>     Message-ID: <[hidden email]>
>
>     That is indeed the whole point of the index file. In my world-view, the
>     index file does _everything_. It's the staging area ("work file"), it's
>     the merging area ("merge directory") and it's the cache file ("stat
>     cache").
>
> And this one:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/6670/focus=6863
>
> is even more illuminating.
>
> Notice that the word "staging area" is used in the old article as a way to
> explain one of the three important aspects of the index, and the other
> article that is about nailing down the terminology, the word does not even
> come into the picture at all (one reason being that it will confuse
> readers if "staging area" is used too casually in a document to precisely
> define terminology, which needs to explain the merge stage(s) in the
> index).

Oh, the classics :)

Thanks for an illuminating and entertaining read!

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Drew Northup
In reply to this post by Aghiles-2

On Sun, 2011-02-27 at 16:16 -0500, Aghiles wrote:
> > FWIW, when teaching Git I have found that users immediately understand
> > "staging area", while "index" and "cache" confuse them.
>
> FWIW, same here.

I would really like to hear the actual presentation. What one says in
person in front of a classroom and what one puts in a manpage are
frequently not the same thing--and there's a good reason for that.
If nothing else, we could come up with a better presentation at the end!

--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Jeff King
In reply to this post by Drew Northup
On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:

> The concept of a "staging area" is definitely of limited use for many of
> us attempting to learn how git works. The very fact that the object
> cache and the Index (or multiple, as is useful at times) are distinct
> elements is useful and should be mentioned somewhere.

Now your terminology has _me_ confused. What is the "object cache"?

> Alas, creating in the user's mind that there is a distinct unified
> "staging area" acts against this dissemination of knowledge. It
> definitely didn't help me.

I'm not sure what you mean by "distint unified staging area". It is a
conceptual idea that you will put your changes somewhere, and when they
look good to you, then you will finalize them in some way.

But note that it is a mental model. The fact that it is implemented
inside the index, along with the stat cache, doesn't need to be relevant
to the user. And the fact that the actual content is in the object
store, with sha1-identifiers in the index, is not relevant either. At
least I don't think so, and I am usually of the opinion that we should
expose the data structures to the user, so that their mental model can
match what is actually happening. But in this case, I think they can
still have a pretty useful but simpler mental model.

> If we use "staging area made up of the object store and information kept
> in the Index" then we tie a knot on everything, make it clear that it
> may be more complex than that--and you don't have to care, and we do not
> foreclose on the possibility of more complete explanation later. That
> does not bother me. We do however need to recognize that "staging area"
> is an idiom of limited portability and deal with that appropriately.

Sure, I'm willing to accept that the specific words of the idiom aren't
good for people with different backgrounds.

One analogy I like for the index is that it's a bucket. It starts out
full of files from the last commit. You can put new, changed files in
the bucket. When it looks good, you dump the bucket into a commit. You
can have multiple buckets if you want. You can pull files from other
commits and put them in the bucket. You can take files out of the bucket
and put them in your work tree.

So maybe it should just be called "the bucket"?

I'm not sure that's a good idea, because while the analogy makes sense,
it doesn't by itself convey any meaning. That is, knowing the concept, I
can see that bucket is a fine term. But hearing about git's bucket, I
have no clue what it means. Whereas "staging area" I think is a bit more
specific, _if_ you know what a staging area is.

So there are two questions:

  1. Is there a more universal term that means something like "staging
     area"?

  2. Is the term "staging area", while meaningful to some, actually
     _worse_ to others than a term like "bucket"? That is, does it sound
     complex and scary, when it is really a simple thing. And while
     people won't know what the "git bucket" is off the bat, it is
     relatively easy to learn.

     And obviously, replace "bucket" here with whatever term makes more
     sense.

> A particular Three Stooges episode comes to mind here for me.

Wow, 180,000 messages and this is somehow the first Three Stooges
analogy on the git list.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Victor Engmark-2
In reply to this post by Drew Northup
On 03/01/2011 12:03 AM, Jeff King wrote:
> On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:

> One analogy I like for the index is that it's a bucket. It starts out
> full of files from the last commit. You can put new, changed files in
> the bucket. When it looks good, you dump the bucket into a commit. You
> can have multiple buckets if you want. You can pull files from other
> commits and put them in the bucket. You can take files out of the bucket
> and put them in your work tree.
>
> So maybe it should just be called "the bucket"?
>
> I'm not sure that's a good idea, because while the analogy makes sense,
> it doesn't by itself convey any meaning. That is, knowing the concept, I
> can see that bucket is a fine term. But hearing about git's bucket, I
> have no clue what it means. Whereas "staging area" I think is a bit more
> specific, _if_ you know what a staging area is.
>
> So there are two questions:
>
>   1. Is there a more universal term that means something like "staging
>      area"?
>
>   2. Is the term "staging area", while meaningful to some, actually
>      _worse_ to others than a term like "bucket"? That is, does it sound
>      complex and scary, when it is really a simple thing. And while
>      people won't know what the "git bucket" is off the bat, it is
>      relatively easy to learn.

I like the name "git bucket", as in "a git bit bucket", but semantically
the connection is just "a container". Especially for beginners this can
result in the wrong connotations:
* Limited size. A modern harddisk is vastly larger than most Git
repositories, likening it more to a container ship than a bucket.
* Definite size. Harddisk space availability varies with time, unlike
most containers.
* Non-linear use. A full physical bucket could be used for many
different things, but a full git bucket can either be forgotten (with
checkout), remembered temporarily (with stash), or remembered
permanently (with commit).
* Container-specific features irrelevant for git: Handles, translucency
(or not), depth, material, dimensions of the opening...

How about a metaphor like "plan"? You either cancel/undo it (git
checkout), postpone / shelf it (git stash), resume/continue it (git
stash apply) or commit to it. Coming from the desktop metaphor, I
personally like `git undo`, `git postpone/resume` and `git commit` -
They give a clear sense of direction towards the commit, and much
clearer verbs for those new to VC in general.

--
Victor Engmark
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

David-18
In reply to this post by Jeff King
On 1 March 2011 10:03, Jeff King <[hidden email]> wrote:

> On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:
>
> I'm not sure what you mean by "distint unified staging area". It is a
> conceptual idea that you will put your changes somewhere, and when they
> look good to you, then you will finalize them in some way.
>
> But note that it is a mental model. The fact that it is implemented
> inside the index, along with the stat cache, doesn't need to be relevant
> to the user. And the fact that the actual content is in the object
> store, with sha1-identifiers in the index, is not relevant either. At
> least I don't think so, and I am usually of the opinion that we should
> expose the data structures to the user, so that their mental model can
> match what is actually happening. But in this case, I think they can
> still have a pretty useful but simpler mental model.
>
>> If we use "staging area made up of the object store and information kept
>> in the Index" then we tie a knot on everything, make it clear that it
>> may be more complex than that--and you don't have to care, and we do not
>> foreclose on the possibility of more complete explanation later. That
>> does not bother me. We do however need to recognize that "staging area"
>> is an idiom of limited portability and deal with that appropriately.
>
> Sure, I'm willing to accept that the specific words of the idiom aren't
> good for people with different backgrounds.
>
> One analogy I like for the index is that it's a bucket. It starts out
> full of files from the last commit. You can put new, changed files in
> the bucket. When it looks good, you dump the bucket into a commit. You
> can have multiple buckets if you want. You can pull files from other
> commits and put them in the bucket. You can take files out of the bucket
> and put them in your work tree.
>
> So maybe it should just be called "the bucket"?
>
> I'm not sure that's a good idea, because while the analogy makes sense,
> it doesn't by itself convey any meaning. That is, knowing the concept, I
> can see that bucket is a fine term. But hearing about git's bucket, I
> have no clue what it means. Whereas "staging area" I think is a bit more
> specific, _if_ you know what a staging area is.
>
> So there are two questions:
>
>  1. Is there a more universal term that means something like "staging
>     area"?
>
>  2. Is the term "staging area", while meaningful to some, actually
>     _worse_ to others than a term like "bucket"? That is, does it sound
>     complex and scary, when it is really a simple thing. And while
>     people won't know what the "git bucket" is off the bat, it is
>     relatively easy to learn.
>
>     And obviously, replace "bucket" here with whatever term makes more
>     sense.

A suggestion: could your conceptual bucket be named as "the precommit".

Motives for this suggestion are:
1)  I imagine this word will be readily translatable;
2) Using an invented word like this neatly avoids the complication of
the various different connotations associated with existing words like
"index", "cache", and "stage" that others have raised.

The "precommit" would be a user concept that merely specifies the
content of the next commit. Its purpose is to simplify the user
interface and the documentation. For example, man git-status would
read like this:

"git status displays paths that have differences between the precommit
and the current HEAD commit, paths that have differences between the
working tree and the precommit, and paths in the working tree that are
not tracked by git."

The "precommit" is not to be associated to any specific data structure
in the implementation. For users who want more understanding, it can
be explained that the precommit is implemented by a combination of
data structures. Which are then free to be named anything appropriate
to their individual function (eg "the index file") without triggering
all the issues that give rise to this thread.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Matthieu Moy-2
David <[hidden email]> writes:

> A suggestion: could your conceptual bucket be named as "the
> precommit".

I actually like it.

Maybe "precommit area", or "precommit something", because "precommit"
could be seen either as an action (like the pre-commit hook) or as a
place to put stuff.

As a non-native speaker, I didn't know what "staging area" really meant
in english, but the "area" part of the expression immediately made sense
to me. Had it been called the "foobar-ing area", I would have found it
more intuitive than cache or index ;-).

--
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Alexey Feldgendler
In reply to this post by David-18
On Tue, 01 Mar 2011 10:11:11 +0100, David <[hidden email]> wrote:

> A suggestion: could your conceptual bucket be named as "the precommit".
>
> Motives for this suggestion are:
> 1)  I imagine this word will be readily translatable;

Less so than “staging area”, at least into Russian.

Just my two cents.


--
Alexey Feldgendler
Software Developer, Desktop Team, Opera Software ASA
[ICQ: 115226275] http://my.opera.com/feldgendler/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Alexei Sholik
In reply to this post by Matthieu Moy-2
On 1 March 2011 11:15, Matthieu Moy <[hidden email]> wrote:

> David <[hidden email]> writes:
>
>> A suggestion: could your conceptual bucket be named as "the
>> precommit".
>
> ...
>
> As a non-native speaker, I didn't know what "staging area" really meant
> in english, but the "area" part of the expression immediately made sense
> to me. Had it been called the "foobar-ing area", I would have found it
> more intuitive than cache or index ;-).

Hello everyone,
I'm not a very experienced git-user and I still remember how it felt
when I started learning git. I don't recall the exact tutorial I used
(probably it was the 'Pro Git' Book), but anyway, it used the term
"staging area" and "to stage changes" from the outset. I'm also not a
native English speaker and I hadn't even heard of the term "to stage"
before, but managed to grasp at once what "to stage changes" meant.

As of such names as "bucket" and "precommit", I don't think they will
do. There is a lot of resources for beginners on the internet already,
many of them already use "staging area" and "index". There's no need
to rename the staging area. The only source of confusion as I see it
comes from the interchangeable usage of the terms "staging area" and
"index" ("staged" and "cached" being the other confusing pair of
words).

I guess, people who are friendly with git using the word "index"
because it's easier to type. But it confuses an unprepared reader. The
solution of the problem with confusion must be relevant to these
points:
 - clarify that "index" means the same thing as the "staging area" (in
man if it isn't there already?)
 - replace "cached" with "staged" for consistency with the term
"staging area" (I guess none of you would like to replace ot with
"indexed" instead :-P)

Best regards,
Alexei Sholik
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Jonathan Nieder-2
In reply to this post by Piotr Krukowiecki
Hi again,

Piotr Krukowiecki wrote:

> is there a plan for using one term

To summarize: everyone knows what the staging area is, no one seems to
know what the index is, and the --cached options are confusing.

We need a new description (terminology, or better yet, story) for
"git's view of the work tree", since just saying "the index! the
index!" without a myth behind it confuses people.

Various commands take --cached (porcelain):

. git diff --cached - view staged changes relative to the named tree.
. git grep --cached - search in the staging area instead of the worktree.
. git rm --cached - only remove from the index.

(plumbing):

. git apply --cached - apply a patch without touching the worktree.
. git ls-files --cached - list paths that will have content in the next commit.

It would be reasonable to introduce a synonym --index-only.  That can
be confusing if you don't view the staging area as representing git's
deluded idea of what's in the work tree, though.  For the same reason
and some others, --no-worktree / --ignore-worktree wouldn't work so
well (e.g., "git ls-files --no-worktree" would be terribly confusing).
So, um, we're stuck?

Various commands take --index or related options (porcelain):

. git filter-branch --index-filter - let hook tweak index before commit
. git stash apply --index - revive the stashed index changes, too
. git stash save --keep-index - do not stash changes already added to index

(toys):

. git grep --no-index - just act as a better "grep"; do not look for .git
. git diff --no-index - just act as a better "diff"; do not look for .git

(plumbing):

. git apply --index - next commit will have the patch applied, too
. git checkout-index --index - update stat() cache while at it
. git read-tree --index-output - write output to a different index file
. git update-index --index-info - apply changes in ls-tree or ls-files format
. GIT_INDEX_FILE - where information about the worktree goes

It would be possible to introduce synonyms along the lines of
GIT_STAGING_AREA_FILE, keeping in mind that they also affect the
merging process (and some of them also affect the stat() cache), if
that seems like the right thing to do.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Drew Northup
In reply to this post by Alexey Feldgendler

On Tue, 2011-03-01 at 10:27 +0100, Alexey Feldgendler wrote:

> On Tue, 01 Mar 2011 10:11:11 +0100, David <[hidden email]> wrote:
>
> > A suggestion: could your conceptual bucket be named as "the precommit".
> >
> > Motives for this suggestion are:
> > 1)  I imagine this word will be readily translatable;
>
> Less so than “staging area”, at least into Russian.
>
> Just my two cents.

I was starting to think about "commit preparation area" this morning,
but it sounds horribly long. Would "Prep area" work provided that the
longer version has already been introduced into the discussion? This
provides a similar language metaphor to "staging area" hopefully without
the translation problem.

Also, I still think that it is important to note somewhere that the way
that git handles commits is not the way that most users are likely to
imagine (the Index doesn't contain the blob objects itself; a finalized
commit is not just a bundled collection of everything as somebody might
expect; etc) so this "Prep area" is a logical space not completely
analogous to stuff found in the ".git" directory. Pretending that
complexity does not exist will not help; letting the users know that
they don't need to grok all of the details to get started is, on the
other hand, quite important.

(Reconstructing the CC list... let me know if I left you out, spammed
you, etc...)

--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Drew Northup
In reply to this post by Alexei Sholik

On Tue, 2011-03-01 at 11:32 +0200, Alexei Sholik wrote:

> I guess, people who are friendly with git using the word "index"
> because it's easier to type. But it confuses an unprepared reader. The
> solution of the problem with confusion must be relevant to these
> points:
>  - clarify that "index" means the same thing as the "staging area" (in
> man if it isn't there already?)

Alas, this isn't quite true. Blobs are copied to the .git/objects
directory (which I referred to earlier as an object store without proper
qualification) with each "git add" action AND are noted in the Index at
the same time. Therefore the Index is quite literally containing
information about the blobs to be committed without containing the blobs
themselves. This is why I find any specific equivalence between Index
and "staging area" distasteful--it is misleading.

(Yes, I made that mistake as well--helped along by a lot of third-party
documentation referring to a specific cache or a specific "staging area"
without noting that those were tools to understand the logical function
of git but did not have anything to do with implementation. When you
claim to be explaining "how something works" you should be doing just
that.)

--
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Reply | Threaded
Open this post in threaded view
|

Re: Consistent terminology: cached/staged/index

Alexei Sholik
On 1 March 2011 19:02, Drew Northup <[hidden email]> wrote:

>
> On Tue, 2011-03-01 at 11:32 +0200, Alexei Sholik wrote:
>
>> I guess, people who are friendly with git using the word "index"
>> because it's easier to type. But it confuses an unprepared reader. The
>> solution of the problem with confusion must be relevant to these
>> points:
>>  - clarify that "index" means the same thing as the "staging area" (in
>> man if it isn't there already?)
>
> Alas, this isn't quite true. Blobs are copied to the .git/objects
> directory (which I referred to earlier as an object store without proper
> qualification) with each "git add" action AND are noted in the Index at
> the same time. Therefore the Index is quite literally containing
> information about the blobs to be committed without containing the blobs
> themselves. This is why I find any specific equivalence between Index
> and "staging area" distasteful--it is misleading.

There's no reason to make it more confusing by telling all the
implementation details users are not interested in.

Once I add a modified file to index (via 'git add') or even add a new
file, its content is already tracked by git. This is the most relevant
part.

It is not relevant from the user's point of view whether it's already
in .git/objects or not. Once I've staged a file, I can rm it and then
'git checkout' it again to the version that's remembered in the
staging area, i.e. I will not lose it's contents once it's been
staged.

If what you're trying to say is that new users think of the 'staging
area' as some place where the content is stored before a subsequent
commit, there's nothing bad about it. If they will try to find out
about it's concrete location in the fs, they'll eventually find out
about index and its true nature in terms of implementation.

--
Best regards,
Alexei Sholik
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [hidden email]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
1234